Top Banner
Introduction to Modeling and Generating Probabilistic Input Processes for Simulation Michael E. Kuhl Emily K. Lada RIT SAS Institute Inc. Natalie M. Steiger Mary Ann Wagner University of Maine SAIC James R. Wilson NC State University <www.ise.ncsu.edu/jwilson> December 11, 2007
74

Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

May 25, 2018

Download

Documents

ledan
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and GeneratingProbabilistic Input Processes for Simulation

Michael E. Kuhl Emily K. LadaRIT SAS Institute Inc.

Natalie M. Steiger Mary Ann WagnerUniversity of Maine SAIC

James R. WilsonNC State University

<www.ise.ncsu.edu/jwilson>

December 11, 2007

Page 2: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

OVERVIEW

I. Introduction

II. Univariate Input Models

A. Generalized Beta Distribution Family

B. Johnson Translation System of Distributions

C. Bézier Distribution Family

III. Time-Dependent Arrival Processes

IV. Conclusions and Recommendations

2

Page 3: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

I. Introduction

• Stochastic simulations require valid input models—e.g., probabilitydistributions that accurately mimic the random input processes drivingthe target system.

3

Page 4: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

• Problems in using many conventional probability models:

1. They cannot adequately represent real-world behavior, e.g. in the tailsof the underlying distribution.

2. Parameter estimation based on sample data or subjective information(expert opinion) is often troublesome.

3. Fine-tuning the fitted model is difficult; e.g., many conventionalprobability distributions have the following drawbacks—

(a) A limited number of parameters available to control the fitteddistribution, and

(b) No effective mechanism for directly manipulating the fitteddistribution while simultaneously updating its parameter estimates.

4

Page 5: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

• Conventional approach to identifying an input model uses sample datato select from a list of well-known alternatives based on

1. informal graphical techniques such as probability plots, Q–Q plots,histograms, empirical frequency distributions, or box-plots; and

2. statistical goodness-of-fit tests such as the Kolmogorov-Smirnov,chi-squared, and Anderson-Darling tests.

5

Page 6: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

• Drawbacks of conventional input modeling

1. Visual comparison of a histogram to a fitted probability densityfunction (p.d.f.) depends on the (arbitrary) layout of the histogram.

2. Problems with statistical goodness-of-fit tests include:

(a) In small samples, low power to detect lack of fit results in aninability to reject any alternatives.

(b) In large samples, practically insignificant fit discrepancies result inrejection of all alternatives.

6

Page 7: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

• Problems in estimating the parameters of the selected input modelfrom sample data:

– Matching the mean and standard deviation of the fitted distributionwith that of the sample often fails to capture relevant shapecharacteristics.

– Some estimation methods, such as maximum likelihood andpercentile matching, may simply fail to estimate some parameters.

– Users lack a comprehensive basis for selecting the “best-fitting”model.

7

Page 8: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

• Problems with parameter estimation based on subjective information(expert opinion):

– Subjective estimates of moments such as the mean and standarddeviation can be unreliable and depend critically on the units ofmeasurement.

– Subjective estimates of extreme quantiles (e.g., lower and upperlimits of the fitted distribution) are unreliable.

• Practitioners lack definitive procedures for identifying and estimatingvalid input models; thus, output analysis is often based on incorrectinput processes.

• We focus on methods for input modeling that alleviate many of theseproblems.

8

Page 9: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

II. Univariate Input Models

A. Generalized Beta Distribution Family

If X is a continuous random variable with lower limit a and upper limit b

whose distribution is to be approximated and subsequently sampled ina simulation, then often we can model the behavior of X using ageneralized beta distribution.

• Generalized beta p.d.f.

fX(x) = (1)

�(θ1 + θ2)

�(θ1)�(θ2)(b − a)θ1+θ2−1(x − a)θ1−1(b − x)θ2−1 for a ≤ x ≤ b,

where �(z) = ∫∞0 tz−1e−t dt (for z > 0) denotes the gamma function.

9

Page 10: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

• The beta p.d.f. can accommodate a wide variety of shapes,including

� symmetric and positively or negatively skewed unimodal p.d.f.’s;� J - and U -shaped p.d.f.’s;� left- and right-triangular p.d.f.’s; and� uniform p.d.f.’s.

• Some examples illustrating the range of distributional shapesachievable with the beta p.d.f. follow.

10

Page 11: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

Positively and Negatively Skewed Unimodal Beta Densities

11

Page 12: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

U -shaped Beta Densities

12

Page 13: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

J -shaped and Left-triangular Beta Densities

13

Page 14: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

Symmetric and Uniform Beta Densities

14

Page 15: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

• Cumulative distribution function (c.d.f.) of beta variate X,

FX(x) = Pr{X ≤ x} =∫ x

−∞fX(w) dw for all real x,

has no convenient analytical expression.

15

Page 16: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

• Mean and variance of X are given by

µX = E[X] = θ1b + θ2a

θ1 + θ2,

σ 2X = E

[(X − µX

)2] = (b − a)2θ1θ2

(θ1 + θ2)2(θ1 + θ2 + 1)

.

⎫⎪⎪⎪⎪⎪⎬⎪⎪⎪⎪⎪⎭(2)

• Provided θ1, θ2 > 1 so that the p.d.f. (1) is unimodal, the mode isgiven by

m = (θ1 − 1)b + (θ2 − 1)a

θ1 + θ2 − 2. (3)

• The key distributional characteristics (2) and (3) are simplefunctions of a, b, θ1, and θ2; and this facilitates rapid input modeling.

16

Page 17: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

Fitting Beta Distributions to Data or Subjective Information

Given the data set {Xi : i = 1, . . . , n} of size n, we let

X(1) ≤ X(2) ≤ · · · ≤ X(n)

denote the order statistics; and we compute the sample statistics

a = 2X(1) − X(2), b = 2X(n) − X(n−1),

X = n−1n∑

i=1

Xi, S2 = (n − 1)−1n∑

i=1

(Xi − X

)2.

⎫⎪⎪⎪⎬⎪⎪⎪⎭

17

Page 18: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

• Moment-matching estimates of θ1, θ2 are computed from

θ1 = d21 (1 − d1)

d22

− d1, θ2 = d1(1 − d1)2

d22

− (1 − d1),

where

d1 = X − a

b − aand d2 = S

b − a.

18

Page 19: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

• BetaFit (AbouRizk, Halpin, and Wilson 1994) is a Windows-basedpackage for fitting the beta distribution to sample data by computing a,b, θ1, and θ2 using the following estimation methods:

– moment matching;

– feasibility-constrained moment matching (so that the feasibilityconditions a < X(1) and X(n) < b are always satisfied);

– maximum likelihood (assuming a and b are known and thus are notestimated); and

– ordinary least squares (OLS) and diagonally weighted least squares(DWLS) estimation of the c.d.f.

• BetaFit is in the public domain and is available on the Web via

<www.ise.ncsu.edu/jwilson/page3>.

19

Page 20: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

Application of BetaFit to a Sample of n = 9,980 Observations ofEnd-to-End Chain Lengths (in Angströms) of Nafion, an Ionic PolymerUsed As a “Smart Material,” Based on the Method of Moment Matching

20

Page 21: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

21

Page 22: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

Result of Applying BetaFit to Nafion Data Set Using Maximum LikelihoodEstimation

22

Page 23: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

23

Page 24: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

Result of Applying BetaFit to Nafion Data Set Using Ordinary LeastSquares Estimation of the C.d.f.

24

Page 25: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

25

Page 26: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

• Rapid input modeling with subjective estimates a, m, and b of theminimum, mode, and maximum, respectively, of the target distribution:

θ1 = d2 + 3d + 4

d2 + 1and θ2 = 4d2 + 3d + 1

d2 + 1, (4)

where

d = b − m

m − a.

The mode of the fitted beta distribution will differ from m by at most4.4%; in practice the error is usually at most 1%.

26

Page 27: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

• VIBES (AbouRizk, Halpin, and Wilson 1991) is a Windows-basedpackage for fitting the beta distribution to subjective estimates of:

1. the endpoints a and b; and

2. any of the following combinations of distributional characteristics—

� the mean µX and the variance σ 2X,

� the mean µX and the mode m,� the mode m and the variance σ 2

X,� the mode m and an arbitrary quantile xp = F−1

X (p) for p ∈ (0, 1),or

� two quantiles xp and xq for p, q ∈ (0, 1).

27

Page 28: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

• Advantages of the beta distribution as an input-modeling tool:

– sufficient flexibility to represent with reasonable accuracy a widediversity of distributional shapes; and

– convenient estimation of parameters from sample data or subjectiveinformation.

• Disadvantages of the beta distribution as an input-modeling tool:

– difficult to explain; and

– difficult to sample—some popular beta variate generators breakdown when θ1 > 10 or θ2 > 10.

28

Page 29: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

Application of Beta Distributions to Pharmaceutical Manufacturing

Pearlswig (1995) developed a simulation of a proposed facility formanufacturing effervescent tablets.

• For each operation, he obtained three time estimates(a, m, and b

)from the process engineers.

• Extremely conservative estimates given for upper limits (so that b � m).

• With triangular distributions to model processing times, bottlenecksresulted in excessively low simulation estimates of annual production.

• Using (4), Pearlswig fitted beta distributions to all operation times; andthen the simulation results conformed to production levels of similarplants elsewhere.

29

Page 30: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

B. Johnson Translation System of Distributions

To fit a distribution to the continuous random variable X, Johnson(1949a) proposed finding a “translation” of X to a standard normalrandom variable Z with mean 0 and variance 1 so that Z ∼ N(0, 1).

For a detailed discussion of the Johnson translation system, see

DeBrota, D. J., R. S. Dittus, S. D. Roberts, J. R. Wilson, J. J. Swain, andS. Venkatraman. 1989a. Modeling input processes with Johnson dis-tributions. In Proceedings of the 1989 Winter Simulation Conference,pp. 308–318. Available online via

<www.ise.ncsu.edu/jwilson/files/wsc89jnsn.pdf>.

30

Page 31: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

The proposed normalizing translations have the general form

Z = γ + δ · g

(X − ξ

λ

), (5)

where γ and δ are shape parameters, λ is a scale parameter, ξ is a locationparameter, and the function g(·) defines the four distribution families in theJohnson translation system,

g(y) =

⎧⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎩

ln(y), for SL (lognormal) family,

ln(

y +√

y2 + 1)

, for SU (unbounded) family,

ln[y/(1 − y)] , for SB (bounded) family,

y, for SN (normal) family.

31

Page 32: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

• Johnson c.d.f.

If (5) is an exact normalizing translation of X to a standard normalrandom variable, then the c.d.f. of X is given by

FX(x) = �

[γ + δ · g

(x − ξ

λ

)]for all x ∈ H,

where: �(z) = (2π)−1/2∫ z

−∞ exp( − 1

2w2)

dw is the standard normalc.d.f.; and the space of X is

H =

⎧⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎩[ξ, +∞), for SL (lognormal) family,

(−∞, +∞), for SU (unbounded) family,

[ξ, ξ + λ], for SB (bounded) family,

(−∞, +∞), for SN (normal) family.

32

Page 33: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

• Johnson p.d.f. is

fX(x) = δ

λ(2π)1/2g′(

x − ξ

λ

)exp

⎧⎨⎩−1

2

[γ + δ · g

(x − ξ

λ

)]2⎫⎬⎭

for all x ∈ H, where

g′(y) =

⎧⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎩

1/y, for SL (lognormal) family,

1/√

y2 + 1, for SU (unbounded) family,

1/[y/(1 − y)], for SB (bounded) family,

1, for SN (normal) family.

Following are examples illustrating all the distributional shapes in theJohnson system.

33

Page 34: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

Symmetric Bimodal and Nearly Uniform Johnson SB Densities

34

Page 35: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

Nearly J -shaped and Symmetric Unimodal Johnson SB Densities

35

Page 36: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

Positively Skewed and Symmetric Unimodal Johnson SB Densities

36

Page 37: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

Symmetric and Negatively Skewed Johnson SU Densities

37

Page 38: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

Fitting Johnson Distributions to Sample Data

We select an estimation method and the desired translation function g(·)and then obtain estimates of γ , δ, λ, and ξ .

The Johnson system has the flexibility to match—

(a) any feasible combination of values for the mean µX, variance σ 2X,

skewness

αX = E[(

X − µX

)3/σ 3

X

] (often denoted by

√β1),

and kurtosis

βX = E[(

X − µX

)4/σ 4

X

](often denoted by β2);

or

(b) sample estimates of the moments µX, σ 2X, αX, and βX.

38

Page 39: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

• FITTR1 (Swain, Venkatraman, and Wilson 1988) is a software pack-age for fitting Johnson distributions to sample data using the followingestimation methods:

� OLS and DWLS estimation of the c.d.f.;

� minimum L1 and L∞ norm estimation of the c.d.f.;

� moment matching; and

� percentile matching.

39

Page 40: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

• VISIFIT (DeBrota et al. 1989b) is a Windows-based software packagefor fitting Johnson SB distributions to subjective information, possiblycombined with sample data. The user must provide estimates of a, b,and any two of the following characteristics:

� the mode m;

� the mean µX;

� the median x0.5;

� arbitrary quantile(s) xp or xq for p, q ∈ (0, 1);

� the width of the central 95% of the distribution; or

� the standard deviation σX.

Venkatraman, Swain and Wilson (1988), DeBrota et al. (1989b),FITTR1, and VISIFIT are available on the Web via

<www.ise.ncsu.edu/jwilson/more_info>.

40

Page 41: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

Generating Johnson Variates by Inversion

[1] Generate Z ∼ N(0, 1).

[2] Apply to Z the inverse translation

X = ξ + λ · g−1(

Z − γ

δ

), (6)

where for all real z we define the inverse translation function

g−1(z) =

⎧⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎩

ez, for SL (lognormal) family,(ez − e−z

)/2, for SU (unbounded) family,

1/(

1 + e−z), for SB (bounded) family,

z, for SN (normal) family.

(7)

41

Page 42: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

Application of Johnson Distributions to Smart Materials Research

• Matthews et al. (2006) and Weiland et al. (2005) use a multiscale mod-eling approach to predict material stiffness of a certain class of smartmaterials called ionic polymers.

• Material stiffness depends on effective length of the polymer chainscomprising the material.

• In a case study of the ionic polymer Nafion, Matthews et al. (2006) de-velop a simulation of polymer-chain conformation on a nanoscopic levelso as to generate a large number of end-to-end chain lengths.

• The chain-length p.d.f. is estimated and used as input to a macroscopic-level mathematical model to predict material stiffness.

42

Page 43: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

Johnson SU C.d.f. Fitted to n = 9,980 Nafion Chain Lengths Using DWLSEstimation

−10 0 10 20 30 40 50 60 70 80 900

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

43

Page 44: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

Johnson SU P.d.f Fitted to n = 9,980 Nafion Chain Lengths Using DWLSEstimation

−10 0 10 20 30 40 50 60 70 80 900

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.1

44

Page 45: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

• Matthews et al. (2006) and Weiland et al. (2005) obtain more accurateand intuitively appealing fits to Nafion chain-length data with Johnsonp.d.f.’s than with other distributions.

– Material stiffness is computed from the second derivative f ′′X(x) of

the fitted p.d.f.

– There is a relatively simple relationship between the Johnsonparameters and material stiffness.

45

Page 46: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

Application of Johnson Distributions to Healthcare

• To model arrival patterns of patients who have scheduled appointmentsat community healthcare clinics in San Diego, Alexopoulos et al. (2006)estimate the distribution of patient tardiness—that is, deviation from thescheduled appointment time.

• Alexopoulos et al. (2006) perform an exhaustive analysis of 18 continu-ous distributions, concluding that the SU distribution provided superiorfits to the available data.

46

Page 47: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

C. Bézier Distribution Family

Definition of Bézier Curves

• A Bézier curve is often used to approximate a smooth function on abounded interval by forcing the Bézier curve to pass in the vicinity ofselected control points{

pi ≡ (xi, zi)T : i = 0, 1, . . . , n

}in two-dimensional Euclidean space.

47

Page 48: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

• A Bézier curve of degree n with control points {p0, p1, . . . ,pn} is givenparametrically by

P(t) = [Px(t; n, x), Pz(t; n, z)

]T

=n∑

i=0

Bn,i(t) pi for t ∈ [0, 1], (8)

where x ≡ (x0, x1, . . . , xn)T and z ≡ (z0, z1, . . . , zn)

T, and where theblending function,

Bn,i(t) ≡ n!i! (n − i)! t i (1 − t)n−i for t ∈ [0, 1], (9)

is the ith Bernstein polynomial for i = 0, 1, . . . , n.

48

Page 49: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

Bézier Distribution and Density Functions

• If X is a continuous random variable on [a, b] with c.d.f. FX(·) and p.d.f.fX(·), then we can approximate FX(·) arbitrarily closely using a Béziercurve of the form (8) by taking a sufficient number (n + 1) of controlpoints with appropriate coordinates

pi = (xi, zi)T

for the ith control point, where i = 0, . . . , n.

49

Page 50: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

• If X is Bézier, then the c.d.f. of X is given parametrically by

P(t) = [Px(t; n, x), Pz(t; n, z)

]T

= {x(t), FX[x(t)]}T for t ∈ [0, 1], (10)

where

x(t) = Px(t; n, x) =n∑

i=0

Bn,i(t)xi,

FX[x(t)] = Pz(t; n, z) =n∑

i=0

Bn,i(t)zi

⎫⎪⎪⎪⎪⎬⎪⎪⎪⎪⎭ for t ∈ [0, 1]. (11)

For a detailed discussion of Bézier distributions, see

Wagner, M. A. F., and J. R. Wilson. 1996a. Using univariate Bézier distri-butions to model simulation input processes. IIE Transactions 28 (9):699–711. Available online via

<www.ise.ncsu.edu/jwilson/files/wagner96iie.pdf>

50

Page 51: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

• If X is Bézier with c.d.f. FX(·) given by (10), then the p.d.f. fX(x) is

P∗(t) = [P ∗

x (t; n, x), P ∗z (t; n, x, z)

]T

= {x(t), fX[x(t)]}T for t ∈ [0, 1],

where x(t) = P ∗x (t; n, x) = Px(t; n, x) as in (11) and

fX[x(t)] = P ∗z (t; n, x, z)

= Pz(t; n − 1, �z)

Px(t; n − 1, �x)=

∑n−1i=0 Bn−1,i (t)�zi∑n−1i=0 Bn−1,i (t)�xi

,

where

�x ≡ (�x0, . . . , �xn−1)T and �z ≡ (�z0, . . . , �zn−1)

T,

with

�xi ≡ xi+1 − xi and �zi ≡ zi+1 − zi for i = 0, 1, . . . , n − 1.

51

Page 52: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

Generating Bézier Variates by Inversion

[1] Generate a random number U ∼ Uniform[0, 1].[2] Find tU ∈ [0, 1] such that

FX[x(tU )] =n∑

i=0

Bn,i(tU )zi = U. (12)

[3] Deliver the variate

X = x(tU ) =n∑

i=0

Bn,i(tU )xi .

Codes to implement this approach are available on the Web via

<www.ise.ncsu.edu/jwilson/page3>.

52

Page 53: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

Using PRIME to Model Bézier Distributions

• PRIME (Wagner and Wilson 1996a) is a Windows-based system forfitting Bézier distributions to data or subjective information.

• PRIME is available on the previously mentioned Web site.

• Control points appear as indexed black squares that can be manipulatedwith the mouse and keyboard.

– Each control point exerts on the c.d.f. a “magnetic” attraction whosestrength is given by the associated Bernstein polynomial (9).

– Moving a control point causes the displayed c.d.f. to be updated(nearly) instantaneously.

53

Page 54: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

PRIME Windows Showing the Bézier C.d.f. (Left Panel) with Its ControlPoints and the P.d.f. (Right Panel)

54

Page 55: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

PRIME includes the following methods for fitting Bézier distributions tosample data:

� OLS estimation of the c.d.f.;

� minimum L1 and L∞ norm estimation of the c.d.f.;

� maximum likelihood estimation (assuming a and b are known);

� moment matching; and

� percentile matching.

For automatic estimation of the number of control points, see

Wagner, M. A. F., and J. R. Wilson. 1996b. Recent developments in inputmodeling with Bézier distributions. In Proceedings of the 1996 WinterSimulation Conference, pp. 1448–1456. Available online as

<www.ise.ncsu.edu/jwilson/files/bezwsc96.pdf>.

55

Page 56: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

Bézier Distribution Fitted to n = 9,980 Nafion Chain Lengths Using OLSEstimation of the C.d.f.

56

Page 57: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

Advantages of the Bézier distribution family:

• It is extremely flexible and can represent a wide diversity of distributionalshapes, including multiple modes and mixed distributions.

• If data are available, then the likelihood ratio test of Wagner and Wilson(1996b) can be used with any of the available estimation methods tofind automatically both the number and location of the control points.

• In the absence of data, PRIME can be used to determine the con-ceptualized distribution based on known quantitative or qualitativeinformation.

• As the number (n + 1) of control points increases, so does the flexibilityin fitting Bézier distributions.

57

Page 58: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

III. Time-Dependent Arrival Processes

• Many simulation applications require high-fidelity input models ofarrival processes with arrival rates that depend strongly on time.

• Nonhomogeneous Poisson processes (NHPPs) have been usedsuccessfully to model complex time-dependent arrival processes inmany applications.

• An NHPP {N(t) : t ≥ 0} is a counting process such that

� N(t) is the number of arrivals in the time interval (0, t];� λ(t) is the instantaneous arrival rate at time t , and λ(t) satisfies the

Poisson postulates; and

� the (cumulative) mean-value function is given by

µ(t) ≡ E[N(t)] =∫ t

0λ(z) dz for all t ≥ 0. (13)

58

Page 59: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

• We discuss the nonparametric approach of Leemis (1991, 2000, 2004)for modeling and simulation of NHPPs; see

Leemis, L. M. 1991. Nonparametric estimation of the cumulative intensityfunction for a nonhomogeneous Poisson process. Management Science37 (7): 886–900.

Leemis, L. M. 2000. Nonparametric estimation of the cumulative intensityfunction for a nonhomogeneous Poisson process from overlappingrealizations. Management Science 46 (7): 989–998.

Leemis, L. M. 2004. Nonparametric estimation and variate generation for anonhomogeneous Poisson process from event count data. IIETransactions 36:1155–1160.

59

Page 60: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

• The context is a recent application to modeling and simulatingunscheduled patient arrivals to a community healthcare clinic(Alexopoulos et al. 2005)

• Suppose we have a time interval (0, S] over which we observe severalindependent replications (realizations) of a stream of unscheduledpatient arrivals constituting an NHPP with arrival rate λ(t) for t ∈ (0, S].

For example, (0, S] might represent the time period on each weekdayduring which unscheduled patients may walk into a clinic—say,between 9 A.M. and 5 P.M. so that S = 480 minutes.

60

Page 61: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

• Suppose k realizations of the arrival stream over (0, S] have beenrecorded so that we have

� ni patient arrivals in the ith realization for i = 1, 2, . . . , k; and

� n =k∑

i=1

ni patient arrivals accumulated over all realizations.

61

Page 62: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

• Let{t(i) : i = 1, . . . , n

}denote the overall set of arrival times for all

unscheduled patients expressed as an offset from the beginning of(0, S] and then sorted in increasing order.

For example, if we observed n = 250 patient arrivals over k = 5 days, eachwith an observation interval of length S = 480 minutes, then

� t(1) = 2.5 minutes means that over all 5 days, the earliest arrivaloccurred 2.5 minutes after the clinic opened on one of those days; and

� t(2) = 4.73 minutes means that the second-earliest arrival occurred4.73 minutes after the clinic opened on one of those days.

� t(n) = 478.5 minutes means that the latest arrival occurred 478.5minutes after the clinic opened on one of those days.

62

Page 63: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

• We estimate the mean-value function µ(t) as follows.

� We take t(0) ≡ 0 and t(n+1) ≡ S.

� For t(i) < t ≤ t(i+1) and i = 0, 1, . . . , n, we take

µ(t) = in

(n + 1)k+{

n[t − t(i)

](n + 1)k

[t(i+1) − t(i)

]}. (14)

• Equation (14) provides a basis for modeling and simulatingunscheduled patient-arrival streams when the arrival rate exhibits astrong dependence on time.

63

Page 64: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

(0,0) t(1) t(2) t(3) t(n) t(n+1) ≡ S

n(n+1)k

2n(n+1)k

3n(n+1)k

n2

(n+1)k

nk

����������

��

��������

��������...

��������

...

µ(t)

Nonparametric Estimator of Mean Value Function

64

Page 65: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

Goodness-of-fit Testing for the Fitted Mean-Value Function

• In addition to the realizations of the target arrival process that wereused to compute the estimated mean-value function µ(t), suppose weobserve one additional realization{

A′i : i = 1, 2, . . . , n′}

independently of the previously observed realizations, with the ithpatient arriving at time A′

i for i = 1, . . . , n′.

• If the target arrival stream is an NHPP with mean-value function µ(t)

for t ∈ (0, S], then the transformed arrival times{B ′

i = µ(A′

i

) : i = 1, 2, . . . , n′}constitute a homogeneous Poisson process with an arrival rate of 1.

65

Page 66: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

• If the target arrival stream is an NHPP with mean-value function µ(t)

for t ∈ (0, S], then the corresponding transformed interarrival times{X′

i = B ′i − B ′

i−1 : i = 1, 2, . . . , n′}(with B ′

0 ≡ 0) constitute a random sample from an exponentialdistribution with a mean of 1.

• To test the adequacy of the fitted mean-value function µ(t) as anapproximation to µ(t), apply the Kolmogorov-Smirnov test to the dataset {

X′′i = µ

(A′

i

)− µ(A′

i−1

) : i = 1, 2, . . . n′}(with A′

0 ≡ 0), where the hypothesized c.d.f. in the goodness-of-fit test is

FX′′i(x) = 1 − e−x for all x ≥ 0.

66

Page 67: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

Generating Realizations of the Fitted NHPP

[1] Set i ← 1 and N ← 0.[2] Generate Ui ∼ Uniform(0, 1).[3] Set Bi ← − ln(1 − Ui).[4] While Bi < n/k do

Begin

Set m ←⌊

(n + 1)kBi

n

⌋;

Set Ai ← t(m) + {t(m+1) − t(m)

}{ (n + 1)kBi

n− m

};

Set N ← N + 1; Set i ← i + 1;Generate Ui ∼ Uniform(0, 1);Set Bi ← Bi−1 − ln(1 − Ui).

End

NHPP Simulation Procedure of Leemis (1991)

67

Page 68: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

Advantages of Leemis’s Nonparametric Approach toModeling and Simulation of NHPPs

• It does not require the assumption of any particular form for arrival rateλ(t) as a function of t .

• It provides a strongly consistent estimator of mean-value function—thatis,

limk→∞ µ(t) = µ(t) for all t ∈ (0, S] with probability 1.

• The simulation algorithm given above, which is based on inversion ofµ(t) so that

Ai = µ−1(Bi) for i = 1, . . . , N,

is also asymptotically valid as k → ∞.

68

Page 69: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

Application to Organ Transplantation Policy Analysis

• The United Network for Organ Sharing (UNOS) applied a simplifiedvariant of this approach in the development and use of the UNOS LiverAllocation Model (ULAM) for analyzing the cadaveric liver-allocationsystem in the U.S. (see Harper et al. 2000).

• ULAM incorporated models of

(a) the streams of liver-transplant patients arriving at 115 transplantcenters, and

(b) the streams of donated organs arriving at 61 organ procurementorganizations in the United States.

Virtually all these arrival streams exhibited long-term trends as well asstrong dependencies on the time of day, the day of the week, and thegeographic location of the arrival stream.

69

Page 70: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

Handling Arrival Processes Having Trends and Cyclic Effects

• Kuhl, Sumant, and Wilson (2006) develop a “semiparametric” methodfor modeling and simulating arrival processes that may exhibit along-term trend or nested periodic phenomena (such as daily andweekly cycles), where the latter effects might not necessarily possessthe symmetry of sinusoidal oscillations.

• See

Kuhl, M. E., S. G. Sumant, and J. R. Wilson. 2006. An automatedmultiresolution procedure for modeling complex arrival processes.INFORMS Journal on Computing 18 (1): 3–18.Available online via

<www.ise.ncsu.edu/jwilson/files/kuhl06joc.pdf>

70

Page 71: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

Fitted Rate Function over 100 Replications of a Test Processwith One Cyclic Rate Component and Long-term Trend

0

100

200

300

400

500

600

700

0 1 2 3 4 5 6 7 8 9 10Time t

Arr

ival

Rat

e

71

Page 72: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

Fitted Mean-Value Function over 100 Replications of a Test Processwith One Cyclic Rate Component and Long-term Trend

0

600

1200

0 1 2 3 4 5 6 7 8 9 10Time t

Cum

ulat

ive

Mea

n A

rriv

als

72

Page 73: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

Web-based Input Modeling Software

<www.rit.edu/˜kuhl1/simulation>

73

Page 74: Introduction to Modeling and Generating Probabilistic …jwilson/files/wsc07imt.pdfIntroduction to Modeling and Generating Probabilistic Input Processes for Simulation • Conventional

Introduction to Modeling and Generating Probabilistic Input Processes for Simulation

IV. Conclusions and Recommendations

• The common thread running through this tutorial is the focus on robustinput models that are

� computationally tractable and

� sufficiently flexible to represent adequately many of the probabilisticphenomena that arise in many applications of discrete-eventstochastic simulation.

• Notably absent is a discussion of Bayesian input-modelingtechniques—a topic that will receive increasing attention in the future.

• Additional material on input modeling is available via

<www.ise.ncsu.edu/jwilson/more_info>.

74