1 Simulation Modeling and Analysis Output Analysis
Dec 22, 2015
1
Simulation Modeling and Analysis
Output Analysis
2
Outline
• Stochastic Nature of Output
• Taxonomy of Simulation Outputs
• Measures of Performance– Point Estimation– Interval Estimation
• Output Analysis in Terminating Simulations
• Output Analysis in Steady-state Simulations
3
Introduction
• Output Analysis– Analysis of data produced by simulation
• Goal– To predict system performance– To compare alternatives
• Why is it needed?– To evaluate the precision of the simulation
performance parameter as an estimator
4
Introduction -contd
• Each simulation run is a sample point
• Attempts to increase the sample size by increasing run length may fail because of autocorrelation
• Initial conditions affect the output
5
Stochastic Nature of Output Data
• Model Input Variables are Random Variables
• The Model Transforms Input into Output
• Output Data are Random Variables
• Replications of a model run can be obtained by repeating the run using different random number streams
6
Example: M/G/1 Queue
• Average arrival rate Poisson with = 0.1 per minute
• Service times Normal with = 9.5 minutes and = 1.75 minutes
• Runs– One 5000 minute run – Five 1000 minute runs w/ 3 replications each
7
Taxonomy of Simulation Outputs
• Terminating (Transient) Simulations– Runs until a terminating event takes place– Uses well specified initial conditions
• Non-terminating (Steady-state) Simulations– Runs continually or over a very long time– Results must be independent of initial data– Termination?
• What determines the type of simulation?
Examples: Non-terminating Systems
• Many shifts of a widget manufacturing process.
• Expansion in workload of a computer service bureau.
8
9
Measures of Performance: Point Estimation
• Means
• Proportions
• Quantiles
10
Measures of Performance: Point Estimation (Discrete-time Data)
• Point estimator of (of ) based on the simulation discrete-time output (Y1, Y2,.., Yn)
* = (1/n) i n Yi
• Unbiased point estimator
E(* ) = • Bias
b = E(* ) -
11
Measures of Performance: Point Estimation (Continuous-time data)• Point estimator of (of ) based on the
simulation continuous-time output (Y(t), 0 < t < Te)
* = (1/ Te) 0 Te Y(t) dt
• Unbiased point estimator
E(* ) = • Bias
b = E(* ) -
12
Measures of Performance: Interval Estimation (Discrete-time Data)
• Variance and variance estimator
2() = true variance of point estimator
2*() = estimator of variance of point estimator
• Bias (in variance estimation)
B = E(2*() )/ 2()
13
Measures of Performance: Interval Estimation - contd
• If B ~ 1 then t = ( - )/ 2*() has t/2,f distribution (d.o.f. = f). I.e.
• A 100(1 - )% confidence interval for is
- t/2,f 2*() < < + t/2,f 2*()
• Cases– Statistically independent observations– Statistically dependent observations (time
series).
14
Measures of Performance: Interval Estimation - contd
• Statistically independent observations– Sample variance
S2 = i n (Yi - )2/(n-1)
– Unbiased estimator of 2()
2*() = S2 /n– Standard error of the point estimator
*() = S /n
15
Measures of Performance: Interval Estimation - contd
• Statistically dependent observations– Variance of
2() = (1/n2) i n j
n cov(Yi , Yj )
– Lag k autocovariance
k = cov(Yi , Yi+k )
– Lag k autocorrelation
k = k0
16
Measures of Performance: Interval Estimation - contd
• Statistically dependent observations (contd)– Variance of 2() = (0 /n) [ 1 + 2 k=1
n-1 (1- k/n) k] = (0 /n) c
– Positively autocorrelated time series (k > 0)
– Negatively autocorrelated time series (k < 0)
– Bias (in variance estimation)
B = E(S2/n )/ 2() = (n/c - 1)/(n-1)
17
Measures of Performance: Interval Estimation - contd
• Statistically dependent observations (contd)
• Cases– Independent data k = 0, c = 1, B = 1
– Positively correlated data k > 0, c > 1, B < 1, S2/n is biased low (underestimation)
– Negatively correlated data k < 0, c < 1, B > 1, S2/n is biased high (overestimation)
18
Output Analysis for Terminating Simulations
• Method of independent replications– n = Sample size– Number of replications r=1,2,…,R
– Yji i-th observation in replication j
– Yji, Yjk are autocorrelated
– Yri, Ysk are statistically independent
– Estimator of mean (r =1,2,…,R)
r(1/nr) i nr Yri
19
Output Analysis for Terminating Simulations - contd
• Confidence Interval (R fixed; discrete data)– Overall point estimate
* = (1/R) 1 R r
– Variance estimate
* (*) = [1/(R-1)R] 1 R (r
– Standard error of the point estimator
*() = * (*)
20
Output Analysis for Terminating Simulations - contd
• Estimator and Interval (R fixed; continuous data)– Estimator of mean (r =1,2,…,R)
r(1/Te) 0 Te Yr(t) dt
Overall point estimate
* = (1/R) 1 R r
– Variance estimate
* (*) = [1/(R-1)R] 1 R (r
21
Output Analysis in Terminating Simulations - contd
• Confidence Intervals with Specified Precision
• Half-length confidence interval (h.l.)
h.l. = t/2,f 2*() = t/2,f S/ R <
• Required number of replications
R* > ( z /2 So/ )2
22
Output Analysis for Steady State Simulations
• Let (Y1, Y2,.., Yn) be an autocorrelated time series
• Estimator of the long run measure of performance (independent of I.C.s)
= lim n => (1/n) i n Yi
• Sample size n (or Te) is design choice.
23
Output Analysis for Steady State Simulations -contd
• Considerations affecting the choice of n– Estimator bias due to initial conditions– Desired precision of point estimator– Budget/computer constraints
24
Output Analysis for Steady State Simulations -contd
• Initialization bias and Initialization methods– Intelligent initialization
• Using actual field data
• Using data from a simpler model
– Use of phases in simulation• Initialization phase (0 < t < To; for i=1,2,…,d)
• Data collection phase (To < t < Te; for i=d+1,d+2,…,n)
• Rule of thumb (n-d) > 10 d
25
Output Analysis for Steady State Simulations -contd
• Example M/G/1 queue– Batched data– Batched means– Averaging batch means within a replication
(I.e. along the batches)– Averaging batch means within a batch (I.e.
along the replications).
26
Steady State Simulations: Replication Method
• Cases1.- Yrj is an individual observation from within a
replication
2.- Yrj is a batch mean of discrete data from within a replication
3.- Yrj is a batch mean of continuous data over a given interval
27
Steady State Simulations: Replication Method -contd
• Sample average for replication r of all (nondeleted) observations
Y*r(n,d) = Y*r = [1/(n-d)] j=d+1n Yrj
• Replication averages are independent and identically distributed RV’s
• Overall point estimator
Y*(n,d) = Y* = [1/R] r=1R Yr(n,d)
28
Steady State Simulations: Replication Method -contd
• Sample Variance
S2 = [1/(R-1)] r=1R (Y*r - Y*)
• Standard error = S/ R
• 100(1-)% Confidence interval
Y* - t /2,R-1 S/ R < < Y* + t /2,R-1 S/ R
29
Steady State Simulations: Sample Size
• Greater precision can be achieved by– Increasing the run length – Increasing the number of replications
30
Steady State Simulations: Batch Means for Interval Estimation
• Single, long replication with batches– Batch means treated as if they were
independent– Batch means (continuous)
Y*j = (1/m) (j-1)m jm Y(t) dt
– Batch means (discrete)
Y*j = (1/m) i=(j-1)m jm Yi
31
Steady State Simulations: Batch Size Selection Guidelines
• Number of batches < 30• Diagnose correlation with lag 1 autocorrelation
obtained from a large number of batch means from a smaller batch size
• For total sample size to be selected sequentially allow batch size and number of batches grow with run length.