Simulation Modeling and
Analysis
Output Analysis
1
Outline
• Stochastic Nature of Output
• Taxonomy of Simulation Outputs
• Measures of Performance
– Point Estimation
– Interval Estimation
• Output Analysis in Terminating Simulations
• Output Analysis in Steady-state Simulations
2
Introduction
• Output Analysis
– Analysis of data produced by simulation
• Goal
– To predict system performance
– To compare alternatives
• Why is it needed?
– To evaluate the precision of the simulation performance parameter as an estimator
3
Introduction -contd
• Each simulation run is a sample point
• Attempts to increase the sample size by increasing run length may fail because of autocorrelation
• Initial conditions affect the output
4
Stochastic Nature of Output Data
• Model Input Variables are Random
Variables
• The Model Transforms Input into Output
• Output Data are Random Variables
• Replications of a model run can be obtained by repeating the run using different random number streams
5
Example: M/G/1 Queue
• Average arrival rate Poisson with
= 0.1 per minute
• Service times Normal with
= 9.5 minutes and
= 1.75 minutes
• Runs
– One 5000 minute run
– Five 1000 minute runs w/ 3 replications each
6
Taxonomy of Simulation Outputs
• Terminating (Transient) Simulations
– Runs until a terminating event takes place
– Uses well specified initial conditions
• Non-terminating (Steady-state) Simulations
– Runs continually or over a very long time
– Results must be independent of initial data
– Termination?
• What determines the type of simulation?
7
Examples: Non-terminating
Systems
• Many shifts of a widget manufacturing process.
• Expansion in workload of a computer service bureau.
8
Measures of Performance: Point
Estimation
• Means
• Proportions
• Quantiles
9
Measures of Performance: Point
Estimation (Discrete-time Data)
• Point estimator of
(of
) based on the simulation discrete-time output
(Y
1
, Y
2
,.., Y n
)
* = (1/n)
• Unbiased point estimator
E(
* ) =
i n Y i
• Bias b = E(
* ) -
10
Measures of Performance: Point
Estimation (Continuous-time data)
• Point estimator of
(of
) based on the simulation continuous-time output
(Y(t), 0 < t < T e
)
* = (1/ T e
)
0
Te Y(t) dt
• Unbiased point estimator
E(
* ) =
• Bias b = E(
* ) -
11
Measures of Performance: Interval
Estimation (Discrete-time Data)
• Variance and variance estimator
2 (
) = true variance of point estimator
2* (
) = estimator of variance of point estimator
• Bias (in variance estimation)
B = E(
2* (
) )/
2 (
)
12
Measures of Performance:
Interval Estimation - contd
• If B ~ 1 then t = (
-
)/
2* (
) has t
/2,f distribution (d.o.f. = f). I.e.
• A 100(1
)% confidence interval for
is
- t
/2,f
• Cases
2* (
) <
<
+ t
/2,f
2* (
)
– Statistically independent observations
– Statistically dependent observations (time series).
13
Measures of Performance:
Interval Estimation - contd
• Statistically independent observations
– Sample variance
S 2 =
i n (Y i
-
) 2 /(n-1)
– Unbiased estimator of
2 (
)
2* (
) = S 2 /n
– Standard error of the point estimator
* (
) = S /
n
14
Measures of Performance:
Interval Estimation - contd
• Statistically dependent observations
– Variance of
2 (
) = (1/n 2 )
i n
j n cov(Y i ,
Y j
)
– Lag k autocovariance
k
= cov(Y
– Lag k autocorrelation i ,
Y i+k
)
k
=
k
0
15
Measures of Performance:
Interval Estimation - contd
• Statistically dependent observations (contd)
– Variance of
2 (
) = (
0
/n) [ 1 + 2
k=1 n-1 (1- k/n)
k
] = (
0
/n) c
– Positively autocorrelated time series ( k
– Negatively autocorrelated time series ( k
> 0)
< 0)
– Bias (in variance estimation)
B = E(S 2 /n )/
2 (
) = (n/c - 1)/(n-1)
16
Measures of Performance:
Interval Estimation - contd
• Statistically dependent observations (contd)
• Cases
– Independent data k
= 0, c = 1, B = 1
– Positively correlated data k
> 0, c > 1, B < 1,
S 2 /n is biased low (underestimation)
– Negatively correlated data k
< 0, c < 1, B > 1,
S 2 /n is biased high (overestimation)
17
Output Analysis for Terminating
Simulations
• Method of independent replications
– n = Sample size
– Number of replications r=1,2,…,R
– Y ji i-th observation in replication j
– Y ji
, Y jk are autocorrelated
– Y ri
, Y sk are statistically independent
– Estimator of mean (r =1,2,…,R)
r
(1/n r
)
i n r
Y ri
18
Output Analysis for Terminating
Simulations - contd
• Confidence Interval (R fixed; discrete data)
– Overall point estimate
* = (1/R)
1
R
r
– Variance estimate
* (
*) = [1/(R-1)R]
1
R (
r
– Standard error of the point estimator
* (
) =
* (
*)
19
Output Analysis for Terminating
Simulations - contd
• Estimator and Interval (R fixed; continuous data)
– Estimator of mean (r =1,2,…,R)
r
(1/T e
)
0
Te Y r
(t) dt
Overall point estimate
* = (1/R)
1
R
– Variance estimate
* (
*) = [1/(R-1)R]
1
R
r
(
r
20
Output Analysis in Terminating
Simulations - contd
• Confidence Intervals with Specified
Precision
• Half-length confidence interval (h.l.) h.l. = t
/2,f
2* (
) = t
/2,f
S/
• Required number of replications
R <
R* > ( z
/2
S o
/
) 2
21
Output Analysis for Steady State
Simulations
• Let (Y
1
, Y
2
,.., Y n
) be an autocorrelated time series
• Estimator of the long run measure of performance
(independent of I.C.s)
= lim n =>
(1/n)
i n Y i
• Sample size n (or T e
) is design choice.
22
Output Analysis for Steady State
Simulations -contd
• Considerations affecting the choice of n
– Estimator bias due to initial conditions
– Desired precision of point estimator
– Budget/computer constraints
23
Output Analysis for Steady State
Simulations -contd
• Initialization bias and Initialization methods
– Intelligent initialization
• Using actual field data
• Using data from a simpler model
– Use of phases in simulation
• Initialization phase (0 < t < To; for i=1,2,…,d)
• Data collection phase (To < t < Te; for i=d+1,d+2,…,n)
• Rule of thumb (n-d) > 10 d
24
Output Analysis for Steady State
Simulations -contd
• Example M/G/1 queue
– Batched data
– Batched means
– Averaging batch means within a replication (I.e. along the batches)
– Averaging batch means within a batch (I.e. along the replications).
25
Steady State Simulations:
Replication Method
• Cases
1.- Y rj is an individual observation from within a replication
2.- Y rj is a batch mean of discrete data from within a replication
3.- Y rj is a batch mean of continuous data over a given interval
26
Steady State Simulations:
Replication Method -contd
• Sample average for replication r of all
(nondeleted) observations
Y* r
(n,d) = Y* r
= [1/(n-d)]
j=d+1 n Y rj
• Replication averages are independent and identically distributed RV’s
• Overall point estimator
Y*(n,d) = Y* = [1/R]
r=1
R Y r
(n,d)
27
Steady State Simulations:
Replication Method -contd
• Sample Variance
S 2 = [1/(R-1)]
• Standard error = S/
R r=1
R (Y* r
- Y*)
• 100(1
)% Confidence interval
Y* - t
/2,R-1
S/
R <
< Y* + t
/2,R-1
S/
R
28
Steady State Simulations: Sample
Size
• Greater precision can be achieved by
– Increasing the run length
– Increasing the number of replications
29
Steady State Simulations: Batch
Means for Interval Estimation
• Single, long replication with batches
– Batch means treated as if they were independent
– Batch means (continuous)
Y* j
= (1/m)
– Batch means (discrete)
(j-1)m jm
Y* j
= (1/m)
i=(j-1)m
Y(t) dt jm Y i
30
Steady State Simulations: Batch
Size Selection Guidelines
• Number of batches < 30
• Diagnose correlation with lag 1 autocorrelation obtained from a large number of batch means from a smaller batch size
• For total sample size to be selected sequentially allow batch size and number of batches grow with run length.
31