a comprehensive review of methods for simulation output analysis

advertisement
T.C
ATILIM UNIVERSITY
MODES
ADVANCED SYSTEM SIMULATION
MODES 650
A COMPREHENSIVE REVIEW OF METHODS
FOR SIMULATION OUTPUT ANALYSIS
Christos Alexopoulos
School of Industrial and Systems Engineering
Georgia Institute of Technology
Atlanta, Georgia 30332–0205, U.S.A.
represented by:
Adel Agila
Introduction
• Simulation output analysis
– Point estimator and confidence interval
– Variance estimation (σ2) confidence interval
• Independent and identically distributed (IID)
– Suppose X1,…Xm are iid
The Methods
• The Replication/Deletion Approach
• The Regenerative Method
• The Batch Means Method
nonoverlapping Batch Means(NBM)
• Overlapping Batch Means(OBM)
• Consistent Batch Means Estimation Methods
• The Standardized Time Series Method(STS)
• The Weighted Area Estimator
• Batched Area Estimators
types of simulations with regard to output
analysis:
• Finite-horizon simulations.
• In this case the simulation starts in a specific
state and is run until some terminating event
occurs. EX: Bank
• The output process is not expected to achieve
any steady-state behavior.
• Steady-state simulations.
• The purpose of a steady-state simulation is the
study of the long-run behavior of the system of
interest. EX: Hospital
Finite-horizon simulations.
• n output data are needed
• X1, X2,..., Xn are collected with the objective of
estimating
is the sample mean of the data.
Xi can be :
The transit time of unit i through a network , or the
total time station i is busy during the ith hour.
Then is an unbiased estimator for μ
Xi are generally dependent random variables
Finite-horizon simulations.
• Then, let
be the sample
variance of the data
• Then, the estimator
a biased
estimator of
If Xi are positively correlated ,
To overcome the problem, run k independent
replications
Replications
• Xij are the output data can be illustrated as:
•
and the replicate
•
averages are
are (IID)random var’s
• Here their means
•
is unbiased estimator of μ and
their sample variance
is an unbiased
estimator of
Approximate 1-α cI for u
•
0<α<1,
Tk-1,1- α/2 = upper critical point
for t distribution,k-1 DOF
STEADY-STATE ANALYSIS
• Stationary process :The process X = {Xi} is
called stationary if the joint distribution of
is independent of i for all indices
j1,j2, . . ., jk and all k ≥ 1.
• weakly stationary process: If E(Xi) = µ, Var(Xi) ≡
σX2 < ∞ for all i, and the Cov (Xi ,Xi+j) is
independent of i, then X is called WSP.
• σX2 the (asymptotic) variance parameter of X.
Stationary Process
The stochastic process X is stationary for t1,…,tk, t∈ T, if
d
( X t1 t ,..., X tk t )  ( X t1 ,..., X tk )
d
where "  " denotes equality in distribution
•
Stationary time series with positive
autocorrelation
•
Stationary time series with negative
autocorrelation
•
Nonstationary time series with an upward
trend
Stationary Process
• A discrete-time stationary process X = {Xi : i≥1} with mean µ and
variance σX2 = Cov(Xi,Xi),

2


2
X
 2 Cov( X 1, X 1 j )
j 1
• Variance of the sample mean
n 1

1 2
j
Var ( X n )   X  2 (1  )Cov( X1, X1 j ) 
n
n
j 1

• Then
nVar ( X n )   2 (the (asymptotic) variance parameter)
• lim
n 
• For iid,

 Sn2 ( X )  1  2
2 n1
j
E
(1  )Cov( X 1, X 1 j ) 

   X 
n  1 j 1
n
 n  n

Stationary Process
•
The expected value of the variance estimator is:
n
1
 S n2 ( X )  an
E
Var ( X n )

n 1
 n 
 S n2 ( X ) 
E
  Var ( X n ) when positively correlated
n


– If Xi are independent, thenS
2
n
(X )
n
is an unbiased estimator of Var ( X n )
– If the autocorrelation is positive, then S
– If the autocorrelation is negative, then
2
n
(X )
n
S n2 ( X )
n
is biased low as an estimator of Var ( X n )
is biased high as an estimator of Var ( X n )
Stationary Process
• Cov(X, Y) = Σ ( Xi - X ) ( Yi - Y ) / N = Σ xiyi / N
• where
• N is the number of scores in each set of data
X is the mean of the N scores in the first data set
Xi is the ithe raw score in the first set of scores
xi is the ith deviation score in the first set of scores
Y is the mean of the N scores in the second data set
Yi is the ithe raw score in the second set of scores
yi is the ith deviation score in the second set of scores
Cov(X, Y) is the covariance of corresponding scores in
the two sets of data
Functional Central Limit Theorem (FCLT) Assumption.
• Suppose the series
is convergent,
and σ2 >0. where
Rj= Cov ( Xi ,Xi+j) , and σ2 : -the (asymptotic)
variance parameter of X - equals
n


1
 
lim n E    j 
 j 1 
As n→∞ ,we have the following convergent:
1


t

  t 


X
 n
t ≥ 0.
2
2
n 
jn
n
j 1
j
The variance of the sample mean in terms of the
autocovariance function is
Assumption: Along the above equation 0<σ2 <∞.
Imply
The paper focuses on methods for obtaining CIs
for μ , which involve estimating σ2 .
CI:1-α Confidence Interval
The Replication/Deletion Approach
• K independent replications
• each of length l +n observations.
• Discard the first l observations from each run.
• Use the IID sample means
• Compute the point estimate
• If k is large ,compute the approximate 1-α cI
for μ:
Ex: Note
The Replication/Deletion Approach
• For (l ,n, and k)
• (a) As l increased for fixed n, the “systematic”
error in each Yi(l , n) due to the initial conditions
decreased.
• (b) As n increased for fixed l , the systematic and
sampling errors in Yi(l , n) decreased.
• (c) #of replications k cannot effect The Yi(l , n) k.
• (d) For fixed n, the CI is valid only if l / lnk → ∞ as
k → ∞. l increase faster than lnk.
• Replication method (more )is expensive
The Regenerative Method
• The basic concept underlying this approach is that
for many systems a simulation run can be divided
into a series of cycles such that the evolution of the
system in a cycle is a probabilistic replica of the
evolution in any other cycle. IID cycles
• The Method (we have)
• Random time indices 1≤T1<T2 <….
• The portion (XTi+j,j) ≥ 0 has the same distribution for i .
• Then
• And the SS_mean
• Where E(Zi) < ∞ , E(Zi) ≠0
The
Regenerative
Method
To obtain estimates of the expected value of some random
variable X
Y1=62-53=9
Y3=124-117=7
Z1=62-24=38
Y2=70-62=8
The Regenerative Method
The Regenerative Method
• Disadvantages
• difficult to apply in practice because the
majority of simulations have either no
regenerative points or very long cycle lengths.
The Batch Means Method
nonoverlapping Batch Means(NBM)
• To compute points and CI estimators for the
mean µ. The method
• suppose the sample x1, x2,….xn .
• Divide the sample into k batches with m
observations (n=km).
• Then, for i=1,2,……,k, the ith batch consists of
the observations X(i+1)m+1 , X(i+1)m+2,….,Xim
• And the ith batch mean
• The NBM-based estimator of the mean is
Batch means method (Nonoverlapping batch mean)
• Non-overlapping batch mean (NBM)
X 1 ,..., X m , X m 1 ,..., X 2 m , ... X (i 1) m 1 ,..., X im, ... X ( k 1) m 1 ,..., X km
Y1,m
Y2,m
Yi ,m
Yk ,m
X nkm
Batch 1
m
observations
with batch
mean Y1,m
Batch k
m
observations
with batch
mean Yk,m
Nonoverlapping batch mean (NBM)
• Suppose the batch means become
uncorrelated
as m  ∞
nVar ( X n ) 
nVar (Yi ,m )
k
 mVar (Yi ,m )
• NBM estimator for σ2
k
m
2
VˆB (k , m) 
(
Y

X
)
 i ,m n
k  1 i 1
• Confidence Interval
X n  tk 1,1 / 2
VˆB (k , m)
n
Consistent Batch Means Estimation Methods
• Alternative rules that yield strongly consistent
estimators for
The Assumption of Strong Approximation(ASA)
Given a constant
, and a finite random
variable C such that, as n →∞
Where w(n) is a standard Brownian motion process.
λ→1/2
normal distribution and low correlation among Xi .
λ→0
the absence of one of the above .
The Assumption of Strong Approximation(ASA)
• Theorem suppose
• ASA is hold, mn is batch sizes and kn is batch counts
• Such that
• And
• Then,
, as n →∞
for some finite integer q ≥ 1.
and
• Where N(0,1) is a standard normal random
variable.
Overlapping Batch Means(OBM)
• For a given m, this method uses all n-m+1
overlapping batches to estimate µ and
• the first batch X1,…….,Xm, the second batch
X2,……….., Xm+1.etc
• The OBM estimator of µ is
• Where
(batch mean)
• The OBM estimator of σ2 is
• Where k=n/m.
Overlapping Batch Mean (OBM)
X1 , X 2 , X 3 , ... , X m, X m1 , X m2 , X m3 , ...
Y1,m
Y2,m
• OBM estimator for σ2
n  m 1
nm
2
VˆO (m) 
(
Y

X
)
 i ,m n
(n  m  1)(n  m) i 1
NBM vs. OBM
• Under mild conditions
E VˆB (k , m)    2  (k  1) n  o(1 n)
E VˆO (m)    2   m  o(1 m)
– Thus, both have similar bias
• Variance of the estimators
Var VˆB (k , m) 
3
 , as k , m  
2
Var VˆO (m) 
– Thus, the OBM method gives better
(asymptotic)performance than NBM.
29
The Standardized Time Series Method(STS)
•
•
•
•
•
The estimator based on STS applied to batches.
The method
For the sample X1,X2,…..Xn. Define
D0,n=0, and Di,n=Ŷi-Ŷn, for i=1,….,n;
Scales the sequence Di,n by
and the time
by setting t=i/n.
• Then STS is
• If X satisfies a FCLT, then as n →∞
Standardized Time Series
• Define the ‘centered’ partial sums of Xi as
• Central Limit Theorem
• Define the continuous time process
`
S nt 


t

Tn
n
Question: How does Tn(t) behave as n increases?
n=100
32
n=1000
33
n=10000
34
n=1,000,000
35
The Weighted Area Estimator
• To estimate σ2
• We define the square of the weighted area under
the standardized time series.
• its limiting functional is
• Where
• And satisfies the following
• If the above hold, then
• Where denotes equivalence in distribution,
and Nor(0,1) denotes the standard normal random
variable.
The Weighted Area Estimator
• Under assumption FCLT, the continuous
mapping theorem (CMT) implies
2
2
2
• A(f)=σ Xv ,where Xv denotes chi-squared
random variable with v degrees of freedom.
2
2
• And var[A(f)]=var[σ Xv ]=2σ4.
Batched Area Estimators
• Divide the run into contiguous, nonoverlapping batches
X 1 ,..., X m , X m 1 ,..., X 2 m , ... X (i 1) m 1 ,..., X im, ... X ( k 1) m 1 ,..., X km
•
•
•
•
•
Y1,m
Y2,m
Yi ,m
X nkm
Yk ,m
Form an STS estimator from each batch.
Take the average of the estimators.
The STS from batch i ( i=1,2,….K
then
Where
(i=1,…,k),&(j=1,….m)
• If Assumption FCLT holds,
• then
Batched Area Estimators
• Where (i) the [Zi: i = 1,……,k] are IID. standard
normal random variables;
(ii) the [Zi: i =1,…….,k] are independent of the
Bs; and (iii) Bs denotes a standard Brownian
bridge on [s,s+1], for
The area estimator from batch i is
Batched Area Estimators
and the batched area estimator for σ2 is
Since the Ti,m, i = 1,...,k, converge to independent
Brownian bridges as m becomes large (with
fixed k), we shall assume that the the Ai (f; m)
are asymptotically independent as m →∞.
Then ,we have
Then, the variance of the batched area estimator :
as m →∞.
Download