T.C ATILIM UNIVERSITY MODES ADVANCED SYSTEM SIMULATION MODES 650 A COMPREHENSIVE REVIEW OF METHODS FOR SIMULATION OUTPUT ANALYSIS Christos Alexopoulos School of Industrial and Systems Engineering Georgia Institute of Technology Atlanta, Georgia 30332–0205, U.S.A. represented by: Adel Agila Introduction • Simulation output analysis – Point estimator and confidence interval – Variance estimation (σ2) confidence interval • Independent and identically distributed (IID) – Suppose X1,…Xm are iid The Methods • The Replication/Deletion Approach • The Regenerative Method • The Batch Means Method nonoverlapping Batch Means(NBM) • Overlapping Batch Means(OBM) • Consistent Batch Means Estimation Methods • The Standardized Time Series Method(STS) • The Weighted Area Estimator • Batched Area Estimators types of simulations with regard to output analysis: • Finite-horizon simulations. • In this case the simulation starts in a specific state and is run until some terminating event occurs. EX: Bank • The output process is not expected to achieve any steady-state behavior. • Steady-state simulations. • The purpose of a steady-state simulation is the study of the long-run behavior of the system of interest. EX: Hospital Finite-horizon simulations. • n output data are needed • X1, X2,..., Xn are collected with the objective of estimating is the sample mean of the data. Xi can be : The transit time of unit i through a network , or the total time station i is busy during the ith hour. Then is an unbiased estimator for μ Xi are generally dependent random variables Finite-horizon simulations. • Then, let be the sample variance of the data • Then, the estimator a biased estimator of If Xi are positively correlated , To overcome the problem, run k independent replications Replications • Xij are the output data can be illustrated as: • and the replicate • averages are are (IID)random var’s • Here their means • is unbiased estimator of μ and their sample variance is an unbiased estimator of Approximate 1-α cI for u • 0<α<1, Tk-1,1- α/2 = upper critical point for t distribution,k-1 DOF STEADY-STATE ANALYSIS • Stationary process :The process X = {Xi} is called stationary if the joint distribution of is independent of i for all indices j1,j2, . . ., jk and all k ≥ 1. • weakly stationary process: If E(Xi) = µ, Var(Xi) ≡ σX2 < ∞ for all i, and the Cov (Xi ,Xi+j) is independent of i, then X is called WSP. • σX2 the (asymptotic) variance parameter of X. Stationary Process The stochastic process X is stationary for t1,…,tk, t∈ T, if d ( X t1 t ,..., X tk t ) ( X t1 ,..., X tk ) d where " " denotes equality in distribution • Stationary time series with positive autocorrelation • Stationary time series with negative autocorrelation • Nonstationary time series with an upward trend Stationary Process • A discrete-time stationary process X = {Xi : i≥1} with mean µ and variance σX2 = Cov(Xi,Xi), 2 2 X 2 Cov( X 1, X 1 j ) j 1 • Variance of the sample mean n 1 1 2 j Var ( X n ) X 2 (1 )Cov( X1, X1 j ) n n j 1 • Then nVar ( X n ) 2 (the (asymptotic) variance parameter) • lim n • For iid, Sn2 ( X ) 1 2 2 n1 j E (1 )Cov( X 1, X 1 j ) X n 1 j 1 n n n Stationary Process • The expected value of the variance estimator is: n 1 S n2 ( X ) an E Var ( X n ) n 1 n S n2 ( X ) E Var ( X n ) when positively correlated n – If Xi are independent, thenS 2 n (X ) n is an unbiased estimator of Var ( X n ) – If the autocorrelation is positive, then S – If the autocorrelation is negative, then 2 n (X ) n S n2 ( X ) n is biased low as an estimator of Var ( X n ) is biased high as an estimator of Var ( X n ) Stationary Process • Cov(X, Y) = Σ ( Xi - X ) ( Yi - Y ) / N = Σ xiyi / N • where • N is the number of scores in each set of data X is the mean of the N scores in the first data set Xi is the ithe raw score in the first set of scores xi is the ith deviation score in the first set of scores Y is the mean of the N scores in the second data set Yi is the ithe raw score in the second set of scores yi is the ith deviation score in the second set of scores Cov(X, Y) is the covariance of corresponding scores in the two sets of data Functional Central Limit Theorem (FCLT) Assumption. • Suppose the series is convergent, and σ2 >0. where Rj= Cov ( Xi ,Xi+j) , and σ2 : -the (asymptotic) variance parameter of X - equals n 1 lim n E j j 1 As n→∞ ,we have the following convergent: 1 t t X n t ≥ 0. 2 2 n jn n j 1 j The variance of the sample mean in terms of the autocovariance function is Assumption: Along the above equation 0<σ2 <∞. Imply The paper focuses on methods for obtaining CIs for μ , which involve estimating σ2 . CI:1-α Confidence Interval The Replication/Deletion Approach • K independent replications • each of length l +n observations. • Discard the first l observations from each run. • Use the IID sample means • Compute the point estimate • If k is large ,compute the approximate 1-α cI for μ: Ex: Note The Replication/Deletion Approach • For (l ,n, and k) • (a) As l increased for fixed n, the “systematic” error in each Yi(l , n) due to the initial conditions decreased. • (b) As n increased for fixed l , the systematic and sampling errors in Yi(l , n) decreased. • (c) #of replications k cannot effect The Yi(l , n) k. • (d) For fixed n, the CI is valid only if l / lnk → ∞ as k → ∞. l increase faster than lnk. • Replication method (more )is expensive The Regenerative Method • The basic concept underlying this approach is that for many systems a simulation run can be divided into a series of cycles such that the evolution of the system in a cycle is a probabilistic replica of the evolution in any other cycle. IID cycles • The Method (we have) • Random time indices 1≤T1<T2 <…. • The portion (XTi+j,j) ≥ 0 has the same distribution for i . • Then • And the SS_mean • Where E(Zi) < ∞ , E(Zi) ≠0 The Regenerative Method To obtain estimates of the expected value of some random variable X Y1=62-53=9 Y3=124-117=7 Z1=62-24=38 Y2=70-62=8 The Regenerative Method The Regenerative Method • Disadvantages • difficult to apply in practice because the majority of simulations have either no regenerative points or very long cycle lengths. The Batch Means Method nonoverlapping Batch Means(NBM) • To compute points and CI estimators for the mean µ. The method • suppose the sample x1, x2,….xn . • Divide the sample into k batches with m observations (n=km). • Then, for i=1,2,……,k, the ith batch consists of the observations X(i+1)m+1 , X(i+1)m+2,….,Xim • And the ith batch mean • The NBM-based estimator of the mean is Batch means method (Nonoverlapping batch mean) • Non-overlapping batch mean (NBM) X 1 ,..., X m , X m 1 ,..., X 2 m , ... X (i 1) m 1 ,..., X im, ... X ( k 1) m 1 ,..., X km Y1,m Y2,m Yi ,m Yk ,m X nkm Batch 1 m observations with batch mean Y1,m Batch k m observations with batch mean Yk,m Nonoverlapping batch mean (NBM) • Suppose the batch means become uncorrelated as m ∞ nVar ( X n ) nVar (Yi ,m ) k mVar (Yi ,m ) • NBM estimator for σ2 k m 2 VˆB (k , m) ( Y X ) i ,m n k 1 i 1 • Confidence Interval X n tk 1,1 / 2 VˆB (k , m) n Consistent Batch Means Estimation Methods • Alternative rules that yield strongly consistent estimators for The Assumption of Strong Approximation(ASA) Given a constant , and a finite random variable C such that, as n →∞ Where w(n) is a standard Brownian motion process. λ→1/2 normal distribution and low correlation among Xi . λ→0 the absence of one of the above . The Assumption of Strong Approximation(ASA) • Theorem suppose • ASA is hold, mn is batch sizes and kn is batch counts • Such that • And • Then, , as n →∞ for some finite integer q ≥ 1. and • Where N(0,1) is a standard normal random variable. Overlapping Batch Means(OBM) • For a given m, this method uses all n-m+1 overlapping batches to estimate µ and • the first batch X1,…….,Xm, the second batch X2,……….., Xm+1.etc • The OBM estimator of µ is • Where (batch mean) • The OBM estimator of σ2 is • Where k=n/m. Overlapping Batch Mean (OBM) X1 , X 2 , X 3 , ... , X m, X m1 , X m2 , X m3 , ... Y1,m Y2,m • OBM estimator for σ2 n m 1 nm 2 VˆO (m) ( Y X ) i ,m n (n m 1)(n m) i 1 NBM vs. OBM • Under mild conditions E VˆB (k , m) 2 (k 1) n o(1 n) E VˆO (m) 2 m o(1 m) – Thus, both have similar bias • Variance of the estimators Var VˆB (k , m) 3 , as k , m 2 Var VˆO (m) – Thus, the OBM method gives better (asymptotic)performance than NBM. 29 The Standardized Time Series Method(STS) • • • • • The estimator based on STS applied to batches. The method For the sample X1,X2,…..Xn. Define D0,n=0, and Di,n=Ŷi-Ŷn, for i=1,….,n; Scales the sequence Di,n by and the time by setting t=i/n. • Then STS is • If X satisfies a FCLT, then as n →∞ Standardized Time Series • Define the ‘centered’ partial sums of Xi as • Central Limit Theorem • Define the continuous time process ` S nt t Tn n Question: How does Tn(t) behave as n increases? n=100 32 n=1000 33 n=10000 34 n=1,000,000 35 The Weighted Area Estimator • To estimate σ2 • We define the square of the weighted area under the standardized time series. • its limiting functional is • Where • And satisfies the following • If the above hold, then • Where denotes equivalence in distribution, and Nor(0,1) denotes the standard normal random variable. The Weighted Area Estimator • Under assumption FCLT, the continuous mapping theorem (CMT) implies 2 2 2 • A(f)=σ Xv ,where Xv denotes chi-squared random variable with v degrees of freedom. 2 2 • And var[A(f)]=var[σ Xv ]=2σ4. Batched Area Estimators • Divide the run into contiguous, nonoverlapping batches X 1 ,..., X m , X m 1 ,..., X 2 m , ... X (i 1) m 1 ,..., X im, ... X ( k 1) m 1 ,..., X km • • • • • Y1,m Y2,m Yi ,m X nkm Yk ,m Form an STS estimator from each batch. Take the average of the estimators. The STS from batch i ( i=1,2,….K then Where (i=1,…,k),&(j=1,….m) • If Assumption FCLT holds, • then Batched Area Estimators • Where (i) the [Zi: i = 1,……,k] are IID. standard normal random variables; (ii) the [Zi: i =1,…….,k] are independent of the Bs; and (iii) Bs denotes a standard Brownian bridge on [s,s+1], for The area estimator from batch i is Batched Area Estimators and the batched area estimator for σ2 is Since the Ti,m, i = 1,...,k, converge to independent Brownian bridges as m becomes large (with fixed k), we shall assume that the the Ai (f; m) are asymptotically independent as m →∞. Then ,we have Then, the variance of the batched area estimator : as m →∞.