MEDIAN CONFIDENCE INTERVALS – GROUPING DATA INTO BATCHES AND COMPARISON WITH OTHER TECHNIQUES Johann Christoph Strelen Rheinische Friedrich–Wilhelms–Universität Bonn Römerstr. 164, 53117 Bonn, Germany E-mail: strelen@cs.uni-bonn.de ASTC 2002 San Diego Median Confidence Intervals (MCI) A new confidence interval (CI) technique for simulation results • Easy to apply • Accurate • Generally applicable 2 Main Features • Easy to obtain: w independent replications (simulation runs) or a single simulation run with w subsequent phases for batches of data – typically w = 5 or 6. • The variance of the estimator is not used, correlated output is implicitly considered. Hence, a serious problem is omitted which usually arises when confidence intervals for simulation results are derived. • Even if the variance does not exist, an MCI can be constructed whereas a classical CI cannot. • Sequential procedure: If a median confidence interval (MCI) is too wide, given a confidence level, it can be narrowed: Each of the replications are augmented, beginning with the last state. Similar for batches. 3 • If a measure is estimated with a function of some estimators, an MCI can be given. Example: Λ(n) estimates the throughput of a queue and W (n) the mean waitig time. Then the product Λ(n)W (n) estimates the mean number of customers in the queue (Littles Formula). • For some samples of independent random variables, e.g. normally distributed, we found MCIs which are sometimes slightly wider than usual CIs. But such simple statistic occurs seldom in simulation. • Here the output is usually dependent, and the distribution is unknown. Under these circumstances, classical CIs are usually too narrow, the confidence level is not realistic, the CIs too often do not cover the real unknown value, they are only approximate. MCIs are more accurate. 4 • The MCI technique is exact when the median and the mean of an estimator coincide. This holds for symmetrical distributions – the most important one in simulation is the normal distribution. Due to the central limit theorem and long simulation runs with fast computers many estimators are nearly normally distributed. • But in principle, the MCI technique is not restricted to the case median = mean. In this general case, one must know a single value of the estimator distribution function Fθ (x), namely the probability F = Fθ (θ) where θ is the unknown parameter. • Not each confidence level is possible, only the values 1 − F w − (1 − F )w , w = 2, 3, . . . Here, w is the number of independent replications or of batches of data. • In the special case median = mean, F = 0.5 holds, and the possible confidence levels are 50% 75% 87.5% 93.75% 96.875% 98.4% 99.2% 99.6% 99.8% 99.9% ... 5 The Basic Principle Sample X1,1, ..., X1,m of random variables. θ unknown parameter to estimate. T (X1,1, ..., X1,m) estimator, distribution function Fθ (x). Novel kind of confidence interval T min , T max (1) where T min = min Ti, T max = max Ti, 1≤i≤w 1≤i≤w and Ti = T (Xi,1, ..., Xi,m), i = 1, . . . , w estimators for w independent replications Xi,1, ..., Xi,m of the sample X1,1, ..., X1,m. 6 Theorem The interval (1) is a confidence interval for the parameter θ with the confidence level 1 − F w − (1 − F )w , i.e. P {T min ≤ θ < T max} = 1 − F w − (1 − F )w holds where F = Fθ (θ), the value of the estimator distribution function at θ. The Most Important Special Case: Mean = Median Here, the unknown parameter is the median of the estimator, Fθ (θ) = 1/2 This holds for unbiased estimators and symmetrical distributions, e.g. the estimator is normally distributed. Then for the confidence interval P {T min ≤ θ < T max} = 1 − 0.5w−1 holds, and the possible confidence levels are 1 − 0.5w−1, w = 2, 3, . . . 7 Batch Median Confidence Intervals for steady state statistics. We applied the idea of the batch means method: Grouping output data into batches and assuming these batches to being independent. A single simulation run: – First the transient phase, – then w phases for w batches of output data. From each batch one obtaines an estimate T̂i, i = 1, . . . , w. The batch mean confidence interval (BMCI) is [ min T̂i , max T̂i ). 1≤i≤w 1≤i≤w 8 Confidence Intervals in Simulation are Usually Approximate Assumptions are not satisfied • The distribution of the estimator (normal, Student) • Independency of the r.v. in the sample • For some methods other assumptions For median confidence intervals the assumptions are weaker: • Only symmetry of the distribution of the estimator • Independency of the replications, not of the r.v. within them What means approximate confidence interval? If for a parameter of a simulation model, many confidence intervals are calculated in many simulation runs, the real value lies in some of them, in the others it does not. The coverage C is the fraction of runs where it is within. If the limit of this coverage equals the confidence level CL, the confidence interval technique is exact, otherwise approximate: The confidence level is not reached, CL 6= C. 9 Numerical Experience Many simulation studies. Comparison of classical confidence interval methods with median confidence intervals or with batch median confidence intervals. Each Study: Many independent simulation experiments for the estimation of the coverage of each considered confidence interval technique. Each simulation experiment: w= 5 independent replications for median confidence intervals (MCI) and for the replication/deletion method or w= 5 batches for batch median confidence intervals (BMCI) and for the batch means method. w= 5 implies a confidence level CL = 93.75% for the MCIs and BMCIs. Measure for the accuracy: The error CL − C = confidence level – observed coverage. 10 1. M/M/1 Queueing System Law and Kelton comparative study for different well known methods for confidence intervals. Utilization 0.8; known to be statistically difficult. 400 independent simulation experiments for each run length n and each CI method, n = 2560 delays e.g. −→ coverage C. We conducted an according simulation study with the same model and the same run lengths including batch median confidence intervals (BMCI). Batch Means 0.102 Standardized Time Series 0.102 Spectrum Analysis 0.067 Autoregressive Method 0.145 Regenerative Method Classical 0.155 Regenerative Method Jackknife 0.137 Batch Median Confidence Intervals 0.045 Errors CL − C Error 0.145 means confidence level CL = 90%, observed coverage C = 75.5%, e.g. 11 2. M/M/1 Queueing System Comparison of the replication/deletion method (RD) and median confidence intervals (MCI). Low and high utilization (ρ = 0.25 and 0.8). Short and long simulation runs. ρ 0.25 0.8 Run replication/deletion median confidence intervals Short 0.023 0.017 Long 0.003 0.002 Short 0.056 0.043 Long 0.012 0.004 Errors CL − C Long runs: Both methods good Short runs: MCIs slightly better 12 3. M/M/1 Queueing System, Ratios of Estimators The same M/M/1-model as before. Comparison of jackknife intervals and median confidence intervals for the mean delay, Ŵ , as ratio of Q̂/λ̂ of the mean number of (r) jobs in the waiting room and the mean throughput. ρ 0.25 0.8 Run RD, Jackknife Median Confidence Intervals Short 0.091 0.015 Long 0.076 0.001 Short 0.121 0.037 Long 0.085 0.003 Errors CL − C Median confidence intervals are more accurate. 13 6. Pareto distribution We are interested in parameters of heavy-tailed Pareto distributions, F (x) = 1 − x−a, 0 < a <= 2, x ≥ 1, with expectation a/(a − 1) for a > 1, median 21/a, the variance does not exist. 1000 simulation experiments, each with sample size n = 5000. The classical confidence interval for the expectation does not exist. Median confidence intervals for the expectation: a CL − C 2 0.016 good 1.5 0.107 bad 1.1 not acceptable Order statistic for the median: a Confidence Interval Median CI 2 0.082 0.000 1.5 0.079 0.000 1.1 0.086 0.005 Errors CL − C 14 5. Reliability Model The model consists of three components and will function as long as component 1 works and either component 2 or 3 works. Gi is the time to failure of component i, i = 1, 2, 3, and G = min{G1, max{G2, G3}} the time to failure of the whole system. The random variables Gi are independent, and each Gi √ has a Weibull distribution F (x) = 1 − exp(− x), x > 0. The estimator of the expectation of G has a very skewed and nonnormal distribution, all confidence intervals are quite inaccurate, for small sample sizes. 8000 simulation experiments, each with sample size n = 5 or 40. n Classical CI Median CI 5 0.191 0.147 40 0.069 0.032 Errors CL − C 15 Potential Further Development of the Technique The assumption of symmetry of the estimator distribution can be omitted, even the estimator may be biased, only F = Fθ (θ), the value of the estimator distribution function at θ, the unknown parameter, must be known. Then we speak of “min-max confidence intervals” (MMCI). They are exact if the w replications are independent, their confidence level is CL = 1 − F w − (1 − F )w . Crucial problem: This value Fθ (θ). We do not know an adequate method for estimating it efficiently. But this MMCI idea works, we tried a brute-force procedure: Very long and expensive simulations for an empirical distribution of G, then the distribution F̂θ (x) with convolution, and with an estimation of the unknown parameter, θ̂, we obtained F̂ = F̂θ (θ̂) and an estimate ĈL. n Coverage ĈL 5 0.791 0.791 40 0.909 0.907 Coverages and Estimated Accurate, isn’t it? But so not practicable Confidence Levels 16