MIN-MAX CONFIDENCE INTERVALS Johann Christoph Strelen Rheinische Friedrich–Wilhelms–Universität Bonn Römerstr. 164, 53117 Bonn, Germany E-mail: strelen@cs.uni-bonn.de July 2004 STOCHASTIC SIMULATION Random input =⇒ random output two different runs of the same model −→ different output. Due to the stochastic nature of the simulation results, careful statistic analysis must be done for the correct interpretation of calculated values. If this is omitted, there is a significant probability of making erroneous inferences about the system under study. 2 CONFIDENCE INTERVALS Unknown parameter θ is to be estimated from an output process X1, X2, . . . , Xn Confidence interval I(X1, . . . , Xn) = [L(X1, . . . , Xn), U (X1, . . . , Xn)] such that P {θ ∈ I(X1, . . . , Xn)} = 1 − α Confidence level 1 − α where the probability α, small, is given. Width U (X1, . . . , Xn) − L(X1, . . . , Xn) should be small. 3 ESTIMATORS Functions T (X1, . . . , Xn) for the estimation of the unknown parameter θ such that given an output x1, . . . , xn one may expect T (x1, . . . , xn) ≈ θ E[T (X1, . . . , Xn)] = θ: T unbiased limn→∞ E[T (X1, . . . , Xn)] = θ: T asymptotically unbiased 4 Statistical Theory For the construction of the confidence interval, probability distributions of the interval bounds, U (X1, . . . , Xn) and L(X1, . . . , Xn), are determined. Usual assumptions: • the X1, . . . , Xn are independent random variables • they are identically distributed • often: they are normally distributed 5 CLASSICAL CONFIDENCE INTERVALS r Ȳ ± tn−1,1−α/2 S2/n Sample (Y1, ..., Yn) of independent, normally distributed random variables Confidence level 1 − α, 0 < α < 1 Sample mean Ȳ = (Y1 + ... + Yn)/n n Sample variance S2 = (Y12 + ... + Yn2)/(n − 1) − n−1 Ȳ 2 (1 − α/2)-quantile of the Student distribution with n − 1 degrees of freedom tn−1,1−α/2 But: In simulation, mostly the assumptions of the statistical theory are not fulfilled 6 Resort • Central limit theorem: Y1 + . . . + Yn is nearly normally distributed if the sample Y1, . . . , Yn is IID and if n is large • Independent replications of the simulation with different random number streams - the estimators in these runs are independent • Grouping consecutive results of a long simulation run into batches - considered to be (nearly) independent • Evaluating only the steady state phase of each simulation run - ignoring the transient phase 7 ACCURACY Inaccurate confidence intervals not unusual in simulation, e.g. assumed confidence level 90%, coverage only 80%. This means: In many different simulations, only approximately 80% of the confidence intervals contain the real value. Comparative numerical studies: More elaborated techniques (regenerative method, autoregressive processes, spectral estimation method, standardized time series method) may be less accurate than batch-means method and replication/deletion method. Median confidence intervals may be even more accurate. In long simulation runs, the accuracy of the confidence intervals is better. 8 Min-Max Confidence Intervals (MMCI), Median Confidence Intervals (MCI) A new confidence interval (CI) technique for simulation results • Easy to apply • Accurate • Generally applicable 9 Main Features • Easy to obtain: w independent replications (simulation runs) or a single simulation run with w subsequent phases for batches of data – typically w = 5 or 6. • The variance of the estimator is not used, correlated output is implicitly considered. Hence, a serious problem is omitted which usually arises when confidence intervals for simulation results are derived. • Even if the variance does not exist, an MCI can be constructed whereas a classical CI cannot. • Sequential procedure: If a median confidence interval (MCI) is too wide, given a confidence level, it can be narrowed: Each of the replications are augmented, beginning with the last state. Similar for batches. 10 • If a measure is estimated with a function of some estimators, an MCI can be given. Example: Λ(n) estimates the throughput of a queue and W (n) the mean waitig time. Then the product Λ(n)W (n) estimates the mean number of customers in the queue (Littles Formula). • For some samples of independent random variables, e.g. normally distributed, we found MCIs which are sometimes slightly wider than usual CIs. But such simple statistic occurs seldom in simulation. • Here the output is usually dependent, and the distribution is unknown. Under these circumstances, classical CIs are usually too narrow, the confidence level is not realistic, the CIs too often do not cover the real unknown value, they are only approximate. MCIs are more accurate. 11 • The MCI technique is exact when the median and the mean of an estimator coincide. This holds for symmetrical distributions – the most important one in simulation is the normal distribution. Due to the central limit theorem and long simulation runs with fast computers many estimators are nearly normally distributed. • But in principle, the MCI technique is not restricted to the case median = mean. In this general case, one must know a single value of the estimator distribution function Fθ (x), namely the probability F = Fθ (θ) where θ is the unknown parameter. • Not each confidence level is possible, only the values 1 − F w − (1 − F )w , w = 2, 3, . . . Here, w is the number of independent replications or of batches of data. • In the special case median = mean, F = 0.5 holds, and the possible confidence levels are 50%, 75%, 87.5%, 93.75%, 96.875%, 98.4%, 99.2%, 99.6%, 99.8%, 99.9%, ... 12 The Basic Principle Sample X1,1, ..., X1,m of random variables, one run of a steady-state simulation or of n terminating runs. θ unknown parameter to estimate. T (X1,1, ..., X1,m) estimator, distribution function Fθ (x). Novel kind of confidence interval T min , T max (1) where T min = min Ti, T max = max Ti, 1≤i≤w 1≤i≤w and Ti = T (Xi,1, ..., Xi,m), i = 1, . . . , w estimators for w independent replications Xi,1, ..., Xi,m of the sample X1,1, ..., X1,m. 13 Theorem 1 The interval (1) is a confidence interval for the parameter θ with the confidence level 1 − F w − (1 − F )w , i.e. P {T min ≤ θ < T max} = 1 − F w − (1 − F )w holds where F = Fθ (θ), the value of the estimator distribution function at θ. The Most Important Special Case: Mean = Median Here, the unknown parameter is the median of the estimator, Fθ (θ) = 1/2 This holds for unbiased estimators and symmetrical distributions, e.g. the estimator is normally distributed. Then for the confidence interval P {T min ≤ θ < T max} = 1 − 0.5w−1 holds, and the possible confidence levels are 1 − 0.5w−1, w = 2, 3, . . . 14 Batch Median Confidence Intervals for steady state statistics. We applied the idea of the batch means method: Grouping output data into batches and assuming these batches to being independent. A single simulation run: – First the transient phase, – then w phases for w batches of output data. From each batch one obtaines an estimate T̂i, i = 1, . . . , w. The batch mean confidence interval (BMCI) is [ min T̂i , max T̂i ). 1≤i≤w 1≤i≤w 15 Interesting application where F can be calculated: Order statistics as estimates for quantiles. Consider samples X1, . . . , Xn and the according ordered sequence X(1), . . . , X(n), X(i) ≤ X(j) if i < j, where the Xi are IID with the strictly increasing distribution function F (x). The q-quantile θ = xq , q ∈ (0, 1), F (xq ) = q, is estimated by X(r), r ∈ {1, 2, . . . , n}. Let Fθ (x) denote the distribution function of the estimator, namely X(r). 16 Here, F = Fθ (x) is known: Theorem 2 If the q-quantile xq is estimated by X(r), the min-max confidence interval (1) has precisely the confidence level of theorem 1 with F = n i n−i q (1 − q) . i n X i=r (2) Remarks 1. Here the value F = Fθ (xq ) is independent of the actual distribution function of the sample elements Xi. 2. Theorem 2 is not useful for the simulation of the extremes, q = 0 or q = 1. Here one gets the confidence level 0. 3. Usually, k ≈ qn is chosen. 17 Corollary If the sample size n is odd, r = dn/2e and q = 0.5, i.e. the median is estimated, F = 0.5 holds. 18 Confidence Intervals in Simulation are Usually Approximate Assumptions are not satisfied, in general What means approximate confidence • The distribution of the estimator (nor- interval? If for a parameter of a sim- mal, Student) ulation model, many confidence intervals are calculated in many simulations, the real • Independency of the r.v. in the sample value lies in some of them, in the others it • For some methods other assumptions does not. The coverage C is the fraction For median confidence intervals the as- of runs where it is within. sumptions are weaker: If the limit of this coverage equals the confidence level CL = 1−α, the confidence in- • Only symmetry of the distribution of terval technique is exact, otherwise approx- the estimator imate: The confidence level is not reached, • Independency of the replications, not of CL 6= C. the r.v. within them 19 Numerical Experience Each simulation experiment: w= 5 independent replications for median Many simulation studies. confidence intervals (MCI) and for the repli- Comparison of classical confidence inter- cation/deletion method val methods with or median confidence intervals or with batch w= 5 batches for batch median confidence median confidence intervals. intervals (BMCI) and for the batch means Each Study: method. Many independent simulation experiments w= 5 implies a confidence level CL = for the estimation of the coverage of each 93.75% for the MCIs and BMCIs. considered confidence interval technique. Measure for the accuracy: The error CL − C = confidence level – observed coverage. 20 1. M/M/1 Queueing System: Waiting Times (Delays) Law and Kelton comparative study for difBatch Means 0.102 Standardized Time Series 0.102 Spectrum Analysis 0.067 Autoregressive Method 0.145 400 independent simulation experiments for Regenerative Method Classical 0.155 each run length n and each CI method, n Regenerative Method Jackknife 0.137 ferent well known methods for confidence intervals. Utilization 0.8; known to be statistically difficult. Batch Median Confidence Intervals 0.045 = 2560 delays e.g. −→ coverage C. We conducted an according simulation Errors CL − C study with the same model and the same run lengths including batch median confi- Error 0.145 means confidence level CL = dence intervals (BMCI). 90%, observed coverage C = 75.5%, e.g. 21 2. M/M/1 Queueing System Comparison of the replication/deletion method (RD) and median confidence intervals (MCI). Low and high utilization (ρ = 0.25 and 0.8). Short and long simulation runs. ρ 0.25 0.8 Run replication/deletion median confidence intervals Short 0.023 0.017 Long 0.003 0.002 Short 0.056 0.043 Long 0.012 0.004 Errors CL − C Long runs: Both methods good – Short runs: MCIs slightly better 22 3. M/M/1 Queueing System, Ratios of Estimators The same M/M/1-model as before. Comparison of jackknife intervals and median confidence intervals for the mean delay, Ŵ , as (r) ratio of Q̂/λ̂ of the mean number of jobs in the waiting room and the mean throughput. ρ 0.25 0.8 Run RD, Jackknife Median Confidence Intervals Short 0.091 0.015 Long 0.076 0.001 Short 0.121 0.037 Long 0.085 0.003 Errors CL − C Median confidence intervals are more accurate. 23 4. Pareto distribution We are interested in parameters of heavy-tailed Pareto distributions, F (x) = 1 − x−a, 0 < a <= 2, x ≥ 1, with expectation a/(a − 1) for a > 1, median 21/a, the variance does not exist. 1000 simulation experiments, each with sample size n = 5000. The classical confidence interval for the expectation does not exist. Median confidence intervals for the expec- Median confidence intervals for the order tation: statistic for the median: a CL − C 2 0.016 good 2 0.082 0.000 1.5 0.107 bad 1.5 0.079 0.000 not acceptable 1.1 0.086 0.005 1.1 a Confidence Interval Median CI Errors CL − C 24 5. Reliability Model The model consists of three components and will function as long as component 1 works and either component 2 or 3 works. Gi is the time to failure of component i, i = 1, 2, 3, and G = min{G1, max{G2, G3}} the time to failure of the whole system. The random variables Gi are independent, and each Gi has a Weibull distribution F (x) = √ 1 − exp(− x), x > 0. The estimator of the expectation of G has a very skewed and nonnormal distribution, all confidence intervals are quite inaccurate, for small sample sizes. 8000 simulation experiments, each with sample size n = 5 or 40. n Classical CI Median CI 5 0.191 0.147 40 0.069 0.032 Errors CL − C 25 Potential Further Development of the Technique The assumption of symmetry of the estimator distribution can be omitted, even the estimator may be biased, only F = Fθ (θ), the value of the estimator distribution function at θ, the unknown parameter, must be known. Then we speak of “min-max confidence intervals” (MMCI). They are exact if the w replications are independent, their confidence level is CL = 1 − F w − (1 − F )w . Crucial problem: This value Fθ (θ). We do not know an adequate method for estimating it efficiently. 26 But this MMCI idea works, we tried a brute-force procedure: Very long and expensive simulations for an empirical distribution of the r.v. G of example 5, then the distribution function F̂θ (x) of the estimator with convolution, and with an estimation of the unknown parameter, θ̂, we obtained F̂ = F̂θ (θ̂) and an estimate ĈL. n Coverage ĈL 5 0.791 0.791 40 0.909 0.907 Coverages and Estimated Confidence Levels Accurate, isn’t it? But so not practicable 27