CDA6530: Performance Models of Computers and Networks Chapter 9:Statistical Analysis of Simulated Data and Confidence Interval Sample Mean R.v. X: E[X]=µ, Var[X]=¾2 Q: how to use simulation to derive? Simulate X repeatedly X1, , Xn are i.i.d., =statistic X Xn X i Sample mean: X¹ ´ i= 1 E [ X¹ ] = µ n ¾2 V ar ( X¹ ) = n 2 Sample Variance ¾2 unknown in simulation 2 Hard to use V ar ( X¹ ) = ¾ to measure n simulation variance Thus we need to estimate ¾2 Sample variance S2: Pn 2 ¹ ( X ¡ X ) i 2 i = 1 S = n¡ 1 n-1 instead of n is to provide unbiased estimator E [S2 ] = ¾2 3 Estimate Error Sample mean X¹ is a good estimator of µ, but has an error How confidence we are sure that the sample mean is within an acceptable error? From central limit theorm: p ( X¹ ¡ µ) n » N ( 0; 1) ¾ It means that: p ( X¹ ¡ µ) n » N ( 0; 1) S 4 Confidence Interval R.v. Z» N(0,1), for 0<®<1, define: P(Z>z®) = ® z0.025 = 1.96 P(- z0.025 <Z< z0.025) =1-2® = 0.95 S ¹ P ( X ¡ z®=2 p < µ < n S ¹ P ( X ¡ 1:96 p < µ < n S ¹ X + z®=2 p ) ¼ 1 ¡ ® n S ¹ X + 1:96 p ) ¼ 0:95 n 95% confidence interval (® = 0.05) of an estimate is: p ( X¹ § 1:96S= n) 5 When to stop a simulation? Repeatedly generate data (sample) until 100(1-®) percent confidence interval estimate of µ is less than I Generate at least 100 data values. Continue generate, until youp generated k values such that 2z®=2 S= k < I The 100(1-®) percent confidence interval of estimate is p p ( X¹ ¡ z®=2 S= k; X¹ + z®=2 S= k) 6 Fix no. of simulation runs Suppose we only simulate 100 times k=100 What is the 95% confidence interval? p p ( X¹ ¡ z®=2 S= k; X¹ + z®=2 S= k) p p ( X¹ ¡ 1:96S= k; X¹ + 1:96S= k) 7 Example: Generating Expo. Distribution Compare 100 samples and 1000 samples confidence intervals 8