Errors in Experimental Measurements Sources of errors Accuracy, precision, resolution A mathematical model of errors Confidence intervals For means For proportions How many measurements are needed for desired error? Copyright 2004 David J. Lilja 1 Why do we need statistics? 1. Noise, noise, noise, noise, noise! OK – not really this type of noise Copyright 2004 David J. Lilja 2 Why do we need statistics? 445 446 397 226 388 3445 188 1002 47762 432 54 12 98 345 2245 8839 77492 472 565 999 1 34 882 545 4022 827 572 597 364 2. Aggregate data into meaningful information. x ... Copyright 2004 David J. Lilja 3 What is a statistic? “A quantity that is computed from a sample [of data].” Merriam-Webster → A single number used to summarize a larger collection of values. Copyright 2004 David J. Lilja 4 What are statistics? “A branch of mathematics dealing with the collection, analysis, interpretation, and presentation of masses of numerical data.” Merriam-Webster → We are most interested in analysis and interpretation here. “Lies, damn lies, and statistics!” Copyright 2004 David J. Lilja 5 Goals Provide intuitive conceptual background for some standard statistical tools. • • Draw meaningful conclusions in presence of noisy measurements. Allow you to correctly and intelligently apply techniques in new situations. → Don’t simply plug and crank from a formula. Copyright 2004 David J. Lilja 6 Goals Present techniques for aggregating large quantities of data. • • Obtain a big-picture view of your results. Obtain new insights from complex measurement and simulation results. → E.g. How does a new feature impact the overall system? Copyright 2004 David J. Lilja 7 Sources of Experimental Errors Accuracy, precision, resolution Copyright 2004 David J. Lilja 8 Experimental errors Errors → noise in measured values Systematic errors Result of an experimental “mistake” Typically produce constant or slowly varying bias Controlled through skill of experimenter Examples Temperature change causes clock drift Forget to clear cache before timing run Copyright 2004 David J. Lilja 9 Experimental errors Random errors Result of Unpredictable, non-deterministic Unbiased → equal probability of increasing or decreasing measured value Limitations of measuring tool Observer reading output of tool Random processes within system Typically cannot be controlled Use statistical tools to characterize and quantify Copyright 2004 David J. Lilja 10 Example: Quantization → Random error Copyright 2004 David J. Lilja 11 Quantization error Timer resolution → quantization error Repeated measurements X±Δ Completely unpredictable Copyright 2004 David J. Lilja 12 A Model of Errors Error Measured value Probability -E x–E ½ +E x+E ½ Copyright 2004 David J. Lilja 13 A Model of Errors Error 1 Error 2 Probability -E Measured value x – 2E -E +E x ¼ +E -E x ¼ +E +E x + 2E ¼ -E Copyright 2004 David J. Lilja ¼ 14 A Model of Errors Probability 0.6 0.5 0.4 0.3 0.2 0.1 0 x-E x x+E Measured value Copyright 2004 David J. Lilja 15 Probability of Obtaining a Specific Measured Value Copyright 2004 David J. Lilja 16 A Model of Errors Pr(X=xi) = Pr(measure xi) = number of paths from real value to xi Pr(X=xi) ~ binomial distribution As number of error sources becomes large n → ∞, Binomial → Gaussian (Normal) Thus, the bell curve Copyright 2004 David J. Lilja 17 Frequency of Measuring Specific Values Accuracy Precision Mean of measured values Resolution Copyright 2004 David J. Lilja True value 18 Accuracy, Precision, Resolution Systematic errors → accuracy Random errors → precision How close mean of measured values is to true value Repeatability of measurements Characteristics of tools → resolution Smallest increment between measured values Copyright 2004 David J. Lilja 19 Quantifying Accuracy, Precision, Resolution Accuracy Hard to determine true accuracy Relative to a predefined standard Resolution E.g. definition of a “second” Dependent on tools Precision Quantify amount of imprecision using statistical tools Copyright 2004 David J. Lilja 20 Confidence Interval for the Mean 1-α α/2 c1 c2 Copyright 2004 David J. Lilja α/2 21 Normalize x xx z s/ n n number of measurement s n x mean xi i 1 2 ( x x ) i1 i n s standarddeviation Copyright 2004 David J. Lilja n 1 22 Confidence Interval for the Mean Normalized z follows a Student’s t distribution (n-1) degrees of freedom Area left of c2 = 1 – α/2 Tabulated values for t 1-α α/2 c1 c2 Copyright 2004 David J. Lilja α/2 23 Confidence Interval for the Mean As n → ∞, normalized distribution becomes Gaussian (normal) 1-α α/2 c1 c2 Copyright 2004 David J. Lilja α/2 24 Confidence Interval for the Mean c1 x t1 / 2;n 1 c2 x t1 / 2;n 1 s n s n T hen, Pr(c1 x c2 ) 1 Copyright 2004 David J. Lilja 25 An Example Experiment 1 2 3 4 5 6 7 8 Measured value 8.0 s 7.0 s 5.0 s 9.0 s 9.5 s 11.3 s 5.2 s 8.5 s Copyright 2004 David J. Lilja 26 An Example (cont.) x n x i 1 i 7.94 n s samplestandarddeviation 2.14 Copyright 2004 David J. Lilja 27 An Example (cont.) 90% CI → 90% chance actual value in interval 90% CI → α = 0.10 1 - α /2 = 0.95 n = 8 → 7 degrees of freedom 1-α α/2 c1 Copyright 2004 David J. Lilja c2 α/2 28 90% Confidence Interval a 1 / 2 1 0.10 / 2 0.95 t a;n 1 t0.95;7 1.895 1.895(2.14) c1 7.94 6.5 8 1.895(2.14) c2 7.94 9.4 8 a n 0.90 … 5 … … … 1.476 2.015 2.571 6 7 1.440 1.943 2.447 1.415 1.895 2.365 … ∞ … … … 1.282 1.645 1.960 Copyright 2004 David J. Lilja 0.95 0.975 29 95% Confidence Interval a 1 / 2 1 0.10 / 2 0.975 t a;n 1 t0.975;7 2.365 2.365(2.14) 6.1 8 2.365(2.14) c2 7.94 9.7 8 c1 7.94 a n 0.90 … 5 … … … 1.476 2.015 2.571 6 7 1.440 1.943 2.447 1.415 1.895 2.365 … ∞ … … … 1.282 1.645 1.960 Copyright 2004 David J. Lilja 0.95 0.975 30 What does it mean? 90% CI = [6.5, 9.4] 95% CI = [6.1, 9.7] 90% chance real value is between 6.5, 9.4 95% chance real value is between 6.1, 9.7 Why is interval wider when we are more confident? Copyright 2004 David J. Lilja 31 Higher Confidence → Wider Interval? 90% 9.4 6.5 95% 6.1 9.7 Copyright 2004 David J. Lilja 32 Key Assumption Measurement errors are Normally distributed. Is this true for most measurements on real computer systems? 1-α α/2 c1 Copyright 2004 David J. Lilja c2 α/2 33 Key Assumption Saved by the Central Limit Theorem Sum of a “large number” of values from any distribution will be Normally (Gaussian) distributed. What is a “large number?” Typically assumed to be >≈ 6 or 7. Copyright 2004 David J. Lilja 34 How many measurements? Width of interval inversely proportional to √n Want to minimize number of measurements Find confidence interval for mean, such that: Pr(actual mean in interval) = (1 – α) (c1 , c2 ) (1 e) x , (1 e) x Copyright 2004 David J. Lilja 35 How many measurements? (c1 , c2 ) (1 e) x s n x z1 / 2 z1 / 2 s xe n z1 / 2 s n xe Copyright 2004 David J. Lilja 2 36 How many measurements? But n depends on knowing mean and standard deviation! Estimate s with small number of measurements Use this s to find n needed for desired interval width Copyright 2004 David J. Lilja 37 How many measurements? Mean = 7.94 s Standard deviation = 2.14 s Want 90% confidence mean is within 7% of actual mean. Copyright 2004 David J. Lilja 38 How many measurements? Mean = 7.94 s Standard deviation = 2.14 s Want 90% confidence mean is within 7% of actual mean. α = 0.90 (1-α/2) = 0.95 Error = ± 3.5% e = 0.035 Copyright 2004 David J. Lilja 39 How many measurements? z1 / 2 s 1.895(2.14) 212.9 n x e 0.035(7.94) 2 213 measurements → 90% chance true mean is within ± 3.5% interval Copyright 2004 David J. Lilja 40 Proportions p = Pr(success) in n trials of binomial experiment Estimate proportion: p = m/n m = number of successes n = total number of trials Copyright 2004 David J. Lilja 41 Proportions c1 p z1 / 2 p (1 p ) n c2 p z1 / 2 p (1 p ) n Copyright 2004 David J. Lilja 42 Proportions How much time does processor spend in OS? Interrupt every 10 ms Increment counters n = number of interrupts m = number of interrupts when PC within OS Copyright 2004 David J. Lilja 43 Proportions How much time does processor spend in OS? Interrupt every 10 ms Increment counters n = number of interrupts m = number of interrupts when PC within OS Run for 1 minute n = 6000 m = 658 Copyright 2004 David J. Lilja 44 Proportions (c1 , c2 ) p z1 / 2 p (1 p ) n 0.1097(1 0.1097) 0.1097 1.96 (0.1018,0.1176) 6000 95% confidence interval for proportion So 95% certain processor spends 10.2-11.8% of its time in OS Copyright 2004 David J. Lilja 45 Number of measurements for proportions (1 e) p p z1 / 2 p (1 p ) n p (1 p ) ep z1 / 2 n 2 z1 / 2 p (1 p ) n 2 (ep ) Copyright 2004 David J. Lilja 46 Number of measurements for proportions How long to run OS experiment? Want 95% confidence ± 0.5% Copyright 2004 David J. Lilja 47 Number of measurements for proportions How long to run OS experiment? Want 95% confidence ± 0.5% e = 0.005 p = 0.1097 Copyright 2004 David J. Lilja 48 Number of measurements for proportions z12 / 2 p (1 p ) n 2 (ep ) (1.960) (0.1097)(1 0.1097) 2 0.005(0.1097) 1,247,102 2 10 ms interrupts → 3.46 hours Copyright 2004 David J. Lilja 49 Important Points Use statistics to Deal with noisy measurements Aggregate large amounts of data Errors in measurements are due to: Accuracy, precision, resolution of tools Other sources of noise → Systematic, random errors Copyright 2004 David J. Lilja 50 Important Points: Model errors with bell curve Accuracy Precision Mean of measured values Resolution Copyright 2004 David J. Lilja True value 51 Important Points Use confidence intervals to quantify precision Confidence intervals for Confidence level Mean of n samples Proportions Pr(actual mean within computed interval) Compute number of measurements needed for desired interval width Copyright 2004 David J. Lilja 52