Quantifying measurement error Blake Laing, Ph.D., Southern Adventist University [7/29/2015] LEARNING OUTCOMES 1. Quantify the precision of a single measurement or a mean. Interpret confidence intervals. 2. Quantify the accuracy of a measurement or a mean using absolute error (or percentage error). 3. Quantitatively compare a measurement to an expected value, using standard error and absolute error (or percentage standard error and percent error). 4. Quantitatively compare two measurements and their associated confidence intervals. 5. Determine whether differences are likely due to random error or systematic error 6. Estimate standard deviation or standard error from data, histogram, or from other information using the statistical interpretation of standard deviation. BACKGROUND INFORMATION “Measurement error” may sound like a mistake, but it simply means the uncertainty in a measured value. We distinguish between three types of measurement error: random error, systematic error, and a personal error (ok, mistakes do happen). Precision is limited by random error. Accuracy is limited by systematic error. The precision of a single measurement can be quantified1 by the standard deviation σ. You don’t need to know the formula, but you do need to know how to interpret it. Working definition of standard deviation: about 68% of data will be within one standard deviation of the mean, 95% within two standard deviations, and so on. Interval Confidence level “Chances” measurements outside CI 𝑥̅ ± 𝜎 68.27% 1 in 3 32% 𝑥̅ ± 2𝜎 95.45% 1 in 22 5% 𝑥̅ ± 3𝜎 99.73% 1 in 370 0.3% The precision of the mean is quantified by the standard error α. 𝛼= 𝜎 √𝑁 This is also called the error of the mean. 1 I’m talking about the sample standard deviation here, which admittedly isn’t the best tool for small data sets. Our interpretation will assume that data is normally-distributed (a “bell curve”). General/University Physics I Laboratory Southern Adventist University To quantify the accuracy of a measurement, you must compare to an expected value. The expected value may be referred to as the “theoretical”, “known”, or “actual” value, but this is a little presumptuous, isn’t it? Perhaps your measurement is closer to the truth. RESEARCH QUESTIONS A small bottle of a glucose (D-glucose, also called dextrose) solution has been provided to you. Your lab instructor can tell you the mass of glucose and volume of distilled water used to make the solution. Your goal is to quantitatively answer the following three questions. 1. What is the precision of your glucometer? 2. What is the accuracy of your glucometer? 3. What is the concentration of glucose in the provided solution? Experimental notes The manufacturer’s instructions for the testing strips note that the reaction site must be completely full of solution. Also note that the drop must be placed in the tip of the strip, not on top. Measurements mean nothing without units. For repeated measurements record units in the column label. Always use (non-erasable) pen to record data and never use correction fluid. It’s OK to make mistakes! Just cross your mistake out once. CALCULATIONS INDIVIDUAL DATA 1. Enter your measured data in the provided spreadsheet template then have only one person from your group copy your glucose concentrations and paste into the provided cloud-based spreadsheet for the whole-class data. Anyone can view the class data spreadsheet, but if the whole class attempts to edit at once, we could experience problems. 2. In Calculations Table 1, calculate the mean (or average), minimum, and maximum glucose concentrations. Note that a “hint box” has been provided in the spreadsheet, which suggests that you use spreadsheet functions such as Average, Min, and Max for this. 3. Create a histogram manually. a. You’ll divide the range of possible measurements into “bins” of equal width. The number of data points within each bin is called the frequency. You’ll count manually2 to determine each frequency. An easy way to do this is to sort the data. Yes, there is a way to do this automatically using the Frequency function. We used to do it that way, but it seemed that people didn’t learn enough. 2 General/University Physics I Laboratory 4. 5. 6. 7. 8. 9. Southern Adventist University b. I suggest choosing a bin size such that most of the data is within 9 bins. Making these choices is sometimes called “binning the data”. It’s a balancing act—too many bins would make all frequencies either 0 or 1. Too few bins would put all data in one bin. You might as well make a choice that makes the histogram look the most like a “bell curve”. c. Add the frequencies and verify that the result is what you should expect. If not, then something is wrong. Quantify the precision of each measurement by calculating the (sample)3 standard deviation 𝜎 using the function STDEV() in Calculations Table 2. a. The 68% confidence interval for each measurement is the range of values in the interval (𝑐̅ − 𝜎, 𝑐̅ + 𝜎), where 𝑐̅ is the mean concentration. Calculate the limits of this range and determine the percentage of measurements which are within this range. It won’t be exactly 68%. b. The 95% confidence interval for each measurement is the range of values in the interval (𝑐̅ − 2𝜎, 𝑐̅ + 2𝜎). Determine the percentage of measurements which are within this range. Quantify the precision of the mean of all measurements by calculating the standard error α. Calculate the end points of the 68% confidence interval (𝑐̅ − α, 𝑐̅ + α). Calculate the end points of the 95% confidence interval (𝑐̅ − 2α, 𝑐̅ + 2α). At the 68% confidence limit, the uncertainty in the mean value is α. If, for instance, α were 0.5 mg/dL (uncertainties only need one significant figure), then the tenths place of the mean is the least significant digit. Go back and adjust the significant figures on the mean value accordingly. Always use this information to display only significant digits in all calculated mean values. Adjust the number of digits displayed in the spreadsheet using the “magic points” box which changes how many significant figures are displayed (the whole, unrounded number is still used in calculations). Figure 1 Mario always hits the magic The precision of each measured value and of the mean of points box before proceeding with his calculations many values has now been described. Now describe the accuracy of the mean value by comparing to the expected value. Calculate the expected glucose concentration in mg/dL in Calculations Table 3. Compare by taking the absolute error and the percentage error. Pay attention to significant figures. Do not manually enter numbers to obtain correct significant figures, if possible. Compare the measured mean value to a single measurement using a different kind of glucometer in Calculations Table 4. The right tool for this job is the percent difference. Some students will lose points all semester long for incorrect “sig figs”. Compare with a classmate to be sure that you are only displaying significant digits, then check with the Learning Assistant (LA) or instructor. CLASS DATA 1. Paste in data from online class data spreadsheet. If using a browser other than Internet Explorer, you might not be able to copy and paste all the data at once. 3 There are two kinds of standard deviation, and in this course we will only use the sample standard deviation General/University Physics I Laboratory Southern Adventist University 2. Create a histogram as before. 3. Visually estimate what the standard deviation σ must be such that 68% of the data is within one σ of the mean. The provided spreadsheet template will draw “error bars” representing the width of the 68% confidence interval based on the estimated value for σ you control using the “slider”. Also, the number of data points within your estimated standard deviation is counted for you. You can calculate this number as a percentage. 4. Compute the actual standard deviation and check to see if your estimate was close. Notice whether the standard deviation has changed much with the additional data points. It shouldn’t, unless there are more outliers than we should expect. THIS IS OUTRAGEOUS! First of all, the results of your measurements for aqueous glucose cannot immediately be applied to determine the uncertainty when measuring capillary blood. It is true, however, that a recent study[1] found that many of the glucometers on store shelves do not meet the minimum FDA requirements for accuracy and precision, and yet millions of patients rely on these devices to make medical decisions. If you would like to learn more about public advocacy efforts to correct this, visit http://www.stripsafely.com/strip/ REFERENCES [1] G. Freckman et al J. Diabetes Sci. Technol. 6(5) 1060-75 (2012) General/University Physics I Laboratory Southern Adventist University PRE-LABORATORY INVESTIGATION (DUE AT THE BEGINNING OF LAB) 1. Draw lines matching the quantity that is the right tool for each job Jobs Quantify precision of the mean Tools Absolute error Quantify accuracy Percent error Quantify random error in each measurement Quantify systematic error Standard deviation Standard error 2. Fatima measures the acceleration of gravity at her location from the mean of 100 measurements to be 𝑔̅ = 9.71439 𝑚/𝑠 2 with a standard deviation of σ = 0.241 𝑚/𝑠 2 . a. Fatima wants to state with 95% confidence that the uncertainty due to random error in her mean value is not more than a certain value? What is the uncertainty in 𝑔̅ ? b. Fatima compares her measured value of 𝑔̅ to the expected value for this location: 9.79660 𝑚/𝑠 2 , where all digits are significant. What is the absolute error? Suggestion: keep one extra digit in all calculations and put a line over the least significant digit. c. Absolute error is a measure of systematic error. We don’t know whether this absolute error is large or small until we compare it to something. One comparison is to the expected value. Calculate the percent error, being careful with significant figures. d. Absolute error can also be compared to the standard error by taking a ratio (“how many standard errors away is Fatima’s result”?). There is a 32% probability that random error could cause the absolute error to be more than one standard error. There’s a 0.3% probability for three standard errors. Using this kind of reasoning, what can you say about the likelihood that the difference between Fatima’s measurement and the expected value is due to random error? General/University Physics I Laboratory [This page left intentionally blank] Southern Adventist University General/University Physics I Laboratory Southern Adventist University RAW DATA (YOU’RE USING A NON-ERASABLE PEN, RIGHT?) Name, date: Working with: Reli-on Prime measurements Record units in the column header. trial Glucose ( ) trial Glucose ( ) trial Glucose ( 1 21 41 2 22 42 3 23 43 4 24 44 5 25 45 6 26 46 7 27 47 8 28 48 9 29 49 10 30 50 11 31 12 32 13 33 14 34 15 35 16 36 17 37 18 38 19 39 20 40 Expected concentration Glucose mass Water volume glucose concentration units ) General/University Physics I Laboratory Southern Adventist University QUESTIONS 1. Does the standard deviation get much smaller as more measurements are taken? How about the standard error? Answer these questions quantitatively by making a table of the standard deviation and standard error for 5, 25, and 50 data points using your data, and for all points of the class data. Explain whether σ or α would be appropriate to describe the precision of the glucometer. Number σ ( ) α( ) 5 25 50 2. Compare the 68% confidence interval of the measured mean glucose concentration graphically. Draw on the number line below a line indicating the expected and the mean value. Graphically illustrate this comparison below by drawing (to scale on the same horizontal axis) error bars representing the range 𝑐̅ ± α = (𝑐̅ − α, 𝑐̅ + α) and indicate the position of the expected value. Label the tic marks on the horizontal axis. For example, does your data look like this or this? 3. Quantitatively compare your mean glucose concentration to the expected value by comparing calculating “how many standard errors away” the expected value is from the mean value. Is the difference likely due to random error in the determination of the mean value? A compelling argument would include probability. General/University Physics I Laboratory Southern Adventist University 4. How many data points on the whole-class histogram are within one standard deviation of the average? The bin counts are displayed on the histogram. What did you expect the count to be? 5. Quantitatively compare the mean glucose measurement for the class data to the single measurement made by a different glucometer. “Quantitative” mean to use numbers to make your case. (You say the difference is small? Compared to what?) 6. Glucose monitors in the United States are regulated by the FDA and are subject to the ISO standard that 95% of measured results (using whole blood) must be within ±20% of the true value. Suppose that the glucometers used in this lab were correctly calibrated such that the average 𝑐̅ is the true value, and that if we had used whole blood we would have obtained the same standard deviation. Would these devices conform to the ISO standard? Justify. General/University Physics I Laboratory Southern Adventist University ADDITIONAL QUESTIONS TO DISCUSS (WILL NOT BE GRADED) It’s your first day in lab, and maybe the first time you’ve had to think so hard at how to answer quantitative questions. Here are some additional questions that I find interesting. I like to make my test questions interesting. 7. Suppose that a friend has been diagnosed with diabetes and has been taking 4 glucose readings per day. She is distressed by the occasionally-wild variations of her readings. If she were using the glucometer tested today, explain to her what kind of differences should be expected between the measured value and the actual value for each measurement and for the weekly average of measurements in terms of percentages (not mentioning σ or α) so she doesn’t drive herself crazy. If 20 is the standard deviation, then 2/3 or 68% of the time values will be off by less than 20, but about 30% of the time it will be off by more than 20. We can’t say what range each measurement will be in, because her actual glucose concentration won’t be static like our solution. 8. Unbeknownst to many, glucometer manuals often call for periodic calibration using a control solution. If the solution provided to you were an appropriate control solution, would you say that calibration is necessary, or is it likely that the difference is due to random error?