Compilation: Statistics in Student Reports (2006) Compilation: Statistics in Student Reports (2006) Date: Mon, 10 Jul 2006 From: "SULLIVAN, PETER" Subject: statistics used by modelers I am trying to figure out what statistics to use when I have students report uncertainties. In the past I have had them use Average Deviation of the Mean (ADM). However, a more common statistic would be the standard deviation. Yet my understanding is that this statistic is only appropriate for data sets with 30 or more trials, where as the ADM is appropriate for smaller size data sets. Do any modelers use either of these stats and have a reason why one is more appropriate than the other? ------------------Date: Thu, 20 Jul 2006 From: Tim Erickson Using any of these is appropriate as long as we recognize what they mean. They are all measures of spread, just as the mean and the median are measures of center. You could use: -Average [absolute] Deviation [from] the Mean (also known as Mean Absolute Deviation -MAD) -Standard Deviation (SE) -Interquartile Range (IQR) (this is the distance between the 25th and 75th percentile in the data, and is the width of the box in a box plot). -Range: the maximum minus the minimum value -Standard ERROR (SE), which is probably what you're looking for. SE = SD/sqrt(N) - Or… any other statistic, including one of your own design, that measures how spread out the data are. The idea of a magic number of 30 doesn't necessarily apply here. The key is to have the uncertainty mean what you think it does. To that I offer three responses, the first a more or less official response, which I find unsatisfying, and two more that may be more what we're about in modeling. In order to do this, we need a context where you would need such an uncertainty. Suppose you're rolling balls down a ramp from 50, 100, 150, 200, 250 cm, and students are using stopwatches to time the roll. They roll the balls 5 times from each height and average the results. Then you plot average time versus distance, and do your best to fit the square-root curve to the result. The question is, what is the uncertainty in the measurement of time? And what do we do with that? PLAN A: Official Stats If you consider the measurements to a sample from a distribution of all measurements, we ask, what's the mean of the "source" population? We cannot answer that directly, but we can say that with any reasonable distribution of source measurements, the distribution of the MEAN of a sample of measurements will be normal with a standard deviation equal to (the standard deviation of the source) divided by (the square root of the number of measurements in the sample). This quantity is called the standard error (SE). Without confusing the issue with more statsy gobbledygook, many scientists, in this situation, use error bars with a length of two (well, 1.96) SE to represent 95% confidence (NOT probability) that the "true" value is within the error bars. You could use 3 SE to give you 99% confidence. Example: at 100 cm, students measure 9 times and get a mean of 1.00 seconds with a standard deviation of 0.06 seconds. One standard error, then, is 0.06/sqrt(9) = 0.02 seconds. So the error bar will extend from 0.96 to 1.04 seconds, that is, 1.00 seconds +/- 2 SE. Then, when you try to fit a curve, you make it go through the error bars as well as you can. Error bars are nice because they help students see that the measurements 1 Compilation: Statistics in Student Reports (2006) are uncertain. Notice that using SE, the error bars get smaller as the sample size increases. That's because we're asking for where we think the MEAN of the underlying "truth" is, and we know that better with bigger samples, even though the spread of the individual data values may remain the same (that is, students don't get any more precise because there are more measurements, but more measurements give you a more precise mean value). The disadvantage is that students have yet another formula to remember, and the rationale behind it is rather deep and subtle, depending as it does on the Central Limit Theorem (ewwww...) and other things we will not discuss. PLAN B: Your Own Measure The Standard Error (SE) is official. It has particular properties that practicing scientists use to their advantage. But that may not be relevant to students in our classes, in fact, it may be confusing. So use whatever measure you want. Many students find MAD understandable. A disadvantage is that you don't get the uncertainty going down as N increases. This may not be a problem. PLAN C: Don't Do It At All! As mentioned in a recent post, there are some situations (like this one) where you may get what you need without doing the calculation. Don't calculate the mean or any uncertainty. Instead, plot all of the data. The spread of points acts like a visual error bar, reinforcing the uncertainty in the measurements. Then when you fit a curve, make it go through the clouds of points as well as possible. The human eye and brain naturally make adjustments that kinda-sorta do the sqrt(N) thing. Having said all that, we mean different things by uncertainty in measurement. I've been trying to address the issue of repeated measurements that we believe to be unbiased, when we think there is some underlying "true" value -- and we get different values because of measurement error or some other kind of fluctuation that we cannot model. 2