Statistics in student reports

advertisement
Compilation: Statistics in Student Reports (2006)
Compilation: Statistics in Student Reports (2006)
Date: Mon, 10 Jul 2006
From: "SULLIVAN, PETER"
Subject: statistics used by modelers
I am trying to figure out what statistics to use when I have students report uncertainties. In the
past I have had them use Average Deviation of the Mean (ADM). However, a more common
statistic would be the standard deviation. Yet my understanding is that this statistic is only
appropriate for data sets with 30 or more trials, where as the ADM is appropriate for smaller size
data sets. Do any modelers use either of these stats and have a reason why one is more
appropriate than the other?
------------------Date: Thu, 20 Jul 2006
From: Tim Erickson
Using any of these is appropriate as long as we recognize what they mean. They are all measures
of spread, just as the mean and the median are measures of center. You could use:
-Average [absolute] Deviation [from] the Mean (also known as Mean Absolute Deviation -MAD)
-Standard Deviation (SE)
-Interquartile Range (IQR) (this is the distance between the 25th and 75th percentile in the data,
and is the width of the box in a box plot).
-Range: the maximum minus the minimum value
-Standard ERROR (SE), which is probably what you're looking for. SE = SD/sqrt(N)
- Or… any other statistic, including one of your own design, that measures how spread out the
data are.
The idea of a magic number of 30 doesn't necessarily apply here. The key is to have the
uncertainty mean what you think it does. To that I offer three responses, the first a more or less
official response, which I find unsatisfying, and two more that may be more what we're about in
modeling. In order to do this, we need a context where you would need such an uncertainty.
Suppose you're rolling balls down a ramp from 50, 100, 150, 200, 250 cm, and students are using
stopwatches to time the roll. They roll the balls 5 times from each height and average the results.
Then you plot average time versus distance, and do your best to fit the square-root curve to the
result. The question is, what is the uncertainty in the measurement of time? And what do we do
with that?
PLAN A: Official Stats
If you consider the measurements to a sample from a distribution of all measurements, we ask,
what's the mean of the "source" population? We cannot answer that directly, but we can say that
with any reasonable distribution of source measurements, the distribution of the MEAN of a
sample of measurements will be normal with a standard deviation equal to (the standard
deviation of the source) divided by (the square root of the number of measurements in the
sample). This quantity is called the standard error (SE). Without confusing the issue with more
statsy gobbledygook, many scientists, in this situation, use error bars with a length of two (well,
1.96) SE to represent 95% confidence (NOT probability) that the "true" value is within the error
bars. You could use 3 SE to give you 99% confidence. Example: at 100 cm, students measure 9
times and get a mean of 1.00 seconds with a standard deviation of 0.06 seconds. One standard
error, then, is 0.06/sqrt(9) = 0.02 seconds. So the error bar will extend from 0.96 to 1.04 seconds,
that is, 1.00 seconds +/- 2 SE. Then, when you try to fit a curve, you make it go through the error
bars as well as you can. Error bars are nice because they help students see that the measurements
1
Compilation: Statistics in Student Reports (2006)
are uncertain. Notice that using SE, the error bars get smaller as the sample size increases. That's
because we're asking for where we think the MEAN of the underlying "truth" is, and we know
that better with bigger samples, even though the spread of the individual data values may remain
the same (that is, students don't get any more precise because there are more measurements, but
more measurements give you a more precise mean value). The disadvantage is that students have
yet another formula to remember, and the rationale behind it is rather deep and subtle, depending
as it does on the Central Limit Theorem (ewwww...) and other things we will not discuss.
PLAN B: Your Own Measure
The Standard Error (SE) is official. It has particular properties that practicing scientists use to
their advantage. But that may not be relevant to students in our classes, in fact, it may be
confusing. So use whatever measure you want. Many students find MAD understandable. A
disadvantage is that you don't get the uncertainty going down as N increases. This may not be a
problem.
PLAN C: Don't Do It At All!
As mentioned in a recent post, there are some situations (like this one) where you may get what
you need without doing the calculation. Don't calculate the mean or any uncertainty. Instead, plot
all of the data. The spread of points acts like a visual error bar, reinforcing the uncertainty in the
measurements. Then when you fit a curve, make it go through the clouds of points as well as
possible. The human eye and brain naturally make adjustments that kinda-sorta do the sqrt(N)
thing.
Having said all that, we mean different things by uncertainty in measurement. I've been trying to
address the issue of repeated measurements that we believe to be unbiased, when we think there
is some underlying "true" value -- and we get different values because of measurement error or
some other kind of fluctuation that we cannot model.
2
Download