Lab 2: Exercises in Uncertainty A key concept in any lab science is the uncertainty in measurements. No instrument or method is a perfect tool. No matter how expensive or advanced it may be, every measurement will have some amount of "error" involved. In this lab we will explore the methods used to quantify and handle this problem. Note that the key word in the last sentence is quantify. In layperson's terms, "quantify" means "stick an number on it". Numbers should be used to describe the quality of data, for example, "We found that gravity was 9.6±0.2 m/s2" This description not only tells the reader the experimental value for gravity, it also tells the reader exactly how good the measurement was. Most students are familiar with reading the uncertainty in a ruler. For a given ruler, a measurement can only be made to perhaps ±0.5mm. We assume that students are competent in measuring and stating this kind of uncertainty, and will spend today's lab concentrating on more powerful methods. Circles, Measurements and Graphs To start, we are going to look for circles. Take a piece of string and a caliper and find as many circles (or more likely, cylinders) as you can in ten minutes. Use the caliper to measure the diameter of the circle, and wrap the string around the circle to measure the circumference. Open up Microsoft Excel on one of the computers and make a table with the diameters in one column and the circumferences in the other. We will next want to make a graph. Pick the Chart Wizard option from the toolbar. The button looks like this: Pick X-Y Scatter Plot and take the default option. Add a title, labels and units to your chart. Click finish and Excel will generate a graph. Under the Chart menu, click on "Add Trendline". For options you want to display the equation and the R-squared value. This will put a least-squares fit on your graph and give the slope. The Rsquared number should be close to one. If it is not, it means that some of your data may be nonlinear. Note that if your R-squared number is equal to one, this often is a sign that you have made some kind of error as it implies straight-line data with no "noise". Before you do the next step, save your data and graphs. Excel is a little quirky, and can freeze while using the following function. So make sure that you don't have to redo all of your work. We've found the slope in the graph, but what's the uncertainty in that slope? Excel can tell us! We will use the LINEST feature in Excel. Go to an empty cell in Excel and type =LINEST(known Y value cells, known X value cells, 1, true) where "known Y value cells" are the data entries that correspond to the diameters and "known X value cells" are the circumferences. For example, if we had eight rows of data, we would type the following: =LINEST(B1:B8, A1:A8, 1, true) When we type this in Excel will return a single value, the slope. LINEST is an array function, which means that even though it is trying to give back a lot of values, it only displays one per cell. What Excel is expecting you to do is highlight a 2x2 set of cells with the LINEST formula in the upper-left cell. Once this is done, go up to the formula bar and highlight the entire LINEST equation you typed in and hit Ctrl-Shift-Enter (on a Mac use Apple-Return). This will fill in the 2x2 square. Upper left will be the slope, lower left will be the uncertainty on the slope. Upper right will be the intercept and lower right will be the uncertainty on the intercept. A little reflection on circles will tell you exactly what the slope and intercept should be. Do these true values fall inside your uncertainties? Repeated Measurements When a scientist performs an experiment, he or she will often do it multiple times. Why? Because repeated measurements allow for random measurement errors to cancel. The average (mean) of several measurements is almost always a better value than just one measurement. The reported value for your measurements will be the mean value. For students who need a refresher, the mean is the sum of all of the measurements, divided by the number of measurements. But what about a statement like "We found that gravity was 9.6±0.2 m/s2"? It was stated in the previous paragraph that more measurements implies a better estimate of the true value, how is that expressed in the statement of the measurement? It is reflected in the ±0.2 part. If we took more measurements, we could shrink this number. Here we need to introduce a concept known as the standard error. Many students know about the standard deviation from their statistics classes. A close relative is the sample standard deviation. It is defined as: σ = {[Σi (xmean - xi)2]/(n-1) }1/2 Hard to follow? Well, let's take the following numbers: 18 15 22 14 8 That totals to 77, so the mean value is 77/5 = 15.4, and to calculate the sample standard deviation we first need to sum the square of the differences: (18-15.4)2 = 6.76 (15-15.4)2 = 0.16 (22-15.4)2 = 43.56 (14-15.4)2 = 1.96 (8-15.4)2 = 54.76 That sums to 107.2, next we need to divide by the number of measurements - 1 and then take the square root. So our final value will be (107.2/4)1/2 = 5. Why 5 rather than 5.177? Because uncertainties are normally reported with only one sig fig (an exception is made for leading ones, hence 1.7 would be reported, even through it consists of two sig figs). So the ±0.2 part in the gravity measurement is the sample standard deviation, right? No, it is actually something called the standard error. The standard error is the sample standard deviation divided by the square root of the number of measurements. In this case, the standard error would be 2 (2.3 for those of you who want to see the next digit). The standard error will give the proper scaling for the reduced uncertainty for repeated measurements. At this point, you may feel like all this math makes you want to drop physics and change careers. These numbers are used so often that there is no need to calculate them by hand, just about any mathematics software will generate them for you. In Excel, enter one of the data sets into a column. Then go to the Tools menu and select Data Analysis. This will open a box with many options. Choose Descriptive Statistics and click OK. This will open a new box which prompts for an input range, select your column of data. For output options, choose Summary Statistics. Excel will then produce a set of numbers, the first is the mean, and the second is the standard error, the two quantities we want. Propagation of Error How do the uncertainties in the more basic measurements come into play? For multiplication, we have a simple rule. We define the percentage uncertainty to be the uncertainty divided by the mean. In other words, for our gravity example we would take ±0.2 and divide by 9.6 to get a 2.1% percentage error. The percentage error in a product is the square root of the sum of the squares of the percentage errors. Again, that's too many words when a formula will do the job much better: ±c/C = [ (±a/A)2 + (±b/B)2]1/2 The above formula assumes that C = AB, but it works the same way for C=A/B. Note that the rules are slightly different for exponents, the percentage uncertainty d2 is actually twice the percentage uncertainty in d. The uncertainty in a sum roughly follows the uncertainty in a product in that we use a Pythagorean-style formula. For the uncertainty in a sum we would write: ±c = [ (±a)2 + (±b)2]1/2 In other words we use the uncertainties themselves rather than the respective percentage uncertainties. In today's lab your instuctor will give you three sets of numbers. These sets correspond to imaginary measurements of quantities A, B and C. Calculate you answer for A by hand, and then use Excel to calculate B and C. When that is completed, use the error propagation techniques to calculate a value for C/(A+B). The Meaning of Error Bars Statistics course spend a large amount of time on hypothesis testing. In this class we will play a little "fast and loose" and claim that 68% of the time the true value falls within the error bars. People familiar with the Guassian distribution will recognize that this means that 95% of the time the true value is within twice the value of the error bars, and 99.7% three times the value. Again, roughly speaking, most shoot for the 95% mark. If we find that our value differs from the accepted value by more than twice the error bar, we are concerned as there is only a 5% chance that is due to some statistical fluke. Students should always reference there numbers with respect to the error bars in their discussions. We have covered many concepts in today's lab, and for most students many of them are new. While the methods for handling uncertainty are very important, we will come back to them again and again this quarter. Do not be discouraged if you don't feel that you have mastered them today, instead treat today's exercises as an introduction.