Introduction to Data and Error Analysis for General Physics Lab

advertisement
Introduction to Data and Error Analysis
for General Physics Lab Experiments
1. INTRODUCTION
The determination of the laws of physics comes from observations and experiments.
It is essential to learn physics by performing experiments and interpreting experimental
data properly.
We consider two basic types of experiments that scientists often perform in order to
learn about the physical world:
1) Through measurements to determine the numerical value of some physical quantity
2) To test whether a particular theory is consistent with experimental data.
Our lab experiments are designed to teach you the techniques for making measurements
and for comparing data from an experiment with the predictions of physical laws which
you learn from the lectures. In order to obtain meaningful results from an experiment,
you need to analyze data with good understanding of experimental errors. It is very
important to learn how to identify sources of experimental errors and to estimate their
sizes. Please keep in mind that any valid experimental data must be presented
with the associated errors. For example, the measurement of the speed of light is
given as:
c = (2:99792458 0:00000004) 108 m=s
The information we can learn from the above data presentation includes two aspects:
i) the measured speed of light is 2.99792458108 m=s
ii) the error (or uncertainty) of the measurement is 4m=s.
For our lab course, it is required that you present experimental data with errors
indicated in your lab report.
In the following sections we briey introduce the basic concepts in experimental data
and error analysis through denitions and simple examples.
2. BASIC CONCEPTS
2.1 True value, Experimental Value and Error
1
True value, A0: It is an exact physical value which often appears in fundamental
laws of physics. Examples are: the speed of light in vacuum (c) in Maxwell's equations gravitational acceleration (g) in Newton's equation and so on. The numerical
values of these physical quantities must be determined through measurements.
Experimental value, A: It is the numerical value obtained by performing experiments designed to measure A0. In general, the measured value does not exactly
equal its true value. This is because the experimental instruments and methods are
not perfect, so that the measured value, A has uncertainty, which is called experimental error. The smaller the experimental error, the closer is the measured value
A to the true value A0.
Experimental error, : It is the dierence between A0 and A, = A ; A0.
It indicates how close a measured value comes to its true value. However, we do
not know the exact value of A0, so is also an unknown value. The task of data
analysis is to nd the sources of errors, and to estimate the size of the errors. Based
on the estimated , we can give a range of values where the true value, A0, is likely
to lie. The size of the error indicates the accuracy of the experimental value A.
2.2 The Rule for Data Recording
Before we discuss error analysis, we need to understand the rule for data recording in experiments. This is because a meaningful experimental value is dierent from a
pure mathematic value, it contains certain physical meaning. When we make measurements and record the measured data, we must determine how many valid digits should
be recorded. This idea can be illustrated through a simple example below.
Assume one measures a length of an object by using a ruler which has minimum scale
of 1 mm. The measurement is a little bit more than 89 mm (but less than 90 mm),
One can record the measurement value as 89.5 mm. Here, 89 mm is accurate, and the
additional 0.5 mm is estimated from the reading. Because it is hard for our eyes to tell
if that a little bit more is exactly 0.5 mm, the recorded data above should have some
uncertainty. We cannot be certain that our estimated 0.5 mm is not actually 0.4 or 0.6
mm, we then say that the measurement error is 0:1 mm. Therefore, the nal measured
result should be reported as: Length = (89:5 0:1)mm. In this example, we see that
three digits should be recorded in the measurement, which indicates the measurement
(read out) precision of the instrument. If one writes the measured data as 89.4987 mm
;! this means he can read out the ruler better than an accuracy of 0.001 mm. This is
certainly not true. On the other hand, if one writes his data as 89 mm ;! this indicates
his read out error is about 1 mm, clearly too large.
In general, the rule of data recording in an experiment is that the number of valid
digits of recorded data should reect the minimum read out scale of the measurement
instrument.
2
In some cases we need to determine a physical quantity by measuring several values.
The number of valid digits for the nal measured result should be determined by the
minimum valid digits of individual values. For example, suppose we measure the resistance
of a resistor by measuring the circuit current, I , and the voltage, V . Assuming the read
out values are V = 10:5V , and I = 1:522A, then the resistance R should be recorded as
:5 = 6:70 :
R = VI = 110
:522
Here the R value has three valid digits, which is determined by the voltage, which is read
out with three valid digits.
2.3 Dierent Types of Errors
The discussion about the experimental error will not include the performance mistakes
during the experiment process. These mistakes include reading error, recording error, and
incorrect instrument operation. These kind of errors are dicult eliminate through data
analysis. Therefore, great care should be taken to prevent them from occurring.
We discuss here two fundamentally dierent types of errors associated with any measurement procedures: systematic and random errors.
Systematic Errors
There are basically two sources of systematic errors:
1) Instrument calibration. For example, the zero-point has not been tuned correctly
before the measurement: suppose it is at a, not at the zero point, at the beginning.
Then all the measured data points will shift by a constant a. As another example,
if the full range of a voltage meter is 0-1.9 volts, but the meter scale shows a
full scale of 0-2 volts, then the measured voltage value using this meter will be
systematically increased by a factor of 2=1:9. Therefore, checking the instrument
calibration including the zero-point tuning is important to avoid the systematic
errors.
2) Experiment method error. This kind of error often is due to the experiment's
design being imperfect. The experimental conditions are not exactly the same as
the theoretic model assumes. When comparing the experimental data with the
theoretical expectations one must take into account the experimental method errors.
We sometimes call such error theoretical systematic error.
Understanding the systematic error in a experiment is not an easy job: we must fully
understand the experimental principles and carefully check the instrument to estimate
the size of the systematic errors.
3
1
f(Δ ) = σ
2π
-4
-3
- Δ2/2 σ 2
e
-2
-1
0
1
2
3
4
Δ
Error Δ distribution obay Gaussian function
Figure 1: Random error distribution.
Random Errors
Random error is often due to the experimental instrument's precision limitations,
and imperfectly performed experiments. A special random error comes from the physics
process itself. For example, the measurement of the life-time of radioactive particles must
take into account the fact that radioactive decay is a random process. Often, under certain
conditions, uctuations due to this kind of error obey the Gaussian distribution see ref.
1] as shown in Figure 1. We often refer to these as statistical errors. In general, random
errors can be reduced by repeating the measurement.
3. DATA ANALYSIS
Data analysis includes determination of the measured mean values and the standard
deviations. W e discuss the standard method to present the measurement results in this
section. We rst give the denitions and discuss how to combine the errors, then we
briey introduce the least squares method for linear variable relations.
3.1 Mean Value and Standard Deviation
For N measurements samples of a physical quantity of true value , with each measured
value xi, the sample mean value, x is dened by
N
X
(1)
x N1 xi hxi i=1
and the corresponding sample variance is given by
N
X
(2)
var(x) N 1; 1 (xi ; x)2 :
i=1
4
The shaded area indicates : Probability of measurement within t standard deviation of x .
68%
95%
99.7%
x
μ - σ μ μ+σ
x
μ +2σ
μ
μ -2 σ
x
μ -3 σ
μ
μ+3σ
Figure 2: Interpreting the standard deviation .
The sample standard deviation, , is given by
v
u
N
q
u
X
= var(x) = t N 1; 1 (xi ; x)2 :
i=1
(3)
Standard deviation represents how the measured values spread out in repeated measurements, and therefore is a good estimate of the statistical error of the experiment. As
shown in Fig. 2, the exact meaning of the standard deviation, , can be related to the
probability, pt, for nding a single measurement of x to be within the range ( ; t + t),
This is seen to be 68:3%, 95:5% and 99:7% for t = 1, 2, and 3 respectively. for N large
enough (typically
N p20) it can be shown that the probability for x to be in the interval
p
( ; t= N + t= N ) is about 68%, 95%, and 99.7% for t = 1, 2 and 3 respectively.
So, to estimate , one measures x and one has that
= x p2
N
with a condence of 95%. Thus the accuracy in determining improves as N increases.
3.2 Combining errors
We are often confronted with a situation where the result of an experiment is the
combination of two or more measurements. We want to know what is the error on the
nal answer in terms of the errors on the individual measurements.
Linear situation
As a very simple example, consider the nal result a which is related to the measured
values b and c:
a = b; c:
To nd the error on a, rst dierentiate
a = (b) + (;c) :
5
if we were talking about maximum possible errors, then we would simply add the magnitudes of b and c to get the maximum possible a. But it is more sensible to consider
the root mean square deviations:
a2 = h(a ; a)2i
= h(b ; c) ; (b ; c)]2i
= h(b ; b)2i + h(c ; c)2i ; 2h(b ; b)(c ; c)i
b2 + c2 ; 2cov(b c)
The last term involves the covariance of b and c. This has to do with whether their errors
are correlated or not. It can be positive, negative or, in the case where the errors are
uncorrelated, zero. Thus, provided that the errors on b and c are uncorrelated, the rule
is that we add the contributions b and c in quadrature:
a2 = b2 + c2 :
(4)
However, it should be emphasized that only when the individual errors are uncorrelated,
Eq.(4) can be applied. To illustrate this point, let's consider the following example.
Example:
a=b+b
then the two variables on the right-hand side of the equation are completely correlated.
Thus pif the measurement error of b is b, then a is simply 2b. We notice that here
a 6= 2b, as would be expected by (4). (Recall: a2 = b2 + b2 + 2cov(b b) = 4b2, here
2cov(b b) 2h(b ; b)(b ; b)i = 2b2.)
Non-linear situations
For this case the correct answer can be achieved by rst dierentiating, then collecting together the terms of each independent variable and nally adding these terms in
quadrature, i.e. for y(x1 x2 xn ),
!2
n
X
@y 2 y2 =
(5)
@x x
i
i=1
thus, for example, if
i
a = br cs
where r and s are known constants. Assuming the errors on b and c are uncorrelated,
2
2
2
a
2 b
2 c
=
r
+
s
(6)
a
b
c i.e. the fractional errors on b and c are combined to give the fractional error on a.
6
As before, when dealing with ratios or products we must be careful about correlations.
When correlations are present between b and c in the above example, the fractional error
on a is given :
2
2
2
cov(b c) :
a
2 b
2 c
=
r
+
s
+
2
rs
(7)
a
b
c
bc
Example: if a = b=c = (100 10)=(1 0:2) (assuming errors on b and c are independent.)
(a=a)2 = (10=100)2 + (0:2=1)2 = 0:01 + 0:04 = 0:05
p
a = a 0:05 = 100 0:22 = 22
The nal result should be presented as:
a = 100 22 or a = 100(1 22%):
In summary, for simple function with two independent variables, y(x1 x2), the measurement results can be presented as:
q
y = x1 + x2 ;! y = (x1 + x2) x21 + x22
q
y = x1 ; x2 ;! y = (x1 ; x2) x21 + x22
q
2
2
y = x1 x2 ;! y = (x1 x2) 1 (x1 =x1) + (x2 =x2) q
x
x
1
1
2
2
;! y = x 1 (x1 =x1) + (x2 =x2) :
y = x
2
2
Combining results of dierent experiments
When several experiments measure the same physical quantity and give a set of answers
ai with dierent errors i, then the best estimates of a and its accuracy are given by
P (a = 2)
(8)
a = Pi (1i=2i) i
i
and
1 =X 1 :
(9)
2
2
i
i
Thus each experiment is weighted by 1= . In some sense, 1=i2 gives a measure of the
information quality of that particular experiment.
2
i
3.3 Least Squares Method
The least squares method is very often used in data analysis to determine the experimental parameters from a set of measured data points. In this section, we only consider
7
the simplest situation where the relations between the variables are linear. The mathematic proof will not be given here, only the formulae used in this method are given
below.
Consider variables y and x are related to one another linearly:
y = a + bx
(10)
where a and b are two parameters to be determined. (We should notice that the above
formula is a line equation, with a as the intersection, and b as the slope.) Assuming we
measured a set of data points fxi yig i = 1 N , we need to determine a and b from
the measurements. We rst dene the following variables for the calculation:
X
x = N1 xi
(11)
i
X
(12)
y = N1 yi
Lxx =
Lyy =
Lxy =
X
i
(xi ; x)2
(13)
(yi ; y)2
(14)
(xi ; x)(yi ; y)
(15)
X
i
X
i
i
then the measured mean value of the line parameters, a and b are determined by the
following formulae:
b = Lxy (16)
Lxx
a = y ; bx :
(17)
The errors on a and b can be calculated using the following formulae:
P x2
2
a = N P x2 ; i(P x )2 y2
i
i
N
b2 = N P x2 ; (P x )2 y2
i
i
(18)
(19)
where y is the uncertainty of the y measurements and can be determined by the following
formula:
v
u
P y ; (a + b x )]2
u
t
y = i i (N ; 2) i
(20)
4. EXAMPLE OF DATA ANALYSIS
8
V
1
2
s
6V
C
R
Fig. 3 RC circuit experiment set up diagram. When switch
connect to 1: charging C; connect to 2: discharging C.
Consider an experiment to measure the RC time constant in following circuit (see Fig.
3), and to determine the resistance, R, for a given capacitance value, C = 10(1 2%)C .
(ref: your lab manual: Capacitance experiment).
Experiment description
A fully charged capacitor of capacitance C (with initial voltage of 6 V), is connected in
series with a resistor of resistance R in the circuit. It will lose charge, so that the potential
dierence, V , across the capacitor will decay exponentially according to the following law
for a capacitor discharging :
V (t) = V0e;t=RC where V0 is the initial voltage across the capacitor and V (t) is the voltage at time t. The
product RC is called the `time constant'.
1) Measure the circuit time constant, RC 2) Determine the resistance value of the resistor, R.
Measurement and data analysis
To determine the time constant, we take data on voltage (V ) as function of time (t)
by reading the voltage every 5 seconds. Assume that the recorded data are listed in
Table 1. Please note: the recorded voltage data have three digits, and the minimum
read out scale of the voltage is mV.
For data analysis, we rewrite the capacitor discharging formula by taking the logarithm of both sides:
1 lnV (t) = lnV0 ; RC
t
so that lnV is linear in t and the formulae in section 3.3 can be applied directly
in the analysis. In order to see the linear relation directly, data can be plotted on a
9
time (seconds) 0.0 5.0 10.0 15.0 20.0 25.0
Voltage (volts) 6.00 2.22 .823 .305 .118 .045
Table 1: Recording data on voltage vs. time
semi-log paper with voltage along the log scale and time along the linear scale. The
data points should lie on a straight line. The slope of the line is ;1=RC .
The standard way to determine the slope of a line from a set of measured data points
is the least squares method. We use the formulae presented in the last section. Let
y lnV (t) x t a lnV0 b ;1=RC then the original equation becomes
1 t ;! y = a + bx :
lnV (t) = lnV0 ; RC
Following the discussions in Least squares Method section, we can calculate all
the quantities for determining a, b, and a, b. The calculated results are listed in
Table 2.
x
y
Lxx Lyy Lxy P x2i P xi
12.5 -0.672 437.5 16.770 -85.7 1375.0 75.0
Table 2: Calculated quantities for least squares method
From these calculated quantities, we determine the measured parameters and errors
as :
b = Lxy ;1=RC
= ;0:196 Lxx
a = yv
; bx lnV0 = 1:77 u
P y ; (a + b x )]2
u
i
i i
;2
t
y =
=
1
:
90
10
(N ; 2)
v
u
P 2
u
t
a = n P x2 ;x(iP x )2 y2 = 9:97 10;2 i
i
s
b = n P x2 ;n (P x )2 y2 = 6:58 10;3
i
i
Finally, we obtain the experimental results on the time constant RC and resistance
value R in Table 3.
10
The time constant
The error on time constant
The resistance value
The fractional error on R
RC = ;1=b = 5:1 seconds
= 0:17 seconds
RC = (b=b) RC
1
1
R = ; bCq= 0:19610;6 = 510k
R=R = (C =C )2 + (b=b)2 = 3:94%
Table 3: Determined parameters and errors
The measurement errors indicated in Table 3 are the random errors that mainly
come from the uncertainties for reading value of the voltage (you might not read
out the voltage exactly on time, and the last digit of recorded data contain errors).
By repeating the measurement, such error will decrease.
In addition to the random error, we should also consider possible sources of systematic errors.
{ There may be calibration errors in the voltage meter and in the clock.
{ The measured time constant in fact includes the resistance of the circuit and
the instrument, if you determine R through measured time constant, the determined resistance value will be larger than its true value. Once we know the
circuit and instrument resistance, we should make corrections: b = ;1=RC ,
and R = R0 +r, where R0 is the resistance of the resistor, and r is the resistance
of the instrument and circuit ;! R0 = R ; r.
{ The given capacitance value is not exact, but has two percent of uncertainty.
This error cannot be reduced by repeating the measurement. From the calculation we know that the minimum error on the R value is 2%, which comes
from the capacitance value uncertainty.
Presentation of nal results
The measured time constant: RC = 5:1 0:17 seconds
The measured resistance of the resistor : R0 = 500(1 3:94%)k.
Note: we have assumed that the equivelance resistance of the circuit and instrument
r is 10, so that nal result is R0 = R ; r.
Reference:
(1) ` An Introduction to Error Analysis 0, by John R. Taylor
11
Download