Lab 2

advertisement
Least-Squares Curve Fits
Curve fitting plays a large part in experimental analysis, either in the interpretation of the
data or in its empirical representation. There are a large number of ways to fit curves of various
sorts to experimental data. Here we wish to concentrate upon the most widely-used technique:
least squares curve fits. By far the most common of the least-squares fitting techniques is the
linear least-squares fit, or the linear regression, as it is sometimes called.
To introduce this technique, suppose that we vary one experimental quantity x and
measure the resulting variation in another quantity y. The quantity x could be, for example, the
value of a mass which we attach to a spring, and y could be the resulting period of oscillation of
the spring-mass system. Or x could be the amount of current we supply to an electrical
resistance coil within an insulated box, and y the resulting temperature within the box. In any
case, suppose that we have some reason to believe that there exists (at least approximately) a
linear relationship between x and y, of the form y = ax +
b, where a and b are constants to be determined. Let us y
suppose moreover that, for n values of x, x1 , x2 , , xn ,
we have measured n corresponding values of y,
y1 , y2 , , yn . Then one very simple, and very frequently
used, means of determining a and b is just plot the
values (xi, yi) on a sheet of graph paper and, by eye,
draw a straight line which seems to fit the data well.
The quantities a and b can then be extracted as the slope
and the x-intercept of this straight line. The figure at
x
right shows this procedure.
Although this method can often produce quite reasonable fits, it is possible to do
considerably better without too much extra effort. Let yi  axi  b . Note that the quantity yi is
not the same as the observed, or measured, value yi, but rather the value we would measure if the
linear relation we have postulated were completely valid (which it is not; rather it is only
approximately valid). What we want to do is to determine the two constants a and b by requiring
the yi values to be as close as possible, in some sense, to the observed values yi. A very common
method of doing this is to seek to minimize the sum of the squares of the differences between yi
and yi. In other words, we want to minimize the quantity
D    yi  yi 
n
2
i 1
Now we have that yi  axi  b , so substitution into the above equation gives
D   ax i  b  y i 
n
2
i 1
We now want to pick values of a and b so as to minimize D. Hence we will require that
 D D

0
a b
Thus
n

D
 2  xi  axi  b  yi   0
a
 i 1

n
 n 2
 n 
  xi  a    xi  b   xi yi
 i 1 
 i 1 
i 1
(1)
n

D
 2   axi  b  yi   0
b
 i 1

n
 n 
x
a

nb

yi
 i 

 i 1 
i 1
( 2)
Equations (1) and (2) above give us two linear equations for a and b whose solution gives us the
best linear fit through the data, at least in a least squares sense. It should be noted that most
modern scientific calculators include, as an intrinsic function, the ability to calculate linear least
squares fit.
This procedure can be readily extended to higher-order polynomials. For example,
suppose that we want to fit a quadratic of the form y  ax 2  bx  c to n pairs of data (x1, y1), (x2,
y2), ......., (xn, yn). We again form the quantity D as before and minimize it with respect to a, b,
and c. This yields the following three linear equations for a, b, and c:
n
 n 4
 n

 n

  x i  a    x i3  b    x i2  c   x i2 y i
 i 1 
 i 1 
 i 1 
i 1
n
 n 3
 n 2
 n 
  xi  a    xi  b    xi  c   xi yi
 i 1 
 i 1 
 i 1 
i 1
n
 n 2
 n 
  x i  a    x i  b  nc   y i
 i 1 
 i 1 
i 1
In general, an n-th order polynomial fit will yield n+1 linear equations for the constant in the
polynomial.
We can use basically the same procedure with certain other types of curve fits. For
example, suppose that we wish to fit a power-law relation of the form
y  cx a ,
where a and c are constants to be determined, to the experimental data. Taking logs of both sides
of this equation, we have
ln y  a ln x  ln c
If we now let b  ln c , we have
ln y  a ln x  b
Notice now that this is exactly the same as our previous linear fit, except now the equation
involves ln y instead of y and ln x instead of x. Hence the linear least-squares fitting procedure
goes through exactly as before, if we only substitute log quantities instead of the actual quantities
themselves.
A similar trick works for curve fits of the form
y  ce ax
Again taking logs of both sides, we have
ln y  ax  ln c
Setting b  ln c , we have
ln y  ax  b,
so that once again we can use a linear, least-squares fit between ln y and b.
It is not always obvious which type of curve will best fit the data. One useful procedure
is simply to plot the data graphically on various types of graph paper: linear, semi-log, and loglog. Obviously, if the data fall fairly close to a straight line on linear paper, then a linear fit of
the form y  ax  b is a good choice. On the other hand, if the data fall close to a straight line on
semi-log paper, then an exponential fit of the form y  ce ax is indicated. Finally, if the data are
nearly linear on log-log paper, then a power-law fit y  ce ax may be used.
Having chosen a particular functional form for the curve fit, we may now ask how good
the resulting fit really is. A quantitative answer to this question is given by the correlation
coefficient. Let
ym 
1 n
 yi
n i 1
be the mean of the experimentally observed y values, and likewise let
n
y 
y
i 1
i
 ym 
2
n 1
be the (sample) standard deviation. Suppose again that we are fitting a linear relation of the form
y  ax  b to the experimental data, and let yi  axi  b . Then we define
n
 y,x 
y
i 1
i
 yi 
2
n2
and note that, for a perfect fit, x,y = 0.
We now define the correlation coefficient r of the fit as
r  1
 2x , y
 2y
Note that, for a perfect fit (x,y = 0), r = 1. In general the closer r is to 1, the better the fit. Most
modern scientific calculators, in carrying out linear, power-law, or exponential curve fits, will
also provide the value of the correlation coefficient.
The Experiment
We want to derive an experimental relation between the
period of oscillation of a spring-mass system (the length of time
that the system takes to make one complete oscillation) and the
mass of the system. We will be using the simple system
sketched at right, using a spring of fixed spring constant k, but
varying the mass m carried by the spring.
k
m
Using the masses available in class, load the system by
placing one or more masses on the spring platform and set the system into motion. Using a
stopwatch, measure the period of oscillation T by measuring the time required to complete a
fixed number of oscillations (the more, the better), and then dividing this time by the number of
period of oscillations. It is recommended that this measurement be done simultaneously by as
many team members as possible, with the recorded value being the average of the measurements.
Do this for five different mass values and record the values in the table on the lab sheet (note that
we are not really recording mass, but rather weight in pounds; however, the two differ only by a
constant factor g).
We will investigate three possible data fits: linear (T = aW + b), power-law (T = aWb),
and exponential (T = aebW). To decide which of these is the most appropriate, we will carry out
fits for all three possible functional forms using a scientific calculator and compare the
correlation coefficients. The fit yielding a correlation coefficient closest to one is the best fit.
Report on the data sheet the functional form yielding the best fit and the associated values of a
and b. Then plot the best fit and the data points on the linear graph paper on the back of the data
sheet. Because we are using linear paper, if you decide that a power-law fit is best, you should
plot lnT vs. lnW. On the other hand, if you decide that an exponential fit is best, you should plot
lnT vs. W.
Download