Chapter 5 Workshop on Fitting of Linear Data 5.0 Learning Objectives

advertisement
Chabot Engineering - Courtesy of Prof. E. Allen • SJSU CoE MATE-153
LabNotes Rev. Sp07
Chapter 5
Workshop on Fitting of Linear Data
(Contributed by E.L. Allen, SJSU)
5.0
Learning Objectives
After successfully com pleting this laboratory workshop, including the assigned reading, the lab
repot sheets, the lab quizzes, and any required reports, the student will be able to:
1. Distinguish between dependent and independent variables.
2. Properly display x-y data in a scatter plot w ith appropriate scaling and axes, both manually
and using an Excel spreadsheet with its graphing functions.
3. Recognize from a scatter plot whether data is linear, has some other functional dependence,
or has no functional dependence between x and y.
4. Perform a linear regress ion analysis of experimental data, both manually and using an Excel
spreadsheet or the linear regression commands in MATLAB (e.g., the Basic Fitting Interface)
5. Correctly display a fitted line and its equations and demonstrate understanding of relationship
between fitting parameters and experimental data.
6. Demonstrate understanding of the distinction between raw data and experimental results.
5.1
References
1. C. Chatfield, Statistics for Technology, Chapman and Hall, London, 1983.
2. W.J. Palm, "Intro to MATLAB7 for Engineers", McGraw-Hill, 2005, pp 312-315
5.2
.
Linear Regression
Linear regression is a technique for finding the linear relationship between the x and y values of
experimental data. It assum es that there is a linear relationship between x and y. This m ay not
always be th e case; there m ay be a linear relatio nship between x and a function of y; there m ay
be a non-linear function, or there may be no relationship at all between the data. We only apply
linear regression to data when we have reason to
think that there is a linear r elationship. It is
necessary to first identify which is the independent variable and which is the dependent variable.
Here is a m ethod to follow. Many of you have linear regression progra ms on your engineering
ENGR45 Chapter 5: Linear Regression
Ch5_LinearFit_RevS07_ENGR45.pdf
5-1
Chabot Engineering - Courtesy of Prof. E. Allen • SJSU CoE MATE-153
calculators; it is also
embedded in Excel and
LabNotes Rev. Sp07
MATLAB software . However, to
understand the m eaning and reliability of f itted data, it is best to perform
a manual linear
regression at least once in your life. In the exercise that follows, you may use your calculator for
adding columns of data only but no t for fitting. W hen you have finished the linear reg ression
exercise, you may use your calculator’s program to check your result; then you can plot the data
in Excel or MATLAB and compare the results obtained with Excel’s or MATLAB's algorithm.
More on this technique, and other statistical methods, can be found in the references.
Before beginning any calculations for a linear regression analysis, consider the following points:
1. Identify x and y, that is the independent and dependent variables in the experiment.
2. Make a scatter diagram, i.e. plot the data on an x-y coordinate system and see if it is roughly
linear.
3. If it isn’t, think about wh ether you should plot som e f(y) such as log y or y 2; th is generally
requires some notion of what the experiment is about and what is the expected relationship in
the data. If you need to transform the x or y data, do so, and create a new set of data (c.f. ENGR25)
4. You now have n pairs of data, which we shall refer to as (xi, yi). The x variable is referred to
as the control or independent variable; yi is referred to as the response or dependent variable.
5. Assume y is subject to scatter (i.e. random errors) and that x is not.
6. Assume a linear relationship exists, as y = a0 + a1 x (equivalent to: y = mx + b) .
7. Our objective is to find estim ates for a 0 and a 1 such that the line gives a “good fit”. The
method of least squares is one way to do this. In general, the values of a
0
and a 1 are
interpreted to have some fundamental physical meaning. This is one way of determ ining the
value of physical constants or m
aterials prop erties. The values so
o btained are only as
reliable as the measurement errors.
8. The assum ption of a linear f it in the data pre cludes the possibility that there is
a m ore
complex relationsh ip be tween them , or th at the relation ship is on ly linear ove r a lim ited
range of the independent variable, or that th ere are other factors wh ich inf luence the data
which we have not explicitly considered. A
ll of these c oncepts m ust be considered in
interpreting the data.
Now that your data is available for a linear re gression you are ready to begin. The following
steps provide a guide to performing a linear regression analysis:
Chapter 5: Linear Regression
Ch5_LinearFit_RevS07_ENGR45.pdf
5-2
Chabot Engineering - Courtesy of Prof. E. Allen • SJSU CoE MATE-153
LabNotes Rev. Sp07
1. Consider that at xi, the predicted value of y is: yp = a0 + a1xi
2. The difference between the observed y-value (yi) and the predicted value (yp) is
ei = yi − (a0 + a1 x i ) where ei is the deviation or residual. There is a value of ei at each data pair.
3. Choose values of a 0 and a1 that minimize the s um of the squares of the ei’s at each pair of
points. The sum of the squares we will call S.
n
n
i=1
i=1
2
S = ∑ ei2 = ∑ [yi − (a0 + a1 x i )]
S = f (a0 , a1 )
4. To minimize S, we need to find the partial derivatives
zero, and solve the resulting simultaneous equa
∂S
∂S
∂a0 and ∂a1 , set th em equal to
tions f or the leas t s quares es timates of
aˆ0 and aˆ1 . In the pa ges that foll ow, each sum is assu med to be from i=1 to i=n, where n is
the total number of data pairs.
∂S
= ∑ 2[yi − (a0 + a1 x i )](−1)
∂a0
∂S
= 2[y − (a0 + a1 xi )](− xi )
∂a1 ∑ i
5. Next set the two partial derivatives equal to zero, and solve for the values of aˆ0 and aˆ1 :
∂S
= 0 = ∑ 2[yi − aˆ0 − aˆ1 xi ](−1)
∂a0
0 = − ∑ 2yi + ∑ 2 aˆ 0 + ∑ 2 aˆ1 x i
∑y
i
= naˆ0 + aˆ1 ∑ xi
and similarly, for the second partial:
∂S
=0=
∂a1
∑ 2[y
i
− aˆ0 − aˆ1 xi ](− xi )
0 = − ∑ 2x i yi + ∑ 2aˆ0 xi + ∑ 2 aˆ1 x 2i
∑x y
i i
= aˆ 0 ∑ x i + aˆ1 ∑ x 2i
6. We now have two normal equations in aˆ0 and aˆ1 which need to be solved simultaneously.
naˆ 0 + aˆ1 ∑ x i = ∑ yi
aˆ 0 ∑ x i + aˆ1 ∑ x 2i = ∑ x i yi
ENGR45 Chapter 5: Linear Regression
Ch5_LinearFit_RevS07_ENGR45.pdf
5-3
Chabot Engineering - Courtesy of Prof. E. Allen • SJSU CoE MATE-153
LabNotes Rev. Sp07
To solve the normal equations, set up a matrix of the coefficients of aˆ0 and aˆ1 . Let X = aˆ0 and
Y = aˆ1 , and a-f are the coefficients.
ax + by = c
dx + ey = f
solving for x and y, we get:
c − by
f − ey
and x =
a
d
c − by f − ey
=
a
d
and by rearranging, eventually we get :
x=
y=
aˆ1 =
dc − af
, or
db − ae
∑ x ∑ y − n∑ x y
(∑ x ) − n∑ x
i
2
i
i
i
2
i
i
where aˆ1 is the slope of the least squares fit.
7. Doing a similar calculation for the intercept, we find:
aˆ0 =
∑y
i
− aˆ1 ∑ xi
n
8. To find the linear fit, then, we simply calculate the terms in aˆ0 and aˆ1 from our data set. Then
plot the lin e and conf irm that th ere is reasonable fit to th e data. This explanation does not
include further statistical analysis such as ho
w good the linear fit is or what the confidence
intervals are. These should be investigated further in other courses.
ENGR45 Chapter 5: Linear Regression
Ch5_LinearFit_RevS07_ENGR45.pdf
5-4
Chabot Engineering - Courtesy of Prof. E. Allen • SJSU CoE MATE-153
LabNotes Rev. Sp07
LAB REPORT SHEET 5.1
Determining a Linear Relationship
Key Member (Encourage all m embers of the team to participate, ensure that everyone
understands the m aterial, and organize the ta sks and divide them between the tea m
members):
Graphics Analyst (Generate the needed plots and figures.): _______________
Other Group Members:
Consider the following sets of experimental data. In each case, determine:
• Whether there appears to be any functional relationship at all among the data
• If there is a relationship, determine which is the independent variable and
which is the dependent variable.
• Determine whether there is a linear re lationship between th e data, and why you
might expect one.
• If the relationship is not linear, determine whether there is a way to transfor m the
data so that it is linear.
1. You have data from a meteorologist on te mperature for each month of the year,
measured in one location using th e sam e therm ometer, on three d ifferent d ays each
month. Data is in degrees Fahrenheit.
Jan 23
Feb
Mar
Apr
May
June
July
Aug
Sep
Oct
Nov
Dec
, 34,13
12, 45, 32
14, 32, 33
38, 50, 51
50, 62, 65
70, 75, 79
80, 82, 95
88, 83, 75
84, 75, 90
78, 65, 50
50, 45, 32
32, 30, 28
ENGR45 Chapter 5: Linear Regression
Ch5_LinearFit_RevS07_ENGR45.pdf
Report Sheet 5.1
Chabot Engineering - Courtesy of Prof. E. Allen • SJSU CoE MATE-153
LabNotes Rev. Sp07
2. You have data on electrical resistance of an aluminum wire, at various temperatures:
Temp (C)
20
30
40
50
60
70
80
90
100
110
120
130
140
150
Resistance(ohms)
1.12E-07
1.16E-07
1.21E-07
1.25E-07
1.29E-07
1.34E-07
1.38E-07
1.43E-07
1.47E-07
1.51E-07
1.56E-07
1.60E-07
1.64E-07
1.69E-07
3. You have data on the im
pact energy of a bu llet, m easured at various take-off
velocities. Answer the questions above and determine the mass of the bullet.
Velocity (m/s) Energy (J)
50
1.25E+02
55
1.51E+02
60
1.80E+02
65
2.11E+02
70
2.45E+02
75
2.81E+02
80
3.20E+02
85
3.61E+02
90
4.05E+02
95
4.51E+02
100
5.00E+02
ENGR45 Chapter 5: Linear Regression
Ch5_LinearFit_RevS07_ENGR45.pdf
Lab Report Sheet 5.1
Chabot Engineering - Courtesy of Prof. E. Allen • SJSU CoE MATE-153
LabNotes Rev. Sp07
LAB REPORT SHEET 5.2
Performing a Linear Regression
Key Member (Encourage all m embers of the team to participate, ensure that everyon e
understands the material, and organize the tasks and divide them between the team members):
_
_
Calculations Expert (Work with your calculator to help your team solve the problems.)
Graphics Analyst (Generate the needed plots and figures.): _______________
Other Group Members:
Exercise 1
The following data was record ed in an experim ent which m easured the vari ation of th e specific
heat of a chemical with temperature. In this temperature regime, it was expected that the specific
heat (Cp) should depend linearly on the absolute tem perature, T. Two measurem ents were made
at each temperature.
Temperature (OC)
Specific Heat
(J/mol oC)
50
1.6
1.64
60
1.63
1.65
70
1.67
1.68
80
1.7
1.72
90
1.71
1.72
100
1.71
1.74
1. Note that the two m
easurements m ade at the sam e t emperature can be considered
independent measurements, so you have 12 pieces of data.
2. Plot the data on a scatter diagram and determine whether a linear relationship exists.
3. Fit a s traight lin e to the data by eye; f ind th e slope and inte rcept o f this line; w rite an
equation for this line.
4. Estimate the specific heat of this chemical when the temperature is 75 oC.
Exercise 2
1. Perform a linear regression analysis on the data of Exercise 1. Using the regression values for
the slope and the intercept, add the fitted line to your plot from Exercise 1, using a different
color pencil from the previous line fit to the data by eye.
2. Estimate the specific heat of this chemical when the temperature is 75 oC.
3. What is the percentage difference between the estimated values for 75 oC found in Exercise 1
and Exercise 2? Which value is more accurate? Why?
ENGR45 Chapter 5: Linear Regression
Ch5_LinearFit_RevS07_ENGR45.pdf
Report Sheet 5.2
Chabot Engineering - Courtesy of Prof. E. Allen • SJSU CoE MATE-153
ENGR45 Chapter 5: Linear Regression
Ch5_LinearFit_RevS07_ENGR45.pdf
LabNotes Rev. Sp07
Lab Report Sheet 5.2
Download