Chapter 10 - TeacherWeb

advertisement
""Each of us is a statistical impossibility around which hover a million other lives that were
never destined to be born." -- Loren Eiseley
Chapter 10: Re-expressing Data: Get it Straight! (pages 220 – 243)
OVERVIEW: What happens if our scatterplot is not linear? We make it linear. If a scatter plot shows a
curved pattern, it can perhaps be conveniently modeled by an exponential growth or
decay function of the form
or a power function of the form
In these situations, we can linearize the data by making use of logarithms. Among the
advantages of using logarithms is the fact that use of logarithms produces smaller
numbers, making graphical displays more convenient to construct.
Definition: log b x  y if and only if b y  x . [x > 0, b > 0 and b is not equal to 1]
Rules for logarithms
1. log(AB) =
2. log(A/B) =
3. log A p 
Note that y = abx (exponential function)
==>
==>
This is a linear relationship between the variables x and logy since loga and logb are constants.
Also, y = axb (power function)
==>
==>
This is a linear relationship between the variables logx and logy.
Let’s take a closer look by linearizing some data using both models.
Ex. The table shows the temperature of an instrument measured as its distance from a heat source is
varied.
Distance
(cm)
1
2
3
4
5
6
7
8
Temperature
(Fo)
130
105
95
87
83
80
78
77
1. Plot scatterplot
2. Run Linear Regression
ŷ 
Looks like a curve
r
3. Residual Plot
Definite pattern
r2 
Let’s try the exponential model (x , logy)
1. Linearize data by taking log of y( L2 )and loading into L3 .
2. Scatterplot ( L1 , L3 )
Still a curve
3. Run Linear Regression*( L1 , L3 )
4. Residual Plot ( L1 , RESID)
ŷ 
r
Still a definite pattern
r 
2
TIME OUT: Normally, we would not proceed b/c the residual plot says this is not a good model.
We will continue with the process for the sake of keeping the notes organized.
So far we have transformed exponential pattern to a linear pattern and then we ran regression on the
transformed data. Now if there is no pattern in the residual plot we must perform the inverse
transformation on the linear equation to find a curve of best fit on the original data and solve for ŷ .
5. Transform back in order to plot curve on original data
log 10 yˆ  2.09  .03x
now b/c log b x  y iff b y  x
so
b logb x  x
is what our transformed equation ends up looking like
6.TI-83 turn off Y1 / load Y2 with transformed equation
7. Stat Plot on original data ( L1 , L2 )
Let’s try the power model (logx, logy):
1. Linearize data by taking log of x( L1 ) and loading into L3 and log of y( L2 )and loading into L4 .
2. Scatterplot ( L3 , L4 )
3. Run Linear Regression*( L3 , L4 )
ŷ 
r
Much more linear
4. Residual Plot ( L1 , Resid)
Less pattern than others
r 
2
5. Transform back in order to plot curve on original data
We have log y  a  b log x
10 log y  10 a log x
b
y  (10a )(10log x )
y  10 a ( x b )
b
is what our transformed equation looks like.
6.TI-83 turn off Y1 / load Y2 with transformed equation
7. Stat Plot on original data ( L1 , L2 )
*We have used the common log which has base 10 in our example, but you may also use natural log (ln)
Shortcut to Identify Exponential Growth
yn
y n 1
A variable grows exponentially if it is multiplied by a fixed number greater than 1 in each equal
time period.
1. Test to see if you have common ratios among data.
2. Transform points into linear pattern (x, logy) and look at ratios of L3 which has logy in it.
Approximately constant ratios is good evidence that the scatterplot of logy on x is linear.
Don't forget residuals. These are the most useful in determining the best model to fit to a data set.
Stick with me. I know it may look confusing but if you just practice the key
strokes in the calculator with a few examples, you will have it in no time.
Complete a regression analysis for the following age and income data as indicated.
Age
(years)
Income
($1,000)
20
25
30
35
40
45
50
55
60
18.5
23.6
29.8
38.5
49.0
64.1
78.5
102.0
130.8
1. Construct and label a scatterplot of the data.
2. Perform a linear regression on the data &
plot the regression line on the scatterplot.
3. Discuss the goodness of fit of the linear regression referencing the correlation coefficient & residual
plot.
4. Perform the exponential and power transformations and show linear regression on both sets of data.
Exponential
Power
ŷ 
ŷ 
r
r2 
r
Residual Plot
r2 
Residual Plot
5. Plot each one of the models on the original scatterplot.
Exponential
Power
6. Which of the three models fits the data best and leads you to use in order to make predictions?
7. What income would you predict for a 47 year old person based on this data?
Type of Model
Linear
Exponential
Power
Transform and Run
Regression On
Equation of Curve to Plot on
Original Data
Download