Linear Regression

advertisement
Least-Squares Regression:
Linear Regression
Section 3.2
Reference Text:
The Practice of Statistics, Fourth Edition.
Starnes, Yates, Moore
Warm up/ quiz
• Draw a quick sketch of three scatterplots:
– Draw a plot with r ≈ .9
– Draw a plot with r ≈ -.5
– Draw a plot with r ≈ 0
Today’s Objective
Regression line, introducing “y-hat” ŷ  a  bx
•
–
–
–
•
•
•
–
Predicted value
Slope
Y-intercept
Extrapolation
Residuals
Least-squares Regression Line
How to use your calculator effectively for time
Your Poster!
• Take a look at your poster: Do you think you
could draw a straight line that would go straight
through the middle where you have ½ your
points above and ½ your points below?
– Calculate your line:
• m = y2-y1 / x2-x1
• Point slope form: y – y1 = m ( x - x1)
• In math-land this is known as a “line of
best fit”
Regression Line
• In statistics, this is called a regression
line!
• A line that describes how a response
variables y changes as an explanatory
variable x changes. We often use a
regression line to predict the value of y for
a given value of x.
Formulas for Regression Line
• The RL is linear, so it follows the form y = mx + b
– In Statistics, we say
ŷ  a  bx
– In this context, ŷ is called the predicted value
– WARNING: we are entering “predicting” statistics,
using the “ ŷ ” symbol is very important
– [story about AP training]
ŷ  a  bx
– So ŷ is the predicted value
– And ‘a’ is the y-intercept, the predicted value of y
when x=0
– And ‘b’ is the slope
Equation of Regression
The Meaning of Slope
• In a simple algebraic function like y = 2x + 17,
what is the real meaning of the slope?
– For every increase in x of 1 unit, y increases by 2
• In the function y = 2x + 17 what is the meaning
of the y intercept?
– It is the value y takes on when x = 0
• In statistics if the regression line is
.00344x
– What is the slope?
– What is the y-intercept?
ŷ
= 3.505 -
Context
Extrapolation
Take a look at your poster!
• Take a look at the range of your data.
• Your line is linear- so it does on and on
even past your data points
• Predict an output value when you input a
large number outside your data range
• Put it into context: examples?
• This is whats known as extrapolating!
Extrapolation
• Is the use of a regression line for
prediction outside the interval of values of
the explanatory variable x used to obtain
the line. Such predictions are often not
accurate.
• “Just because your line behaves the way it does within
the confines, does not mean its gets all squirrely later on!
We cant predict the behavior of data to extremes.”
Example
• Some data were collected on the weight of a male white laboratory
rat for the first 25 weeks after its birth. A scatterplot of the weight (in
grams) and time since birth (in weeks) shows a fairly strong, positive
relationship. The linear regression equation weight = 100 + 40(time)
models the data fairly well.
• 1) What is the slope of the regression line? Explain what
it means in context
• 2) what’s the y intercept? Explain in context
• 3) predict the rat’s weight after 16 weeks, show your
work
• 4) Should you use the line to predict the rat’s weight at
age 2 years?
Residuals
• Look at your graph, how far away are your
points from your graph?
• Residuals is the difference between an
observed value of the response variable
and the value predicted by the regression
line.
• Residual = observed y – predicted y
Finding a residual
• Find and interpret the residual for the hiker who
weighed 187 pounds.
• Regression line:
–
–
–
–
–
Pack weight = 16.3 + .0980( 187) = 33.28 lbs
His actual pack weight was 30 pounds.
Residual = observed – predicted
Residual = 30 – 33.28 = -3.28
The (-) sign tells us that the observed is below the
predicted by 3.28 pounds.
– Negative: Below predicted, Positive: above Predicted
Least-Squares Regression Line
• We’ve been using the least-squares
regression line this whole time!
ŷ  a  bx
• We will talk about where the LSRL comes
from next class!
• But for now…lets learn how to use our tool
to make our Stats crunching FAST!
LSRL TI-83/ TI-89
• TI-83
– Put your data in L1, and L2
– STAT> CALC>#8 >Enter
• Did you know your TI-83 will default to using L1
and L2 as our lists, so as long as you put your data
in L1 and L2, you don’t have to tell it!
• TI-89
– Statistics/List Editor> F4 (CALC)>#3> #1
– Practice next slide!
Practice with your TI Calculator
Body Weight
(lbs)
120
187
109
103
131
165
158
116
Backpack
Weight (lbs)
26
30
26
24
29
35
31
28
• You Should get: ŷ = 16.3 + .0908x
Today’s Objective
Regression line, introducing “y-hat” ŷ  a  bx
•
–
–
–
•
•
•
–
Predicted value
Slope
Y-intercept
Extrapolation
Residuals
Least-squares Regression Line
How to use your calculator effectively for time
Homework
Worksheet
Download