AP Statistics Section 3.2 A Regression Lines

advertisement
AP Statistics Section 3.2 A
Regression Lines
Linear relationships between two
quantitative variables are quite common.
Correlation measures the direction and
strength of these relationships. Just as we
drew a density curve to model the data in a
histogram, we can summarize the overall
pattern in a linear relationship by drawing
a _______________
regression line on the scatterplot.
Note that regression requires that
we have an explanatory variable
and a response variable. A
regression line is often used to
predict the value of y for a given
value of x.
Who:______________________________
16 healthy young adults
What:______________________________
Exp.-change in NEA (cal)
______________________________
Resp.-fat gain (kg)
Why:_______________________________
Do changes in NEA explain weight gain
When, where, how and by whom? The data come from a controlled
experiment in which subjects were forced to overeat for an 8-week period.
Results of the study were published in Science magazine in 1999.
8
F
a
t
6
G
a
i
n
4
(kg)
2
0
-100
0
100
200 300
400 500
NEA (calories)
600
700
8
F
a
t
G
a
i
n
(kg)
6
4
2
0
-100
0
100
200 300
400 500
NEA (calories)
600
700
Numerical summary: The
correlation between NEA
change and fat gain is r = _______
 .7786
A least-squares regression line
relating y to x has an equation of
the form ___________
yˆ  a  bx
In this equation, b is the _____,
slope
and a is the __________.
y-intercept
The formula at the right will allow
you to find the value of b:
br
Sy
Sx
Once you have computed b, you
can then find the value of a using
this equation.
a  y  b(x )
We can also find these values on
our TI-83/84.
same way we found r earlier
For this example, the LSL is
yˆ  3.505  .0034 x
or
FatGain(kg)  3.505  .0034( NEAchange(cal.))
Interpreting b: The slope b is the
predicted _____________
rate of change in the
response variable y as the
explanatory variable x changes.
The slope b = -.0034 tells
us that fat gain goes down by
.0034 kg for each additional
calorie of NEA.
You cannot say how important a
relationship is by looking at how
big the regression slope is.
Interpreting a:
The y-intercept a = 3.505 kg is the
fat gain estimated by the model if
NEA does not change when a
person overeats.
Model: Using the equation above,
draw the LSL on your scatterplot.
8
F
a
t
G
a
i
n
(kg)
 .0034
 34
10000
 .34
100
 1 .7
500
6
4
2
0
-100
0
100
200 300
400 500
NEA (calories)
600
700
TI 83/84
8:LinReg(a+bx) L1, L2 , Y1
VARS
Y  VARS
1 : Function
1 : Y1
GRAPH
ENTER
Prediction: Predict the fat gain for
an individual whose NEA increases
by 400 cal by:
(a) using the graph ___________
(b) using the equation _________
8
F
a
t
G
a
i
n
(kg)
6
4
2
0
-100
0
100
200 300
400 500
NEA (calories)
600
700
Prediction: Predict the fat gain for
an individual whose NEA increases
by 400 cal by:
(a) using the graph ___________
2.2
(b) using the equation _________
yˆ  3.505  .0034(400)
Prediction: Predict the fat gain for
an individual whose NEA increases
by 400 cal by:
(a) using the graph ___________
2.2
(b) using the equation _________
2.145
Predict the fat gain for an
individual whose NEA increases by
1500 cal.
yˆ  3.505  .0034(1500)
yˆ  1.595
So we are predicting that this
individual loses fat when he/she
overeats. What went wrong?
1500 is way outside the range
of NEA values in our data
Extrapolation is the use of a
regression line for prediction
outside the range of values of the
explanatory variable x used to
obtain the line. Such predictions
are often not accurate.
a
b
Download