AP Statistics Section 3.2 B Residuals

advertisement
AP Statistics Section 3.2 B
Residuals
One of the purposes of drawing a
regression line is to predict __
y based on
____.
x Since any prediction errors we make
are errors in y, we would like to find the
line that makes the vertical distances from
our data points to the line as ______
small as
possible. The predicted response will
usually not be exactly the same as the
actual observed response.
One of the first principals of data
analysis is to look for an overall
pattern and for striking deviations
from that pattern.
Residuals determine how well the
LSL fits the data. A residual is the
difference between an observed
value of the response variable and
the value predicted by the
regression line.
residual  y - yˆ
Example: Refer back to section 3. 2 A
and find the residual for the subject
whose NEA rose by 135 calories.
yˆ  3.505 .0034(135)
yˆ  3.046
residual  2.7 - 3.046  -.346
The residual is negative because
the data point lies ______
below the line.
The sum of the least squares
residuals is always ____.
0
The graph at the right below is the residual plot
of the NEA vs Fat Gain example in Section 3.2 A.
A residual plot makes it easier to study the
residuals by plotting them against the
explanatory variable. Because the mean of the
residuals is always 0, the horizontal line at zero
helps orient us. This “residual = 0” line
corresponds to the regression
line we drew in the section
3.2 A notes.
The residual plot magnifies the deviations
from the line to make patterns easier to
see. If the regression line captures the
overall pattern of the data, there should be
_________
no pattern in the residuals, such as in the
graph to the left.
CALCULATOR:
Put data in 2 lists and find LSL
2nd Y= (STATPLOT)
Xlist: Explanatory variable
Ylist: RESID
2nd STAT (List)
NAMES-Scroll down to
RESID
ENTER
GRAPH
ZOOM 9
Here are two important things to
look for when you examine a
residual plot.
patterns
fan-shaped
2. The residuals should be relatively small in size.
Since smallness is relative, we could find the
standard deviation of the residuals, which is
given by the equation
. 2
residuals
s
n2
The standard deviation of the residuals
represents the amount of error that could
“consistently” occur using the LSL to make
predictions.
Download