Math 137 Module 13 intro SE and residuals.docx

advertisement
Module 13: SSE and Residual Plots
Sum of Squared Errors (SSE)
The estimate made from a model is the
predicted value (denoted as 𝑦̂
).
Residual = observed – predicted
̂
= y-𝒚
Which line is better?
What’s a way we can
determine which is
better?
When we compare the sum of the areas of the yellow squares, the line on the left
has an SSE of 57.8 (using computer technology). The line on the right has a
smaller SSE of 43.9.
So the line on the right fits the points better, but is it the best fit?
Computer technolgy finds this best fit line where the SSE is the mimimum.
1
Residuals
To create a residual plot, we will take the residuals and plot these errors as distances from a base line described by the
explanatory or x-variable.
Recall that the error or residual is the distance from the data point and the line of
regression which is given by:
y – ŷ
Take these distances and plot them as vertical distances based on the x-value.
Here we are showing the graph of the points with an attached line which shows
the distances. When we do our residual plots these connected lines will not be
present.
You may want to use lines to get used to marking the distances if you need and
then erase them afterward to get your completed residual plot
2
Example: A
Residual
2
If there is NO PATTERN in the residual plot then the linear
model is a good fit.
1
0
-1
-2
-3
2.5
5
7.5
10
12.5
15
x
Example: B
Residual
2
If there IS A PATTERN in the residual plot than the linear
model is not the best fit and perhaps another equation would be a
better model for the data.
1
0
-1
-2
2.5
5
7.5
10
12.5
15
x
3
Download