Section 9.2 Linear Regression

advertisement
Section 9.2
Linear Regression
Section 9.2 Objectives
• Find the equation of a regression line
• Predict y-values using a regression equation
Regression lines
• After verifying that the linear correlation between
two variables is significant, next we determine the
equation of the line that best models the data
(regression line).
• Can be used to predict the value of y for a given value
of x.
y
x
Residuals
Residual
• The difference between the observed y-value and the
predicted y-value for a given x-value on the line.
For a given x-value,
di = (observed y-value) – (predicted y-value)
y
Observed
y-value
d3{
}d1
}d 2
d4
{
d6{
}d5
Predicted
y-value
x
Regression Line
Regression line (line of best fit)
• The line for which the sum of the squares of the
residuals is a minimum.
• The equation of a regression line for an independent
variable x and a dependent variable y is
ŷ = mx + b
y-intercept
Predicted
y-value for
a given xvalue
Slope
The Equation of a Regression Line
• ŷ = mx + b where
n  xy   x y
m
2
n  x 2   x
y
x
b  y  mx 
m
n
n
• y is the mean of the y-values in the data
• x is the mean of the x-values in the data
• The regression line always passes through the point
 x, y 
Example: Finding the Equation of a
Regression Line
Find the equation of the
regression line for the gross
domestic products and
carbon dioxide emissions
data.
GDP
CO2 emission
(trillions of $), (millions of
x
metric tons), y
1.6
428.2
3.6
828.8
4.9
1214.2
1.1
444.6
0.9
264.0
2.9
415.3
2.7
571.8
2.3
454.9
1.6
358.7
1.5
573.5
Solution: Finding the Equation of a
Regression Line
Recall from section 9.1:
x
1.6
3.6
4.9
1.1
0.9
2.9
2.7
2.3
1.6
1.5
y
428.2
828.8
1214.2
444.6
264.0
415.3
571.8
454.9
358.7
573.5
xy
685.12
2983.68
5949.58
489.06
237.6
1204.37
1543.86
1046.27
573.92
860.25
x2
2.56
12.96
24.01
1.21
0.81
8.41
7.29
5.29
2.56
2.25
Σx = 23.1 Σy = 5554 Σxy = 15,573.71 Σx2 = 67.35
y2
183,355.24
686,909.44
1,474,281.64
197,669.16
69,696
172,474.09
326,955.24
206,934.01
128,665.69
328,902.25
Σy2 =
3,775,842.76
Solution: Finding the Equation of a
Regression Line
Σx = 23.1 Σy = 5554 Σxy = 15,573.71 Σx2 = 67.35 Σy2 =
3,775,842.76
n  xy   x y 10(15,573.71)  (23.1)(5554)

m
2
2
2
10(67.35)

23.1
n  x   x
27,439.7

196.151977
139.89
5554
23.1
b  y  mx  10  (196.151977) 10
 555.4  (196.151977)(2.31) 102.2889
Equation of the regression line yˆ  196.152 x  102.289
Solution: Finding the Equation of a
Regression Line
• To sketch the regression line, use any two x-values
within the range of the data and calculate the
corresponding y-values from the regression line.
Example: Predicting y-Values Using
Regression Equations
The regression equation for the gross domestic products
(in trillions of dollars) and carbon dioxide emissions (in
millions of metric tons) data is ŷ = 196.152x + 102.289.
Use this equation to predict the expected carbon dioxide
emissions for the following gross domestic products.
(Recall from section 9.1 that x and y have a significant
linear correlation.)
1. 1.2 trillion dollars
2. 2.0 trillion dollars
3. 2.5 trillion dollars
Solution: Predicting y-Values Using
Regression Equations
ŷ = 196.152x + 102.289
1. 1.2 trillion dollars
ŷ =196.152(1.2) + 102.289 ≈ 337.671
When the gross domestic product is $1.2 trillion, the
CO2 emissions are about 337.671 million metric tons.
2. 2.0 trillion dollars
ŷ =196.152(2.0) + 102.289 = 494.593
When the gross domestic product is $2.0 trillion, the
CO2 emissions are 494.595 million metric tons.
Solution: Predicting y-Values Using
Regression Equations
3. 2.5 trillion dollars
ŷ =196.152(2.5) + 102.289 = 592.669
When the gross domestic product is $2.5 trillion, the
CO2 emissions are 592.669 million metric tons.
Prediction values are meaningful only for x-values in
(or close to) the range of the data. The x-values in the
original data set range from 0.9 to 4.9. So, it would
not be appropriate to use the regression line to predict
carbon dioxide emissions for gross domestic products
such as $0.2 or $14.5 trillion dollars.
Section 9.2 Summary
• Found the equation of a regression line
• Predicted y-values using a regression equation
Download