“Quiz 13” Practice Question on Linear Regression Practice Problem

advertisement
“Quiz 13” Practice Question on Linear Regression
Practice Problem (From p. 593 # 8) Let x be per capita income in thousands of
dollars. Let y be the death rate per 1000 residents. Six small cities in Oregon (Albany,
Bend, Corvallis, Grants Pass, Klamath Falls and Roseburg) gave the following x and y
values.
(x)
8.6
9.3
10.1 8.0
8.3
8.7
(y)
8.4
7.6
5.4
10.6 8.3
9.3
P
P 2
P
P 2
P
For this data:
x = 53,
x = 471.04,
y = 49.6,
y = 425.22,
xy = 432.06.
Find: (a) the line of best fit; (b) find the correlation coefficient; (c) find the coefficient of
determination and explain what it means.
Answer. (a) First we compute
SSx =
and
SSxy =
X
X
P
( x)2
532
x −
= 471.04 −
= 2.8733333
n
6
2
P
P
( x)( y)
(53)(49.6)
xy −
= 432.06 −
= −6.0733333
n
6
Therefore, the slope is
b=
SSxy
= −2.1137
SSx
and the y-intercept is
49.6
− (−2.1137)
a = ȳ − bx̄ =
6
53
6
= 26.938
Therefore, the equation of the least squares line is y = −2.1137x + 26.938.
(b) Most of the information we need for computing
Sxy
SSx SSy
r= p
was found in (a), we additionally compute
SSy =
X
P
( y)2
49.62
y −
= 425.22 −
= 15.1933333
n
6
2
Therefore,
Sxy
−6.0733333
= −.9191948
=p
SSx SSy
(2.8733333)(15.19333333)
r= p
Because r is reasonably close to −1 this indicates a good negative linear correlation.
(c) The coefficient of determination is r2 = (−.9191948)2 = .8449. Therefore, approximately 84.5% of deviation is explained by the regression line, while 15.5% is unexplained.
Download