SBE10 SM14

advertisement
Chapter 14
Simple Linear Regression
Learning Objectives
1.
Understand how regression analysis can be used to develop an equation that estimates
mathematically how two variables are related.
2.
Understand the differences between the regression model, the regression equation, and the estimated
regression equation.
3.
Know how to fit an estimated regression equation to a set of sample data based upon the leastsquares method.
4.
Be able to determine how good a fit is provided by the estimated regression equation and compute
the sample correlation coefficient from the regression analysis output.
5.
Understand the assumptions necessary for statistical inference and be able to test for a significant
relationship.
6.
Know how to develop confidence interval estimates of y given a specific value of x in both the case
of a mean value of y and an individual value of y.
7.
Learn how to use a residual plot to make a judgement as to the validity of the regression assumptions.
8.
Know the definition of the following terms:
independent and dependent variable
simple linear regression
regression model
regression equation and estimated regression equation
scatter diagram
coefficient of determination
standard error of the estimate
confidence interval
prediction interval
residual plot
14 - 1
Chapter 14
Solutions:
a.
16
14
12
10
y
1
8
6
4
2
0
0
1
2
3
4
5
6
x
b.
There appears to be a positive linear relationship between x and y.
c.
Many different straight lines can be drawn to provide a linear approximation of the
relationship between x and y; in part (d) we will determine the equation of a straight line
that “best” represents the relationship according to the least squares criterion.
d.
x
xi 15

3
n
5
y
( xi  x )( yi  y )  26
b1 
yi 40

8
n
5
( xi  x ) 2  10
( xi  x )( yi  y ) 26

 2.6
10
( xi  x )2
b0  y  b1 x  8  (2.6)(3)  0.2
yˆ  0.2  2.6 x
e.
yˆ  0.2  2.6(4)  10.6
14 - 2
Simple Linear Regression
a.
60
50
40
y
2.
30
20
10
0
0
5
10
15
20
25
x
b.
There appears to be a negative linear relationship between x and y.
c.
Many different straight lines can be drawn to provide a linear approximation of the
relationship between x and y; in part (d) we will determine the equation of a straight line
that “best” represents the relationship according to the least squares criterion.
d.
x
xi 55

 11
n
5
y
( xi  x )( yi  y )  540
b1 
yi 175

 35
n
5
( xi  x ) 2  180
( xi  x )( yi  y ) 540

 3
180
( xi  x )2
b0  y  b1 x  35  (3)(11)  68
yˆ  68  3x
e.
yˆ  68  3(10)  38
14 - 3
Chapter 14
a.
30
25
20
y
3.
15
10
5
0
0
5
10
15
x
b.
x
xi 50

 10
n
5
y
( xi  x )( yi  y )  171
b1 
yi 83

 16.6
n
5
( xi  x ) 2  190
( xi  x )( yi  y ) 171

 0.9
190
( xi  x )2
b0  y  b1 x  16.6  (0.9)(10)  7.6
yˆ  7.6  0.9 x
c.
yˆ  7.6  0.9(6)  13
14 - 4
20
25
Simple Linear Regression
4. a.
135
130
125
y
120
115
110
105
100
61
62
63
64
65
66
67
68
69
x
b.
There appears to be a positive linear relationship between x and y.
c.
Many different straight lines can be drawn to provide a linear approximation of the
relationship between x and y; in part (d) we will determine the equation of a straight line
that “best” represents the relationship according to the least squares criterion.
d.
x
xi 325

 65
n
5
y
( xi  x )( yi  y )  110
b1 
yi 585

 117
n
5
( xi  x ) 2  20
( xi  x )( yi  y ) 110

 5.5
20
( xi  x ) 2
b0  y  b1 x  117  (5.5)(65)  240.5
y  240.5  55
. x
e.
y  240.5  55
. x  240.5  55
. (63)  106 pounds
14 - 5
Chapter 14
a.
2500
2000
Price ($)
5.
1500
1000
500
0
1
1.5
2
2.5
3
3.5
4
4.5
Baggage Capacity
b.
Let x = baggage capacity and y = price ($).
There appears to be a positive linear relationship between x and y.
c.
Many different straight lines can be drawn to provide a linear approximation of the
relationship between x and y; in part (d) we will determine the equation of a straight line
that “best” represents the relationship according to the least squares criterion.
d.
x
xi 29.5

 3.277778
n
9
y
( xi  x )( yi  y )  2909.888891
b1 
yi 11,110

 123.444444
n
9
( xi  x ) 2  4.555559
( xi  x )( yi  y ) 2909.888891

 638.755615
4.555559
( xi  x )2
b0  y  b1 x  1234.4444  (638.7561)(3.2778)  859.254658
yˆ  859.26  638.76 x
e.
A one point increase in the baggage capacity rating will increase the price by approximately $639.
f.
yˆ  859.26  638.76 x  859.26  638.76(3)  $1057
14 - 6
Simple Linear Regression
a.
30
25
20
Bonus
6.
15
10
5
0
50.00
70.00
90.00
110.00
130.00
150.00
170.00
190.00
Salary
b.
Let x = salary and y = bonus.
There appears to be a positive linear relationship between x and y.
c.
x
xi 1420

 142
n
10
xi  1420
b1 
yi  160
y
yi 160

 16
n
10
( xi  x )( yi  y )  1114
( xi  x ) 2  6046
( xi  x )( yi  y ) 1114

 .184254
6046
( xi  x )2
b0  y  b1 x  16  (.184254)(142)  10.164068
yˆ  10.1641  .1843x
d.
$1000 increase in salary will increase the bonus by $180.
e
yˆ  10.1641  .1843x  10.1641  .1843(120)  11.95 or approximately $12,000
14 - 7
Chapter 14
a.
44,000
42,000
40,000
Price
7.
38,000
36,000
34,000
32,000
30,000
0
1
2
3
4
5
6
Reliability
b.
x  xi / n  47 /15  3.1333
y  yi / n  548, 434 /15  36,562.2667
( xi  x )( yi  y )  36, 086.5333
b1 
( xi  x ) 2  27.7333
( xi  x )( yi  y ) 36, 086.5333

 1301.1987
27.7333
( xi  x )2
b0  y  b1 x  36,562.2667  (1301.1987)(3.1333)  40,639.3126
yˆ  40,369  1301.2 x
c.
The scatter diagram and the slope of the estimated regression equation indicate a negative linear
relationship between reliability and price. Thus, it appears that higher reliable cars actually cost less.
Although this result may surprise you, it may be due to the fact that higher priced cars have more
options that may increase the likelihood of problems.
d.
A car with a good reliability rating corresponds to x = 3.
yˆ  40,369  1301.2 x  40,369  1301.2(3)  36, 465.40
Thus, the estimate of the price of an upscale sedan with a good reliability rating is approximately
$36,465.
14 - 8
Simple Linear Regression
a.
1800
1600
1400
1200
Price
8.
1000
800
600
400
200
0
0
1
2
3
4
5
6
Sidetrack Capability
b.
There appears to be a positive linear relationship between x = sidetrack capability and y = price, with
higher priced models having a higher level of handling.
c.
x
xi 28

 2.8
n
10
y
( xi  x )( yi  y )  4003.2
b1 
yi 10, 621

 1062.1
n
10
( xi  x ) 2  19.6
( xi  x )( yi  y ) 4003.2

 204.2449
19.6
( xi  x )2
b0  y  b1 x  1062.1  (204.2449)(2.8)  490.2143
yˆ  490.21  204.24 x
d.
yˆ  490.21  204.24 x  490.21  204.24(4)  1307
14 - 9
Chapter 14
a.
150
140
130
120
110
y
9.
100
90
80
70
60
50
0
2
4
6
8
x
b.
x
xi 70

7
n
10
y
( xi  x )( yi  y )  568
b1 
yi 1080

 108
n
10
( xi  x ) 2  142
( xi  x )( yi  y ) 568

4
142
( xi  x )2
b0  y  b1 x  108  (4)(7)  80
y  80  4 x
c.
y  80  4 x  80  4(9)  116
14 - 10
10
12
14
Simple Linear Regression
10. a.
450
400
350
Price
300
250
200
150
100
50
0
0
10
20
30
40
50
Rating
b.
The scatter diagram and the slope of the estimated regression equation indicate a negative linear
relationship between rating and price. Thus, it appears that sleeping bags with a lower temperature
rating cost more than sleeping bags with a higher temperature rating. In other words, it costs more to
stay warmer.
c.
x  xi / n  209 /11  19
( xi  x )( yi  y )  10, 090
b1 
y  yi / n  2849 /11  259
( xi  x ) 2  1912
( xi  x )( yi  y ) 10, 090

 5.2772
1912
( xi  x )2
b0  y  b1 x  259  (5.2772)(19)  359.2668
yˆ  359.2668  5.2772 x
d.
yˆ  359.2668  5.2772 x  359.2668  5.2772(20)  253.72
Thus, the estimate of the price of sleeping bag with a temperature rating of 20 is approximately $254.
14 - 11
Chapter 14
11. a.
35
Departures (%)
30
25
20
15
10
5
0
10
15
20
25
30
35
Arrivals (%)
b.
There appears to be a positive linear relationship between the variables.
c.
Let x = percentage of late arrivals and y = percentage of late departures.
x
xi 273
y
265

 21 y  i 
 20.3846
n
13
n
13
( xi  x )( yi  y )  142
b1 
( xi  x ) 2  166
( xi  x )( yi  y ) 142

 .8554
166
( xi  x )2
b0  y  b1 x  20.3846  (.8554)(21)  2.4212
yˆ  2.42  .86 x
d.
A one percent increase in the percentage of late arrivals will increase the percentage of late arrivals
by .86 or slightly less than one percent.
e.
yˆ  2.42  .86 x  2.42  .86(22)  21.34%
14 - 12
Simple Linear Regression
12. a.
12000
11000
Price
10000
9000
8000
7000
6000
5000
700
720
740
760
780
800
820
840
Weight
b.
The scatter diagram indicates a positive linear relationship between weight and price. Thus, it
appears that PWC’s that weigh more have a higher price.
c.
x  xi / n  7730 /10  773
y  yi / n  92, 200 /10  9220
( xi  x )( yi  y )  332, 400
( xi  x ) 2  14,810
b1 
( xi  x )( yi  y ) 332, 400

 22.4443
14,810
( xi  x )2
b0  y  b1 x  9220  (22.4443)(773)  8129.4439
yˆ  8129.4439  22.4443x
d.
yˆ  8129.4439  22.4443x  8129.4439  22.4443(750)  8703.78
Thus, the estimate of the price of Jet Ski with a weight of 750 pounds is approximately $8704.
e.
No. The relationship between weight and price is not deterministic.
f.
The weight of the Kawasaki SX-R 800 is so far below the lowest weight for the data used to develop
the estimated regression equation that we would not recommend using the estimated regression
equation to predict the price for this model.
14 - 13
Chapter 14
13. a.
30.0
25.0
y
20.0
15.0
10.0
5.0
0.0
0.0
20.0
40.0
60.0
80.0
100.0
120.0
x
b.
x
xi 399

 57
n
7
y
( xi  x )( yi  y )  1233.7
b1 
yi 97.1

 13.8714
n
7
( xi  x ) 2  7648
( xi  x )( yi  y ) 1233.7

 0.1613
7648
( xi  x )2
b0  y  b1 x  13.8714  (0.1613)(57)  4.6773
y  4.68  016
. x
c.
y  4.68  016
. x  4.68  016
. (52.5)  13.08 or approximately $13,080.
The agent's request for an audit appears to be justified.
14 - 14
140.0
Simple Linear Regression
Salary
14. a.
30.0
29.0
28.0
27.0
26.0
25.0
24.0
23.0
22.0
21.0
20.0
0.00
20.00
40.00
60.00
80.00
100.00
Index
b.
Let x = cost of living index and y = starting salary ($1000s)
x
xi 369.16

 36.916
n
10
y
( xi  x )( yi  y )  311.9592
b1 
yi 267.7

 26.77
n
10
( xi  x ) 2  7520.4042
( xi  x )( yi  y ) 311.9592

 .0415
7520.4042
( xi  x )2
b0  y  b1 x  26.77  (.0415)(36.916)  28.30
yˆ  28.30  .0415x
c.
15. a.
yˆ  28.30  .0415x  28.30  .0415(50)  26.2
The estimated regression equation and the mean for the dependent variable are:
yi  0.2  2.6xi
y 8
The sum of squares due to error and the total sum of squares are
SSE  ( yi  yi ) 2  12.40
SST  ( yi  y ) 2  80
Thus, SSR = SST - SSE = 80 - 12.4 = 67.6
b.
r2 = SSR/SST = 67.6/80 = .845
The least squares line provided a very good fit; 84.5% of the variability in y has been explained by
the least squares line.
c.
rxy  .845  .9192
14 - 15
Chapter 14
16. a.
The estimated regression equation and the mean for the dependent variable are:
yˆi  68  3x
y  35
The sum of squares due to error and the total sum of squares are
SSE  ( yi  yˆi ) 2  230
SST  ( yi  y ) 2  1850
Thus, SSR = SST - SSE = 1850 - 230 = 1620
b.
r2 = SSR/SST = 1620/1850 = .876
The least squares line provided an excellent fit; 87.6% of the variability in y has been explained by
the estimated regression equation.
c.
rxy  .876  .936
Note: the sign for r is negative because the slope of the estimated regression equation is negative.
(b1 = -3)
17.
The estimated regression equation and the mean for the dependent variable are:
yˆi  7.6  .9 x
y  16.6
The sum of squares due to error and the total sum of squares are
SSE  ( yi  yˆi ) 2  127.3
SST  ( yi  y ) 2  281.2
Thus, SSR = SST - SSE = 281.2 – 127.3 = 153.9
r2 = SSR/SST = 153.9/281.2 = .547
We see that 54.7% of the variability in y has been explained by the least squares line.
rxy  .547  .740
18. a.
The estimated regression equation and the mean for the dependent variable are:
yˆ  1790.5  581.1x
y  3650
The sum of squares due to error and the total sum of squares are
SSE  ( yi  yˆi ) 2  85,135.14
SST  ( yi  y ) 2  335, 000
Thus, SSR = SST - SSE = 335,000 - 85,135.14 = 249,864.86
b.
r2 = SSR/SST = 249,864.86/335,000 = .746
We see that 74.6% of the variability in y has been explained by the least squares line.
c.
rxy  .746  .8637
14 - 16
Simple Linear Regression
19.
The estimated regression equation and the mean for the dependent variable are:
yˆi  40,639  1301.2 x
y  36,562.27
The sum of squares due to error and the total sum of squares are
SSE  ( yi  yˆi ) 2  47,116,828
SST  ( yi  y ) 2  94, 072,519
Thus, SSR = SST - SSE = 94,072,519 – 47,116,828 = 46,955,691
r2 = SSR/SST = 46,955,691/94,072,519 = .4991
We see that 49.91% of the variability in y has been explained by the least squares line.
rxy  .4991  .71
20. a.
70
65
60
Score
55
50
45
40
35
30
25
20
1500.00
2000.00
2500.00
3000.00
3500.00
4000.00
4500.00
Price
The scatter diagram indicates a positive linear relationship between price and score.
x  xi / n  29, 600 /10  2960
( xi  x )( yi  y )  34,840
b1 
y  yi / n  496 /10  49.6
( xi  x ) 2  2, 744, 000
( xi  x )( yi  y )
34,840

 .012697
2
2, 744, 000
( xi  x )
b0  y  b1 x  49.6  (.012697)(2960)  12.0169
yˆ  12.0169  .0127 x
14 - 17
Chapter 14
b.
The sum of squares due to error and the total sum of squares are
SSE  ( yi  yˆi ) 2  540.0446
SST  ( yi  y ) 2  982.4
Thus, SSR = SST - SSE = 982.4 – 540.0446 = 442.3554
r2 = SSR/SST = 442.3554/982.4 = .4503
The fit provided by the estimated regression equation is not that good; only 45.03% of the variability
in y has been explained by the least squares line.
c.
yˆ  12.0169  .0127 x  12.0169  .0127(3200)  52.66
The estimate of the overall score for a 42-inch plasma television is approximately 53.
21. a.
x
xi 3450

 575
n
6
y
yi 33, 700

 5616.67
n
6
( xi  x )( yi  y )  712,500
b1 
( xi  x ) 2  93, 750
( xi  x )( yi  y ) 712,500

 7.6
93, 750
( xi  x ) 2
b0  y  b1 x  5616.67  (7.6)(575)  1246.67
y  1246.67  7.6 x
b.
$7.60
c.
The sum of squares due to error and the total sum of squares are:
SSE  ( yi  yˆi ) 2  233,333.33
SST  ( yi  y ) 2  5, 648,333.33
Thus, SSR = SST - SSE = 5,648,333.33 - 233,333.33 = 5,415,000
r2 = SSR/SST = 5,415,000/5,648,333.33 = .9587
We see that 95.87% of the variability in y has been explained by the estimated regression equation.
d.
22. a.
y  1246.67  7.6x  1246.67  7.6(500)  $5046.67
Let x = speed (ppm) and y = price ($)
x
xi 149.9

 14.99
n
10
y
( xi  x )( yi  y )  34,359.810
b1 
yi 10, 221

 1022.1
n
10
( xi  x ) 2  291.3890
( xi  x )( yi  y ) 34,359.810

 117.917320
291.3890
( xi  x ) 2
14 - 18
Simple Linear Regression
b0  y  b1 x  1022.1  (117.917320)(14.99)  745.480627
yˆ  745.480627  117.917320x
b.
The sum of squares due to error and the total sum of squares are:
SSE  ( yi  yˆi ) 2  1, 678, 294
SST  ( yi  y ) 2  5, 729,911
Thus, SSR = SST - SSE = 5,729,911 - 1,678,294 = 4,051,617
r2 = SSR/SST = 4,051,617/5,729,911 = 0.7071
Approximately 71% of the variability in price is explained by the speed.
c.
rxy  .7071  .84
It reflects a linear relationship that is between weak and strong.
23. a.
s2 = MSE = SSE / (n - 2) = 12.4 / 3 = 4.133
b.
s  MSE  4.133  2.033
c.
( xi  x ) 2  10
sb1 
d.
t
s
( xi  x )
2

2.033
10
 0.643
b1
2.6

 4.044
sb1 .643
Using t table (3 degrees of freedom), area in tail is between .01 and .025
p-value is between .02 and .05
Using Excel or Minitab, the p-value corresponding to t = 4.04 is .0272.
Because p-value   , we reject H0: 1 = 0
e.
MSR = SSR / 1 = 67.6
F = MSR / MSE = 67.6 / 4.133 = 16.36
Using F table (1 degree of freedom numerator and 3 denominator), p-value is between .025 and .05
Using Excel or Minitab, the p-value corresponding to F = 16.36 is .0272.
Because p-value   , we reject H0: 1 = 0
14 - 19
Chapter 14
Source
of Variation
Regression
Error
Total
24. a.
Sum
of Squares
67.6
12.4
80.0
Mean
Square
67.6
4.133
F
16.36
p-value
.0272
s2 = MSE = SSE/(n - 2) = 230/3 = 76.6667
b.
s  MSE  76.6667  8.7560
c.
( xi  x ) 2  180
sb1 
d.
Degrees
of Freedom
1
3
4
t
s
( xi  x )
2

8.7560
180
 0.6526
b1
3

 4.59
sb1 .653
Using t table (3 degrees of freedom), area in tail is less than .01; p-value is less than .02
Using Excel or Minitab, the p-value corresponding to t = -4.59 is .0193.
Because p-value   , we reject H0: 1 = 0
e.
MSR = SSR/1 = 1620
F = MSR/MSE = 1620/76.6667 = 21.13
Using F table (1 degree of freedom numerator and 3 denominator), p-value is less than .025
Using Excel or Minitab, the p-value corresponding to F = 21.13 is .0193.
Because p-value   , we reject H0: 1 = 0
Source
of Variation
Regression
Error
Total
25. a.
Sum
of Squares
230
1620
1850
Degrees
of Freedom
1
3
4
s2 = MSE = SSE/(n - 2) = 127.3/3 = 42.4333
s  MSE  42.4333  6.5141
b.
( xi  x ) 2  190
sb1 
s
( xi  x )
2

6.5141
190
 0.4726
14 - 20
Mean
Square
230
76.6667
F
21.13
p-value
.0193
Simple Linear Regression
t
b1
.9

 1.90
sb1 .4726
Using t table (3 degrees of freedom), area in tail is between .05 and .10
p-value is between .10 and .20
Using Excel or Minitab, the p-value corresponding to t = 1.90 is .1530.
Because p-value >  , we cannot reject H0: 1 = 0; x and y do not appear to be related.
c.
MSR = SSR/1 = 153.9 /1 = 153.9
F = MSR/MSE = 153.9/42.4333 = 3.63
Using F table (1 degree of freedom numerator and 3 denominator), p-value is greater than .10
Using Excel or Minitab, the p-value corresponding to F = 3.63 is .1530.
Because p-value >  , we cannot reject H0: 1 = 0; x and y do not appear to be related.
26. a.
In solving exercise 18, we found SSE = 85,135.14
s2 = MSE = SSE/(n - 2) = 85,135.14/4 = 21,283.79
s  MSE  21,28379
.  14589
.
( xi  x ) 2  0.74
sb1 
t
s
( xi  x )
2

145.89
0.74
 169.59
b1
581.1

 3.43
sb1 169.59
Using t table (4 degrees of freedom), area in tail is between .01 and .025
p-value is between .02 and .05
Using Excel or Minitab, the p-value corresponding to t = 3.43 is .0266.
Because p-value   , we reject H0: 1 = 0
b.
MSR = SSR/1 = 249,864.86/1 = 249.864.86
F = MSR/MSE = 249,864.86/21,283.79 = 11.74
Using F table (1 degree of freedom numerator and 4 denominator), p-value is between .025 and .05
14 - 21
Chapter 14
Using Excel or Minitab, the p-value corresponding to F = 11.74 is .0266.
Because p-value   , we reject H0: 1 = 0
c.
Source
of Variation
Regression
Error
Total
27. a.
x
Sum
of Squares
249864.86
85135.14
335000
xi 37

 3.7
n
10
y
( xi  x )( yi  y )  315.2
b1 
Degrees
of Freedom
1
4
5
Mean
Square
249864.86
21283.79
F
11.74
p-value
.0266
yi 1654

 165.4
n
10
( xi  x ) 2  10.1
( xi  x )( yi  y ) 315.2

 31.2079
10.1
( xi  x ) 2
b0  y  b1 x  165.4  (31.2079)(3.7)  49.9308
yˆ  49.9308  31.2079 x
b.
SSE = ( yi  yˆi ) 2  2487.66 SST = ( yi  y ) 2 = 12,324.4
Thus, SSR = SST - SSE = 12,324.4 - 2487.66 = 9836.74
MSR = SSR/1 = 9836.74
MSE = SSE/(n - 2) = 2487.66/8 = 310.96
F = MSR / MSE = 9836.74/310.96 = 31.63
Using F table (1 degree of freedom numerator and 8 denominator), p-value is less than .01
Using Excel or Minitab, the p-value corresponding to F = 31.63 is .001.
Because p-value   , we reject H0: 1 = 0
Upper support and price are related.
c.
r2 = SSR/SST = 9,836.74/12,324.4 = .80
The estimated regression equation provided a good fit; we should feel comfortable using the
estimated regression equation to estimate the price given the upper support rating.
d.
ŷ = 49.93 + 31.21(4) = 174.77
14 - 22
Simple Linear Regression
28.
The sum of squares due to error and the total sum of squares are
SSE  ( yi  yˆi ) 2  12,953.09
SST  ( yi  y ) 2  66, 200
Thus, SSR = SST - SSE = 66,200 – 12,953.09 = 53,246.91
s2 = MSE = SSE / (n - 2) = 12,953.09 / 9 = 1439.2322
s  MSE  1439.2322  37.9372
We can use either the t test or F test to determine whether temperature rating and price are related.
We will first illustrate the use of the t test.
Note: from the solution to exercise 10 ( xi  x ) 2  1912
s
sb1 
t
( xi  x )
b1
sb

5.2772
.8676
2

37.9372
1912
 .8676
 6.0825
1
Using t table (9 degrees of freedom), area in tail is less than .005; p-value is less than .01
Using Excel or Minitab, the p-value corresponding to t = -6.0825 is .000.
Because p-value   , we reject H0: 1 = 0
Because we can reject H0: 1 = 0 we conclude that temperature rating and price are related.
Next we illustrate the use of the F test.
MSR = SSR / 1 = 53,246.91
F = MSR / MSE = 53,246.91 / 1439.2322 = 37.00
Using F table (1 degree of freedom numerator and 9 denominator), p-value is less than .01
Using Excel or Minitab, the p-value corresponding to F = 37.00 is .000.
Because p-value   , we reject H0: 1 = 0
Because we can reject H0: 1 = 0 we conclude that temperature rating and price are related.
The ANOVA table is shown below.
Source
of Variation
Regression
Error
Total
Sum
of Squares
53,246.91
12,953.09
66,200
Degrees
of Freedom
1
9
10
14 - 23
Mean
Square
53,246.91
1439.2322
F
37.00
p-value
.000
Chapter 14
29.
SSE = ( yi  yˆi ) 2  233,333.33
SST = ( yi  y ) 2 = 5,648,333.33
Thus, SSR = SST – SSE = 5,648,333.33 –233,333.33 = 5,415,000
MSE = SSE/(n - 2) = 233,333.33/(6 - 2) = 58,333.33
MSR = SSR/1 = 5,415,000
F = MSR / MSE = 5,415,000 / 58,333.25 = 92.83
Source of
Variation
Regression
Error
Total
Sum
of Squares
5,415,000.00
233,333.33
5,648,333.33
Degrees of
Freedom
1
4
5
Mean
Square
5,415,000
58,333.33
F
92.83
p-value
.0006
Using F table (1 degree of freedom numerator and 4 denominator), p-value is less than .01
Using Excel or Minitab, the p-value corresponding to F = 92.83 is .0006.
Because p-value   , we reject H0: 1 = 0. Production volume and total cost are related.
30.
SSE = ( yi  yˆi ) 2  1,678,294
SST = ( yi  y ) 2 = 5,729,911
Thus, SSR = SST – SSE = 5,729,911 – 1,678,294 = 4,051,617
s2 = MSE = SSE/(n-2) = 1,678,294/8 = 209,786.8
s  209,786.8  458.02
 ( xi  x ) 2 = 291.3890
sb1 
t
s
( xi  x )
2

458.02
291.3890
 26.83
b1 117.9173

 4.39
sb1
26.83
Using t table (1 degree of freedom numerator and 8 denominator), area in tail is less than .005
p-value is less than .01
Using Excel or Minitab, the p-value corresponding to t = 4.39 is .002.
Because p-value   , we reject H0: 1 = 0
There is a significant relationship between x and y.
14 - 24
Simple Linear Regression
31.
SSE = 540.04 and SST = 982.40
Thus, SSR = SST - SSE = 982.40 – 540.04 = 442.36
s2 = MSE = SSE / (n - 2) = 540.04 / 8 = 67.5050
s  MSE  67.5050  8.216
MSR = SSR / 1 = 442.36
F = MSR / MSE = 442.36 / 67.5050 = 6.55
Using F table (1 degree of freedom numerator and 8 denominator), p-value is between .025 and .05
Using Excel or Minitab, the p-value corresponding to F = 6.55 is .034.
Because p-value   , we reject H0: 1 = 0
Conclusion: price and overall score are related
32. a.
b.
s = 2.033
x 3
( xi  x ) 2  10
s yˆp  s
2
1 ( xp  x )
1 (4  3)2


2.033

 1.11
n ( xi  x )2
5
10
y  0.2  2.6x  0.2  2.6(4)  10.6
y p  t / 2 sy p
10.6  3.182 (1.11) = 10.6  3.53
or 7.07 to 14.13
2
1 ( xp  x )
1 (4  3)2
 s 1 
 2.033 1  
 2.32
2
n ( xi  x )
5
10
c.
sind
d.
y p  t / 2 sind
10.6  3.182 (2.32) = 10.6  7.38
or 3.22 to 17.98
33. a.
b.
s = 8.7560
x  11
( xi  x )2  180
14 - 25
Chapter 14
s yˆp  s
2
1 ( xp  x )
1 (8  11)2


8.7560

 4.3780
n ( xi  x )2
5
180
yˆ  68  3x  68  3(10)  38
y p  t / 2 sy p
38  3.182(4.3780) = 38  13.93
or 24.07 to 51.93
c.
sind  s 1 
d.
y p  t / 2 sind
2
1 ( xp  x )
1 (8  11)2

 8.7560 1  
 9.7895
2
n ( xi  x )
5
180
38  3.182(9.7895) = 38  31.15
or 6.85 to 69.15
34.
s = 6.5141
x  10
s yˆp  s
( xi  x ) 2  190
2
1 ( xp  x )
1 (12  10)2


6.5141

 3.0627
n ( xi  x )2
5
190
yˆ  7.6  .9 x  7.6  .9(12)  18.40
y p  t / 2 sy p
18.40  3.182(3.0627) = 18.40  9.75
or 8.65 to 28.15
sind  s 1 
2
1 ( xp  x )
1 (12  10)2


6.5141
1


 7.1982
n ( xi  x )2
5
190
y p  t / 2 sind
18.40  3.182(7.1982) = 18.40  22.90
or -4.50 to 41.30
14 - 26
Simple Linear Regression
35. a.
Note: some of the values shown were computed in exercises 18 and 26.
s = 145.89
x  3.2
s yˆp  s
( xi  x ) 2  0.74
2
1 ( xp  x )
1 (3  3.2)2

 145.89

 68.54
2
n ( xi  x )
6
0.74
yˆ  1790.5  581.1x  1790.5  581.1(3)  3533.8
y p  t / 2 sy p
3533.8  2.776 (68.54) = 3533.8  190.27
or $3343.53 to $3724.07
b.
sind  s 1 
2
1 ( xp  x )
1 (3  3.2)2


145.89
1


 161.19
n ( xi  x )2
6
0.74
y p  t / 2 sind
3533.8  2.776 (161.19) = 3533.8  447.46
or $3086.34 to $3981.26
36. a.
b.
yˆ  359.2668  5.2772 x =359.2668 – 5.2772(30) = 200.95 or approximately $201.
s = 37.9372
x  19
s yˆp  s
( xi  x ) 2  1912
2
1 ( xp  x )
1 (30  19)2


37.9372

 14.90
n ( xi  x )2
11
1912
y p  t / 2 sy p
200.95  2.262 (14.90) = 200.95  33.70
or 167.25 to 234.65
c.
sind  s 1 
2
1 ( xp  x )
1 (30  19)2


37.9372
1


 40.76
n ( xi  x )2
11
1912
y p  t / 2 sind
14 - 27
Chapter 14
200.95  2.262 (40.76) = 200.95  92.20
or 108.75 to 293.15
d.
37. a.
As expected, the prediction interval is much wider than the confidence interval. This is due to the
fact that it is more difficult to predict the results for one new model with a temperature rating of 30
Fo than it is to estimate the mean for all models with a temperature rating of 30 Fo .
x  57
( xi  x ) 2  7648
s2 = 1.88
s = 1.37
2
1 ( xp  x )
1 (52.5  57)2
s yˆp  s

 1.37

 0.52
2
n ( xi  x )
7
7648
y p  t / 2 sy p
ŷ = 4.68 + 0.16x = 4.68 + 0.16(52.5) = 13.08
13.08  2.571 (.52) = 13.08  1.34
or 11.74 to 14.42 or $11,740 to $14,420
b.
sind = 1.47
13.08  2.571 (1.47) = 13.08  3.78
or 9.30 to 16.86 or $9,300 to $16,860
c.
Yes, $20,400 is much larger than anticipated.
d.
Any deductions exceeding the $16,860 upper limit could suggest an audit.
38. a.
b.
y  1246.67  7.6(500)  $5046.67
x  575
( xi  x ) 2  93, 750
s2 = MSE = 58,333.33 s = 241.52
sind  s 1 
2
1 ( xp  x )
1 (500  575)2


241.52
1


 267.50
n ( xi  x )2
6
93,750
y p  t / 2 sind
5046.67  4.604 (267.50) = 5046.67  1231.57
or $3815.10 to $6278.24
c.
Based on one month, $6000 is not out of line since $3815.10 to $6278.24 is the prediction interval.
However, a sequence of five to seven months with consistently high costs should cause concern.
14 - 28
Simple Linear Regression
39. a.
Let x = miles of track and y = weekday ridership in thousands.
x
xi 203

 29
n
7
y
( xi  x )( yi  y )  1471
b1 
yi 309

 44.1429
n
7
( xi  x ) 2  838
( xi  x )( yi  y ) 1471

 1.7554
838
( xi  x )2
b0  y  b1 x  44.1429  (1.7554)(29)  6.76
yˆ  6.76  1.755x
b.
SST =3620.9 SSE = 1038.7 SSR = 2582.1
r2 = SSR/SST = 2582.1/3620.9 = .713
The estimated regression equation explained 71.3% of the variability in y; a good fit.
c.
s2 = MSE = 1038.7/5 = 207.7
s  207.7  14.41
s yˆp  s
2
1 ( xp  x )
1 (30  29)2


14.41

 5.47
n ( xi  x )2
7
838
yˆ  6.76  1.755x  6.76  1.755(30)  45.9
y p  t / 2 sy p
45.9  2.571(5.47) = 45.9  14.1
or 31.8 to 60
d.
sind  s 1 
2
1 ( xp  x )
1 (30  29)2

 14.41 1  
 15.41
2
n ( xi  x )
7
838
y p  t / 2 sind
45.9  2.571(15.41) = 45.9  39.6
or 6.3 to 85.5
The prediction interval is so wide that it would not be of much value in the planning process. A
larger data set would be beneficial.
14 - 29
Chapter 14
40. a.
9
b.
ŷ = 20.0 + 7.21x
c.
1.3626
d.
SSE = SST - SSR = 51,984.1 - 41,587.3 = 10,396.8
MSE = 10,396.8/7 = 1,485.3
F = MSR / MSE = 41,587.3 /1,485.3 = 28.00
Using F table (1 degree of freedom numerator and 7 denominator), p-value is less than .01
Using Excel or Minitab, the p-value corresponding to F = 28.00 is .0011.
Because p-value   , we reject H0: B1 = 0.
e.
41. a.
b.
ŷ = 20.0 + 7.21(50) = 380.5 or $380,500
ŷ = 6.1092 + .8951x
t
b1  B1 .8951  0

 6.01
sb1
.149
Using the t table (8 degrees of freedom), area in tail is less than .005
p-value is less than .01
Using Excel or Minitab, the p-value corresponding to t = 6.01 is .0003.
Because p-value   , we reject H0: B1 = 0
c.
42 a.
ŷ = 6.1092 + .8951(25) = 28.49 or $28.49 per month
ŷ = 80.0 + 50.0x
b.
30
c.
F = MSR / MSE = 6828.6/82.1 = 83.17
Using F table (1 degree of freedom numerator and 28 denominator), p-value is less than .01
Using Excel or Minitab, the p-value corresponding to F = 83.17 is .000.
Because p-value <  = .05, we reject H0: B1 = 0.
Branch office sales are related to the salespersons.
d.
ŷ = 80 + 50 (12) = 680 or $680,000
14 - 30
Simple Linear Regression
43. a.
The Minitab output is shown below:
The regression equation is
Price = 4.98 + 2.94 Weight
Predictor
Constant
Weight
Coef
4.979
2.9370
S = 8.457
SE Coef
3.380
0.2934
R-Sq = 80.7%
T
1.47
10.01
P
0.154
0.000
R-Sq(adj) = 79.9%
Analysis of Variance
Source
Regression
Residual Error
Total
DF
1
24
25
SS
7167.9
1716.6
8884.5
MS
7167.9
71.5
F
100.22
P
0.000
Predicted Values for New Observations
New Obs
1
Fit
SE Fit
34.35
1.66
(
95.0% CI
30.93,
b.
The p-value = .000 <  = .05 (t or F); significant relationship
c.
r2 = .807. The least squares line provided a very good fit.
d.
The 95% confidence interval is 30.93 to 37.77.
e.
The 95% prediction interval is 16.56 to 52.14.
37.77)
(
95.0% PI
16.56,
52.14)
44. a/b. The scatter diagram shows a linear relationship between the two variables.
c.
The Minitab output is shown below:
The regression equation is
Rental$ = 37.1 - 0.779 Vacancy%
Predictor
Constant
Vacancy%
Coef
37.066
-0.7791
S = 4.889
SE Coef
3.530
0.2226
R-Sq = 43.4%
T
10.50
-3.50
P
0.000
0.003
R-Sq(adj) = 39.8%
Analysis of Variance
Source
Regression
Residual Error
Total
DF
1
16
17
SS
292.89
382.37
675.26
MS
292.89
23.90
F
12.26
P
0.003
Predicted Values for New Observations
New Obs
1
2
Fit
17.59
28.26
SE Fit
2.51
1.42
95.0% CI
( 12.27,
22.90)
( 25.26,
31.26)
14 - 31
(
(
95.0% PI
5.94,
29.23)
17.47,
39.05)
Chapter 14
Values of Predictors for New Observations
New Obs
1
2
Vacancy%
25.0
11.3
d.
Since the p-value = 0.003 is less than  = .05, the relationship is significant.
e.
r2 = .434. The least squares line does not provide a very good fit.
f.
The 95% confidence interval is 12.27 to 22.90 or $12.27 to $22.90.
g.
The 95% prediction interval is 17.47 to 39.05 or $17.47 to $39.05.
45. a.
x
xi 70

 14
n
5
y
( xi  x )( yi  y )  200
b1 
yi 76

 15.2
n
5
( xi  x ) 2  126
( xi  x )( yi  y ) 200

 1.5873
126
( xi  x )2
b0  y  b1 x  15.2  (1.5873)(14)  7.0222
yˆ  7.02  1.59 x
b.
The residuals are 3.48, -2.47, -4.83, -1.6, and 5.22
c.
6
Residuals
4
2
0
-2
-4
-6
0
5
10
15
20
25
x
With only 5 observations it is difficult to determine if the assumptions are satisfied.
However, the plot does suggest curvature in the residuals that would indicate that the error
term assumptions are not satisfied. The scatter diagram for these data also indicates that the
underlying relationship between x and y may be curvilinear.
14 - 32
Simple Linear Regression
d.
s2  23.78
hi 
1 ( xi  x ) 2
1 ( x  14) 2

  i
2
n ( xi  x )
5
126
The standardized residuals are 1.32, -.59, -1.11, -.40, 1.49.
e.
46. a.
The standardized residual plot has the same shape as the original residual plot. The
curvature observed indicates that the assumptions regarding the error term may not be
satisfied.
yˆ  2.32  .64 x
b.
4
3
Residuals
2
1
0
-1
-2
-3
-4
0
2
4
6
8
10
x
The assumption that the variance is the same for all values of x is questionable. The variance appears
to increase for larger values of x.
47. a.
Let x = advertising expenditures and y = revenue
yˆ  29.4  1.55 x
b.
SST = 1002 SSE = 310.28 SSR = 691.72
MSR = SSR / 1 = 691.72
MSE = SSE / (n - 2) = 310.28/ 5 = 62.0554
F = MSR / MSE = 691.72/ 62.0554= 11.15
Using F table (1 degree of freedom numerator and 5 denominator), p-value is between .01 and .025
Using Excel or Minitab, the p-value corresponding to F = 11.15 is .0206.
Because p-value   = .05, we conclude that the two variables are related.
14 - 33
Chapter 14
c.
10
Residuals
5
0
-5
-10
-15
25
35
45
55
65
Predicted Values
d.
48. a.
The residual plot leads us to question the assumption of a linear relationship between x and y. Even
though the relationship is significant at the .05 level of significance, it would be extremely dangerous
to extrapolate beyond the range of the data.
yˆ  80  4 x
8
6
Residuals
4
2
0
-2
-4
-6
-8
0
2
4
6
8
x
b.
The assumptions concerning the error term appear reasonable.
14 - 34
10
12
14
Simple Linear Regression
49. a.
Let x = return on investment (ROE) and y = price/earnings (P/E) ratio.
yˆ  32.13  3.22 x
b.
Standardized Residuals
2
1.5
1
0.5
0
-0.5
-1
-1.5
0
10
20
30
40
50
60
x
c.
50. a.
There is an unusual trend in the residuals. The assumptions concerning the error term appear
questionable.
The MINITAB output is shown below:
The regression equation is
Y = 66.1 + 0.402 X
Predictor
Constant
X
Coef
66.10
0.4023
S = 12.62
SE Coef
32.06
0.2276
R-sq = 38.5%
T
2.06
1.77
p
0.094
0.137
R-sq(adj) = 26.1%
Analysis of Variance
SOURCE
DF
Regression
1
Residual Error 5
Total
6
Unusual Observations
Obs.
X
Y
1
135
145.00
SS
497.2
795.7
1292.9
Fit
120.42
MS
497.2
159.1
SEFit
4.87
F
3.12
Residual
24.58
p
0.137
St.Resid
2.11R
R denotes an observation with a large standardized residual.
The standardized residuals are: 2.11, -1.08, .14, -.38, -.78, -.04, -.41
The first observation appears to be an outlier since it has a large standardized residual.
14 - 35
Chapter 14
b.
2.5
2.0
Standardized Residual
1.5
1.0
0.5
0.0
-0.5
-1.0
110
115
120
125
Fitted Value
130
135
140
The standardized residual plot indicates that the observation x = 135, y = 145 may be an outlier;
note that this observation has a standardized residual of 2.11.
The scatter diagram is shown below
150
145
140
135
130
y
c.
125
120
115
110
105
100
100
110
120
130
140
150
160
170
180
x
The scatter diagram also indicates that the observation x = 135, y = 145 may be an outlier; the
implication is that for simple linear regression an outlier can be identified by looking at the scatter
diagram.
14 - 36
Simple Linear Regression
51. a.
The Minitab output is shown below:
The regression equation is
Y = 13.0 + 0.425 X
Predictor
Constant
X
Coef
13.002
0.4248
S = 3.181
SE Coef
2.396
0.2116
R-sq = 40.2%
T
5.43
2.01
p
0.002
0.091
R-sq(adj) = 30.2%
Analysis of Variance
SOURCE
DF
Regression
1
Residual Error 6
Total
7
SS
40.78
60.72
101.50
Unusual Observations
Obs.
X
Y
7
12.0
24.00
8
22.0
19.00
MS
40.78
10.12
Fit Stdev.Fit
18.10
1.20
22.35
2.78
F
4.03
Residual
5.90
-3.35
p
0.091
St.Resid
2.00R
-2.16RX
R denotes an observation with a large standardized residual.
X denotes an observation whose X value gives it large influence.
The standardized residuals are: -1.00, -.41, .01, -.48, .25, .65, -2.00, -2.16
The last two observations in the data set appear to be outliers since the standardized residuals for
these observations are 2.00 and -2.16, respectively.
b.
Using MINITAB, we obtained the following leverage values:
.28, .24, .16, .14, .13, .14, .14, .76
MINITAB identifies an observation as having high leverage if hi > 6/n; for these data, 6/n =
6/8 = .75. Since the leverage for the observation x = 22, y = 19 is .76, MINITAB would
identify observation 8 as a high leverage point. Thus, we conclude that observation 8 is an
influential observation.
14 - 37
Chapter 14
c.
30
25
y
20
15
10
5
0
0
5
10
15
20
25
x
The scatter diagram indicates that the observation x = 22, y = 19 is an influential observation.
52. a.
The Minitab output is shown below:
The regression equation is
Shipment = 4.09 + 0.196 Media$
Predictor
Constant
MediaExp
S = 5.044
Coef
4.089
0.19552
SE Coef
2.168
0.03635
R-Sq = 78.3%
T
1.89
5.38
P
0.096
0.001
R-Sq(adj) = 75.6%
Analysis of Variance
Source
Regression
Residual Error
Total
DF
1
8
9
Unusual Observations
Obs
Media$ Shipment
1
120
36.30
SS
735.84
203.51
939.35
Fit
27.55
MS
735.84
25.44
F
28.93
SE Fit
3.30
Residual
8.75
P
0.001
St Resid
2.30R
R denotes an observation with a large standardized residual
b.
Minitab identifies observation 1 as having a large standardized residual; thus, we would consider
observation 1 to be an outlier.
14 - 38
Simple Linear Regression
53. a.
The Minitab output is shown below:
The regression equation is
Price = 28.0 + 0.173 Volume
Predictor
Constant
Volume
Coef
27.958
0.17289
S = 17.53
SE Coef
4.521
0.07804
R-Sq = 17.0%
T
6.18
2.22
P
0.000
0.036
R-Sq(adj) = 13.5%
Analysis of Variance
Source
Regression
Residual Error
Total
DF
1
24
25
SS
1508.4
7376.1
8884.5
Unusual Observations
Obs
Volume
Price
22
230
40.00
MS
1508.4
307.3
Fit
67.72
F
4.91
SE Fit
15.40
P
0.036
Residual
-27.72
St Resid
-3.31RX
R denotes an observation with a large standardized residual
X denotes an observation whose X value gives it large influence.
The Minitab output identifies observation 22 as having a large standardized residual and is an
observation whose x value gives it large influence. The following residual plot verifies these
observations.
2
1
Standardized Residual
b.
0
-1
-2
-3
-4
30
40
50
Fitted Value
14 - 39
60
70
Chapter 14
54. a.
The Minitab output is shown below:
The regression equation is
Salary = 707 + 0.00482 MktCap
Predictor
Constant
MktCap
Coef
707.0
0.0048154
S = 379.8
SE Coef
118.0
0.0008076
R-Sq = 66.4%
T
5.99
5.96
P
0.000
0.000
R-Sq(adj) = 64.5%
Analysis of Variance
Source
Regression
Residual Error
Total
Unusual
Obs
6
17
DF
1
18
19
Observations
MktCap
Salary
507217
3325.0
120967
116.2
SS
5129071
2596647
7725718
Fit
3149.5
1289.5
MS
5129071
144258
F
35.55
SE Fit
338.6
86.4
Residual
175.5
-1173.3
P
0.000
St Resid
1.02 X
-3.17R
R denotes an observation with a large standardized residual
X denotes an observation whose X value gives it large influence.
b.
Minitab identifies observation 6 as an observation whose x value gives it large influence and
observation 17 as having a large standardized residual. A standardized residual plot against the
predicted values is shown below:
Standardized Residuals
3
2
1
0
-1
-2
-3
-4
500
1000
1500
2000
2500
3000
3500
Predicted Values
55.
No. Regression or correlation analysis can never prove that two variables are causally related.
56.
The estimate of a mean value is an estimate of the average of all y values associated with the same x.
The estimate of an individual y value is an estimate of only one of the y values associated with a
particular x.
57.
The purpose of testing whether 1  0 is to determine whether or not there is a significant relationship
between x and y. However, rejecting 1  0 does not necessarily imply a good fit. For example, if
1  0 is rejected and r2 is low, there is a statistically significant relationship between x and y but the
fit is not very good.
14 - 40
Simple Linear Regression
58. a.
The Minitab output is shown below:
The regression equation is
Price = 9.26 + 0.711 Shares
Predictor
Constant
Shares
S = 1.419
Coef
9.265
0.7105
SE Coef
1.099
0.1474
R-Sq = 74.4%
T
8.43
4.82
P
0.000
0.001
R-Sq(adj) = 71.2%
Analysis of Variance
Source
Regression
Residual Error
Total
DF
1
8
9
SS
46.784
16.116
62.900
MS
46.784
2.015
F
23.22
P
0.001
b.
Since the p-value corresponding to F = 23.22 = .001 <  = .05, the relationship is significant.
c.
r 2 = .744; a good fit. The least squares line explained 74.4% of the variability in Price.
d.
yˆ  9.26  .711(6)  13.53
59. a.
The Minitab output is shown below:
The regression equation is
Options = - 3.83 + 0.296 Common
Predictor
Constant
Common
S = 11.04
Coef
-3.834
0.29567
SE Coef
5.903
0.02648
R-Sq = 91.9%
T
-0.65
11.17
P
0.529
0.000
R-Sq(adj) = 91.2%
Analysis of Variance
Source
Regression
Residual Error
Total
DF
1
11
12
SS
15208
1341
16550
MS
15208
122
F
124.72
P
0.000
b.
yˆ  3.83  .296(150)  40.57 ; approximately 40.6 million shares of options grants outstanding.
c.
r 2 = .919; a very good fit. The least squares line explained 91.9% of the variability in Options.
14 - 41
Chapter 14
60. a.
1300
1280
S&P500
1260
1240
1220
1200
1180
1160
10000
10200
10400
10600
10800
11000
DJIA
b.
The Minitab output is shown below:
The regression equation is
S&P500 = - 182 + 0.133 DJIA
Predictor
Constant
DJIA
Coef
-182.11
0.133428
S = 6.89993
SE Coef
71.83
0.006739
R-Sq = 95.6%
T
-2.54
19.80
P
0.021
0.000
R-Sq(adj) = 95.4%
Analysis of Variance
Source
Regression
Residual Error
Total
DF
1
18
19
SS
18666
857
19523
MS
18666
48
F
392.06
P
0.000
c.
Using the F test, the p-value corresponding to F = 392.06 is .000. Because the p-value   =.05, we
reject H 0 : 1  0 ; there is a significant relationship.
d.
With R-Sq = 95.6%, the estimated regression equation provided an excellent fit.
e.
yˆ  182.11  .133428DJIA=  182.11  .133428(11,000)  1285.60 or 1286.
f.
The DJIA is not that far beyond the range of the data. With the excellent fit provided by the
estimated regression equation, we should not be too concerned about using the estimated regression
equation to predict the S&P500.
14 - 42
Simple Linear Regression
61.
The Minitab output is shown below:
The regression equation is
Expense = 10.5 + 0.953 Usage
Predictor
Constant
X
Coef
10.528
0.9534
S = 4.250
SE Coef
3.745
0.1382
R-sq = 85.6%
T
2.81
6.90
p
0.023
0.000
R-sq(adj) = 83.8%
Analysis of Variance
SOURCE
DF
Regression
1
Residual Error 8
Total
9
Fit
39.13
SS
860.05
144.47
1004.53
Stdev.Fit
1.49
(
MS
860.05
18.06
95% C.I.
35.69, 42.57)
F
47.62
p
0.000
95% P.I.
28.74, 49.52)
(
a.
ŷ = 10.5 + .953 Usage
b.
Since the p-value corresponding to F = 47.62 = .000 <  = .05, we reject H0: 1 = 0.
c.
The 95% prediction interval is 28.74 to 49.52 or $2874 to $4952
d.
Yes, since the expected expense is $3913.
62. a.
The Minitab output is shown below:
The regression equation is
Defects = 22.2 - 0.148 Speed
Predictor
Constant
Speed
Coef
22.174
-0.14783
S = 1.489
SE Coef
1.653
0.04391
R-Sq = 73.9%
T
13.42
-3.37
P
0.000
0.028
R-Sq(adj) = 67.4%
Analysis of Variance
Source
Regression
Residual Error
Total
DF
1
4
5
SS
25.130
8.870
34.000
MS
25.130
2.217
F
11.33
P
0.028
Predicted Values for New Observations
New Obs
Fit
1
14.783
SE Fit
0.896
(
95.0% CI
12.294, 17.271)
(
95.0% PI
9.957, 19.608)
b.
Since the p-value corresponding to F = 11.33 = .028 <  = .05, the relationship is significant.
c.
r 2 = .739; a good fit. The least squares line explained 73.9% of the variability in the number of
defects.
d.
Using the Minitab output in part (a), the 95% confidence interval is 12.294 to 17.271.
14 - 43
Chapter 14
63. a.
9
8
7
Days
6
5
4
3
2
1
0
0
5
10
15
20
Distance
There appears to be a negative linear relationship between distance to work and number of days
absent.
b.
The MINITAB output is shown below:
The regression equation is
Days = 8.10 - 0.344 Distance
Predictor
Constant
X
S = 1.289
Coef
8.0978
-0.34420
SE Coef
0.8088
0.07761
R-sq = 71.1%
T
10.01
-4.43
p
0.000
0.002
R-sq(adj) = 67.5%
Analysis of Variance
SOURCE
DF
Regression
1
Residual Error 8
Total
9
Fit
6.377
Stdev.Fit
0.512
SS
32.699
13.301
46.000
(
MS
32.699
1.663
95% C.I.
5.195, 7.559)
F
19.67
(
p
0.002
95% P.I.
3.176, 9.577)
c.
Since the p-value corresponding to F = 419.67 is .002 <  = .05. We reject H0 : 1 = 0.
d.
r2 = .711. The estimated regression equation explained 71.1% of the variability in y; this is a
reasonably good fit.
e.
The 95% confidence interval is 5.195 to 7.559 or approximately 5.2 to 7.6 days.
14 - 44
Simple Linear Regression
64. a.
The Minitab output is shown below:
The regression equation is
Cost = 220 + 132 Age
Predictor
Constant
X
S = 75.50
Coef
220.00
131.67
SE Coef
58.48
17.80
R-sq = 87.3%
T
3.76
7.40
p
0.006
0.000
R-sq(adj) = 85.7%
Analysis of Variance
SOURCE
DF
Regression
1
Residual Error 8
Total
9
Fit
746.7
SS
312050
45600
357650
Stdev.Fit
29.8
(
MS
312050
5700
95% C.I.
678.0, 815.4)
F
54.75
(
p
0.000
95% P.I.
559.5, 933.9)
b.
Since the p-value corresponding to F = 54.75 is .000 <  = .05, we reject H0: 1 = 0.
c.
r2 = .873. The least squares line provided a very good fit.
d.
The 95% prediction interval is 559.5 to 933.9 or $559.50 to $933.90
65. a.
The Minitab output is shown below:
The regression equation is
Points = 5.85 + 0.830 Hours
Predictor
Constant
X
S = 7.523
Coef
5.847
0.8295
SE Coef
7.972
0.1095
R-sq = 87.8%
T
0.73
7.58
p
0.484
0.000
R-sq(adj) = 86.2%
Analysis of Variance
SOURCE
DF
Regression
1
Residual Error 8
Total
9
Fit
84.65
Stdev.Fit
3.67
SS
3249.7
452.8
3702.5
(
MS
3249.7
56.6
95% C.I.
76.19, 93.11)
F
57.42
(
p
0.000
95% P.I.
65.35, 103.96)
b.
Since the p-value corresponding to F = 57.42 is .000 <  = .05, we reject H0: 1 = 0.
c.
84.65 points
d.
The 95% prediction interval is 65.35 to 103.96
14 - 45
Chapter 14
66. a.
The Minitab output is shown below:
The regression equation is
Horizon = 0.275 + 0.950 S&P 500
Predictor
Constant
S&P 500
Coef
0.2747
0.9498
S = 2.664
SE Coef
0.9004
0.3569
R-Sq = 47.0%
T
0.31
2.66
P
0.768
0.029
R-Sq(adj) = 40.3%
Analysis of Variance
Source
DF
Regression
1
Residual Error 8
Total
9
SS
50.255
56.781
107.036
MS
50.255
7.098
F
7.08
P
0.029
The market beta for Horizon is b1 = .95
b.
Since the p-value = 0.029 is less than  = .05, the relationship is significant.
c.
r2 = .470. The least squares line does not provide a very good fit.
d.
Texas Instruments has higher risk with a market beta of 1.46.
67. a.
The Minitab output is shown below:
The regression equation is
Audit% = - 0.471 +0.000039 Income
Predictor
Constant
Income
Coef
-0.4710
0.00003868
S = 0.2088
SE Coef
0.5842
0.00001731
R-Sq = 21.7%
T
-0.81
2.23
P
0.431
0.038
R-Sq(adj) = 17.4%
Analysis of Variance
Source
Regression
Residual Error
Total
DF
1
18
19
SS
0.21749
0.78451
1.00200
MS
0.21749
0.04358
F
4.99
P
0.038
Predicted Values for New Observations
New Obs
1
Fit
0.8828
SE Fit
0.0523
95.0% CI
( 0.7729, 0.9927)
95.0% PI
( 0.4306, 1.3349)
b.
Since the p-value = 0.038 is less than  = .05, the relationship is significant.
c.
r2 = .217. The least squares line does not provide a very good fit.
d.
The 95% confidence interval is .7729 to .9927.
14 - 46
Simple Linear Regression
68. a.
Satisfaction Rating (%)
90
80
70
60
50
40
30
20
20
30
40
50
60
70
Top Five (%)
b.
There appears to be a positive linear relationship between the two variables.
A portion of the Minitab output for this problem follows.
The regression equation is
Rating (%) = 9.4 + 1.29 Top Five (%)
Predictor
Constant
Top Five (%)
Coef
9.37
1.2875
SE Coef
10.24
0.2293
S = 6.62647
R-Sq = 74.1%
T
0.92
5.61
P
0.380
0.000
R-Sq(adj) = 71.8%
Analysis of Variance
Source
Regression
Residual Error
Total
DF
1
11
12
SS
1383.9
483.0
1866.9
MS
1383.9
43.9
F
31.52
P
0.000
c.
ŷ = 9.37 + 1.2875 Top Five (%)
d.
Since the p-value = .000 < α = .05, the relationship is significant.
e.
r 2 = .741; a good fit. The least squares line explained 74.1% of the variability in the satisfaction
rating.
f.
rxy  r 2  .741  .86
14 - 47
Download