Chapter 12 Correlation and Simple Regression Analysis

advertisement
Chapter 12: Correlation and Simple Regression Analysis
Chapter 12
Correlation and Simple Regression Analysis
LEARNING OBJECTIVES
The overall objective of this chapter is to give you an understanding of bivariate
regression analysis, thereby enabling you to:
1.
2.
3.
4.
5.
6.
7.
8.
9.
Define correlation and when is it used
Define simple regression and its use.
Interpret the slope and intercept of the equation.
Understand the usefulness of residual analysis in examining the fit of the
regression line to the data and in testing the assumptions underlying regression
analysis.
Compute a standard error of the estimate and interpret its meaning.
Compute a coefficient of determination and interpret it.
Test hypotheses about the slope of the regression model and interpret the results.
Estimate values of y using the regression model.
Develop a linear trend line and use it to forecast.
CHAPTER TEACHING STRATEGY
This chapter is about all aspects of simple (bivariate, linear) regression. Early in the
chapter through scatter plots, the student begins to understand that the object of simple
regression is to fit a line through the points. Fairly soon in the process, the student
learns how to solve for slope and y intercept and develop the equation of the regression
line. Most of the remaining material on simple regression is to determine how good the
fit of the line is and if assumptions underlying the process are met.
The student begins to understand that by entering values of the independent variable
into the regression model, predicted values can be determined. The question then
becomes: Are the predicted values good estimates of the actual dependent values? One
rule to emphasize is that the regression model should not be used to predict for
independent variable values that are outside the range of values used to construct the
model. MINITAB issues a warning for such activity when attempted. There are many
instances where the relationship between x and y are linear over a given interval but
outside the interval the relationship becomes curvilinear or unpredictable. Of course,
with this caution having been given, many forecasters use such regression models to
© 2010 John Wiley & Sons Canada, Ltd.
388
Chapter 12: Correlation and Simple Regression Analysis
extrapolate to values of x outside the domain of those used to construct the model.
Such forecasts are introduced in section 12.9, “Using Regression to Develop a
Forecasting Trend Line”. Whether the forecasts obtained under such conditions are any
better than "seat of the pants" or "crystal ball" estimates remains to be seen.
The concept of residual analysis is a good one to show graphically and numerically
how the model relates to the data and the fact that it more closely fits some points than
others, etc. A graphical or numerical analysis of residuals demonstrates that the
regression line fits the data in a manner analogous to the way a mean fits a set of
numbers. The regression model passes through the points such that the vertical
distances from the actual y values to the predicted values will sum to zero. The fact
that the residuals sum to zero points out the need to square the errors (residuals) in
order to get a handle on total error. This leads to the sum of squares error and then on
to the standard error of the estimate. In addition, students can learn why the process is
called least squares analysis (the slope and intercept formulas are derived by calculus
such that the sum of squares of error is minimized - hence "least squares"). Students
can learn that by examining the values of se, the residuals, r2, and the t ratio to test the
slope they can begin to make a judgment about the fit of the model to the data. Many
of the chapter problems ask the student to comment on these items (se, r2, etc.).
It is my view that for many of these students, the most important facet of this chapter
lies in understanding the "buzz" words of regression such as standard error of the
estimate, coefficient of determination, etc. because they may only interface regression
again as some type of computer printout to be deciphered. The concepts then may be
more important than the calculations.
© 2010 John Wiley & Sons Canada, Ltd.
389
Chapter 12: Correlation and Simple Regression Analysis
CHAPTER OUTLINE
12.1
Correlation
12.2
Introduction to Simple Regression Analysis
12.3
Determining the Equation of the Regression Line
12.4
Residual Analysis: Testing the Regression Model I-Predicted values, Residuals, and
Sum of Squares of Error
Using Residuals to Test the Assumptions of the Regression Model
Using the Computer for Residual Analysis
12.5
Standard Error of the Estimate: Testing the Regression Model II-Standard Error of the
Estimate and r Squared
12.6
Coefficient of Determination: Testing the Regression Model II-Standard Error of the
Estimate and r Squared
Relationship Between r and r2
12.7
Hypothesis Tests for the Slope of the Regression Model and Testing the Overall
Model
Testing the Slope
Testing the Overall Model
12.8
Estimation
Confidence Intervals to Estimate the Conditional Mean of y: µy/x
Prediction Intervals to Estimate a Single Value of y
12.9
Using Regression to Develop a Forecasting Trend Line
Determining the Equation of the Trend Line
Forecasting Using the Equation of the Trend Line
Alternative Coding for Time Periods
12.10
Interpreting Computer Output
© 2010 John Wiley & Sons Canada, Ltd.
390
Chapter 12: Correlation and Simple Regression Analysis
KEY TERMS
Bivariate regression
Coefficient of correlation (r)
Coefficient of Determination (r2)
Confidence Interval
Correlation
Dependent Variable
Deterministic Model
Heteroscedasticity
Homoscedasticity
Independent Variable
Least Squares Analysis
Outliers
Pearson product-moment correlation coefficient
Prediction Interval
Probabilistic Model
Regression Analysis
Residual
Residual Plot
Scatter Plot
Simple Regression
Standard Error of the Estimate (se)
Sum of Squares of Error (SSE)
Time-series data
Trend
© 2010 John Wiley & Sons Canada, Ltd.
391
Chapter 12: Correlation and Simple Regression Analysis
SOLUTIONS TO PROBLEMS IN CHAPTER 12
12.1
x = 80
y2 = 815
r =
x2 = 1,148
xy = 624
 xy 
y = 69
n=7
 x y
n

( x)  
( y ) 2 
2
2
 x 
  y 

n  
n 

=
2
(80)( 69)
7
=
2

(80)  
(69) 2 
1,148 
 815 

7 
7 

624 
r =
r =
12.2
 164.571
= –0.927
177.533
x = 1,087
y2 = 878,686
r =
 164.571
=
(233.714)(134.857)
x2 = 322,345
xy= 507,509
 xy 

( x)
2
 x 
n

2
y = 2,032
n=5
 x y
n

( y )
2
  y 
n
 
2



=
(1,087)( 2,032)
5
=

(1,087) 2  
(2,032) 2 
322,345 
 878,686 

5
5



507,509 
r =
r =
65,752.2
65,752.2
=
= .975
67,449.5
(86,031.2)(52,881.2)
© 2010 John Wiley & Sons Canada, Ltd.
392
Chapter 12: Correlation and Simple Regression Analysis
12.3
Air Canada (x)
0.75
0.76
0.84
0.85
0.86
0.86
WestJet (y)
11.92
12.09
12.25
11.85
11.78
11.74
x = 4.92
x2 = 4.0474
r =
y = 71.63
xy = 58.7181
2
y = 855.3355
 xy 

( x)
2
 x 
n

2
 x y
n

( y ) 2 
2
y

 

n 
 
=
(4.92)(71.63)
 0185 = – .37
6

0.0500

(4.92) 2  
(71.63) 2 
4
.
0474

855
.
3355




6 
6


58.7181 
r =
12.4
x = 59,788,660
y = 90,545
xy = 415,640,524,882
r =
 xy 
x2 = 428,249,006,044,248
y2 = 713,381,745
n = 23
 x y
n

( x)  
( y ) 2 
2
2
 x 
  y 

n  
n 

=
2
(59,788,660)(90,545)
23
=
2
2

(59,788,660)  
(90,545) 
428,249,006,044,248 
 713,381,745 

23
23



415,640,524,882 
r =
© 2010 John Wiley & Sons Canada, Ltd.
393
Chapter 12: Correlation and Simple Regression Analysis
180,268,167,504
r =
= .578
(272,827,968,453,000)(356,929,700)
12.5
Correlation between Year 1 and Year 2:
x = 17.09
y = 15.12
xy = 48.97
r =
x2 = 58.7911
y2 = 41.7054
n=8
 xy 
 x y
n

( x)  
( y ) 2 
2
2
 x 
  y 

n  
n 

=
2
(17.09)(15.12)
8
=
2
2

(17.09)  
(15.12) 
58.7911 
 41.7054 

8
8



48.97 
r =
r =
16.6699
16.6699
=
= .975
17.1038
(22.28259)(13.1286)
Correlation between Year 2 and Year 3:
x = 15.12
x2 = 41.7054
y = 15.86
y2 = 42.0396
xy = 41.5934
n=8
r =
 xy 
 x y
n

( x)  
( y ) 2 
2
2
 x 
  y 

n  
n 

=
2
(15.12)(15.86)
8
=
2

(15.12)  
(15.86) 2 
41.7054 
 42.0396 

8
8



41.5934 
r =
© 2010 John Wiley & Sons Canada, Ltd.
394
Chapter 12: Correlation and Simple Regression Analysis
r =
11.618
11.618
=
= .985
11.795
(13.1286)(10.59715)
Correlation between Year 1 and Year 3:
x = 17.09
x2 = 58.7911
y = 15.86
y2 = 42.0396
xy = 48.5827
n=8
r =
 xy 
 x y
n

( x)  
( y ) 2 
2
2
 x 
  y 

n  
n 

=
2
(17.09)(15.86)
8
2

(17.09)  
(15.86) 2 
58
.
7911

42
.
0396




8
8



48.5827 
r =
r =
14.702
(22.2826)(10.5972)
=
14.702
= .957
15.367
The years 2 and 3 are the most correlated with r = .985.
© 2010 John Wiley & Sons Canada, Ltd.
395
Chapter 12: Correlation and Simple Regression Analysis
12.6
x
12
21
28
8
20
y
17
15
22
19
24
30
25
y
20
15
10
5
0
0
5
10
15
20
25
30
x
x = 89
x2 = 1,833
b1 =
SS xy
SS xx
y = 97
y2 = 1,935
x y
 xy   n

( x)
x  
2
(89)(97)
5
= 0.16238
(89) 2
1,833 
5
1,767 
=
2
n
b0 =
 y  b  x  97  0.1624 89
n
1
n
5
xy = 1,767
n=5
5
= 16.5093
ŷ = 16.50965 + 0.162379 x
© 2010 John Wiley & Sons Canada, Ltd.
396
Chapter 12: Correlation and Simple Regression Analysis
12.7
x_
140
119
103
91
65
29
24
x
x2
b1 =
_y_
25
29
46
70
88
112
128
y
= 571
= 58,293
SS xy
SS xx
x y
 xy   n

( x)
x  
2
2
n
b0 =
= 498
y2 = 45,154
(571)( 498)
7
(571) 2
58,293 
7
30,099 
=
 y  b  x  498  (0.898) 571
n
1
n
7
xy = 30,099
n=7
7
= –0.898
= 144.414
ŷ = 144.414 – 0.898 x
© 2010 John Wiley & Sons Canada, Ltd.
397
Chapter 12: Correlation and Simple Regression Analysis
12.8
(Advertising) x
12.5
3.7
21.6
60.0
37.6
6.1
16.8
41.2
(Sales) y
148
55
338
994
541
89
126
379
x = 199.5
x2 = 7,667.15
SS xy
b1 =
SS xx
y
y2
x y
 xy   n

( x)
x  
2
(199.5)( 2,670)
8
(199.5) 2
7,667.15 
8
107,610.4 
=
2
n
b0 =
 y  b  x  2,670  15.24 199.5
n
1
n
xy = 107,610.4
n=8
= 2,670
= 1,587,328
8
8
= 15.240
= -46.292
ŷ = -46.292 + 15.240 x
12.9
(Prime) x
16
6
8
4
7
x = 41
x2 = 421
r =
 xy 

( x)
2
 x 
n

2
(Bond) y
5
12
9
15
7
y = 48
y2 = 524
 x y
xy = 333
n=5
n

( y )
2
  y 
n
 
2



=
(41)( 48)
5
2

(41)  
(48) 2 
421

524


5  
5 

333 
r =
© 2010 John Wiley & Sons Canada, Ltd.
398
Chapter 12: Correlation and Simple Regression Analysis
r =
 60.6
= – .828
(84.8)(63.2)
Since r = – 0.828, the bond rate can be predicted by the prime interest rate.
b1 =
SS xy
SS xx
x y
 xy   n

( x)
x  
2
2
n
b0 =
(41)( 48)
5
= –0.715
(41) 2
421 
5
333 
=
 y  b  x  48  (0.715) 41
n
1
n
5
5
= 15.460
ŷ = 15.460 – 0.715 x
12.10
Firm Births (x)
58.1
55.4
57
58.5
57.4
58
Bankruptcies (y)
34.3
35
38.5
40.1
35.5
37.9
x = 344.4
y = 221.3
y2 = 8188.41
xy = 12,708.08
 x y
r =
 xy 
x2 = 19,774.78
n

( x)  
( y ) 2 
2
2
 x 
  y 

n  
n 

n=6
=
2
© 2010 John Wiley & Sons Canada, Ltd.
399
Chapter 12: Correlation and Simple Regression Analysis
(344.4)( 221.3)
6
2

(344.4)  
(221.3) 2 
19
,
774
.
78

8188
.
41




6
6



12,708.08 
r =
5.46
= 0.428
(6.22)(26.1283)
r =
r = 0.428 represents a relatively moderate positive relationship between the
number of firm births and the number of business bankruptcies. .
b1 =
SS xy
SS xx
x y
 xy   n

( x)
x  
2
(344.4)( 221.3)
6
=
(344.4) 2
19,774.78 
6
12,708.08 
=
2
n
b1 = 0.878
b0 =
 y  b  x  221.3  (0.878) 344.4
n
1
n
6
6
= –13.503
ŷ = –13.503 + 0.878 x
The slope of the line, b1 = 0.878, means that for every 10,000 increase of firm
births, the number of business bankruptcies is predicted to increase by 878.
12.11
Number of Farms (x)
574,993
480,877
430,503
366,110
338,552
318,361
293,089
280,043
276,548
246,923
229,373
x = 3,835,372
Average Size (y)
122
145
164
188
202
207
231
242
246
273
295
y = 2315
x2 = 1,451,587,168,444
© 2010 John Wiley & Sons Canada, Ltd.
400
Chapter 12: Correlation and Simple Regression Analysis
y2 = 515,797
b1 =
SS xy
SS xx
xy = 752,175,501
x y
 xy   n

( x)
x  
2
n = 11
=
2
n
(3,835,372)( 2315)
752,175,501 
11
=
= –0.000481
(3,835,372) 2
1,451,587,168,444 
11
b0 =
 y  b  x  2315  (0.000481) 3,835,372
n
1
n
11
11
= 378.2
ŷ = 378.2 – 0.0005 x
12.12
r =
Steel (x)
90.6
88.8
89.7
79.7
84.3
88.8
91.3
95.2
95.5
98.5
Orders (y)
2.74
2.87
2.93
2.87
2.98
3.09
3.36
3.61
3.75
3.95
x = 902.4
y = 32.15
x2 = 81,705.14
y2 = 104.9815
xy = 2917.906
n = 10
 xy 

( x)
2
 x 
n

2
 x y
n

( y )
2
  y 
n
 
2



=
© 2010 John Wiley & Sons Canada, Ltd.
401
Chapter 12: Correlation and Simple Regression Analysis
(902.4)(32.15)
10
2

(902.4)  
(32.15) 2 
81
,
705
.
14

104
.
9815


10  
10 

2917.906 
r =
16.69
= 0.794
(272.564)(1.61925)
r =
r = 0.794 represents a relatively strong positive relationship between the
raw steel production and the annual new orders. .
b1 =
SS xy
SS xx
x y
 xy   n

( x)
x  
2
2
n
b0 =
(902.4)(32.15)
10
= 0.06123
(902.4) 2
81,705.14 
10
2917.906 
=
 y  b  x  32.15  (0.06123) 902.4
n
1
n
10
10
= –2.31040
ŷ = –2.31040 + 0.06123 x
© 2010 John Wiley & Sons Canada, Ltd.
402
Chapter 12: Correlation and Simple Regression Analysis
4.5
4
3.5
New Orders
3
2.5
2
1.5
1
0.5
0
0
20
40
60
80
100
120
Raw Steel Production
12.13
x
15
8
19
12
5
y
47
36
56
44
21
ŷ = 13.625 + 2.303 x
Residuals:
x
15
8
19
12
5
y
47
36
56
44
21
ŷ
48.1694
32.0489
57.3811
41.2606
25.1401
Residuals (y- ŷ )
-1.1694
3.9511
-1.3811
2.7394
-4.1401
© 2010 John Wiley & Sons Canada, Ltd.
403
Chapter 12: Correlation and Simple Regression Analysis
12.14
x
12
21
28
8
20
y
17
15
22
19
24
Predicted ( ŷ )
18.4582
19.9196
21.0563
17.8087
19.7572
Residuals (y- ŷ )
-1.4582
-4.9196
0.9437
1.1913
4.2428
ŷ = 16.51 + 0.162 x
12.15
x
140
119
103
91
65
29
24
y
25
29
46
70
88
112
128
Predicted ( ŷ ) Residuals (y- ŷ )
18.6597
6.3403
37.5229
-8.5229
51.8948
-5.8948
62.6737
7.3263
86.0281
1.9720
118.3648
-6.3648
122.8561
5.1439
ŷ = 144.414 – 0.898 x
12.16
x
12.5
3.7
21.6
60.0
37.6
6.1
16.8
41.2
y
148
55
338
994
541
89
126
379
Predicted ( ŷ )
144.2053
10.0954
282.8873
868.0945
526.7236
46.6708
209.7364
581.5868
Residuals (y- ŷ )
3.7947
44.9047
55.1127
125.9055
14.2764
42.3292
-83.7364
-202.5868
ŷ = – 46.292 + 15.240x
© 2010 John Wiley & Sons Canada, Ltd.
404
Chapter 12: Correlation and Simple Regression Analysis
12.17
x
16
6
8
4
7
y
5
12
9
15
7
Predicted ( ŷ )
4.0259
11.1722
9.7429
12.6014
10.4576
Residuals (y- ŷ )
0.9741
0.8278
-0.7429
2.3986
-3.4576
ŷ = 15.460 – 0.715 x
12.18
_ x_
58.1
55.4
57.0
58.5
57.4
58.0
_ y_
34.3
35.0
38.5
40.1
35.5
37.9
Predicted ( ŷ )
37.4978
35.1277
36.5322
37.8489
36.8833
37.4100
Residuals (y- ŷ )
-3.1978
-0.1277
1.9678
2.2511
-1.3833
0.4900
The residual for x = 58.1 is relatively large, but the residual for x = 55.4 is quite
small.
12.19
_x_ _ y_
5
47
7
38
11
32
12
24
19
22
25
10
Predicted ( ŷ )
42.2756
38.9836
32.3997
30.7537
19.2317
9.3558
Residuals (y- ŷ )
4.7244
-0.9836
-0.3997
-6.7537
2.7683
0.6442
ŷ = 50.506 - 1.646 x
No apparent violation of assumptions
© 2010 John Wiley & Sons Canada, Ltd.
405
Chapter 12: Correlation and Simple Regression Analysis
12.20
Distance (x)
1,245
425
1,346
973
255
865
1,080
296
Cost y
2.64
2.31
2.45
2.52
2.19
2.55
2.40
2.37
( ŷ )
2.5376
2.3322
2.5629
2.4694
2.2896
2.4424
2.4962
2.2998
(y- ŷ )
.1024
-.0222
-.1129
.0506
-.0996
.1076
-.0962
.0702
ŷ = 2.2257 – 0.00025 x
No apparent violation of assumptions
12.21
Error terms appear to be non independent
© 2010 John Wiley & Sons Canada, Ltd.
406
Chapter 12: Correlation and Simple Regression Analysis
12.22
There appears to be nonlinear regression
12.23
The Residuals vs. Fits graphic is strongly indicative of a violation of the
homoscedasticity assumption of regression. Because the residuals are very close
together for small values of x, there is little variability in the residuals at the left
end of the graph. On the other hand, for larger values of x, the graph flares out
indicating a much greater variability at the upper end. Thus, there is a lack of
homogeneity of error across the values of the independent variable.
12.24 SSE = y2 – b0y – b1xy = 1,935 – (16.50965)(97) – 0.16238(1767) = 46.6385
se 
SSE
46.6385
= 3.94

n2
3
Approximately 68% of the residuals should fall within ±1se.
3 out of 5 or 60% of the actual residuals fell within ± 1se.
12.25 SSE = y2 – b0y – b1xy = 45,154 – 144.414(498) – (–.89824)(30,099) =
SSE = 272.0
se 
SSE
272.0
= 7.376

n2
5
6 out of 7 = 85.7% fall within + 1se
7 out of 7 = 100% fall within + 2se
© 2010 John Wiley & Sons Canada, Ltd.
407
Chapter 12: Correlation and Simple Regression Analysis
12.26 SSE = y2 – b0y – b1xy = 1,587,328 – (–46.29)(2,670) – 15.24(107,610.4) =
SSE = 70,940
se 
SSE
70,940
= 108.7

n2
6
Six out of eight (75%) of the sales estimates are within $108.7 million.
12.27 SSE = y2 – b0y – b1xy = 524 – 15.46(48) – (–0.71462)(333) = 19.8885
se 
SSE
19.8885
= 2.575

n2
3
Four out of five (80%) of the estimates are within 2.575 of the actual rate for
bonds. This amount of error is probably not acceptable to financial analysts.
12.28
_ x_
_ y_
Predicted ( ŷ )
58.1
55.4
57.0
58.5
57.4
58.0
34.3
35.0
38.5
40.1
35.5
37.9
37.4978
35.1277
36.5322
37.8489
36.8833
37.4100
SSE =
se 
 ( y  yˆ )
2
Residuals (y– ŷ )
( y  yˆ ) 2
-3.1978
10.2259
-0.1277
0.0163
1.9678
3.8722
2.2511
5.0675
-1.3833
1.9135
0.4900
0.2401
2
 ( y  yˆ ) = 21.3355
= 21.3355
SSE
21.3355
= 2.3095

n2
4
This standard error of the estimate indicates that the regression model is
with + 2.3095(1,000) bankruptcies about 68% of the time. In this
particular problem, 5/6 or 83.3% of the residuals are within this standard
error of the estimate.
© 2010 John Wiley & Sons Canada, Ltd.
408
Chapter 12: Correlation and Simple Regression Analysis
12.29
 ( y  yˆ )
SSE =
se 
12.30
(y– ŷ )2
22.3200
.9675
.1597
45.6125
7.6635
.4150
(y- ŷ )2 = 77.1382
(y– ŷ )
4.7244
-0.9836
-0.3996
-6.7537
2.7683
0.6442
2
= 77.1382
SSE
77.1382
= 4.391

n2
4
(y- ŷ )
.1024
-.0222
-.1129
.0506
-.0996
.1076
-.0962
.0702
(y- ŷ )2
.0105
.0005
.0127
.0026
.0099
.0116
.0093
.0049
2
(y– ŷ ) = .0620
SSE =
 ( y  yˆ )
2
= .0620
SSE
.0620
= .1017

n2
6
The model produces estimates that are ±.1017 or within about 10 cents 68% of the
time. However, the range of milk costs is only 45 cents for this data.
se 
12.31
Volume (x)
728.6
497.9
439.1
377.9
375.5
363.8
276.3
Sales (y)
10.5
48.1
64.8
20.1
11.4
123.8
89.0
n=7
x = 3059.1
2
x = 1,464,071.97
y = 367.7
y = 30,404.31
2
© 2010 John Wiley & Sons Canada, Ltd.
xy = 141,558.6
409
Chapter 12: Correlation and Simple Regression Analysis
b1 = –.1504
b0 = 118.257
ŷ = 118.257 – .1504x
SSE = y2 – b0y – b1xy
= 30,404.31 – (118.257)(367.7) – (–0.1504)(141,558.6) = 8211.6245
se 
SSE
8211.6245
= 40.526

n2
5
This is a relatively large standard error of the estimate given the sales values
(ranging from 10.5 to 123.8).
12.32 r2 = 1 
SSE
46.6385
= .123
 1
2
2
(
y
)
(
97
)

1,935 
 y2  n
5
This is a low value of r2
12.33 r2 = 1 
SSE
272.121
= .972
 1
2
2
(
y
)
(
498
)

45,154 
 y2  n
7
This is a high value of r2.
12.34 r2 = 1 
SSE
70,940
= .898
 1
2
( y )
(2,670) 2
2
1,587,328 
y  n
8
This value of r2 is relatively high.
© 2010 John Wiley & Sons Canada, Ltd.
410
Chapter 12: Correlation and Simple Regression Analysis
12.35 r2 = 1 
SSE
19.8885
 1
2
( y )
(48) 2
2
524 
y  n
5
= .685
This value of r2 is a modest value.
68.5% of the variation of y is accounted for by x but 31.5% is unaccounted for.
12.36 r2 = 1 
SSE
21.33547
= .183
 1
2
( y )
(221.3) 2
2
8,188.41 
y  n
6
This value is a low value of r2.
Only 18.3% of the variability of y is accounted for by the x values and 81.7% are
unaccounted for.
12.37
Year
1997
1998
1999
2000
2001
2002
2003
2004
2005
Long-term
Interest Rates
(%)
6.14
5.28
5.54
5.93
5.48
5.30
4.80
4.58
4.07
Real GDP
Growth (%)
4.2
4.1
5.5
5.2
1.8
2.9
1.8
3.3
2.9
Equation of the regression line: Long term interest rate = 4.3557 + 0.2498 (Real GDP)
Standard error of the model = 0.6031
r-squared = 0.2592
Real GDP is not a good predictor of the Long-term Interest Rate because the t-value for
the Real GDP Growth is not statistically significant.
© 2010 John Wiley & Sons Canada, Ltd.
411
Chapter 12: Correlation and Simple Regression Analysis
se
12.38 sb =
x
2

( x)
2

n
3.94
(89) 2
1833 
5
= .2498
b1 = 0.162
Ho: 1 = 0
Ha: 1  0
 = .05
This is a two-tail test, /2 = .025
df = n – 2 = 5 – 2 = 3
t.025,3 = ±3.182
t =
b1  1 0.162  0
= 0.65

sb
.2498
Since the observed t = 0.65 < t.025,3 = 3.182, the decision is to fail to reject the
null hypothesis.
se
12.39 sb =
x
2

( x )
2

n
7.376
(571) 2
58,293 
7
= .068145
b1 = – 0.898
Ho: 1 = 0
Ha: 1  0
 = .01
Two-tail test, /2 = .005
df = n – 2 = 7 – 2 = 5
t.005,5 = ±4.032
t=
b1  1  0.898  0
= –13.18

sb
.068145
Since the observed t = –13.18 < t.005,5 = – 4.032, the decision is to reject the null
hypothesis.
© 2010 John Wiley & Sons Canada, Ltd.
412
Chapter 12: Correlation and Simple Regression Analysis
se
12.40 sb =
x
2

( x )

2
n
108.7
(199.5) 2
7,667.15 
8
= 2.095
b1 = 15.240
Ho: 1 = 0
Ha: 1  0
 = .10
For a two-tail test, /2 = .05
df = n – 2 = 8 – 2 = 6
t.05,6 = + 1.943
t =
b1  1 15.240  0
= 7.27

sb
2.095
Since the observed t = 7.27 > t.05,6 = 1.943, the decision is to reject the null
hypothesis.
se
12.41 sb =
x
2

( x)
n
2

2.575
(41) 2
421 
5
= .27963
b1 = – 0.715
Ho: 1= 0
Ha: 1  0
 = .05
For a two-tail test, /2 = .025
df = n – 2 = 5 – 2 = 3
t.025,3 = ±3.182
t =
b1  1  0.715  0
= – 2.56

sb
.27963
Since the observed t = –2.56 > t.025,3 = –3.182, the decision is to fail to reject the
null hypothesis.
© 2010 John Wiley & Sons Canada, Ltd.
413
Chapter 12: Correlation and Simple Regression Analysis
se
12.42 sb =
x
2

( x)
2
n

2.3095
(344.4) 2
19,774.78 
6
= 0.926025
b1 = 0.878
Ho: 1 = 0
Ha: 1  0
 = .05
For a two-tail test, /2 = .025
df = n – 2 = 6 – 2 = 4
t.025,4 = ±2.776
t =
b1  1 0.878  0
= 0.948

sb
.926025
Since the observed t = 0.948 < t.025,4 = 2.776, the decision is to fail to reject the
null hypothesis.
12.43 F = 8.26 with a p-value of .021. The overall model is significant at  = .05 but
not at  = .01. For simple regression,
t =
F = 2.874
t.05,8 = 1.86 but t.01,8 = 2.896. The slope is significant at  = .05 but not at
 = .01.
12.44 x0 = 25
95% confidence
df = n – 2 = 5 – 2 = 3
x
 x  89 =
n
x = 89
5
/2 = .025
t.025,3 = ±3.182
17.8
x2 = 1,833
se = 3.94
© 2010 John Wiley & Sons Canada, Ltd.
414
Chapter 12: Correlation and Simple Regression Analysis
ŷ = 16.5 + 0.162(25) = 20.55
ŷ ± t /2,n-2 se
1

n
( x0  x) 2
( x) 2
2
x


n
20.55 ± 3.182(3.94)
1 ( 25  17.8) 2

(89) 2
5
1,833 
5
= 20.55 ± 3.182(3.94)(.63903) =
20.55 ± 8.01
12.54 < E(y25) < 28.56
12.45 x0 = 100
For 90% confidence, /2 = .05
df = n – 2 = 7 – 2 = 5
t.05,5 = ±2.015
x
 x  571
n
7
= 81.57143
x= 571
x2 = 58,293
se = 7.377
ŷ = 144.414 – .898(100) = 54.614
ŷ ± t /2,n-2 se 1 
1

n
( x0  x) 2
=
2
(
x
)

 x2  n
54.614 ± 2.015(7.377) 1 
1 (100  81.57143) 2
=

(571) 2
7
58,293 
7
54.614 ± 2.015(7.377)(1.08252) = 54.614 ± 16.091
38.523 < y < 70.705
For x0 = 130,
ŷ = 144.414 – 0.898(130) = 27.674
© 2010 John Wiley & Sons Canada, Ltd.
415
Chapter 12: Correlation and Simple Regression Analysis
y ± t /2,n-2 se 1 
1

n
( x0  x) 2
( x ) 2
2
x  n
=
1 (130  81.57143) 2
27.674 ± 2.015(7.377) 1  
=
(571) 2
7
58,293 
7
27.674 ± 2.015(7.377)(1.1589) =
27.674 ± 17.227
10.447 < y < 44.901
The confidence interval of y for x0 = 130 is wider than the confidence interval of y
for x0 = 100, because x0 = 100 is nearer to the value of x = 81.57 than is x0 = 130.
12.46 x0 = 20
For 98% confidence,
/2 = .01
df = n – 2 = 8 – 2 = 6
t.01,6 = + 3.143
x
 x  199.5
n
8
= 24.9375
x = 199.5
x2 = 7,667.15
se = 108.8
ŷ = – 46.29 + 15.24(20) = 258.51
ŷ ± t /2,n-2 se
1

n
( x0  x) 2
( x) 2
2
x  n
258.51 ± (3.143)(108.8)
1

8
(20  24.9375) 2
(199.5) 2
7,667.15 
8
258.51 ± (3.143)(108.8)(0.36614) = 258.51 ± 125.20
133.31 < E(y20) < 383.71
For single y value:
© 2010 John Wiley & Sons Canada, Ltd.
416
Chapter 12: Correlation and Simple Regression Analysis
ŷ ± t /2,n-2 se 1 
1

n
( x0  x) 2
( x ) 2
2
x  n
1
258.51 ± (3.143)(108.8) 1  
8
(20  24.9375) 2
(199.5) 2
7,667.15 
8
258.51 ± (3.143)(108.8)(1.06492) = 258.51 ± 364.16
– 105.65 < y < 622.67
The prediction interval for the single value of y is wider than the confidence
interval for the average value of y because the average is more towards the
middle and individual values of y can vary more than values of the average.
12.47 x0 = 10
For 99% confidence
df = n – 2 = 5 – 2 = 3
t.005,3 = + 5.841
x
 x  41
n
5
/2 = .005
= 8.20
x = 41
x2 = 421
se = 2.575
ŷ = 15.46 – 0.715(10) = 8.31
ŷ ± t /2,n-2 se
1

n
( x0  x) 2
( x) 2
2
x  n
8.31 ± 5.841(2.575)
1 (10  8.2) 2

(41) 2
5
421 
5
=
8.31 ± 5.841(2.575)(.488065) = 8.31 ± 7.34
0.97 < E(y10) < 15.65
If the prime interest rate is 10%, we are 99% confident that the average bond rate
is between 0.97% and 15.65%.
© 2010 John Wiley & Sons Canada, Ltd.
417
Chapter 12: Correlation and Simple Regression Analysis
12.48
Year (x)
2004
2005
2006
2007
2008
Fertilizer ($ millions) (y)
3,481.4
2,697.2
3,609.2
4,637.7
6,867.9
x = 10,030
x2 = 20,120,190
b1 =
b0 =
SS xy
SS xx

y = 21,293.4
y2 = 101,097,670.14
 xy 
x
2
 x y

n
( x) 2
n
(10,030)( 21,293.4)
5
= 871.35
(10,030) 2
20,120,190 
5
42,723,273.9 
=
 y  b  x  21,293.4  871.35 10,030
n
1
n
5
xy = 42,723,273.9
n=5
5
= – 1,743,669.42
ŷ = – 1,743,669.42+ 871.35 x
ŷ (2010) = – 1,743,669.42+ 871.35 (2010) = 7,744.08 ($millions)
12.49
Month
May 2009
June 2009
July 2009
August 2009
September 2009
October 2009
November 2009
Long-Term Interest Rates
3.22
3.47
3.42
3.473
3.366
3.423
3.41
Trend line: Long term interest rate = 3.3371 + 0.015071 (Recoded Month)
Forecast for December 2009 = 3.46
© 2010 John Wiley & Sons Canada, Ltd.
418
Chapter 12: Correlation and Simple Regression Analysis
12.50
Year
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
New Housing
Prices Index
102.0
101.2
99.3
100.0
101.0
101.9
104.1
107.0
111.3
116.7
123.2
Equation of the trend line = New housing price index =94.0946+2.01(Recoded year)
Forecast for new housing prices for 2005 = 118.2145
12.51
x = 36
y = 44
xy = 188
r =
x2 = 256
y2 = 300
n =7
 xy 
 x y
n

( x)  
( y ) 2 
2
2
 x 
  y 

n  
n 

2
 38.2857
r =
(70.85714)( 23.42857)
12.52
x
5
7
3
16
12
=
(36)( 44)
7
2

(36)  
(44) 2 
256

300




7 
7 

188 
=
 38.2857
= –.940
40.7441
y
8
9
11
27
15
© 2010 John Wiley & Sons Canada, Ltd.
419
Chapter 12: Correlation and Simple Regression Analysis
9
13
x = 52
y = 83
xy = 865
x2 = 564
y2 = 1,389
n=6
a)
ŷ = 2.6941 + 1.2853 x
b)
ŷ (Predicted Values)
9.1206
11.6912
6.5500
23.2588
18.1177
14.2618
c)
(y- ŷ ) residuals
-1.1206
-2.6912
4.4500
3.7412
-3.1176
-1.2618
(y- ŷ )2
1.2557
7.2426
19.8025
13.9966
9.7194
1.5921
SSE = 53.6089
se 
d)
b1 = 1.2853
b0 = 2.6941
SSE
53.6089
= 3.661

n2
4
r2 = 1 
SSE
53.6089
 1
2
( y )
(83) 2
2
1
,
389

y  n
6
= .777
© 2010 John Wiley & Sons Canada, Ltd.
420
Chapter 12: Correlation and Simple Regression Analysis
e)
Ho: 1 = 0
Ha: 1  0
 = .01
Two-tailed test, /2 = .005
df = n – 2 = 6 – 2 = 4
t.005,4 = ±4.604
se
sb =
x
t =
2

( x ) 2

n
3.661
(52) 2
564 
6
= .34389
b1  1 1.2853  0
= 3.74

sb
.34389
Since the observed t = 3.74 < t.005,4 = 4.604, the decision is to fail to reject
the null hypothesis.
f)
12.53
The r2 = 77.74% is modest. There appears to be some prediction with this
model. The slope of the regression line is not significantly different from
zero using  = .01. However, for  = .05, the null hypothesis of a zero
slope is rejected. The standard error of the estimate, se = 3.661 is not
particularly small given the range of values for y (11 – 3 = 8).
x
53
47
41
50
58
62
45
60
x = 416
y = 57
xy = 3,106
a)
y
5
5
7
4
10
12
3
11
x2 = 22,032
y2 = 489
n=8
b1 = 0.355
b0 = –11.335
ŷ = –11.335 + 0.355 x
© 2010 John Wiley & Sons Canada, Ltd.
421
Chapter 12: Correlation and Simple Regression Analysis
b)
ŷ (Predicted Values)
7.48
5.35
3.22
6.415
9.255
10.675
4.64
9.965
c)
(y- ŷ ) residuals
-2.48
-0.35
3.78
-2.415
0.745
1.325
-1.64
1.035
(y- ŷ )2
6.1504
0.1225
14.2884
5.8322
0.5550
1.7556
2.6896
1.0712
SSE = 32.4649
SSE
32.4649
= 2.3261

n2
6
d)
se =
e)
r2 = 1 
f)
Ho: 1 = 0
Ha: 1  0
SSE
32.4649
 1
2
( y )
(57) 2
2
489

y


8
n
= .608
 = .05
Two-tailed test, /2 = .025
df = n – 2 = 8 – 2 = 6
t.025,6 = ±2.447
se
sb =
x
2

( x) 2
n

2.3261
(416) 2
22,032 
8
= 0.116305
© 2010 John Wiley & Sons Canada, Ltd.
422
Chapter 12: Correlation and Simple Regression Analysis
t =
b1  1 0.355  0
= 3.05

sb
.116305
Since the observed t = 3.05 > t.025,6 = 2.447, the decision is to reject the
null hypothesis.
The population slope is different from zero.
g)
This model produces only a modest r2 = .608. Almost 40% of the
variance of y is unaccounted for by x. The range of y values is 12 – 3 = 9
and the standard error of the estimate is 2.33. Given this small range, the
se is not small.
12.54 x = 1,263
y = 417
xy = 88,288
b0 = 25.42778
x2 = 268,295
y2 = 29,135
n=6
b1 = 0.209369
SSE = y2 - b0y - b1xy =
29,135 – (25.42778)(417) – (0.209369)(88,288) = 46.845468
r2 = 1 
SSE
46.845468
= .695
 1
2
153.5
( y )
2
y  n
Coefficient of determination = r2 = .695
12.55 a) x0 = 60
x = 524
y = 215
xy = 15,125
x2 = 36,224
y2 = 6,411
n=8
se = 3.201
95% Confidence Interval
df = n – 2 = 8 – 2 = 6
b1 = .5481
b0 = –9.026
/2 = .025
t.025,6 = ±2.447
ŷ = –9.026 + 0.5481(60) = 23.86
© 2010 John Wiley & Sons Canada, Ltd.
423
Chapter 12: Correlation and Simple Regression Analysis
x
 x  524
n
8
ŷ ± t /2,n-2 se
= 65.5
( x0  x) 2
( x) 2
2
x


n
1

n
1

8
23.86 + 2.447(3.201)
(60  65.5) 2
(524) 2
36,224 
8
23.86 + 2.447(3.201)(.375372) = 23.86 + 2.94
20.92 < E(y60) < 26.8
b) x0 = 70
ŷ 70 = –9.026 + 0.5481(70) = 29.341
ŷ + t/2,n-2 se 1 
1

n
( x0  x) 2
( x ) 2
2
x  n
1
29.341 + 2.447(3.201) 1  
8
(70  65.5) 2
(524) 2
36,224 
8
29.341 + 2.447(3.201)(1.06567) = 29.341 + 8.347
20.994 < y < 37.688
c) The confidence interval for (b) is much wider because part (b) is for a single value
of y which produces a much greater possible variation. In actuality, x0 = 70 in
part (b) is slightly closer to the mean (x) than x0 = 60. However, the width of the
single interval is much greater than that of the average or expected y value in
part (a).
12.56
Year
Cost
© 2010 John Wiley & Sons Canada, Ltd.
424
Chapter 12: Correlation and Simple Regression Analysis
1
2
3
4
5
56
54
49
46
45
x = 15
x2 = 55
b1 =
b0 =
y = 250
y2 = 12,594
SS xy
SS xx

 xy 
x
2
 x y

n
( x) 2
(15)( 250)
5
= –3
(15) 2
55 
5
720 
=
n
 y  b  x  250  (3) 15
1
n
n
5
xy = 720
n=5
5
= 59
ŷ = 59 – 3 x
ŷ (7) = 59 – 3(7) = 38 ($ millions)
12.57 y = 267
x = 21
xy = 1,256
y2 = 15,971
x2 = 101
n =5
b0 = 9.234375
b1 = 10.515625
SSE = y2 – b0y – b1xy =
15,971 – (9.234375)(267) – (10.515625)(1,256) = 297.7969
r2 = 1 
SSE
297.7969
= .826
 1
2
1
,
713
.
2
(
y
)

 y2  n
If a regression model would have been developed to predict number of cars sold
by the number of sales people, the model would have had an r2 of 82.6%. This
result means that 82.6% of the variance in the number of cars sold is explained by
variation in the number of salespeople.
12.58 n = 12
x = 548
x2 = 26,592
© 2010 John Wiley & Sons Canada, Ltd.
425
Chapter 12: Correlation and Simple Regression Analysis
y = 5940
y2 = 3,211,546
b1 = 10.626383
xy = 287,908
b0 = 9.728511
ŷ = 9.728511 + 10.626383 x
SSE = y2 – b0y – b1xy =
3,211,546 – (9.728511)(5940) – (10.626383)(287,908) = 94,337.9762
se 
SSE
94,337.9762
= 97.1277

n2
10
SSE
94,337.9762
= .652
 1
2
271
,
246
(
y
)

 y2  n
se
97.1277

= 2.4539
2
2
(
x
)
(
548
)

26,592 
 x2  n
12
r2 = 1 
sb =
t =
10.626383  0
= 4.33
2.4539
If  = .01, then t.005,10 = +3.169. Since the observed t = 4.33 > t.005,10 = 3.169, the
decision is to reject the null hypothesis. The slope is significantly larger than
zero. Average family income is a significant predictor of number of rentals at a
store.
12.59
Sales(y)
Number of Units(x)
17.1
7.9
4.8
4.7
4.6
4.0
2.9
2.7
2.7
12.4
7.5
6.8
8.7
4.6
5.1
11.2
5.1
2.9
y = 51.4
x2 = 538.97
y2 = 460.1
xy = 440.46
x = 64.3
n=9
© 2010 John Wiley & Sons Canada, Ltd.
426
Chapter 12: Correlation and Simple Regression Analysis
b0 = –0.863565
b1 = 0.92025
ŷ = –0.863565 + 0.92025 x
SSE = y2 – b0y – b1xy =
460.1 – (–0.863565)(51.4) – (0.92025)(440.46) = 99.153926
SSE
99.153926
= .405
 1
2
166
.
55
(
y
)

 y2  n
 x y
 xy  n
=
2
2

( x)  
( y ) 
2
2
 x 
  y 

n  
n 

r2 = 1 
r =
(64.3)(51.4)
9

(64.3) 2  
(51.4) 2 
538
.
97

460
.
1




9 
9 

440.46 
r =
73.23556
r =
= 0.636
(79.58222)(166.54889)
Since r = 0.405 = 0.636, the restaurant chain’s sales can be predicted by its
number of units.
12.60
Year (x)
1
2
3
4
5
6
7
8
Total Employment (1,000s) (y)
11,152
10,935
11,050
10,845
10,776
10,764
10,697
9,234
© 2010 John Wiley & Sons Canada, Ltd.
427
Chapter 12: Correlation and Simple Regression Analysis
9
10
11
12
13
9,223
9,158
9,147
9,313
9,353
x = 91
x2 = 819
b1 =
b0 =
SS xy
SS xx
y = 131,647
y2 = 1,342,147,171

 xy 
x
2
 x y

n
( x) 2
n
(91)(131,647)
13
= –198.9725
(91) 2
819 
13
885,316 
=
 y  b  x  131,647  (198.9725) 91
n
1
n
13
xy = 885,316
n = 13
13
= 11,519.4998
ŷ = 11,519.4998 – 198.9725 x
ŷ (2011) = ŷ (17) = 11,519.4998 – 198.9725 (17) = 8,136.9673
ŷ (2011) = 8,137 (1,000s)
12.61
x
2.4
3.9
1.1
2.8
2.4
1.6
2.8
2.2
2.2
0.1
y
6.7
7
7.8
7.9
7.2
6.9
6.1
6
6.1
8.4
© 2010 John Wiley & Sons Canada, Ltd.
428
Chapter 12: Correlation and Simple Regression Analysis
r =
 xy 
 x y
n

( x)  
( y ) 2 
2
2
 x 
  y 

n  
n 

=
2
(21.5)( 70.1)
10
2

(21.5)  
(70.1) 2 
55
.
87

497
.
57




10  
10 

146.94 
r =
r =
 3.775
= –.489
(9.645)(6.169)
The r value of –.489 denotes a moderate negative correlation.
12.62 x = 3,337,494
x2 = 736,622,100,116
xy = 965,566,451,720
r =
 xy 

( x)
2
 x 
n

2
y = 4,406,569
y2 = 1,274,609,514,995
n = 16
 x y
n

( y )
2
  y 
n
 
2



=
(3,337,494)( 4,406,569)
16

(3,337,494) 2  
(4,406,569) 2 
736
,
622
,
100
,
116

1
,
274
,
609
,
514
,
995




16
16



965,566,451,720 
r =
46,385,351,839.6
r =
= 0.934
(40,442,962,613.8)(60,993,868,009.9)
The r value of 0.934 denotes a strong positive correlation.
b1 = 1.1469
b0 = 36,168.0228
ŷ = 36,168.0228+ 1.1469 x
© 2010 John Wiley & Sons Canada, Ltd.
429
Chapter 12: Correlation and Simple Regression Analysis
ŷ (200,000) = 36,168.0228+ 1.1469 (200,000) = 265,548.0228
ŷ + t/2,n-2se
1

n
( x0  x) 2
( x) 2
2
x  n
 = .05, t.025,14 = + 2.145
x0 = 200,000, n = 16
x = 208,593.375
SSE = y2 – b0y – b1xy =
1,274,609,514,995– (36,168.0228)( 4,406,569) – (1.1469)( 965,566,451,720) =
= 7,824,463,455.5
se 
SSE
7,824,463,455.5
= 23,640.8597

n2
14
Confidence Interval =
265,548.0228+ (2.145)( 23,640.8597)
1

16
(200,000  208,593.375) 2
=
(3,337,494) 2
736,622,100,116 
16
265,548.0228 + 12,861.2626
252,686.7542 to 278,409.2854
H0: 1 = 0
Ha: 1  0
 = .05
t =
df = 14
Table t.025,14 = + 2.145
b1  0
1.1469  0
1.1469
= 9.7559


sb

 .11756
23
,
640
.
8597


 40,442,962,613.8 


Since the observed t = 9.7559 > t.025,14 = 2.145, the decision is to reject the null
© 2010 John Wiley & Sons Canada, Ltd.
430
Chapter 12: Correlation and Simple Regression Analysis
hypothesis.
12.63 x = 10.797
y = 516.8
xy = 1091.2081
x2 = 20.673549
y2 = 61,899.06
n=7
b1 = 73.15543
b0 = –39.00846
ŷ = –39.00846+ 73.15543 x
SSE = y2 – b0 y – b1 xy
SSE = 61,899.06 – (–39.00846)(516.8) – (73.15543)( 1091.2081) = 2,230.83435
se 
SSE
2,230.83435
= 21.12266

n2
5
r2 = 1 
SSE
2,230.83435
= 1 – .09395 = .90605
 1
2
( y )
(516.8) 2
2
61,899.06 
y  n
7
12.64 x = 542,911
y = 858,326
y2 = 77,988,664,304
b1 =
SS xy
SS xx

 xy 
x
2
n = 10
 x y

n
( x) 2
n
x2 = 31,262,604,453
xy = 49,065,104,049
(542,911)(858,326)
10
(542,911) 2
31,262,604,453 
10
49,065,104,049 
=
b1 = 1.379481
b0 =
ŷ
 y  b  x  858,326  (1.379481) 542,911
n
1
n
10
10
= 10,939.059081
= 10,939 + 1.3795 x
r2 for this model is 0.78801183 . There is predictability in this model.
Test for slope: t = 5.45 with a p-value of 0.0006. The null hypothesis that the
population slope is zero is rejected.
© 2010 John Wiley & Sons Canada, Ltd.
431
Chapter 12: Correlation and Simple Regression Analysis
Time-Series Trend Line:
x = 55
x2 = 385
b1 =
b0 =
SS xy
SS xx
y = 542,911
y2 = 31,262,604,453

 xy 
x
 x y

2
xy = 3,102,042
n = 10
(55)(542,911)
10
= 1406.442424
(55) 2
385 
10
3,102,042 
n
( x) 2
=
n
 y  b  x  542,911  (1406.442424) 55
1
n
n
10
10
= 46,555.666667
ŷ = 46555.666667+ 1406.442424 x
ŷ (2008) = ŷ (11) = 46,555.666667 + 1406.442424 (11) = 62,026.5
12.65 x = 1034.7
x2 = 217,604.49
xy = 2,356,711.6
b1 =
b0 =
SS xy
SS xx

y = 18,840
y2 = 62,486,228
n=7
 xy 
x
2
 x y

n
( x) 2
(1034.7)(18,840)
7
= – 6.620826
(1034.7) 2
217,604.49 
7
2,356,711.6 
=
n
 y  b  x  18,840  (6.620826) 1034.7
n
1
n
7
7
= 3670.081237
ŷ = 3670.081237– 6.620826 x
SSE = y2 – b0y – b1xy
= 62,486,228 – (3670.081237)(18,840) – (–6.620826)(2,356,711.6)
SSE = 8,945,274.9307
© 2010 John Wiley & Sons Canada, Ltd.
432
Chapter 12: Correlation and Simple Regression Analysis
SSE
8,945,274.9307
= 1337.5556

n2
5
se 
r2 = 1 
SSE
8,945,274.9307
= 1 – .7594 = .2406
 1
2
2
(
y
)
(
18
,
840
)

62,486,228 
 y2  n
7
The r value of –0.4905 denotes a moderate negative correlation.
H0: 1 = 0
Ha: 1 0
 = .05
 x 

t.025,5 = + 2.571
2
SSxx =
t =
x
2
n
 217,604.49 
(1034.7) 2
= 64,661.048571
7
b1  0
 6.620826  0
= – 1.26

se
1337.5556
SS xx
64,661.048571
Since the observed t = – 1.26> t.025,5 = – 2.571, the decision is to fail to reject the
null hypothesis.
12.66 Let Water Use = y and Temperature = x
x = 196
y = 3,880
xy = 112,006
x2 = 5,834
y2 = 2,188,612
n=8
b1 = 16.42054
b0 = 82.69671
ŷ = 82.69671 + 16.42054 x
ŷ 38 = 82.69671 + 16.42054 (38) = 707
SSE = y2 – b0 y – b1 xy
SSE = 2,188,612– (82.69671)( 3,880) – (16.42054)( 112,006) = 28,549.76196
se 
SSE
28,549.76196
= 68.980386

n2
6
© 2010 John Wiley & Sons Canada, Ltd.
433
Chapter 12: Correlation and Simple Regression Analysis
r2 = 1 
SSE
28,549.76196
= 1 – .093053 = .9069
 1
2
2
(
y
)
(
3
,
880
)

2,188,612 
 y2  n
8
Testing the slope:
Ho: 1 = 0
Ha: 1  0
 = .01
Since this is a two-tailed test, /2 = .005
df = n – 2 = 8 – 2 = 6
t.005,6 = ±3.707
se
sb =
x
t =
2

( x ) 2
n

68.980386
(196) 2
5,834 
8
= 2.147266
b1  1 16.42054  0
= 7.65

sb
2.147266
Since the observed t = 7.65 > t.005,6 = 3.707, the decision is to reject the null
hypothesis.
12.67 The F value for overall predictability is 7.12 with an associated p-value of .0205
which is significant at  = .05. It is not significant at alpha of .01. The coefficient
of determination is .372 with an adjusted r2 of .32. This represents the modest
predictability. The standard error of the estimate is 982.219 which in units of
1,000 labourers means that about 68% of the predictions are within 982,219 of the
actual figures. The regression model is:
Number of Union Members = 22,348.97 – 0.0524 Labour Force. For a labour
force of 100,000 (thousand, actually 100 million), substitute x = 100,000 and get a
predicted value of 17,108.97 (thousand) which is actually 17,108,970 union
members.
12.68 The Residual Diagnostics from output indicate a relatively healthy set of
residuals. The Histogram indicates that the error terms are generally normally
distributed. This is somewhat confirmed by the semi straight line Normal Plot of
Residuals. However, the Residuals vs. Fitted Values graph indicates that there
may be some heteroscedasticity with greater error variance for small x values.
© 2010 John Wiley & Sons Canada, Ltd.
434
Download