Document 15930301

advertisement
252solnI1 11/9/07
(Open this document in 'Page Layout' view!)
I. LINEAR REGRESSION-Confidence Intervals and Tests
1. Confidence Intervals for
2. Tests for
b1 .
b1 .
3. Confidence Intervals and Tests for
b0
Text 13.40-13.42, 13.49 [13.35-13.37, 13.43] (13.35-13.37, 13.43) (In 13.42[13.37] do test for
4. Prediction and Confidence Intervals for
b0 as well)
y
Text 13.55-13.56, 13.58 [13.49-13.51] (13.47-13.49)
This document includes exercises 13.35 to 13.43.
----------------------------------------------------------------------------------------------------------------------------- ----
Problems involving confidence intervals and tests for the coefficients of a simple
regression.
Answers below are heavily edited versions of the answers in the Instructor’s Solution Manual. Note that
your text uses S YX for the standard error, which I call s e . The sum S xy and the covariance
s xy 
 xy  nxy , which was introduced last term, are not the same thing.
n 1
Exercise 13.40 [13.35 in 9th and 10th]: You have found n  18, b1  4.5 and sb1  1.5. a) What is the t
ratio for a significance test? b) At a 95% confidence level, what are the critical values of t? c) What is your
statistical decision about significance? d) construct a 95% confidence interval for the slope.

 s2
1
  e . 1  b1  t sb
Solution: From the outline s b21  s e2 
df  n  2
1
2

X 2  nX 2  SS x


H
:



b  10
 0 1
10
To test 
use t  1
. Remember  10 is most often zero – and if the null hypothesis is
H
:



s b1
10
 1 1
false in that case we say that 1 is significant.
Given: n  18, b1  4.5 and sb1  1.5. H 0 : 1  0 .

a)` t 
b1  0 4.5
16
 2.120 .

 3.00 b) df  n  2  18  2  16 .   .05 . This is a 2-sided test so use t .025
s b1
1.5
16
c) Make a diagram. The rejection regions are above t n2  t .025
 2.120 and below
2
16
 t n2  t.025
 2.120 . Since 3.00 is in the upper rejection region, reject the null hypothesis. which is
2
equivalent to saying that the regression equation is useful.
d) 1  b1  t  sb1  4.5  2.1201.5  4.5  3.18 . So 1.32  1  7.68 . This is equivalent to saying the
2
coefficient is significant, since the interval does not include zero.
252solnI1 11/9/07
Exercise 13.41 [13.36 in 8th and 9th]: If SSR  60, SSE  40 and n  20 a) Find the F statistic for the
ANOVA. b) Find the critical value of F for   .05 . c) Make a statistical decision d) Calculate the
correlation assuming that b1 is negative. e) Test the correlation for significance.
Solution: If k is the number of independent variables (1 in this section), the ANOVA table we can get
from this is below. We know SSR  60, SSE  40 , and n  20. We assume   .05 .
Source
SS
DF
MS
F
MSR
MSR  SSR
F
k
Regression
SSR
k
MSE
SSE
MSE 
Error(Residual)
n  k 1
SSE
n  k 1
Total
SST
n 1
If we fill in the data above. We get the following.
Source
SS
DF
MS
F
SSR
MSR
MSR 
F
k
Regression
60
1
MSE
SSE
MSE 
Error(Residual)
n  k 1
40
n  k 1
Total
100
20  1
If we divide SS by DF and MSR by MSE, we can finish the table.
Source
SS
DF
MS
F
F

1
,
18
MSR  60
F  27
F.05   4.41
Regression
60
1
MSE  2.222
18
Error(Residual)
40
Total
100
19
SSE
MSR
60
 40  2.222 . F 

 27 .
So we have a) MSR  SSR  60  60 . s e2  MSE 
18
k
1
n  k 1
MSE 2.222
1,18  4.41 from the
H 0 : Re gression is useless - this is equivalent to H 0 : 1  0 if k  1. b) We get F.05
table. c) Our rejection zone is above 4.41 and since Fcalc  F , we reject the null hypothesis.
d) From the outline R 2 
SSR 60

 0.60 . Remember r 2  R 2 and the correlation has the same sign as
SST 100
the simple regression coefficient. So r   .60  .7746 .
e) According to the text and the outline if we want to test H 0 : xy  0 against H1 : xy  0 and x and y are
r
normally distributed, we use t n  2  

sr
t
r
1 r 2
n2
.7746
. The value of our test ratio is
 5.1962 . This is a two-sided test and the rejection zones are above
1  .6
1 r 2
18
n2


n2
16
16
t
 t.025  2.120 and below  t n2  t.025
 2.120 . Since -5.1962 is in the lower rejection zone,
2

r
2
there is a significant correlation between x and y.
252solnI1 11/9/07
Exercise13.42 [13.37 in 8th and 9th}: For the Petfood problems test the slope for significance and construct
a 95% confidence interval for the slope.
Solution: Recall that in problems 13.4 and 13.16 x 
had spare parts: S xy 

n
12
xy  nx y  384  1212.52.375   27.75 , SS x 
 2250  1212 .5  375 , SST  SS y 
2
S xy
 x  150  12.5 , y   y  28.5  2.375 . We
y
2

n
12
x 2  nx 2
 ny  70 .69  122.375   3.0025
2
2
27 .75
 0.074 (the slope), b0  y  b1 x  2.375  0.074 12.5  1.45 (the intercept).
SS x
375


Y  b0  b1 x became Y  1.45  0.074 x . SSR  b1 S xy  0.07427.75  2.0535.
b1 

SSR 2.0535

 .6839 . SSE  SST  SSR  3.0025  2.0535  0.9490 .
SST 3.0025
SSE 0.9490

 0.09490 . The standard error (of the estimate) is s e  S YX  0.09490  .3081 .
a) s e2 
n  2 12  2
R2 

So s b21  s e2 



 s2
  e  .09490  0.0002531 sb  0.0002531  .01591
1
2
2
SS x
375
X  nX

1
1

2 


X2
  .09490  1   12 .5    .04745 sb  0.004745  .21783
s b20  s e2  
0
 12  375  
n
X 2  nX 2 





b 0
b  0 0.074
1.450
t b1  1

 4.651
t b0  0

 6.6566 We test each of these against
s b1
.01591
s b0
.217831
t n2  t 10  2.228 . Make a diagram with zero in the middle and an upper ‘reject’ zone above 2.228 and


2
.025
a lower reject zone below -2.228. Since both t-ratios are in the upper ‘reject’ zone, we reject the null
hypothesis and say that both coefficients are significant.
b) The confidence intervals for the two coefficients are
1  b1  t sb1  0.074  2.2280.01591  0.074  .0354 and 0.039  1  0.109
2
 0  b0  t 2 sb0  1.450  2.2280.21783  1.45  0.49 . So 0.96  1  1.94 . This is equivalent to saying
the coefficients are significant, since the intervals do not include zero.
Remember the printout given in 252solnG1.
Regression Analysis: Sales versus Space
The regression equation is
Sales = 1.45 + 0.0740 Space
Predictor
Constant
Space
S = 0.3081
Coef
1.4500
0.07400
SE Coef
0.2178
0.01591
R-Sq = 68.4%
T
6.66
4.65
P
0.000
0.001
R-Sq(adj) = 65.2%
Analysis of Variance
Source
Regression
Residual Error
Total
DF
1
10
11
SS
2.0535
0.9490
3.0025
MS
2.0535
0.0949
F
21.64
P
0.001
252solnI1 11/9/07
Notice the F and t ratios are those given above and that the p-values associated with them are essentially
zero. We are then saying that we will reject the null hypotheses of no significance at any reasonable
significance level. The F test and the t-test on the slope are essentially the same test. Note that 21.64 is 6.66
 
1,10  t 10 2 .
squared and that F.05
.025
Exercise 13.49 in 10th edition only: I’m afraid that you need to read the statement of the problem in the
text. From the Instructor’s Solution Manual
13.49 (a)
Sears, Roebuck and Company’s stock moves only 60.3% as much as the overall market
and is much less volatile. LSI Logic’s stock moves 142.1% more than the overall market
and is considered as extremely volatile. The stocks of Disney Company, Ford Motor
Company and IBM, move 10.9%, 34% and 44.9% respectively more than the overall
market and are considered somewhat more volatile than the market.
(b)
Investors can use the beta value as a measure of the volatility of a stock to assess its risk.
See (b) in problem below.
Exercise13.43 in 8th and 9th : From the Instructor’s Solution Manual
13.43 (a)
For Proctor and Gamble, the estimated value of its stock will increase by 0.53% on average
when the S & P 500 index increases by 1%.
For Ford Motor Company, the estimated value of its stock will increase by 0.77% on average
when the S & P 500 index increases by 1%.
For IBM, the estimated value of its stock will increase by 0.94% on average when the S & P
500 index increases by 1%.
For Immunex Corp, the estimated value of its stock will increase by 1.34% on average when
the S & P 500 index increases by 1%.
For LSI Logic, the estimated value of its stock will increase by 2.04% on average when the S
& P 500 index increases by 1%.
(b)
A stock is riskier than the market if the estimated absolute value of the beta exceeds one.
This can be used to gauge the volatility of a stock in relative to how the market behaves in
general.
252solnI1 11/9/07
James T. McClave, P. George Benson and Terry Sincich, Statistics for Business and Economics, 8th ed. ,
Prentice Hall, 2001, the text used in 2003, had some more problems of this type if you want more practice.

Exercise10.31: From the outline s b21  s e2 


1  b1  t sb1 , where df  n  2 .
2
a) Given b1  31, s e  3, SS x 
s b21 
x
2

 s2
  e and a Confidence Interval for b1 is
X 2  nX 2  SS x

1
 nx 2 = 35 and n  10 .
s e2
32

 0.25714 . s b  0.25714  0.5071 .
1
SS x 35
8
8
 2.306 and t .05
 1.860
df  n  2  10  2  8 so t .025
So we find for a 90% interval 1  b1  t sb1  31  1.860 0.5071   31  0.94 or 30.06 to 31.94.
2
And for a 95% interval 1  31  2.306 0.5071   31  1.17 or 29.83 to 32.17.
b) Given b1  64 , SSE  1960 , SS x 
x
2
 nx 2 = 30 and n  14 .
2
s
SSE 1960
163 .3333

 163 .3333 s b21  e 
 5.4444 . s b  5.4444  2.3333 .
1
n2
12
SS x
30


12
12
df  n  2  12 so t .025  2.179 and t .05  1.782
s e2 
So we find for a 90% interval 1  64  1.782 2.3333   64  4.168 or 59.84 to 68.16.
And for a 95% interval 1  64  2.179 2.3333   64  5.08 or 58.92 to 69.08.
c) Given b1  8.4, SSE  146 , SS x 
x
2
 nx 2 = 64 and n  20 .
s2
SSE 146
8.1111

 8.1111 s b21  e 
 0.12674 . s b  0.12674  0.35600 .
1
n  2 18
SS x
64
18
18
 1.734
 2.101 and t .05
df  n  2  18 so t .025
s e2 
So we find for a 90% interval 1  8.4  1.734 0.35600   8.4  0.617 .
And for a 95% interval 1  8.4  2.1010.35600   8.4  0.748 .
Note that if the null hypothesis is H 0 :  1  0 (or that the slope is insignificant), The confidence intervals
above show that it is false.
252solnI1 11/9/07
Exercise 10.32: a) Scattergram
Exercise 10.32
y
8
3
Y = 0.535714 + 0.821429X
R-Squared = 0.727
-2
0
1
2
3
4
5
6
x
b) Row
1
2
3
4
5
6
7
y
x
1
3
3
1
4
7
2
21
1
4
3
2
5
6
0
21
x2
1
16
9
4
25
36
0
91
xy
1
12
9
2
20
42
0
86
 x  21,  y  21,
 x  91,  y  89,  xy  86, and
2
2
y2
1
9
9
1
16
49
4
89
x
 x  21  3 , y   y  21  3
n
7
n
7
Spare Parts:
S xy 
xy  nx y  86  733 = 23.
SS x
SS y

x
y
S xy
2
 nx 2 = 91  732 = 28.
2
 ny 2  89  732 = 26.
23
 0.82143
SS x 28
b0  y  b1 x  3  0.82143 3  0.53571 . So the

equation is Y  0.5357  0.82143 x
b1 

n  7.

c) To plot the regression line note that for x  0, Y  0.5357  0.821430  0.5357, and for

x  6, Y  0.5357  0.821436  5.4641. The line is plotted above.
d) To see if x contributes to y, test H 10 :  1  0 against H 11 :  1  0 . Also test H 00 :  0  0 against
H 01 :  0  0 .
e) SSE  SST  SSR  SS y  b1 S xy . So s e2 
s b21 
SSE SS y  b1 S xy 26  0.8214 23 


 1.4216
n2
n2
5
s e2
1.4216

 0.0508 . s b  0.0508  0.2253 . From the outline
1
SS x
28
1

2 
2


X2
  s e2  1  X   1.4216  1  3   0.6600 . s  0.6600  0.8124 .
s b20  s e2  
b0
n
X 2  nX 2 
 n SS x 
 7 28 


We use these in the t-tests. df  n  2  7  2  5 . Make a diagram. Show an almost normal curve with a

5
5
 2.571 .
 2.571 and  t .025
95% 'accept' region between t .025
252solnI1 11/9/07
 H 0 :  1   10
b  10
From the Outline - To test 
use t  1
, where, as usual, 10  0 . So
s b1
 H 1 :  1   10
b   10 0.8214  0
t 1

 3.646 . Since this is in our 'reject' region, reject the null hypothesis and
s b1
0.2253
conclude that the slope is significant.
H 0 :  0   00
b   00
b   00
0.5375  0

 0.6616 .
To test 
use t  0
, where  00  0 . So t  0
.8124
H
:



s b0
s b0
00
 1 0
Since this is not in our 'reject' region, do not reject the null hypothesis and conclude that the intercept is
insignificant.
The Minitab instructions for this problem follow:
#To make the graph
MTB > %Fitline c1 c2;
SUBC> Confidence 95.0;
SUBC>
Title 'Exercise 10.32'.
Executing from file: W:\WMINITAB\MACROS\Fitline.MAC
Macro is running ... please wait
#To do the regression
MTB > regress c1 on 1 c2
Regression Analysis
The regression equation is
y = 0.536 + 0.821 x
Predictor
Constant
x
Coef
0.5357
0.8214
s = 1.192
Stdev
0.8124
0.2253
R-sq = 72.7%
t-ratio
0.66
3.65
p
0.539
0.015
R-sq(adj) = 67.2%
Analysis of Variance
SOURCE
Regression
Error
Total
DF
1
5
6
SS
18.893
7.107
26.000
MS
18.893
1.421
F
13.29
p
0.015
Note that the computer got the same coefficients and values of t that we did.
Exercise 10.33: Data comes from the previous problem.
5
5
 1.476 and t .01
 3.365
df  n  2  5 so t .10
So we find, for a 80% interval for the slope, 1  b1  t sb1  0.821  1.476 0.2253   0.82  0.33 or 0.49
2
to 1.15 and for the intercept  0  b0  t sb0  0.5357  1.476 0.8124   0.54  1.20 or -0.66 to 1.74 (This
2
interval includes zero.).
And , for a 98% interval for the slope, 1  b1  t sb1  0.821  3.365 0.2253   0.82  0.76 or 0.06 to
2
1.58 and for the intercept  0  b0  t sb0  0.5357  3.365 0.8124   0.54  2.73 or -2.19 to 3.27 (This
2
interval includes zero.).
7
252solnI1 11/9/07
Exercise 10.42†: (10.40 in the old book) a) Scattergram.
Exercise 10.42
y
100
50
Y = 44.1305 + 0.236617X
R-Squared = 0.086
0
10
20
30
40
50
60
70
80
x
b) Row
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
y
x
x2
xy
y2
40
12
144
480 1600
73
71 5041 5183 5329
95
70 4900 6650 9025
60
81 6561 4860 3600
81
43 1849 3483 6561
27
50 2500 1350
729
53
42 1764 2226 2809
66
18
324 1188 4356
25
35 1225
875
625
63
82 6724 5166 3969
70
20
400 1400 4900
47
81 6561 3807 2209
80
40 1600 3200 6400
51
33 1089 1683 2601
32
45 2025 1440 1024
50
10
100
500 2500
52
65 4225 3380 2704
30
20
400
600
900
42
21
441
882 1764
839 1037 47873 48353 63605
n  19, x 
y
 x  1037  44.1579 ,
n
19
 y  839  54.5789
n
19
Spare Parts:
S xy 
xy  nx y  2561 .26 .
SS x
SS y

x
y
S xy
2
 nx 2  10824 .5 .
2
 ny 2  7006 .63  SST
2561 .26
 0.2366
SS x 10824 .5
b0  y  b1 x
 54.5789  0.2366 44.1579   44.131 . Notice
that this is almost the same as the mean of y. So

the equation is Y  44 .1  0.237 x . As in
Exercise 10.15
SS xy 2
2561 .26 2  .086 . On
R2 

SS x SS y 10824 .57006 .63 
a zero to one scale, this is pathetic!
b1 



The Minitab routine below was used to calculate the solution to this problem.
#252sols
regress c1 on 1 c2
let c3=c2*c2
let c4=c1*c2
let c5=c1*c1
let k1=sum(c1)
let k2=sum(c2)
let k3=sum(c3)
let k4=sum(c4)
let k5=sum(c5)
print c1-c5
let k17=count(c1)
let k21=k2/k17
let k22=k1/k17
let k18=k21*k21*k17
Fakes simple OLS regression by hand
#Put y
#c3 is
#c5 is
#k1 is
#k2 is
#k3 is
#k4 is
#k5 is
in c1, x in c2
x squared, c4 is xy
y squared
sum of y
sum of x
sum of x squared
sum of xy
sum of y squared
#k17 is n
#k21 is mean of x
#k22 is mean of y
8
252solnI1 11/9/07
let k18=k3-k18
let k19=k22*k22*k17
let k19=k5-k19
let k20=k21*k22*k17
let k20=k4-k20
print k1-k5
print k17-k22
end
#k18 is SSx
#k19 is SSy
#k20 is SSxy
c) SSE  SST  SSR  SS y  b1 S xy . So
SS y  b1 S xy
7006 .63  0.2366 2561 .26 
SSE


 376 .5080  19 .40 . About 95% of
n2
n2
17
managerial successes indices should lie within two standard deviations of the regression line. Twice this
standard deviation is about 38.80.
d) The low value of R 2 should make us suspicious that the regression explains very little. The scattergram
is extremely diffuse, leading us to feel that there is not much of a pattern to the data.
e) There are several ways to go about formally checking whether these interactions affect the index of
managerial success:.
(i) If we use the method for checking the significance of the slope used in the last problem,
H 0 :  1  0
s2
b 0
0.2366  0
376 .5080
. - To test 
use t  1

 1.27 .
s b21  e 
s b1
SS x
2561 .66
376 .5080
H 1 :  1  0
se 
2561 .66
df  n  2  19  2  17 . Make a diagram. Show an almost normal curve with a 95% 'accept' region
17 
17 
between  t .025
 2.110 and t .025
 2.110 . Since 1.27 is between these two values, we cannot reject the
null hypothesis and must conclude that the slope is not significant.
1

2 

H :   0
X2
  s e2  1  X  and that to test  0 0
Note: Recall that s b20  s e2  
use
H1 :  0  0
n SS x 
n
X 2  nX 2 





b0  0
t
 4.71 . Since this is in our 'reject' region, reject the null hypothesis and conclude that the
s b0

intercept is significant.
(ii) Run the regression on the computer. The Minitab results are
MTB > regress c1 on 1 c2
Regression Analysis
The regression equation is
y = 44.1 + 0.237 x
Predictor
Constant
x
Coef
44.130
0.2366
s = 19.40
Stdev
9.362
0.1865
R-sq = 8.6%
t-ratio
4.71
1.27
p
0.000
0.222
R-sq(adj) = 3.3%
Analysis of Variance
SOURCE
Regression
Error
Total
DF
1
17
18
SS
606.0
6400.6
7006.6
MS
606.0
376.5
F
1.61
p
0.222
The p-value for the slope is 0.222. Since this is above our 5% significance level, do not reject the null
hypothesis that the slope is insignificant.
(iii) The ANOVA is another version of the test in (ii). You already know that SST is 7006.6. you can get
the regression sum of squares, SSR, by computing SSR  b1 S xy  0.23662561.26  606.0 or, because
9
252solnI1 11/9/07
SSR
, SSR  R 2 SST  . You can get SSE  SST  SSR. . The degrees of freedom for a simple
SST
regression sum of squares, SSR, is 1 and for the SST is n  1  18, leaving df  n  2  19  2  17 for
SSE. The F value of 1.61 has 1 and 17 degrees of freedom and is equal to the square of the t that we
2
computed in (I). If you look up F 17  4.45, you will find that it is equal to t 17  2.110 2 . Notice that
R2 
.05
 
.025
the p-value is again 22.2%. In other words, in simple regression, the F test duplicates the t-test on the slope.
17 
f) You already know that t .025
 2.110 . So we find, for a 95% interval for the slope,
1  b1  t 2 sb1  0.2366  2.110 0.1865   0.24  0.39 or -0.15 to 0.63 (Notice that this interval includes
zero so that the slope is not significant.) . For the intercept
 0  b0  t 2 sb0  44.13  2.110 0.9.362   44.13  19.75 or 24.4 to 63.9 (This interval does not include
zero.)..
10
Download