252z9932

advertisement
4/21/98 252z9932
4 a. A bank wishes to compare deposit size at 4 branches. The result of the analysis of variance appears
below.
Sample means:
Branch 1: 4.87
Branch 2: 2.20
Branch 3: 4.31
Branch 4: 1.46
Source
SS
DF MS
F
Between 56.33 3 18.78
67.56
Within
6.67 24
0.278
Total
63.00 27
(i) Do mean deposits differ between banks? Why? (2)
(ii) Assuming that this table is based on four random samples of equal size, how many numbers
are in each column? (1)
(iii) Do a confidence interval for the mean deposit in branch 1. (2)
(iv) Do a Scheffe confidence interval for the difference between deposits in branches 2 and 4 and
explain why this type of interval might be preferred to one using t. (3)
b. In a study of inflation, 3 company sizes (Factor A) , 5 degrees of industrial concentration (Factor B) and
two types of product (Factor C) are distinguished. An ANOVA table was generated using the average price
rise over 5 years for a random sample of 4 firms in each category. Generate an ANOVA table showing all
possible interactions, using the following data. SSA = 22, SSB = 227, SSC = 32, SSAB = 54, SSABC =
136, SST = 1219. Using a 1% significance level, which of the differences and interactions are significant.?
(8) Correction! SSAC = 10, SSBC = 98 !!!!!
Solution: a)
3, 24  3.01 is less than
(i) This is a simple ANOVA. Use   .05 . H 0 : 1   2   3   4 . Since F.05
67.56 reject H 0 .
(ii) Since n  1  27 , n  28 . If we divide that by 4, we get 7.
(iii) As explained in class, 1  x.1  t n m 
2
MSW
, where MSW and the degrees of freedom come
n1
0.278
24
from the within line of the ANOVA. If we use   .05 , t .025
 2.064 and 1  4.87  2.064
7
 4.87  0.38 .
(iv) From the outline  2   4  x2  x4  
m  1Fm1,nm 
MSW
1
1
, where the degrees of

n2 n4
freedom are the same as those used in the F-test in the ANOVA ( m is the number of columns), so that
3, 24  3.01 . So     2.20  1.46   33.01 0.278
we use F.05
2
4
1 1
  0.74  0.85 .
7 7
This has a 95%confidence level together with any other intervals you might do. The intervals using t
have a 95% confidence level alone.
7
4/21/98 252z9932
b) There are 3  5  2  30 groups with 4 observations in each group, so n  30  4  120 .
‘s’ means ‘significant difference’ ( H 0 rejected), ‘ns’ means ‘no significant difference’ ( H 0 accepted).
Source
SS
DF
MS
F
F.01
Factor A
22
2
11.00
1.547
Factor B
227
4
56.75
7.980
Factor C
32
1
32.00
4.500
Interaction AB
54
8
6.75
0.949
Interaction AC
10
2
5.00
0.703
Interaction BC
Interaction ABC
Error (Within)
Total
98
4
24.50
3.445
136
8
17.00
2.391
640
1219
90
119
F 2,90  4.85 ns
F 4,90  3.54 s
F 1,90  6.93 ns
F 8,90  2.72 ns
F 2,90  4.85 ns
F 4,90  3.54 ns
F 8,90  2.72 ns
7.1111
8
4/21/98 252z9932
5. An airline wishes to explain the number of passengers it carries over 10 months as a consequence of
advertising in the previous month. It collects data as follows:
Observation Advertising Passengers
(The xy column is added here.)
($1000)
(1000s)
(You were given at least 8
xy
1
10
16
160
examples of this calculation!)
2
12
18
216
3
8
14
112
4
17
24
408
5
10
17
170
6
15
22
330
7
10
15
150
8
11
21
231
9
19
25
475
10
10
18
180
xy
2432 

For your convenience the following values are given:
x  122 ,
x 2  1604 ,
y  190 ,
y 2  3740 , n  10 .




a. Compute the regression equation Y  b0  b1 x to predict the number of passengers. (6)
b. Compute R 2 . (4)
c. Compute s e . (3)
Solution:
Spare Parts Computation:
x
 x  122  12.2
y
 y  190  19.0
n
SSxx 
Sxy 
 114 .0
10
Sxy

SSxx

x
 nx 2  1604  10 12 .22
 xy  nx y  2432  1012.219.0
SSyy 
a. b1 
2
 115 .6
10
n
x
y
2
 ny  3740  1019 .02
2
 130 .0  TSS
xy  nx y
2
 nx

2
b0  y  b1 x  19 .0  0.9862 12.2  6.9689
114 .0
 0.9862
115 .6
Y  b0  b1 x becomes Yˆ  6.9689  0.9862 x .
RSS 112 .4221
 xy  nxy   0.9862 114 .0  112 .4221 R  TSS

 0.8648 or
130 .0
 xy  nxy 
Sxy
114 .0  .8648 ( 0  R  1 always!)



SSxxSSyy  x  nx  y  ny  115 .6130 .0
b. RSS  b1 Sxy  b1
2
2
2
R
2
2
2
2
2
2
2
c. ESS  TSS  RSS  130 .0  112 .4221  17.5779
s e2
SSyy  b1 Sxy


n2
 y
n2
s e  2.1973  1.4823
ESS 17 .5779

 2.1973 or
n2
8
  xy  nxy  130 .0  0.9862 114 .0

 2.1972
 ny 2  b1
1  R TSS  1  R  y

2
s e2
2
s e2 
2
n2
n2
2
 ny 2

or
se2
 y

8
2
  x
 ny 2  b12
2
 nx 2
or

n2
( s e2 is always positive!)
9
4/21/98 252z9932
6. Continuing the previous problem. (  .02 )
a. Compute s b0 and do a significance test on b0 .(4)
b. Do a confidence interval for b1 (3)
c. Do a prediction interval for passengers when the expenditure on advertising is $10 (thousand).
(5)
d. Using your SST etc., put together the ANOVA table (  .05) (6)
a
1
1
x 2 
s b20  s e2  
 s e2  
 n SSxx 
n



sb0  3.0480  1.7461

2

  2.1972  1  12 .2
 10 115 .6
x 2  nx 2 


x2

  3.0488


b0  6.9689
H 0:  0  0 H 1 :  0  0
t
b0   00 b0  0 6.9689


 3.991 . Since this is not between
s b0
s b0
1.7461
8
 t .n2  t.01
 2.896 , reject H 0 and conclude that  0 is significant.
2
b.. s b21 
s e2

SSxx
 x
s e2
2
 nx
2


2.1972
 0.01900
115 .6
sb1  0.01900  0.1379
so
1  b1  sb1  0.9862  2.8960.1379  0.986  0.389
c.
If Yˆ  6.9689  0.9862 x and x0  10 , then Yˆ0  6.9689  0.986210  16.8309
2
1

 x 0  x 2
  2.1972  1  10  12 .2  1  2.5089
s 2y  s e2  

1

 10

0
115 .6
n
x 2  nx 2




So Y0  Yˆ0  t s y0  16 .8309  2.896 1.5840   16 .8  4.6 .


s y0  2.5089  1.5840 .
d. From the previous page RSS  112 .4221 , TSS  130 .0 and ESS  17.5779 . H 0 is that there is no
relation between Y and X .
Source
SS
DF
MS
F
F.05
Regression
112.4221
1
112.4221
Error (Within)
Total
17.5779
130.0000
8
9
2.1972
51.165
F 1,8  5.32 s
Since the table F is less than the computed F, reject H 0 .
10
Download