4/21/98 252z9931

advertisement
4/21/98 252z9931
4 a. A bank wishes to compare deposit size at 4 branches. The result of the analysis of variance appears
below.
Sample means:
Branch 1: 4.87
Branch 2: 2.29
Branch 3: 4.31
Branch 4: 1.46
Source
SS
DF MS
F
Between 55.33 3 18.44
78.14
Within
5.67 24
0.236
Total
61.00 27
(i) Do mean deposits differ between banks? Why? (2)
(ii) Assuming that this table is based on four random samples of equal size, how many numbers
are in each column? (1)
(iii) Do a confidence interval for the mean deposit in branch 2. (2)
(iv) Do a Scheffe confidence interval for the difference between deposits in branches 1 and 4 and
explain why this type of interval might be preferred to one using t. (3)
b. In a study of inflation, 3 company sizes (Factor A) , 5 degrees of industrial concentration (Factor B) and
two types of product (Factor C) are distinguished. An ANOVA table was generated using the average price
rise over 5 years for a random sample of 4 firms in each category. Generate an ANOVA table showing all
possible interactions, using the following data. SSA = 22, SSB = 227, SSC = 32, SSAB = 54, SSABC =
136, SST = 1219. Using a 1% significance level, which of the differences and interactions are significant?
(8) Correction! SSAC = 10, SSBC = 98 !!!!!
Solution: a)
3, 24  3.01 is less than
(i) This is a simple ANOVA. Use   .05 . H 0 : 1   2   3   4 . Since F.05
78.18 reject H 0 .
(ii) Since n  1  27 , n  28 . If we divide that by 4, we get 7.
(iii) As explained in class,  2  x.2  t n m 
2
MSW
, where MSW and the degrees of freedom come
n2
0.236
24
 2.064 and  2  2.29  2.064
from the within line of the ANOVA. If we use   .05 , t .025
7
 2.29  0.38 .
(iv) From the outline 1   4  x1  x4  
m  1Fm1,nm 
MSW
1
1
, where the degrees of

n1 n 4
freedom are the same as those used in the F-test in the ANOVA ( m is the number of columns), so that
3, 24  3.01 . So     4.87  1.46   33.01 0.236
we use F.05
1
4
1 1
  3.41  0.78 .
7 7
This has a 95%confidence level together with any other intervals you might do. The intervals using t
have a 95% confidence level alone.
7
4/21/98 252z9931
b) There are 3  5  2  30 groups with 4 observations in each group, so n  30  4  120 . On some of the
first exams taken, SSAC and SSBC were missing. If you assumed that these were zero, SSW was 802 and
degrees of freedom as below. If you assumed that these did not exist at all, within degrees of freedom
should be 96.
‘s’ means ‘significant difference’ ( H 0 rejected), ‘ns’ means ‘no significant difference’ ( H 0 accepted).
Source
SS
DF
MS
F
F.01
Factor A
22
2
11.00
1.547
Factor B
227
4
56.75
7.980
Factor C
32
1
32.00
4.500
Interaction AB
54
8
6.75
0.949
Interaction AC
10
2
5.00
0.703
Interaction BC
98
4
24.50
3.445
136
8
17.00
2.391
640
1219
90
119
Interaction ABC
Error (Within)
Total
F 2,90  4.85 ns
F 4,90  3.54 s
F 1,90  6.93 ns
F 8,90  2.72 ns
F 2,90  4.85 ns
F 4,90  3.54 ns
F 8,90  2.72 ns
7.1111
8
4/21/98 252z9931
5. An airline wishes to explain the number of passengers it carries over 10 months as a consequence of
advertising in the previous month. It collects data as follows:
Observation Advertising Passengers
(The xy column is added here.)
($1000)
(1000s)
(You were given at least 8
xy
1
10
15
150
examples of this computation)
2
12
17
204
3
8
13
104
4
17
23
391
5
10
16
160
6
15
21
315
7
10
14
140
8
11
20
220
9
19
24
456
10
10
17
170
xy
2310 

For your convenience the following values are given:
x  122 ,
x 2  1604 ,
y  180 ,
y 2  3370 , n  10 .




a. Compute the regression equation Y  b0  b1 x to predict the number of passengers. (6)
b. Compute R 2 . (4)
c. Compute s e . (3)
Solution:
Spare Parts Computation:
x
 x  122  12.2
y
 y  180  18.0
n
Sxy 
 114 .0
10
Sxy

SSxx

x
 nx 2  1604  10 12 .22
 xy  nx y  2310  1012.218.0
SSyy 
a. b1 
2
 115 .6
10
n
x
SSxx 
y
2
 ny  3370  10 18 .02
2
 130 .0  TSS
xy  nx y
2
 nx

2
b0  y  b1 x  18 .0  0.9862 12.2  5.9689
114 .0
 0.9862
115 .6
Y  b0  b1 x becomes Yˆ  5.9689  0.9862 x .
RSS 112 .4221
 xy  nxy   0.9862 114 .0  112 .4221 R  TSS

 0.8648 or
130 .0
 xy  nxy 
Sxy
114 .0  .8648 ( 0  R  1 always!)



SSxxSSyy  x  nx  y  ny  115 .6130 .0
b. RSS  b1 Sxy  b1
2
2
2
R
2
2
2
2
2
2
2
c. ESS  TSS  RSS  130 .0  112 .4221  17.5779
s e2
SSyy  b1 Sxy


n2
 y
n2
s e  2.1973  1.4823
ESS 17 .5779

 2.1973 or
n2
8
  xy  nxy  130 .0  0.9862 114 .0

 2.1972
 ny 2  b1
1  R TSS  1  R  y

2
s e2
2
s e2 
n2
2
2
n2
 ny 2

or
se2
 y

8
2
  x
 ny 2  b12
2
 nx 2
or

n2
( s e2 is always positive!)
9
4/21/98 252z9931
6. Continuing the previous problem. (  .01)
a. Compute sb1 and do a significance test on b1 .(4)
b. Do a confidence interval for b0 (3)
c. Do a prediction interval for passengers when the expenditure on advertising is $20 (thousand).
(5)
d. Using your SST etc., put together the ANOVA table (6)
a. s b21 
s e2

SSxx
 x
s e2
2
 nx 2
2.1972
  115 .6
H 0 :  1  0 H 1:  1  0
t
 0.01900
sb1  0.01900  0.1379
b1  10 b1  0 0.9862


 7.152 . Since this is not between
s b1
s b1
0.1379
n2
8
 t .005
 t .005
 3.355 , reject H 0 and conclude that  1 is significant.
1
1
x 2 
b. s b20  s e2  
 s e2  
 n SSxx 
n




2

  2.1972  1  12 .2
 10 115 .6
x 2  nx 2 


x2

  3.0480


sb0  3.0480  1.7459 b0  5.9689 so  0  b0  sb0  5.9689  3.3551.7459  5.96  5.86
c.
If Yˆ  5.9689  0.9862 x and x 0  20 , then Yˆ0  5.9689  0.986220  25.6929
2
1

 x 0  x 2
  2.1972  1  20  12 .2  1  3.5725
s 2y  s e2  

1
 10

0
115 .6
n

x 2  nx 2




So Y0  Yˆ0  t s y0  25 .6929  3.355 1.8901   25 .7  6.3 .


s y0  3.5725  1.8901 .
d. From the previous page RSS  112 .4221 , TSS  130 .0 and ESS  17.5779 . H 0 is that there is no
relation between Y and X .
Source
SS
DF
MS
F
F.01
Regression
112.4221
1
112.4221
Error (Within)
Total
17.5779
130.0000
8
9
2.1972
51.165
F 1,8  11.26 s
Since the table F is less than the computed F, reject H 0 .
10
Download