4/21/98 252z9932 4 a. A bank wishes to compare deposit size at 4 branches. The result of the analysis of variance appears below. Sample means: Branch 1: 4.87 Branch 2: 2.20 Branch 3: 4.31 Branch 4: 1.46 Source SS DF MS F Between 56.33 3 18.78 67.56 Within 6.67 24 0.278 Total 63.00 27 (i) Do mean deposits differ between banks? Why? (2) (ii) Assuming that this table is based on four random samples of equal size, how many numbers are in each column? (1) (iii) Do a confidence interval for the mean deposit in branch 1. (2) (iv) Do a Scheffe confidence interval for the difference between deposits in branches 2 and 4 and explain why this type of interval might be preferred to one using t. (3) b. In a study of inflation, 3 company sizes (Factor A) , 5 degrees of industrial concentration (Factor B) and two types of product (Factor C) are distinguished. An ANOVA table was generated using the average price rise over 5 years for a random sample of 4 firms in each category. Generate an ANOVA table showing all possible interactions, using the following data. SSA = 22, SSB = 227, SSC = 32, SSAB = 54, SSABC = 136, SST = 1219. Using a 1% significance level, which of the differences and interactions are significant.? (8) Correction! SSAC = 10, SSBC = 98 !!!!! Solution: a) 3, 24 3.01 is less than (i) This is a simple ANOVA. Use .05 . H 0 : 1 2 3 4 . Since F.05 67.56 reject H 0 . (ii) Since n 1 27 , n 28 . If we divide that by 4, we get 7. (iii) As explained in class, 1 x.1 t n m 2 MSW , where MSW and the degrees of freedom come n1 0.278 24 from the within line of the ANOVA. If we use .05 , t .025 2.064 and 1 4.87 2.064 7 4.87 0.38 . (iv) From the outline 2 4 x2 x4 m 1Fm1,nm MSW 1 1 , where the degrees of n2 n4 freedom are the same as those used in the F-test in the ANOVA ( m is the number of columns), so that 3, 24 3.01 . So 2.20 1.46 33.01 0.278 we use F.05 2 4 1 1 0.74 0.85 . 7 7 This has a 95%confidence level together with any other intervals you might do. The intervals using t have a 95% confidence level alone. 7 4/21/98 252z9932 b) There are 3 5 2 30 groups with 4 observations in each group, so n 30 4 120 . ‘s’ means ‘significant difference’ ( H 0 rejected), ‘ns’ means ‘no significant difference’ ( H 0 accepted). Source SS DF MS F F.01 Factor A 22 2 11.00 1.547 Factor B 227 4 56.75 7.980 Factor C 32 1 32.00 4.500 Interaction AB 54 8 6.75 0.949 Interaction AC 10 2 5.00 0.703 Interaction BC Interaction ABC Error (Within) Total 98 4 24.50 3.445 136 8 17.00 2.391 640 1219 90 119 F 2,90 4.85 ns F 4,90 3.54 s F 1,90 6.93 ns F 8,90 2.72 ns F 2,90 4.85 ns F 4,90 3.54 ns F 8,90 2.72 ns 7.1111 8 4/21/98 252z9932 5. An airline wishes to explain the number of passengers it carries over 10 months as a consequence of advertising in the previous month. It collects data as follows: Observation Advertising Passengers (The xy column is added here.) ($1000) (1000s) (You were given at least 8 xy 1 10 16 160 examples of this calculation!) 2 12 18 216 3 8 14 112 4 17 24 408 5 10 17 170 6 15 22 330 7 10 15 150 8 11 21 231 9 19 25 475 10 10 18 180 xy 2432 For your convenience the following values are given: x 122 , x 2 1604 , y 190 , y 2 3740 , n 10 . a. Compute the regression equation Y b0 b1 x to predict the number of passengers. (6) b. Compute R 2 . (4) c. Compute s e . (3) Solution: Spare Parts Computation: x x 122 12.2 y y 190 19.0 n SSxx Sxy 114 .0 10 Sxy SSxx x nx 2 1604 10 12 .22 xy nx y 2432 1012.219.0 SSyy a. b1 2 115 .6 10 n x y 2 ny 3740 1019 .02 2 130 .0 TSS xy nx y 2 nx 2 b0 y b1 x 19 .0 0.9862 12.2 6.9689 114 .0 0.9862 115 .6 Y b0 b1 x becomes Yˆ 6.9689 0.9862 x . RSS 112 .4221 xy nxy 0.9862 114 .0 112 .4221 R TSS 0.8648 or 130 .0 xy nxy Sxy 114 .0 .8648 ( 0 R 1 always!) SSxxSSyy x nx y ny 115 .6130 .0 b. RSS b1 Sxy b1 2 2 2 R 2 2 2 2 2 2 2 c. ESS TSS RSS 130 .0 112 .4221 17.5779 s e2 SSyy b1 Sxy n2 y n2 s e 2.1973 1.4823 ESS 17 .5779 2.1973 or n2 8 xy nxy 130 .0 0.9862 114 .0 2.1972 ny 2 b1 1 R TSS 1 R y 2 s e2 2 s e2 2 n2 n2 2 ny 2 or se2 y 8 2 x ny 2 b12 2 nx 2 or n2 ( s e2 is always positive!) 9 4/21/98 252z9932 6. Continuing the previous problem. ( .02 ) a. Compute s b0 and do a significance test on b0 .(4) b. Do a confidence interval for b1 (3) c. Do a prediction interval for passengers when the expenditure on advertising is $10 (thousand). (5) d. Using your SST etc., put together the ANOVA table ( .05) (6) a 1 1 x 2 s b20 s e2 s e2 n SSxx n sb0 3.0480 1.7461 2 2.1972 1 12 .2 10 115 .6 x 2 nx 2 x2 3.0488 b0 6.9689 H 0: 0 0 H 1 : 0 0 t b0 00 b0 0 6.9689 3.991 . Since this is not between s b0 s b0 1.7461 8 t .n2 t.01 2.896 , reject H 0 and conclude that 0 is significant. 2 b.. s b21 s e2 SSxx x s e2 2 nx 2 2.1972 0.01900 115 .6 sb1 0.01900 0.1379 so 1 b1 sb1 0.9862 2.8960.1379 0.986 0.389 c. If Yˆ 6.9689 0.9862 x and x0 10 , then Yˆ0 6.9689 0.986210 16.8309 2 1 x 0 x 2 2.1972 1 10 12 .2 1 2.5089 s 2y s e2 1 10 0 115 .6 n x 2 nx 2 So Y0 Yˆ0 t s y0 16 .8309 2.896 1.5840 16 .8 4.6 . s y0 2.5089 1.5840 . d. From the previous page RSS 112 .4221 , TSS 130 .0 and ESS 17.5779 . H 0 is that there is no relation between Y and X . Source SS DF MS F F.05 Regression 112.4221 1 112.4221 Error (Within) Total 17.5779 130.0000 8 9 2.1972 51.165 F 1,8 5.32 s Since the table F is less than the computed F, reject H 0 . 10