4/21/98 252y9931 ECO252 QBA2 THIRD HOUR EXAM April 20, 1999 Name Hour of Class Registered (Circle) MWF TR 10 12 12:30 2:00 night I. (10+ points) Do all the following; 1. Hand in your computer printouts for problems 2 and 3.(4 points – 3 point penalty for not handing in) 2. Do not do the following unless you handed in both outputs. On the next few pages there are problems very much like the ones you did. a. The regression relates automobile accidents (‘deaths’) to tobacco use in pounds per year. Identify the coefficient of ’t-cons’ in the equation and explain whether it is significant at the 5% level.(2) Does this show that tobacco consumption causes automobile accidents? Explain! (1) b. Complete the table in the ANOVA that compares the effect of packaging and advertising plans on the sales of a product. Is there a difference between mean sales for the advertising plans at the 5% level? Show what numbers brought you to your conclusion. (4) Solution: a. On the previous page 4 it says “The regression equation is Deaths = -42027 + 11317 t-cons Predictor Coef Constant -42027 t-cons 11317 s = 3227 Stdev 19970 2603 R-sq=79.1% t-ratio -2.10 4.35 p 0.087 0.007 Rsq(adj) = 74.9% ” H 0 : 1 0 To test the significance of 11317,the coefficient of ’t-cons,’ we test the hypotheses . If H 0 is H 1 : 1 0 false we say that 1 is significant. We can test this coefficient three ways: (i) Note that its p-value is 0.007 5 which is less than our significance level of 5% ;(ii) note that the t-ratio is outside the range t n2 k 1 t .025 2.571 (Degrees of freedom can also be read from the Error or Within line in the regression analysis of variance.) ; (iii) Look at the F-test. In each case we reject H 0 and say that 1 is significant. A regression does not prove cause so that we cannot say that significance proves that tobacco consumption causes automobile accidents. b. On the previous page 3 it says (with the numbers I added in boldface) “ Source DF SS MS F F.05 package 2 71.30 35.65 adplan 3 323.90 107.97 Interaction 6 Error 36 Total 47 8.01 45.93 449.13 1.33 1.28 F 2,36 3.26 3,36 2.87 84.35 F 6,36 2.36 1.04 F 27.85 “ H 0 : No difference between adplan means In this case we are testing . If we look at the F in the adplan H 1 : Some difference between adplan means row, we find that it is larger than the table F and we reject H 0 . The rule on p-value: If the p-value is less than the significance level reject the null hypothesis; if the p-value is greater or equal than the significance level, do not reject the null hypothesis. 4/21/98 252y9931 II. Do at least 4 of the following 6 Problems (at least 10 each) (or do sections adding to at least 40 points Anything extra you do helps, and grades wrap around) . Show your work! State H 0 and H1 where applicable. 1. a. In the regression output supplied with this exam. (i) Add a regression line to the graph. (1) (ii) Do a 90% confidence interval for the constant in the equation. (2) (iii) Assuming that there is some sort of valid relationship, what automobile accident rate would you predict for a year in which per capita tobacco consumption was 10 pounds? (2) b. In the analysis of variance supplied with this exam. (i) Test for significant interaction – explain your conclusion. Use a 90% confidence level. (2) (ii) Do a 95% confidence interval for the difference between the means of package 1 and 3 that is Valid when used alone. (2) Valid when used with other possible differences between means. (2) Solution: a. (i) just connect the x’s on former page 5. 5 2.015 ( k 1 is the number of independent variables.) (ii) .10 , t n2 k 1 t n2 2 t .05 0 b0 t n2 k 1 s b0 42027 2.015 19970 42027 40240 or –82267 to –1787. b. (iii) Deaths = -42027 + 11317 t-cons = -42027 + 11317(10) = 71143. (i) .10 From the previous page. (New F’s provided) “ Source DF SS MS F F.10 package 2 adplan 3 Interaction 6 Error 36 Total 47 F 2,36 2.47 3,36 2.26 323.90 107.97 84.35 F 6,36 1.96 8.01 1.33 1.04 F 71.30 35.65 45.93 449.13 1.28 27.85 “ All F values are approximate and come from the table on page 1020 of the text. Since the F for interaction is less than the table value, accept H 0 : No interaction . (ii) We have C 4 columns, R 3 rows, and P 4 observations per cell. From the outline, for individual row means use Bonferroni intervals (with m 1 ) 1 3 x1 x 3 t RC P 1 2MSW . As explained in class, the degrees of freedom for the t PC statistic are the Error (Within) degrees of freedom, so we want t 36 2.02 . From the table of 2m .025 means that appeared above the ANOVA table, the mean for package 1 is 28.700 and the mean for package 3 is 25.737. From the table above MSW MSE 1.28 . Putting this together 21.28 2.963 2.02 0.400 3.0 0.8 44 For simultaneously valid row means, use the Scheffe’ interval 2MSW 1 3 x1 x 3 R 1FR 1, RC P 1 . As explained in class, this amounts to PC 1 3 28.700 25.737 2.02 2,36 23.26 2.553 36 2.02 by 2 F.05 replacing t .025 , where the degrees of freedom are those . used in the F-test for rows above. So 1 3 2.963 2.553 0.400 3.0 1.0 . 2 4/21/98 252y9931 2. Three new employees are to be evaluated by the partners in an accounting firm for the number of errors that they make in each of six statements of varying difficulty. The data appears below. Assuming that the underlying distribution is normal, and noting that it is blocked (classified) by the statement number, x1 10 , x12 26 , x 2 19, x 22 75, compare mean error rates. (13) Note: x 3 23, x 32 ? . Note also: If you wish to ignore that the data is blocked by statement, indicate this now and compare the column means assuming that the data is three random samples from a normal distribution.(10). x1 x2 x3 1st employee 2 1 0 4 2 1 statement 1 2 3 4 5 6 2nd employee 2 3 1 6 3 4 3rd employee 3 4 4 5 4 3 Solution: a) 2-way ANOVA (Blocked by statement) ‘s’ indicates that the null hypothesis is rejected. Statement Employee Sum SS ni x i. x x1 x2 x3 2 1 0 4 2 1 10 2 3 1 6 3 4 + 19 3 4 4 5 4 3 + 23 7 8 5 15 9 8 = 52 nj 6 +6 +6 = 18 n x j 1.6667 3.1667 3.8333 SS 26 + 75 + 91 2.8889 x 192 xij2 x 2j 2.7778 + 10.0278 + 14.6944 = 27.50 1 2 3 4 5 6 Sum Note that x is not a sum, but is n x SSR n x SSC n x i.. 2 ij x 3 3 3 3 3 3 18 2.3333 2.6667 1.6667 5.0000 3.0000 2.6667 2.8889 17 26 17 77 29 26 192 x 5.4444 7.1111 2.7778 25.0000 9.0000 7.1111 56.4444 x 2 ij x 2 i. 2 j n x 192 18 2.8889 2 41 .7778 . 2 n x 627 .5 182.8889 2 14 .7778 . This is SSB in a one way ANOVA. 2 2 j j 2 i i. x . SST x i2. n x 356 .4444 182.8889 2 19 .1111 ( SSW SST SSC SSR 7.8889 ) 2 Source SS DF MS F F.05 Rows (Statement) 19.1111 5 3.8222 4.845 14.7778 2 7.3889 9.367 F 5,10 3.33 s F 2,10 4.10 s Columns(Employees) Within (Error) 7.8889 10 Total 41.7778 17 b) One way ANOVA (Not blocked by statement) Source SS DF H0 Row means equal Column means equal 0.7889 ( SSW SST SSB 7.8889 ) MS F.05 F Columns(Employees) 14.7778 2 7.3889 Within (Error) Total 27.0000 41.7778 15 17 1.8000 4.105 F 2,15 3.68 s H0 Column means equal 3 4/21/98 252y9931 3. Data from problem 2 is repeated below. x1 x2 x3 1st employee 2 1 0 4 2 1 3rd employee 3 4 4 5 4 3 statement 1 2 3 4 5 6 2nd employee 2 3 1 6 3 4 a) Assume that the distribution is not normal, but that it is blocked (classified) by statement number, and again compare the distributions represented by the columns. (5) b) Assume that the distribution is not normal, but that each column is a random sample, and again compare the distributions represented by the columns. (5) Solution: a) Friedman Test H 0 : Columns from same distribution . Rank within rows. Statement Employee x1 r1 x2 r2 x3 r3 1 2 1.5 2 1.5 3 3 2 1 2 3 2 4 3 3 0 2 1 2 4 3 4 4 3 6 3 5 2 5 2 2 3 2 4 3 6 1 3 4 3 3 2 Sum 6.5 13.5 16 cc 1 34 6 36 . Since There are r 6 rows and c 3 columns. Check: the rank sums must add to r 2 2 12 6.5 + 13.5 + 16 = 36, we are all right. The Friedman Statistic is F2 SR 2 3r c 1 r c c 1 12 1 6.5 2 13.5 2 16 2 364 480 .5 72 8.0833 According to the Friedman Table 634 6 ( k 3, n 6 ) 7 has a p-value of .029 and 8.333 has a p-value of .012, so this p-value must lie in between. If .05, the p-value for our null hypothesis must be below .05, so we reject H 0 . b) Kruskal-Wallis Test H 0 : Columns from same distribution Rank within entire group (1 to 18). Then resolve ties by replacing ranks with average ranks as follows: x 1 is rank 2, 3 and 4, replace with 3; x 2 is rank 5, 6 and 7, replace with 6; x 3 is rank 8, 9, 10 and 11, replace with 9.5; x 4 is rank 12, 13, 14, 15 and 16, replace with 14. These revised ranks are r * Statement Employee x1 x2 x3 r1 r1* r2 r2* r3 r3* 1 2 5 6 2 7 6 3 10 9.5 2 1 2 3 3 8 9.5 4 14 14 3 0 1 1 1 4 3 4 15 14 4 4 12 14 6 18 18 5 17 17 5 2 6 6 3 9 9.5 4 16 14 6 1 3 3 4 13 14 3 11 9.5 Sum 33 60 78 4/21/98 252y9931 4 nn 1 1819 171 , as 33, 60 2 2 SRi2 3n 1 ni Check: If there are n 18 numbers, the three sums of ranks should add to and 78 do. The Kruskal-Wallis statistic has the formula H 12 nn 1 i 12 33 60 78 12 10773 57 6 .We cannot use the Kruskall-=Wallis table 319 1819 6 6 6 1819 6 2 2 2 because the problem is too large, so use a 5% value of 2 with 2 degrees of freedom. The table value is 5.991, so (barely) reject H 0 . Document continues in 252z9931. 5