Chapter 22 problem key: 22.4: The SAS output for this problem includes the program in addition to the output. Note that I just entered the cell numbers as data - a little easier than fancier programming when the sample sizes are not equal. Note first that the interaction plots on pages 3 and 4 of the output indicate that interaction may be important. This conclusion is validated by the test for interaction in proc glm, which produces an F for interaction =106.82 with p-value = 0.0001. Even though this is a highly significant interaction, it appears that both main effects are likely to be significant also. The F statistic for ingredient 1 is 1553.11 with p-value=0.0001 and the F statistic for ingredient 2 is 859.76 with pvalue=0.0001. To perform Tukey's multiple comparisons for the cell means, the analysis of variance was run again using 'cell' as the only class variable with the results appearing on page 8 of the output. The family of confidence intervals for the differences in cell means is on page 9-10. It's a little easier to make sense of this collection of results if we redo the results as a line graph as follows, with the means ordered from largest to smallest: Grouping Mean Ingredient 1 Ingredient 2 Cell # A 13.25 3 3 9 B 10.275 3 2 8 C C C 9.125 2 3 6 8.9 2 2 5 D D D 5.975 3 1 7 5.45 2 1 4 E E E 4.60 1 2 2 4.575 1 3 3 F 2.533 1 1 1 Conclusions: 1. The most effective relief of hay fever was obtained when both ingredients were at their highest level. This combination gave a mean time until hay fever symptoms returned that was significantly higher than any other combination. 2. Each ingredient generally worked better at higher levels than at lower levels. 3. Ingredient 1 seems to have a little more effect than ingredient 2. The three treatments for which ingredient 1 was at its lowest level produced the lowest 3 mean times of relief. 4. If there is a reason why the highest level of the ingredient 1 can't be used (bad side effects, for example), then setting ingredient 1 at its middle level and ingredient 2 at its middle level produced results that were not significantly different from setting ingredient 1 at its middle level and ingredient 2 at its highest level. This might be a good compromise mixture and produced roughly 9 hours of relief on the average. 22.6: See the SAS output for this problem. First note that the interaction plots (pages 4&5) appear to be roughly parallel, so there is not much suggestion of interaction. The test for interaction has F*=0.34 with p-value=0.9104, so there is not a significant interactive effect of subject and degree on earnings. Each of the two main effects is significant. The effect of subject on earnings has F*=64.85 with p-value=0.0001 and the effect of degree on earnings has F*=189.10 with p-value=0.0001. It's useful to follow up the F tests with some multiple comparisons. I used Tukey's multiple comparisons for both subject and degree. (SAS statement: means subject degree/tukey;) All 4 of the subject areas are significantly different from each other as indicated on page 8 of the output. Looking through these results, the ordering of subjects in terms of earnings is management>engineering>social sciences>humanities. (No surprises here.) Also the Ph.D. degree adjuncts had significantly different earnings from the other two groups as indicated on page 9 of the output, with the Ph.D. degree earnings higher than those for the other two groups. Finally, the residual plots are on pages 10 and 11 of the output. The plot of residuals against predicted earnings does not show any difficulties with either outliers or unequal variances. The normal probability plot is reasonably linear, so there does appear to be a problem with the normality assumption.