Activity 5: Two-way ANOVA SOLUTIONS With two factors, there is an additional question beyond whether each factor separately is significant: Is there an interaction between the two factors? That is, does the effect of one factor depend on the level of the other factor? To illustrate this, let’s consider the effects of ethanolic NaOH concentration and coal type on total acidity. Each of these factors has three levels. Open the data from the website. 1. First, let’s look at an interaction plot. Go to Stat > ANOVA > Interactions Plot…. Enter Acidity as the response and NaOH and Coal Type as the factors (in that order). Copy and paste the interaction plot below. For Morwell coal, what concentration of NaOH gives the highest mean acidity? for Yallourn? for Maddingley? For all three types of coal, .786N NaOH gives the highest mean acidity. Since the lines (for each concentration of NaOH) are roughly parallel, it appears that the effect of NaOH concentration does not depend on the type of coal. This suggests that there is NO INTERACTION between the factors. Thus, when we perform a two-way ANOVA, the interaction term will probably not be significant. (Check but do not answer: Do you get a different intuition if you produce an interaction plot with the factors in the other order?) 2. So let’s look at the results. Go to Stat > ANOVA > Two-way…. Enter Acidity as the response, NaOH as the row factor and Coal Type as the column factor. Copy and paste the ANOVA table (and ONLY the table) below. What are the test statistic and pvalue for the interaction term? What can you conclude? Analysis of Variance for Acidity Source NaOH Coal Typ Interaction Error Total DF 2 2 4 9 17 SS 0.1243 1.0024 0.0146 0.1530 1.2942 MS 0.0622 0.5012 0.0036 0.0170 F 3.66 29.49 0.21 P 0.069 0.000 0.924 The statistic and p-value for the interaction are F=.21 (on 4, 9 df) and p=.924. We can conclude from such a large p-value that there is no significant interaction between NaOH and Coal Type. If the interaction term is not significant, then we can focus on the main effects…the effects of the individual factors. 3. Since the interaction term isn’t significant, let’s not waste those 4 degrees of freedom on the interaction when they could be used to give a more accurate estimate of the mean square error. Refit the model without the interaction term by clicking the “Fit additive model” box in Stat > ANOVA > Two-way…. Paste the new ANOVA table below. Does NaOH concentration have a significant effect on acidity? What about the type of coal? Analysis of Variance for Acidity Source DF SS MS NaOH 2 0.1243 0.0622 Coal Typ 2 1.0024 0.5012 Error 13 0.1675 0.0129 Total 17 1.2942 F 4.82 38.90 P 0.027 0.000 NaOH has a significant effect on mean acidity (p=.027). Type of coal also has a significant effect on mean acidity (p=.000). Let’s consider a different example…the effects of brand of pen and writing surface on writing lifetime. 4. Obtain an interaction plot (paste it below). Which brand of pen lasted longest on the first surface? the second surface? the third surface? Pen 1 lasted longest on the first surface; pen 2 lasted longest on the second surface; and pen 3 lasted longest on the third surface. Here, the effect of brand depends on the surface! This suggests that THERE IS AN INTERACTION between the two factors. Notice that the lines cross (intersect) multiple times. Thus, we will most likely find a significant interaction term. 5. Perform a two-way ANOVA. Be sure NOT to fit the additive model so that you can obtain an interaction term. What are the test statistic and p-value for the interaction term? What can you conclude? Analysis of Variance for Lifetime Source DF SS MS Pen 3 1388 463 Surface 2 2888 1444 Interaction 6 8100 1350 Error 12 8216 685 Total 23 20592 F 0.68 2.11 1.97 P 0.583 0.164 0.149 The test statistic and p-value for the interaction are F=1.97 (on 6, 12 df) and p=.149. We conclude that there is not a significant interaction between pen and surface. This appears to contradict what we saw in question 4. The explanation for this apparent contradiction is that the interaction plot in question 4 does not take the precision (variability) of the mean estimates into account, whereas the ANOVA table does. Even though the lines are clearly not parallel in the interaction plot, the estimates are so variable that we cannot distinguish those lines from lines that are truly parallel.