16 - Two-way ANOVA Example - Capsule Dissolving Experiment (Capsule.JMP) In this experiment researchers are interested in studying the effect of two factors on the time to begin dissolving a capsule which is recorded as the time until bubbles first appear (seconds). The factors of interest to the researchers are juice type - gastric or duodenal (Factor A) and capsule type - C or V (Factor B). To conduct the experiment 5 capsules of each type are randomly assigned to each juice giving us 5 observations or replicates for each of the four treatment combinations (Gastric & C, Gastric & V, Duodenal & C, Duodenal & V). The data obtained from the experiment are shown below: Capsule Type Juice Type Means V Type of Digestive Juice C Gastric Duodenal Capsule Type Means 39.5 45.7 49.8 50.2 63.8 47.4 43.5 39.8 36.1 41.2 X 11 49.8 X 12 41.6 33.5 36.7 42 38.1 31.2 44 41.2 47.3 45.3 42.7 X 21 36.3 X 22 44.1 X 1 43.05 X 2 42.85 X 1 45.7 X 2 40.2 Grand Mean X 42.95 We can construct plots to visualize the effects of each factor. Digestive Juice Capsule Type By plotting the mean time until bubbles for both digestive juices, we can see that mean dissolution time for duodenal juice is slightly smaller than that for gastric (about 5 seconds). By plotting the mean time until bubbles for both capsule types we see that the mean dissolution times for the capsule types are approximately equal. 139 Our preliminary conclusions would be first that fluid type has a small effect on the dissolution time with duodenal juice dissolving capsules about 5 seconds quicker on average, and secondly that capsule type has little or no effect. These conclusions are completely WRONG!! Why? When considering the effect of two factors on the response we cannot do so marginally, i.e. individually. It is possible, for example, that the effect of digestive juice is not the same for both capsule types. If we consider the means for each of the treatment combinations above we see that for type C capsules the duodenal juice dissolves the capsule quicker, while the exact opposite is true for type V capsules, gastric juice dissolves the capsules faster. A better display shows the means for each treatment combination. Here we have a separate profile for each digestive juice showing how the capsule effect depends on the type of digestive juice we are using. This is what we call an interaction. Questions of Interest in Two-way ANOVA: 1) Is there a significant interaction between the two factors being studied? This question needs to answered first, because if we conclude there is a significant interaction then both effects are important and there effects can not be discussed individually. If we conclude there isn’t a significant interaction between the factors being studied then we can test the effects individually. 2) Is there a significant Factor A effect? 3) Is there a significant Factor B effect? As always it is important to quantify any significant differences using pair-wise comparisons and CI’s for the differences in the population/treatment means. 140 Analysis in JMP To fit the two-way model for these data select Fit Model from the Analyze menu and put the response Time to Bubbles in the Y box and then highlight both Fluid & Capsule and select Full Factorial from the Macros pull-down menu as shown below. Then click Run Model to obtain the results on the next page. These sections of output can be shut off as our interest is in primarily identifying which effects are significant. These results are in the Effect Tests box. The Fluid*Capsule interaction is significant (p=.0049), so we know both fluid and capsule type significantly effect the response. The p-values for the effects suggests that the Fluid*Capsule interaction is significant (p = .0049), which implies the main effect tests for Fluid and Capsule are of little interest. It is interesting to note that the main effect of Capsule is not significant (p = .9361). This happens because the presence of the Fluid*Capsule interaction "masks" the main effect of 141 capsule as we have seen in marginal effect plots above. The main effect of fluid is only partially masked by the Fluid*Capsule interaction and so it still tests as significant. Because the interaction is significant we need to quantify the treatment effects by comparing the treatment combinations to one another. To do this select LSMeans Tukey HSD from the Fluid*Capsule interaction pull-down menu. Results of the treatment mean comparisons are shown below. Here we see that Gastric,C and Duodenal,C mean dissolution times significantly differ. In particular we estimate that the type C capsules in gastric fluid take between 3.57 and 23.429 seconds longer to dissolve on average. In contrast type V capsules appear to dissolve equally well in either digestive juice. 142 If the interaction between the two factors is not significant we can use the Tukey’s procedure to compare the means across the levels of each factor individually. Checking Two-way ANOVA Assumptions (Normality and constant variance) Assumptions: 1. The observations between and within the treatment combinations are independent. 2. The response is normally distributed for each treatment combination. 3. The variance of the response is the same for each treatment combination. To check the constant variance assumption we can examine the residuals plotted vs. the fitted values and each factor. The fitted values are simply the observed mean response at each of the four treatment combinations and the residuals are the deviations from the treatment combination means. The spread of the residuals, i.e. the spread of the observed response values about their respective treatment combination means, should be uniform indicating constant response variation for the different treatment combinations. A plot of the residuals vs. the fitted values is given each time we fit a model in JMP. The resulting plot is shown below: There appears to be a potential outlier in this plot, otherwise this plot looks fine. To examine the normality assumption we assess the normality of the residuals. Save the residuals to the spreadsheet as shown below and use Analyze > Distribution to examine them. With the exception of two mild outliers, normality seems satisfied. 143 STATISTICAL DETAILS (FYI) Two-way ANOVA Model y ijk i j ( ) ij ijk i 1,..., a j 1,..., b k 1,..., n where, X ijk kth observed response value when level i of factor A and level j of factor B is used. i effect due to the fact level i of Factor A was used. j effect due to the fact level j of Factor B was used. ( ) ij effect due to the interaction of ith level of Factor A and the jth level of Factor B. ijk the random error, represents the variation in the response values when the ith level of Factor A and the jth level of Factor B are used. We assume that ijk ~ N (0, 2 ) , i.e. the errors are normal and their variation is constant. See your text for formulae used to estimate these quantities and those used to test the hypotheses. The three questions of interest in a two-way ANOVA can be formulated in terms of these parameter values. 1. For testing the interaction between Factors A and B we have: H o : ( ) ij 0 for all treatment combinations H a : ( ) ij 0 for all treatment combinations 2. For testing the Factor A effect we have: H o : i 0 for all i H a : i 0 for all i 3. For testing the Factor B effect we have: H o : j 0 for all j H a : j 0 for all j As in one-way ANOVA the test procedures decomposes total response variation into components that measure how much variation in the response is due to Factor A, Factor B, the interaction between Factors A & B, and random error. Sum of Squares: SSTotal SS A SS B SS A B SS Error Degrees of Freedom: N 1 (a 1) (b 1) (a 1)(b 1) ab(n 1) 144 SUM OF SQUARES FORMULAE: a b n a b i 1 j 1 ( X ijk X ) 2 = nb ( X i X ) 2 + an ( X j X ) 2 + i 1 j 1 k 1 a b n (X ij X i X j X ) 2 + i 1 j 1 a b n ( X i 1 j 1 k 1 ijk X ij ) 2 MEAN SQUARES (measures of variation) The mean square for an effect is the effect sum of squares divided by the degrees of freedom. MS effect SS effect df effect When the null hypothesis of “no effect” is true the mean squares are all estimates of 2 , the common response variance for all treatment combinations. If there is a significant effect then we expect the MS effect MS Error (within treatment combination variation). Testing Effect Significance For testing the main effects (A & B) and the interaction effect (A B) we simply compare the size of the MS effect to the MS Error . If the MS effect >> MS Error we have evidence that the effect is significant. If MS effect MS Error then we have little evidence that the effect is significant. This is analogous to the comparison of the between group variation to the within group variation in One-way ANOVA. To compare the mean squares we use the ratio, which has an F-distribution. MS effect Fo ~ F-distribution (numerator df = df for the effect , denominator df = df for error) MS Error Fo >> 1 will lead to the conclusion that the effect in question significantly impacts the response. Large Fo values lead to small p-values which support effect significance. 145 Example 2 – Apple Nutrient Level by Region Grown and Variety Does the level of a certain nutrient found in apples differ significantly across regions and by variety? The regions are labeled as A,B, or C and the varieties are labeled as X, W, Y, and Z. The data in JMP is entered as shown on the left. The first column contains the Region grown, the second column contains the apple Variety, and the last column denotes the nutrient level. Data File: Fruit Nutrient In JMP select Analyze > Fit Model and set up the dialog box as shown below. The output is shown below. Test results: Interaction Region Variety The interaction plot shown on the left shows some signs of non-parallelism and hence interaction, however the p-value in the ANOVA table suggests we have very weak evidence for its significance (p=.0917). 146 The mean nutrient levels differ across region (p = .0065). In particular, the mean nutrient level found in apples grown in region A appears to be significantly higher than that for the other two regions. The mean nutrient levels also differ significantly across variety (p<.0001). It appears that apple varieties Y and Z have higher mean nutrient levels than varieties W and X. Multiple Comparisons for Region and Variety Effects We see that the mean nutrient level of apples grown in region A, regardless of variety, is significantly higher than that for apples grown in regions B and C. We estimate the mean nutrient level for apples grown in region A is between .303 and 2.66 units larger than the mean nutrient level of apples grown in region B, and is between .136 and 2.49 unit larger than the mean nutrient level of apples grown in region C. The mean nutrient levels found in varieties Y and Z, regardless of region grown, significantly differ from those found in varieties X and W. We estimate that the mean nutrient level found in variety Z apples exceeds that for variety W by between 3.57 and 8.55 units, etc.... 147