11 Multifactor Analysis of Variance Copyright © Cengage Learning. All rights reserved. 11.3 Three-Factor ANOVA Copyright © Cengage Learning. All rights reserved. Three-Factor ANOVA To indicate the nature of models and analyses when ANOVA experiments involve more than two factors, we will focus here on the case of three fixed factors—A, B, and C. The numbers of levels of these factors will be denoted by I, J, and K, respectively, and Lijk = the number of observations made with factor A at level i, factor B at level j, and factor C at level k. The analysis is quite complicated when the Lijk’s are not all equal, so we further specialize to Lijk = L. 3 Three-Factor ANOVA Then Xijkl and xijkl denote the observed value, before and after the experiment is performed, of the lth replication (l = 1, 2,…, L) when the three factors are fixed at levels i, j, and k. To understand the parameters that will appear in the threefactor ANOVA model, first recall that in two-factor ANOVA with replications, E(Xijk) = ij = + i + j + γij , where the restrictions ii = jj = 0, iγij = 0 for every j, and iγij = 0 for every i were necessary to obtain a unique set of parameters. 4 Three-Factor ANOVA If we use dot subscripts on the ij’s to denote averaging (rather than summation), then is the effect of factor A at level i averaged over levels of factor B, whereas is the effect of factor A at level i specific to factor B at level j. 5 Three-Factor ANOVA When the effect of A at level i depends on the level of B, there is interaction between the factors, and the γij ’s are not all zero. In particular, (11.11) 6 The Fixed Effects Model and Test Procedures 7 The Fixed Effects Model and Test Procedures The fixed effects model for three-factor ANOVA with Lijk = L is (11.12) where the ijkl ’s are normally distributed with mean 0 and variance 2, and (11.13) The restrictions necessary to obtain uniquely defined parameters are that the sum over any subscript of any parameter on the right-hand side of (11.13) equal 0. 8 The Fixed Effects Model and Test Procedures The parameters , and are called two-factor interactions, and is called a three-factor interaction; the i’s, j ’s, and k ’s are the main effects parameters. For any fixed level k of the third factor, analogous to (11.11), is the interaction of the ith level of A with the jth level of B specific to the kth level of C, whereas 9 The Fixed Effects Model and Test Procedures Is the interaction between A at level i and B at level j averaged over levels of C. If the interaction of A at level i and B at level j does not depend on k, then all ’s equal 0. Thus nonzero ’s represent nonadditivity of the two-factor ’s over the various levels of the third factor C. If the experiment included more than three factors, there would be corresponding higher-order interaction terms with analogous interpretations. 10 The Fixed Effects Model and Test Procedures Note that in the previous argument, if we had considered fixing the level of either A or B (rather than C, as was done) and examining the ’s, their interpretation would be the same; if any of the interactions of two factors depend on the level of the third factor, then there are nonzero ’s. When L > 1, there is a sum of squares for each main effect, each two-factor interaction, and the three-factor interaction. 11 The Fixed Effects Model and Test Procedures To write these in a way that indicates how sums of squares are defined when there are more than three factors, note that any of the model parameters in (11.13) can be estimated unbiasedly by averaging Xijkl over appropriate subscripts and taking differences. Thus with other main effects and interaction estimators obtained by symmetry. 12 The Fixed Effects Model and Test Procedures Definition Relevant sums of squares are df = IJKL – 1 df = I – 1 df = (I – 1)(J – 1) 13 The Fixed Effects Model and Test Procedures df = (I – 1)(J – 1)(k – 1) df = IJK(L – 1) with the remaining main effect and two-factor interaction sums of squares obtained by symmetry. SST is the sum of the other eight SSs. 14 The Fixed Effects Model and Test Procedures Each sum of squares (excepting SST) when divided by its df gives a mean square. Expected mean squares are E(MSE) = 2 with similar expressions for the other expected mean squares. Main effect and interaction hypotheses are tested by forming F ratios with MSE in each denominator. 15 The Fixed Effects Model and Test Procedures Null Hypothesis Test Statistic Value Rejection Region Usually the main effect hypotheses are tested only if all interactions are judged not significant. This analysis assumes that Lijk = L > 1. if L = 1, then as in the two-factor case, the highest-order interactions must be assumed absent to obtain an MSE that estimates 2. 16 The Fixed Effects Model and Test Procedures Setting L = 1 and disregarding the fourth subscript summation over l, the foregoing formulas for sums of squares are still valid, and error sum of squares is SSE = with = Xijk in the expression for . 17 Example 10 The following observations (body temperature –100°F) were reported in an experiment to study heat tolerance of cattle (“The Significance of the Coat in Heat Tolerance of Cattle,” Australian J. Agric. Res., 1959: 744–748). 18 Example 10 cont’d Measurements were made at four different periods (factor A, with I = 4) on two different strains of cattle (factor B, with J = 2) having four different types of coat (factor C, with K = 4); L = 3 observations were made for each of the 4 2 4 = 32 combinations of levels of the three factors. The table of cell totals (xijk.’s) for all combinations of the three factors is 19 Example 10 cont’d Figure 11.8 displays plots of the corresponding cell means . We will return to these plots after considering tests of various hypotheses. Plots of xijk. for Example 10 Figure 11.8 20 Example 10 cont’d The basis for these tests is the ANOVA table given in Table 11.8. ANOVA Table for Example 10 Table 11.8 21 Example 10 cont’d Since F.01,9,64 2.70 and fABC = MSABC/MSE = .704 does not exceed 2.70, we conclude that three-factor interactions are not significant, However, although the AB interactions are also not significant, both AC and BC interactions as well as all main effects seem to be necessary in the model. When there are no ABC or AB interactions, a plot of the separately for each level of C should reveal no substantial interactions (if only the ABC interactions are zero, plots are more difficult to interpret.) 22 Latin Square Designs 23 Latin Square Designs When several factors are to be studied simultaneously, an experiment in which there is at least one observation for every possible combination of levels is referred to as a complete layout. If the factors are A, B, and C with I, J, and K levels, respectively, a complete layout requires at least IJK observations. Frequently an experiment of this size is either impracticable because of cost, time, or space constraints or literally impossible. 24 Latin Square Designs For example, if the response variable is sales of a certain product and the factors are different display configurations, different stores, and different time periods, then only one display configuration can realistically be used in a given store during a given time period. A three-factor experiment in which fewer than IJK observations are made is called an incomplete layout. There are some incomplete layouts in which the pattern of combinations of factors is such that the analysis is straightforward. 25 Latin Square Designs One such three factor design is called a Latin square. It is appropriate when I = J = K (e.g., four display configurations, four stores, and four time periods) and all two- and three-factor interaction effects are assumed absent. If the levels of factor A are identified with the rows of a twoway table and the levels of B with the columns of the table, then the defining characteristic of a Latin square design is that every level of factor C appears exactly once in each row and exactly once in each column. 26 Latin Square Designs Figure 11.9 shows examples of 3 3, 4 4, and 5 5 Latin squares. Examples of Latin squares Figure 11.9 There are 12 different 3 3 Latin squares, and the number of different Latin squares increases rapidly with the number of levels. (e.g., every permutation of rows of a given Latin square yields a Latin square, and similarly for column permutations). 27 Latin Square Designs It is recommended that the square used in a an actual experiment be chosen at random from the set of all possible squares of the desired dimension; for further details, consult one of the chapter references. The letter N will denote the common value of I, J, and K. Then a complete layout with one observation per combination would require N3 observations, whereas a Latin square requires only N2 observations. 28 Latin Square Designs Once a particular square has been chosen, the value of k (the level of factor C) is completely determined by the values of i and j. To emphasize this, we use xij(k) to denote the observed value when the three factors are at levels i, j, and k, respectively, with k taking on only one value for each i, j pair. The model equation for a Latin square design is Xij(k) = + i + j + k + ij(k) i, j, k = 1,…, N where i = j = k = 0 and ij(k) ’s the are independent and normally distributed with mean 0 and variance 2. 29 Latin Square Designs We employ the following notation for totals and averages: Note that although Xi.. previously suggested a double summation, now it corresponds to a single sum over all j (and the associated values of k). 30 Latin Square Designs Definition Sums of squares for a Latin square experiment are df = N2 – 1 df = N = 1 df = N – 1 df = N – 1 31 Latin Square Designs df = N – 1 df = (N – 1)(N – 2) SST = SSA + SSB + SSC + SSE Each mean square is, of course, the ratio SS/df. For testing H0C : 1 = 2 = =N = 0, the test statistic value is fC = MSC/MSE, with H0C rejected if fc F ,N – 1,(N – 1)(N – 2). 32 Latin Square Designs The other two main effect null hypotheses are also rejected if the corresponding F ratio is at least F ,N – 1,(N – 1)(N – 2). If any of the null hypotheses is rejected, significant differences can be identified by using Tukey’s procedure. After computing pairs of sample means (the xi..’s, x.j.’s, or x..k’s ) differing by more than w correspond to significant differences between associated factor effects (the i’s, j’s, or k’s ). The hypothesis H0C is frequently the one of central interest. 33 Latin Square Designs A Latin square design is used to control for extraneous variation in the A and B factors, as was done by a randomized block design for the case of a single extraneous factor. Thus in the product sales example mentioned previously, variation due to both stores and time periods is controlled by a Latin square design, enabling an investigator to test for the presence of effects due to different product-display configurations. 34 Example 11 In an experiment to investigate the effect of relative humidity on abrasion resistance of leather cut from a rectangular pattern (“The Abrasion of Leather,” J. Inter. Soc. Leather Trades’ Chemists, 1946: 287), a 6 6 Latin square was used to control for possible variability due to row and column position in the pattern. 35 Example 11 cont’d The six levels of relative humidity studied were 1 = 25%, 2 = 37%, 3 = 50%, 4 = 62%, 5 = 75%, and 6 = 87%, with the following results: Also, x..1 = 46.10, x..2 = 40.59, x..3 = 39.56, x..4 = 35.86, x..5 = 32.23, x..6 = 32.64, x… = 226.98. 36 Example 11 cont’d Further computations are summarized in Table 11.9. ANOVA Table for Example 11 Table 11.9 Since F.05,5,20 = 2.71 and 26.89 2.71, H0C is rejected in favor of the hypothesis that relative humidity does on average affect abrasion resistance. 37 Example 11 cont’d To apply Tukey’s procedure, Ordering the x k’s and underscoring yields In particular, the lowest relative humidity appears to result in a true average abrasion resistance significantly higher than for any other relative humidity studied. 38