ARG/PDW: MCEN4027F00 VI: 1 ANOVA – TWO-FACTOR FACTORIAL EXPERIMENTS In many experimental situations there are two or more factors (treatments) of interest. For simplicity we will describe the case for two factors: factor A with a levels and factor B with b levels. We have previously considered ANOVA for p different levels of a single factor incorporated in a completely randomized design and in a randomized block design. A randomized block design is often called a two-way classification of data because it has the following characteristics: It involves two independent variables – one factor and one direction of blocking. Each level of one independent variable occurs with every level of the other independent variable. A two-way classification of data always permits the display of the data in a two-way table, one containing r rows and c columns. The treatment selection for a two-factor experiment may also yield a two-way classification of data. ARG/PDW: MCEN4027F00 VI: 2 For example, suppose we want to relate the mean number of defects on a finished item to two factors, type of nozzle for a varnish spray gun and the length of spraying time. Suppose further that we want to investigate the mean number of defects for three types of nozzles (three levels) and two lengths of spraying time (two levels). If we include the treatments for the experiment to include all combinations of the three levels of nozzle type with the two levels of spraying time, we will obtain a two-way classification of data. This selection of treatments is called a complete 3 x 2 factorial experiment. Note that the design, called a factorial design, will contain 3 x 2 = 6 treatments. If we were to include a third factor, say, paint type at three levels, then a complete factorial experiment would include all 3 x 2 x 3 = 18 combinations, and the resulting collection of data would be termed a three-way classification of data. ARG/PDW: MCEN4027F00 VI: 3 Methodology Suppose a two-way classification represents a two-factor factorial experiment with factor A at a levels and factor B at b levels. Further, assume that the ab treatments of the factorial experiment are replicated r times so that there are r observations for each of the ab treatment combinations. Then the total number of observations is n = abr and the total sum of squares, SS(Total) can be partitioned into four parts: SS(A), SS(B), SS(AB), and SSE. ARG/PDW: MCEN4027F00 VI: 4 The first two sums of squares are called main effect sum of squares to distinguish them from the interaction sum of squares. When the number of observations per cell for a two-way factorial experiment is the same for every cell (r per cell), the sums of squares and the degrees of freedom for the analysis of variance are additive: SS(Total) = SS(A) + SS(B) +SS(AB) +SSE and abr – 1 = (a-1) + (b-1) + (a-1)(b-1) + ab(r-1) The corresponding ANOVA table would then appear as follows: Note that for a factorial experiment, the number of r observations per factor-level combination, must always be two or more; otherwise, there will not be any degrees of freedom for SSE. ARG/PDW: MCEN4027F00 VI: 5 If either A or B represents a direction of blocking, then the AB interaction terms are deleted from the ANOVA table and the degrees of freedom and sums of squares from lines 3 and 4 are combined to form a source of error variation. This is because the block-treatment interaction always represents experimental error. The resulting ANOVA table would then appear as shown below: ARG/PDW: MCEN4027F00 VI: 6 For reference, the notation used in the formulas for the respective sums of squares and the formulas are given below: ARG/PDW: MCEN4027F00 VI: 7 To test a hypothesis for any one of the three sources of variation, we proceed in exactly the same manner as was done previously, i.e., we divide the appropriate mean square by the MSE and use F ratio at a test statistic. ARG/PDW: MCEN4027F00 VI: 8 ANOVA Example 3 A company that stamps gaskets out of sheets of rubber, plastic, an other materials, wants to compare the mean number of gaskets produced per hour for two different types of stamping machines. Practically, the manufacturer wants to determine whether one machine is more productive than the other, and even more important, whether one machine is more productive in producing rubber gaskets while the other is more productive in producing plastic gaskets. To answer these questions, the manufacturer decides to conduct a 2 x 3 factorial experiment using three types of gasket materials, B1, B2, and B3, with each of the two types of stamping machines, A1 and A2. Each machine is operated for three 1-hour periods for each of the gasket materials, with the 18 1-hour time periods assigned to the six machine-material combinations in random order (to eliminate the possibility that uncontrolled environmental factors might bias the results). Assume that we have calculated the six treatment means. ARG/PDW: MCEN4027F00 VI: 9 Two hypothetical plots of the six means are shown below - what do each of these plots imply about the productivity of the two stamping machines? ARG/PDW: MCEN4027F00 VI: 10 The first figure suggests that machine A1 produces a larger number of gaskets per hour regardless of the gasket material, and is therefore superior to machine A2. On the average, machine A1 stamps out more B1 gaskets per hour than B2 or B3, but the difference in the mean numbers of the gaskets produced by the two machines remains approximately the same, regardless of the material. Thus, the difference in the mean number of gaskets produced by the two machines is independent of the material used in the stamping process. In contrast, the second figure shows the productivity of machine A1 to be greater than for machine A2 when the gasket material is B1 or B3. But the means are reversed for B2 such that A2 produces more on average than A1. Therefore, this figure illustrates a situation where the mean value of the response variable depends on the combination of the factor levels. When this situation occurs, we say that the factors interact. ARG/PDW: MCEN4027F00 VI: 11 Thus, one of the most important objectives of a factorial experiment is to detect factor interaction if it exists. Tests for main effects are relevant only when no interaction exists between factors. Generally, the test for interaction is performed first. If there is evidence of factor interaction, then tests to assess main factor effects are not performed. In Summary, In a factorial experiment, when the difference in the mean levels of factor A depends on the different levels of factor B, we say that the factors A and B interact. If the difference in A is independent of the levels of B, then there is no interaction between A and B. ARG/PDW: MCEN4027F00 VI: 12 ANOVA Example 4 A manufacturer, whose daily supply of raw materials is variable and limited, can use the material to produce two different products in various proportions. The profit per unit of raw material obtained by producing each of the two products depends on the length of product’s manufacturing run and hence on the amount of raw material assigned to it. Other factors, such as worker productivity and machine breakdown, affect the profit per unit as well, but their net effect on profit is random and uncontrollable. The manufacturer has conducted an experiment to investigate the effect of the level of supply of raw materials (S) and the ratio of its assignment (R) to the two product manufacturing lines on the profit y per unit of raw material. The ultimate goal would be to be able to choose the best ratio R to match each day’s supply of raw materials S. The levels of supply of the raw material chosen for the experiment were 15, 18, and 21 tons; the levels of the ratio of allocation to the two product lines were 0.5, 1, and 2. The response was the profit ($) per unit of raw material supply obtained from a single day’s production. Three replications of a complete 3 x 3 factorial experiment were conducted in a random sequence and the results are shown below. ARG/PDW: MCEN4027F00 VI: 13 ARG/PDW: MCEN4027F00 VI: 14 The presence of interaction tells you that the mean profit depends upon the particular combination of levels of S and R. Consequently, there is little point in checking to see whether the means differ for the 3 levels of S or 3 levels of R, i.e., we will not perform the tests for main effects. For example, the supply level that provides the highest mean profit (over all levels of R) might not be the same supply-ratio level combination that produces the largest mean profit per unit of raw material.