Multi-factor experiments

Multi-factor experiments*
In section 11.4, attention was confined to just one experimental factor. In many cases, there are
many factors that may potentially affect a process. In this section, we address some of the issues
that arise in multi-factor experiments. The first issue concerns whether factor effects should be
studied one factor at a time or whether several factors should be studied simultaneously. The
"one-factor-at-a-time" approach has been used traditionally in much scientific investigation.
Here, we show that it is inferior to the multi-factor approach, recommended by statisticians, in at
least two ways.
Traditional vs. statistical design
Consider a process that may be affected by two factors, say a chemical manufacturing process
where the yield of the process may be affected by operating pressure and operating temperature.
A choice is to be made between two possible temperature levels, say "Low" and "High", as well
as between low and high levels of pressure. Suppose available resources allow that the process
may be run in experimental mode twelve times. It is easily demonstrated that the two factor
approach makes more efficient use of the data in determining the best level for each factor.
The traditional approach involves two steps:
 keep one factor fixed at its standard level, say Temperature at "Low", run the process
at each level of Pressure and choose the best level,
 with Pressure set at its best level, run the process at the "High" level of Temperature
and choose the best level of Temperature.
With twelve experimental runs, it is natural to run the process four times at each factor level
combination, that is
Low Temperature, Low Pressure,
Low Temperature, High Pressure,
High Temperature, Best Pressure.
In the first step, the effect of changing Pressure is assessed by comparing the average of the four
measurements of yield at Low Pressure with the average of the four measurements at High
Pressure, with Temperature at its Low level in each case..
In the second step, the effect of changing Temperature is assessed by comparing the better of
those two averages of four measurements with the average of the four measurements at High
Now, consider the statistically recommended design, which looks at all combinations of level of
both factors in a single study. This may be illustrated as follows where the subscripted Y's
represent the twelve yield measurements made.
Y7 Y8 Y9
Y10 Y11 Y12
* from An Introduction to Statistical Analysis for Business and Industry by Michael Stuart.
Section 11.5, pp. 359-364. Copyright © 2003.
Page 2
Y1 Y2 Y3
Y4 Y5 Y6
Figure 11.5.1 Illustration of a full factorial design
In this design, there are six measurements made at each level of each factor:
Y1, Y2, Y3, Y4, Y5 and Y6 at Low Pressure, compared to
Y7, Y8, Y9, Y10, Y11 and Y12 at High Pressure;
Y1, Y2, Y3, Y7, Y8 and Y9 at Low Temperature, compared to
Y4, Y5, Y6, Y10, Y11 and Y12 at High Temperature.
Thus, the effect of changing the levels of each factor is assessed by comparing an average of six
measurements with an average of six. This represents a considerable improvement on the
comparison of four with four employed in the traditional approach. To achieve the same quality
of comparison with the traditional approach, eighteen measurements, divided into three subsets
of six, would be required. Looking at it in another way, with the two factor design, all twelve
measurements are used twice in assessing the factor effects whereas, with the one-at-a-time
approach, four measurements are used twice while eight are used only once. Thus, the two factor
approach makes much more efficient use of the twelve measurements available.
There is a more subtle difference between the two approaches, which is demonstrated here with
the aid of some hypothetical data. Suppose that, in a study following the traditional approach, the
average response at low and high pressure, with temperature low in both cases, were 65 and 60,
respectively. On this basis, low pressure gives the higher process yield and so the next step is to
keep pressure low and run the process at high temperature. Suppose that the average yield under
these conditions is 70. Assuming that the standard operating conditions are low temperature and
low pressure, the conclusion from this experiment is that an improvement can be achieved by
running the process at high temperature while retaining pressure at its low level.
Page 3
Figure 11.5.2 Hypothetical results using the traditional approach
There is a potential flaw in this approach, however, arising from the fact that process
performance has not been evaluated with both factors at their high levels. Conceivably, the yield
in this case could be higher than at any other factor level combination, for example, as illustrated
in Figure 11.5.3.
Figure 11.5.3 Hypothetical results using the recommended approach
If, using the one-at-a-time approach, Temperature, rather than Pressure had been studied first,
then the best combination of levels would have been found; at the first step, high temperature
would have been chosen as best and the second step, comparing low and high pressure at high
temperature, would have led to the best combination. However, it cannot be regarded as
satisfactory that locating the optimum conditions depends on having the good fortune to pick the
right factor to study first. Here, the choice is between two factors. With several factors, there are
very many sequences of factors that might be chosen to study one at a time and, typically, very
few sequences will lead to the optimum conditions. An experimental strategy that has just a
small chance of locating the optimal conditions can hardly be recommended.
An explanation for the possible failure of the one-factor-at-a-time approach in this case may be
found in the pattern of changes illustrated in Figure 11.5.3. Note that process yield increases by
5, from 65 to 70, when Temperature changes from Low to High at the Low level of Pressure.
However, at the High level of Pressure, the effect of changing from Low to High Temperature is
15, that is, from 60 to 75. Correspondingly, at the Low level of Temperature, yield decreases by
Page 4
5, from 65 to 60, when Pressure is changed from Low to High, whereas, at the High level of
Temperature, yield increases by 5, from 70 to 75.
In short, the effect of changing the level of one factor depends on the level of the other factor. In
statistical terminology, this is referred to as an interaction between the factors.
Several levels
When each factor has just two levels, there are just four possible combinations. If there are more
than two levels per factor, the number of combinations increases. For this reason, it is advisable
to keep the number of levels to a minimum. For many purposes, two levels are adequate. If the
relationship between the response variable and the factors is non-linear, however, three levels
may be advisable. Consider the following example.
Suppose an experimental change of temperature from 50 (Low) to 60 (High) resulted in a yield
improvement from 65 to 69. This may be depicted as in Figure 11.5.4, as commonly seen in
statistical software.
Tem perature
Figure 11.5.4 Effects plot for one factor experiment
Implicit in choosing the high level of temperature in this case is an assumption that the yield
curve relating yield to temperature is linear, as depicted. Suppose, however, that the process had
been run at a third Temperature level, say 55, intermediate between Low and High, with results
as depicted in Figure 11.5.5.
Page 5
Tem perature
Figure 11.5.5 Effects plot for one factor experiment with three factor levels
This clearly shows that the intermediate level is better. Conceivably, there may be other better
levels, possibly as depicted in Figure 11.5.6.
Tem perature
Figure 11.5.6 Effects plot for a one factor experiment and a possible response curve
With two factors, the response relationship may be depicted as a response surface. With more
than two factors, graphical representation becomes virtually impossible. However, multi-factor
designs with three or more levels may be used to assist in identifying optimal conditions.
Several factors
Two factor experiments are relatively simple. To allow for interaction, all that is needed is to
ensure that all possible factor level combinations are included in the experiment. The principles
of blocking and randomisation apply just as readily; all that is required is to ensure that each
possible combination of factor levels occurs once within each homogeneous block with random
assignment of combination to experimental units within a block.
As the number of factors increases, the number of level combinations rapidly increases. With
three factors, each with two levels, the number of level combination is 2 × 2 × 2 = 8. If blocking
and replication are required, the number of experimental runs required builds up very quickly. In
Page 6
such circumstances, not only do resources become a problem but also the task of controlling the
experimental environment over the length of time necessary to complete the experiment becomes
increasingly difficult. In addition, there are now three possible two-factor interactions and a
possible three factor interaction, whose presence will complicate any analysis carried out. With
four two-level factors, the number of possible level combinations is 16, with 5, it is 32, with 6,
64, etc. Effectively, so-called full factorial experiments, where the process is run with all
possible level combinations quickly become impossible. Nevertheless, when a process is subject
to possible influence of several factors, it is important to be able to distinguish the few (it is
hoped) factors that have substantial effects. Fortunately, suitable designs have been devised for
this purpose, sometimes referred to as screening designs. Carefully selected fractions, half,
quarter, eighth, or less, of the full set of possible combinations may be chosen which give the
necessary information when implemented experimentally. These designs are also referred to as
fractional factorial designs. Their success depends on an assumption that there are no high order
interactions or, in other words, that the response relationship is not too complicated.