IE241 Introduction to Mathematical Statistics

advertisement
IE341: Introduction to Design of
Experiments
Table of Contents
Single factor ANOVAs with more than 2 levels
Completely randomized designs
Fixed effects models
Decomposition of total sum of squares
ANOVA table
Testing model adequacy
Multiple comparison methods
Random effects model
Repeated measures design
Randomized complete blocks design
Latin Square
Graeco Latin Square
Regression approach to ANOVA
Factorial designs
Regression model version
2-factor experiments
3-factor experiments
Tests for quadratic effects
Blocking
2k factorial designs
Single replicate designs
Addition of center points
Fractional factorials
Design resolution
Complementary ½ fractions
¼ fractions
7
12
14
18
24
31
38
48
53
59
69
74
78
86
98
110
122
133
144
150
165
177
189
196
208
217
Taguchi approach to experimentation
Random Effects model
Mixed models
ANCOVA
Nested designs
2-stage nested designs
3- stage and m-stage nested designs
Split-plot designs
Split-split-plot designs
Transformations
Log transformation of standard deviations
Logic transformations of odds ratios
Kruskal-Wallis rank transformation
Response Surface Methodology
First order
Second order and CCD
Canonical analysis
Response Surface designs
Central Composite Design (CCD)
Box-Behnken Designs
Mixture Experiments
Simplex designs
225
236
251
258
270
271
281
287
299
307
307
313
316
322
327
342
353
360
363
365
368
372
Last term we talked about testing the difference
between two independent means. For means from a
normal population, the test statistic is
XA  XB
XA  XB
t

sdiff
s A2 s B2

n A nB
where the denominator is the estimated standard
deviation of the difference between two independent
means. This denominator represents the random
variation to be expected with two different samples.
Only if the difference between the sample means is
much greater than the expected random variation do
we declare the means different.
We also covered the case where the
two means are not independent, and
what we must do to account for the fact
that they are dependent.
And finally, we talked about the difference
between two variances, where we used the
F ratio. The F distribution is a ratio of two
chi-square variables. So if s21 and s22
possess independent chi-square
distributions with v1 and v2 df, respectively,
then
2
1
2
2
s
F
s
has the F distribution with v1 and v2 df.
All of this is valuable if we are testing only two means. But
what if we want to test to see if there is a difference among
three means, or four, or ten?
What if we want to know whether fertilizer A or fertilizer B or
fertilizer C is best? In this case, fertilizer is called a factor,
which is the condition under test.
A, B, C, the three types of fertilizer under test, are called levels
of the factor fertilizer.
Or what if we want to know if treatment A or treatment B or
treatment C or treatment D is best? In this case, treatment is
called a factor.
A,B,C,D, the four types of treatment under test, are called
levels of the factor treatment.
It should be noted that the factor may be quantitative or
qualitative.
Enter the analysis of variance!
ANOVA, as it is usually called, is a way to test the
differences between means in such situations.
Previously, we tested single-factor experiments with
only two treatment levels. These experiments are
called single-factor because there is only one factor
under test. Single-factor experiments are more
commonly called one-way experiments.
Now we move to single-factor experiments with more
than two treatment levels.
Let’s start with some notation.
Yij = ith observation in the jth level
N = total number of experimental observations
Y = the grand mean of all N experimental
observations
Y 
N
Y
i 1
ij
N
nj
Y j = the mean of the observations in the
jth level
Yj 
Y
i 1
ij
nj
nj = number of observations in the jth level; the nj are
called replicates.
Replication of the design refers to using more than
one experimental unit for each level.
If there are the same number n replicates for
each treatment, the design is said to be
balanced. Designs are more powerful if they
are balanced, but balance is not always
possible.
Suppose you are doing an experiment and
the equipment breaks down on one of the
tests. Now, not by design but by
circumstance, you have unequal numbers of
replicates for the levels.
In all the formulas, we used nj as the number
of replicates in treatment j, not n, so there is
no problem.
Notation continued
 j = the effect of the jth level
 j  Yj Y
J = number of treatment levels
eij = the “error” associated with the ith observation in the jth
level,
assumed to be independent normally distributed
random variables with mean = 0 and variance = σ2,
which are constant for all levels of the factor.
For all experiments, randomization is
critical. So to draw any conclusions
from the experiment, we must require
that the treatments be run in random
order.
We must also assign the experimental
units to the treatments randomly.
If all this randomization occurs, the
design is called a completely
randomized design.
ANOVA begins with a linear statistical
model
Yij  Y   j  e ij
This model is for a one-way or singlefactor ANOVA. The goal of the model is
to test hypotheses about the treatment
effects and to estimate them.
If the treatments have been selected by
the experimenter, the model is called a
fixed-effects model. For fixed-effects
models, we are interested in differences
between treatment means. In this case,
the conclusions will apply only to the
treatments under consideration.
Another type of model is the random
effects model or components of
variance model.
In this situation, the treatments used
are a random sample from large
population of treatments. Here the τi
are random variables and we are
interested in their variability, not in the
differences among the means being
tested.
First, we will talk about fixed effects,
completely randomized, balanced models.
In the model we showed earlier, the τj are
defined as deviations from the grand mean
so
J

j 1
j
0
It follows that the mean of the jth treatment is
Yj  Y  j
Now the hypothesis under test is:
Ho: μ1= μ2 = μ3 = … μJ
Ha: μj≠ μk for at least one j,k pair
The test procedure is ANOVA, which is a
decomposition of the total sum of
squares into its components parts
according to the model.
The total SS is
nj
J
SSTotal   (Yij  Y ) 2
i 1 j 1
and ANOVA is about dividing it into its
component parts.
SS = variability of the differences
among the J levels
J
SS   n j (Y j  Y )2
j 1
SSε = pooled variability of the random error
within levels
J
nj
SSerror   (Yij  Y j ) 2
j 1 i 1
This is easy to see because
J
nj
 (Y
j 1 i 1

2
  Y j  Y   (Yij  Y j )
)
Y

ij
J
nj
2

j 1 i 1
  n j Y j  Y    (Yij  Y j )  2 Y j  Y (Yij  Y j )
J
j 1
2
J
nj
J
nj
2
j 1 i 1
j 1 i 1
But the cross-product term vanishes because
nj
 (Y
i 1
ij
Yj )  0
So SStotal = SS
treatments
+ SS
error
Most of the time, this is called
SStotal = SS between + SS within
Each of these terms becomes an MS (mean
square) term when divided by the appropriate
df.
MS treatments 
MS error 
SStreatments SStreatments

df treatments
J 1
SSerror SSerror

df error
N J
The df for SSerror = N-J because
J
nj
J
2
(
Y

Y
)

(
n

1
)
s
 ij j  j j
2
j 1 i 1
j 1
(n1  1) s12  (n2  1) s22  ...  (nJ  1) sJ2 SSerror


(n1  1)  (n2  1)  ...  (nJ  1)
N J
and the df for SSbetween = J-1 because
there are J levels.
Now the expected values of each of these terms
are
E(MSerror) = σ2
E(MStreatments) =
J
2 
2
n

 j j
j 1
J 1
Now if there are no differences among the treatment
means, then  j  0 for all j.
So we can test for differences with our old friend F
F
MS treatments
MS error
with J -1 and N -J df.
Under Ho, both numerator and denominator are
estimates of σ2 so the result will not be significant.
Under Ha, the result should be significant because
the numerator is estimating the treatment effects as
well as σ2.
The results of an ANOVA are presented
in an ANOVA table. For this one-way,
fixed-effects, balanced model:
Source
Model
Error
Total
SS
SSbetween
SSwithin
SStotal
df
J-1
N-J
N-1
MS
MSbetween
MSwithin
p
p
Let’s look at a simple example.
A product engineer is investigating the
tensile strength of a synthetic fiber to
make men’s shirts. He knows from prior
experience that the strength is affected
by the weight percent of cotton in the
material. He also knows that the
percent should range between 10% and
40% so that the shirts can receive
permanent press treatment.
The engineer decides to test 5 levels:
15%, 20%, 25%, 30%, 35%
and to have 5 replicates in this design.
His data are
%
Yj
15
7
7 15
11
9
9.8
20
12
17
12
18
18
15.4
25
14
18
18
19
19
17.6
30
19
25
22
19
23
21.6
35
7
10
11
15
11
10.8
Y
15.04
In this tensile strength example, the
ANOVA table is
Source
Model
Error
Total
SS
df
MS
475.76 4 118.94
161.20 20
8.06
636.96 24
p
<0.01
In this case, we would reject Ho and
declare that there is an effect of the
cotton weight percent.
We can estimate the treatment
parameters by subtracting the grand
mean from the treatment means. In this
example,
τ1 = 9.80 – 15.04 = -5.24
τ2 = 15.40 – 15.04 = +0.36
τ3 = 17.60 – 15.04 = -2.56
τ4 = 21.60 – 15.04 = +6.56
τ5 = 10.80 – 15.04 = -4.24
Clearly, treatment 4 is the best because
it provides the greatest tensile strength.
Now you could have computed these values
from the raw data yourself instead of doing
the ANOVA. You would get the same results,
but you wouldn’t know if treatment 4 was
significantly better.
But if you did a scatter diagram of the
original data, you would see that treatment 4
was best, with no analysis whatsoever.
In fact, you should always look at the original
data to see if the results do make sense. A
scatter diagram of the raw data usually tells
as much as any analysis can.
S catter plo t o f tensile strength data
30
tensile str ength
25
20
15
10
5
0
10
15
20
25
30
weight percent co tto n
35
40
How do you test the adequacy of the
model?
The model assumes certain
assumptions that must hold for the
ANOVA to be useful. Most importantly,
that the errors are distributed normally
and independently.
The error for each observation,
sometimes called the residual, is
e ij  Yij  Y j
A residual check is very important for testing
for nonconstant variance. The residuals
should be structureless, that is, they should
have no pattern whatsoever, which, in this
case, they do not.
S catter plo t o f residuals v s. fitted v alues
6
5
4
3
r esidual
2
1
0
-1
-2
-3
-4
-5
9
12
15
fitted v alue
18
21
These residuals show no extreme
differences in variation because they all
have about the same spread.
They also do not show the presence of
any outlier. An outlier is a residual value
that is vey much larger than any of the
others. The presence of an outlier can
seriously jeopardize the ANOVA, so if
one is found, its cause should be
carefully investigated.
A histogram of residuals shows that the
distribution is slightly skewed. Small
departures from symmetry are of less
concern than heavy tails.
Histo gram o f R esiduals
Fr equency
3
2
1
-6
-4
-2
0
R esidual
2
4
6
Another check is for normality. If we do a
normal probability plot of the residuals, we
can see whether normality holds.
Normal probability plot
100
Cum Normal probability
90
80
70
60
50
40
30
20
10
0
-4
-2
0
2
Res idual
4
6
A normal probability plot is made with
ascending ordered residuals on the
x-axis and their cumulative probability
points, 100(k-.5)/n, on the y-axis. k is
the order of the residual and n =
number of residuals. There is no
evidence of an outlier here.
The previous slide is not exactly a
normal probability plot because the
y-axis is not scaled properly. But it
does gives a pretty good suggestion of
linearity.
A plot of residuals vs run order is useful to
detect correlation between the residuals, a
violation of the independence assumption.
Runs of positive or of negative residuals
indicates correlation. None is observed here.
R esiduals v s R un O rder
6
5
4
Residuals
3
2
1
0
-1
-2
-3
-4
-5
0
5
10
15
R un O rder
20
25
30
One of the goals of the analysis is to
choose among the level means. If the
results of the ANOVA shows that the
factor is significant, we know that at
least one of the means stands out from
the rest. But which one or ones?
The procedures for making these mean
comparisons are called multiple
comparison methods. These methods
use linear combinations called contrasts.
A contrast is a particular linear
combination of level means, such as
Y4  Y5 to test the difference between
level 4 and level 5.
Or if one wished to test the average of
levels 1 and 3 vs the average of levels 4
and 5, he would use (Y  Y )  (Y  Y ) .
1
In general,
J
C  n c jY j
j 1
3
4
5
J
where  c
j 1
j
0
An important case of contrasts is called
orthogonal contrasts. Two contrasts
with
coefficients cj and dj are orthogonal if
J
n c d
j 1
j
j
j
0
or in a balanced design if
J
c d
j 1
j
j
0
There are many ways to choose the
orthogonal contrast coefficients for a
set of levels. For example, if level 1 is
a control and levels 2 and 3 are two real
treatments, a logical choice is to
compare the average of the two
treatments with the control:
 1Y2  1Y3  2Y1
and then the two treatments against
one another: 1Y2  1Y3  0Y1
These two contrasts are orthogonal
because  c d  (1*1)  (1* 1)  (2 * 0)  0
3
j 1
j
Only J-1 orthogonal contrasts may be
chosen because the J levels have only
J-1 df. So for only three levels, the
contrasts chosen exhaust those
available for this experiment.
Contrasts must be chosen before seeing
the data so that experimenters aren’t
tempted to contrast the levels with the
greatest differences.
For the tensile strength experiment with 5
levels and thus 4 df, the 4 contrasts are:
C1= 0(5)(9.8)+0(5)(15.4)+0(5)(17.6)-1(5)(21.6)+1(5)(10.8) =-54
C2= +1(5)(9.8)+0(5)(15.4)+1(5)(17.6)-1(5)(21.6)-1(5)(10.8) =-25
C3= +1(5)(9.8)+0(5)(15.4)-1(5)(17.6)+0(5)(21.6)+0(5)(10.8) =-39
C4= -1(5)(9.8)+4(5)(15.4)-1(5)(17.6)-1(5)(21.6)-1(5)(10.8) = 9
These 4 contrasts completely partition the SStreatments. Then the
SS for each contrast is formed:
 J

  n j c jY j 


j 1


SSC 
J
 n j c 2j
j 1
2
So for the 4 contrasts we have:
SS C1 
SS C 2 
SS C 3 
SS C 4 
 542
5[(0  (0 )  (0 )  (1 )  (1 )]
2)
2
2
2
2
 291.6
 252
5[(1 )  (0 )  (1 )  (1 )  (1 )]
2
2
2
2
2
 392
5[(1 )  (0 )  (1 )  (0 )  (0 )]
2
2
2
2
2
 31.25
 31.25
92
5[( 1 )  (4 )  (1 )  (1 )  (1 )]
2
2
2
2
2
 0.81
Now the revised ANOVA table is
Source
Weight %
C1
C2
C3
C4
Error
Total
SS
475.76
291.60
31.25
152.10
0.81
161.20
636.96
df
MS
p
4 118.94 <0.001
1 291.60 <0.001
1
31.25 <0.06
1 152.10 <0.001
1
0.81 <0.76
20
8.06
24
So contrast 1 (level 5 – level 4) and
contrast 3 (level 1 – level 3) are
significant.
Although the orthogonal contrast
approach is widely used, the
experimenter may not know in advance
which levels to test or they may be
interested in more than J-1
comparisons. A number of other
methods are available for such testing.
These methods include:
Scheffe’s Method
Least Significant Difference Method
Duncan’s Multiple Range Test
Newman-Keuls test
There is some disagreement about
which is the best method, but it is best
if all are applied only after there is
significance in the overall F test.
Now let’s look at the random effects
model.
Suppose there is a factor of interest with
an extremely large number of levels. If
the experimenter selects J of these
levels at random, we have a random
effects model or a components of
variance model.
The linear statistical model is
Yij  Y   j  e ij
as before, except that both  j and e ij
are random variables instead of simply e ij .
Because  j and e ij are independent, the
variance of any observation is
Var (Yij )  Var ( )  Var (e ij )
These two variances are called variance
components, hence the name of the model.
The requirements of this model are that the e ij
are NID(0,σ2), as before, and that the  j
are NID(0,  2 ) and that e ij and  j are
independent. The normality assumption is
not required in the random effects model.
As before, SSTotal = SStreatments + SSerror
And the E(MSerror) = σ2.
2
2

But now E(MStreatments) = σ + n 
So the estimate of  
2
is ˆ2 
MS treatments  MS error
n
The computations and the ANOVA table
are the same as before, but the
conclusions are quite different.
Let’s look at an example.
A textile company uses a large number
of looms. The process engineer
suspects that the looms are of different
strength, and selects 4 looms at
random to investigate this.
The results of the experiment are shown in the
table below.
Loom
Yj
1
98
97
99 96
97.5
2
91
90
93 92
91.5
3
96
95
97 95
95.75
4
95
96
99 98
97.0
Y
The ANOVA table is
Source
SS
df
Looms
89.19
3
Error
22.75 12
Total
111.94 15
95.44
MS
p
29.73 <0.001
1.90
In this case, the estimates of the variances
are:
 e2 =1.90
29.73  1.90
ˆ  
 6.96
4
2
 ij2   e2   2  1.90  6.96  8.86
Thus most of the variability in the
observations is due to variability in loom
strength. If you can isolate the causes of this
variability and eliminate them, you can reduce
the variability of the output and increase its
quality.
When we studied the differences
between two treatment means, we
considered repeated measures on the
same individual experimental unit.
With three or more treatments, we can
still do this. The result is a repeated
measures design.
Consider a repeated measures ANOVA
partitioning the SSTotal.
n
J
n
J
n
J
2
(
Y

Y
)

(
Y

Y
)

(
Y

Y
)
 ij
 i
 ij i
2
i 1 j 1
2
i 1 j 1
i 1 j 1
This is the same as
SStotal = SSbetween subjects + SSwithin
subjects
The within-subjects SS may be further
partitioned into SStreatment + SSerror .
In this case, the first term on the RHS is
the differences between treatment
effects and the second term on the
RHS is the random error.
n
J
n
J
n
J
2
(
Y

Y
)

(
Y

Y
)

(
Y

Y

Y

Y
)
 ij i  j
 ij i j
2
i 1 j 1
2
i 1 j 1
i 1 j 1
Now the ANOVA table looks like this.
Source
SS
n
J
Between subjects  (Y  Y )
i 1 j 1
n
Within Subjects
Treatments
Error
Total
J
i 1 j 1
ij
J
 (Y
J-1
ij
 Yi  Y j  Y ) 2
(J-1)(n-1)
 (Y
n
J
 (Y
i 1 j 1
n(J-1)
 Y )2
J
i 1 j 1
 Yi ) 2
n-1
j
i 1 j 1
n
2
i
 (Y
n
df
ij
 Y )2
Jn-1
MS
p
The test for treatment effects is the
usual
MS treatment
MS error
but now it is done entirely within
subjects.
This design is really a randomized
complete block design with subjects
considered to be the blocks.
Now what is a randomized complete
blocks design?
Blocking is a way to eliminate the effect
of a nuisance factor on the
comparisons of interest. Blocking can
be used only if the nuisance factor is
known and controllable.
Let’s use an illustration. Suppose we
want to test the effect of four different
tips on the readings from a hardness
testing machine.
The tip is pressed into a metal test
coupon, and from the depth of the
depression, the hardness of the coupon
can be measured.
The only factor is tip type and it has four
levels. If 4 replications are desired for each
tip, a completely randomized design would
seem to be appropriate.
This would require assigning each of the
4x4 = 16 runs randomly to 16 different
coupons.
The only problem is that the coupons need to
be all of the same hardness, and if they are
not, then the differences in coupon hardness
will contribute to the variability observed.
Blocking is the way to deal with this problem.
In the block design, only 4 coupons are
used and each tip is tested on each of
the 4 coupons. So the blocking factor
is the coupon, with 4 levels.
In this setup, the block forms a
homogeneous unit on which to test the
tips.
This strategy improves the accuracy of
the tip comparison by eliminating
variability due to coupons.
Because all 4 tips are tested on each coupon,
the design is a complete block design. The
data from this design are shown below.
Test coupon
Tip type
1
2
3
4
1
9.3
9.4
9.6
10.0
2
9.4
9.3
9.8
9.9
3
9.2
9.4
9.5
9.7
4
9.7
9.6
10.0
10.2
Now we analyze these data the same
way we did for the repeated measures
design. The model is
Y jk  Y   j   k  e jk
where βk is the effect of the kth block
and the rest of the terms are those we
already know.
Since the block effects are deviations
from the grand mean,
K

k 1
just as
k
J

j 1
j
0
0
We can express the total SS as
J
K
J
K
2
(
Y

Y
)

[(
Y

Y
)

(
Y

Y
)

(
Y

Y

Y

Y
)]
 jk
 j
k
jk
j
k
2
j 1 k 1
J
j 1 k 1
K
J
K
J
K
  (Y j  Y )   (Yk  Y )   (Y jk  Y j  Yk  Y ) 2
2
j 1 k 1
2
j 1 k 1
j 1 k 1
which is equivalent to
SStotal = SStreatments + SSblocks + SSerror
with df
N-1 = J-1
+ K-1
+ (J-1)(K-1)
The test for equality of treatment means
is F  MSMS
treatments
error
and the ANOVA table is
Source
Treatments
Blocks
Error
Total
SS
SStreatments
SSblocks
SSerror
SStotal
df
J-1
K-1
(J-1)(K-1)
N-1
MS
MStreatments
MSblocks
MSerror
p
For the hardness experiment, the
ANOVA table is
Source
Tip type
Coupons
Error
Total
SS
df
38.50
3
82.50
3
8.00
9
129.00 15
MS
12.83
27.50
.89
p
0.0009
As is obvious, this is the same analysis
as the repeated measures design.
Now let’s consider the Latin Square design.
We’ll introduce it with an example.
The object of study is 5 different formulations
of a rocket propellant on the burning rate of
aircraft escape systems.
Each formulation comes from a batch of raw
material large enough for only 5 formulations.
Moreover, the formulations are prepared by 5
different operators, who differ in skill and
experience.
The way to test in this situation is with a 5x5
Latin Square, which allows for double blocking
and therefore the removal of two nuisance
factors. The Latin Square for this example is
Batches of
raw
material
Operators
1
2
3
4
5
1
A
B
C
D
E
2
B
C
D
E
A
3
C
D
E
A
B
4
D
E
A
B
C
5
E
A
B
C
D
Note that each row and each column
has all 5 letters, and each letter occurs
exactly once in each row and column.
The statistical model for a Latin Square
is
Y jkl  Y   j   k   l  e jkl
where Yjkl is the jth treatment
observation in the kth row and the lth
column.
Again we have
SStotal = SSrows+ SScolumns+ SStreatments+ SSerror
with df =
N = R-1 + C-1 + J-1 + (R-2)(C-1)
The ANOVA table for propellant data is
Source
SS df MS
p
Formulations
330.00
4
82.50
0.0025
Material batches
Operators
Error
Total
68.00
150.00
128.00
676.00
4
4
12
24
17.00
37.50
10.67
0.04
So both the formulations and the
operators were significantly different.
The batches of raw material were not,
but it still is a good idea to block on
them because they often are different.
This design was not replicated, and
Latin Squares often are not, but it is
possible to put n replicates in each cell.
Now if you superimposed one Latin
Square on another Latin Square of the
same size, you would get a GraecoLatin Square.
In one Latin Square, the treatments are
designated by roman letters. In the
other Latin Square, the treatments are
designated by Greek letters.
Hence the name Graeco-Latin Square.
A 5x5 Graeco-Latin Square is
Batches of
raw
material
Operators
1
2
3
4
5
1
Aα
Bγ
Cε
Dβ
Eδ
2
Bβ
Cδ
Dα
Eγ
Aε
3
Cγ
Dε
Eβ
Aδ
Bα
4
Dδ
Eα
Aγ
Bε
Cβ
5
Eε
Aβ
Bδ
Cα
Dγ
Note that the five Greek treatments appear
exactly once in each row and column, just as
the Latin treatments did.
If Test Assemblies had been added as
an additional factor to the original
propellant experiment, the ANOVA table
for propellant data would be
Source
SS
Formulations
Material batches
Operators
Test Assemblies
Error
Total
330.00
68.00
150.00
62.00
66.00
676.00
df
4
4
4
4
8
24
MS
82.50
17.00
37.50
15.50
8.25
p
0.0033
0.0329
The test assemblies turned out to be
nonsignificant.
Note that the ANOVA tables for the Latin
Square and the Graeco-Latin Square
designs are identical, except for the
error term.
The SS(error) for the Latin Square
design was decomposed to be both
Test Assemblies and error in the
Graeco-Latin Square. This is a good
example of how the error term is really a
residual. Whatever isn’t controlled falls
into error.
Before we leave one-way designs, we
should look at the regression approach
to ANOVA. The model is
Yij     j  e ij
Using the method of least squares, we
rewrite this as
nj
J
nn
J
E   e   (Yij     j ) 2
i 1 j 1
2
ij
i 1 j 1
Now to find the LS estimates of μ and τj,
E
0

E
0

When we do this differentiation with
respect to μ and τj, and equate to 0, we
obtain
nj
J
 2 (Yij  ˆ  ˆ j )  0
i 1 J 1
J
 2 (Yij  ˆ  ˆ j )  0
j 1
for all j
After simplification, these reduce to
Nˆ  nˆ1  nˆ2  ...  nˆJ  Y ..
nˆ  nˆ1.............................  Y.1
nˆ ............  nˆ2 .................  Y.2
.
.
.
nˆ ............................  nˆJ  Y.J
In these equations,
Y..  NY
Y. j  nY j
These j + 1 equations are called the
least squares normal equations.
J
If we add the constraint
ˆ
j 1
j
0
we get a unique solution to these
normal equations.
ˆ  Y
ˆ j  Y j  Y
It is important to see that ANOVA designs are
simply regression models. If we have a oneway design with 3 levels, the regression
model is
Yij   0  1 X i1   2 X i 2  e ij
where Xi1 = 1 if from level 1
= 0 otherwise
and Xi2 = 1 if from level 2
= 0 otherwise
Although the treatment levels may be
qualitative, they are treated as “dummy”
variables.
Since Xi1 = 1 and Xi2 = 0,
Yi1   0  1 (1)   2 (0)  e ij
  0  1  e ij
so
 0  1  Y   1
Similarly, if the observations are from
level 2, Yi 2   0  1 (0)   2 (1)  e ij
  0   2  e ij
so
0  2  Y  2
Finally, consider observations from level 3, for
which Xi1 = Xi2 = 0. Then the regression
model becomes
Yi 3   0  1 (0)   2 (0)  e ij
  0  e ij
so
0  Y  3
Thus in the regression model formulation of
this one-way ANOVA with 3 levels, the
regression coefficients describe comparisons
of the first two level means with the third.
So
 0  Y3
1  Y1  Y3
 2  Y 2  Y3
Thus, testing β1= β2 = 0 provides a test of the
equality of the three means.
In general, for J levels, the regression model
will have J-1 variables
Yij   0  1 X i1   2 X i 2  ...   J 1 X i ,J 1  eij
and
 0  YJ
 j  Y j  YJ
Now what if you have two factors under
test? Or three? Or four? Or more?
Here the answer is the factorial design.
A factorial design crosses all factors.
Let’s take a two-way design. If there
are J levels of factor A and K levels of
factor B, then all JK treatment
combinations appear in the experiment.
Most commonly, J = K = 2.
In a two-way design, with two levels of each factor,
we have, where -1 and +1 are codes for low and high
levels, respectively
Factor A
Factor B
-1 (low level)
-1 (low level)
+1 (high level)
-1 (low level)
-1 (low level)
+1 (high level)
+1 (high level)
+1 (high level)
Response
20
40
30
52
We can have as many replicates as we want in this
design. With n replicates, there are n observations in
each cell of the design.
SStotal = SSA + SSB + SSAB + SSerror
This decomposition should be familiar
by now except for SSAB. What is this
term? Its official name is interaction.
This is the magic of factorial designs.
We find out about not only the effect of
factor A and the effect of factor B, but
the effect of the two factors in
combination.
How do we compute main effects? The
main effect of factor A is the difference
between the average response at A high
and the average response at A low,
40  52 20  30

 46  25  21
2
2
Similarly, the B effect is the difference
between the average response at B high
and the average response at B low
30  52 20  40

 41  30  11
2
2
You can always find main effects from
the design matrix. Just multiply the
mean response by the +1 and -1 codes
and divide by the number of +1 codes
in the column.
For example,
Aeffect 
(-1)(20)  (1)(40)  (-1)(30)  (1)(52)
 21
2
Beffect 
(-1)(20)  (1)(40)  (1)(30)  (1)(52)
 11
2
So the main effect of factor A is 21 and
the main effect of factor B is 11.
That is, changing the level of factor A
from the low level to the high level
brings a response increase of 21 units.
And changing the level of factor B from
the low level to the high level increases
the response by 11 units.
The plots below show the main effects of
factors A and B.
M ai n E ffe c t of Fac tor A
50
40
M ain Effect of Factor B
35
50
30
45
25
1
Fac tor A L e v e l
Response
Re sponse
45
2
40
35
30
25
1
2
Factor B Level
Both A and B are significant, which you can
see by the fact that the slope is not 0.
A 0 slope in the effect line that connects the
response at the high level with the response
at the low level indicates that it doesn’t matter
to the response whether the factor is set at
its high value or its low value, so the effect of
such a factor is not significant.
Of course, the p value from the F test gives
the significance of the factors precisely, but it
is usually evident from the effects plots.
Now how do you compute the interaction
effect? Interaction occurs when the
difference in response between the levels of
one factor are different at the different levels
of the other factor. ( A B  A B )  ( A B  A B )
2
2
1
2
2
1
1 1
The first term here is the difference between
the two levels of factor A at the high level of
factor B. That is, 52 -30 = 22.
And the difference between the two levels of
factor A at the low level of factor B is
40-20 = 20. Then the interaction effect is
(22-20)/ 2 = 1.
Of course, you can compute the interaction
effect from the interaction column, just as
we did with main effects.
But how do you get the interaction column
+1 and -1 codes? Simply multiply the
codes for factor A by those for factor B.
Factor A
Factor B
AB
Response
-1
-1
+1
20
+1
-1
-1
40
-1
+1
-1
30
+1
+1
+1
52
Now you can compute the interaction
effect by multiplying the response by
the interaction codes and dividing by
the number of +1 codes.
(1)( 20)  (1)( 40)  (1)(30)  (1)(52)
ABeffect 
1
2
And, of course, the interaction effect is
again 1.
Because the interaction effect =1, which is
very small, it is not significant. The
interaction plot below shows almost parallel
lines, which indicates no interaction.
Interaction of Factors A and B
60
B High
Response
50
40
B Low
30
B High
20
B Low
10
0
-1
0
Lev el of Factor A
1
Now suppose the two factors are quantitative,
like temperature, pressure, time, etc. Then
you could write a regression model version
of the design.
Y   0  1 X 1   2 X 2  12 X 1 X 2  
As before, X1 represents factor A and X2
represents factor B. X1X2 is the interaction
term, and e is the error term.
The parameter estimates for this model turn
out to be ½ of the effect estimates.
The β estimates are:
0  Y 
20  40  30  52
 35.5
4
1
2
21
 10.5
2
1
2
11
 5.5
2
1  ( Aeffect ) 
 2  ( Beffect ) 
1
2
12  ( ABeffect ) 
1
 0.5
2
So the model is
Yˆ  35.5  10.5 X 1  5.5 X 2  0.5 X 1 X 2
With this equation, you can find all the
effects of the design. For example, if you
want to know the mean when both A and B
are at the high (+1) level, the equation is
Yˆ  35.5  10.5(1)  5.5(1)  0.5(1)( 1)  52
Now if you want the mean when A is at the
high level and B is at the low level, the
equation is
Yˆ  35.5  10.5(1)  5.5(1)  0.5(1)( 1)  40
All you have to do is fill in the values of X1
and X2 with the appropriate codes, +1 or -1.
Now suppose the data in this experiment are:
Factor A
Factor B
AB
Response
-1
-1
+1
20
+1
-1
-1
50
-1
+1
-1
40
+1
+1
+1
12
Now let’s look at the main and interaction
effects.
The main effects are
Aeffect 
(-1)(20)  (1)(50)  (-1)(40)  (1)(12)
1
2
Beffect 
(-1)(20)  (1)(50)  (1)(40)  (1)(12)
 9
2
The interaction effect is
ABeffect 
(1)( 20)  (1)(50)  (1)( 40)  (1)(12)
 29
2
which is very high and is significant.
Now let’s look at the main effects of the
factors graphically.
M ain Effect o f Facto r A
40
30
M ain Effect o f Facto r B
40
25
35
20
1
2
Facto r A lev el
Response
Response
35
30
25
20
1
2
Facto r B lev el
Clearly, factor A is not significant, which
you can see by the approximately 0
slope.
Factor B is probably significant
because the slope is not close to 0.
The p value from the F test gives the
actual significance.
Now let’s look at the interaction effect. This
is the effect of factors A and B in combination,
and is often the most important effect.
Interaction of Factors A and B
60
B Low
Response
50
B High
40
30
B Low
20
B High
10
0
-1
0
Lev el of Factor A
1
Now these two lines are definitely not
parallel, so there is an interaction. It
probably is very significant because the
two lines cross.
Only the p value associated with the
F test can give the actual significance,
but you can see with the naked eye that
there is no question about significance
here.
Interaction of factors is the key to the
East, as we say in the West.
Suppose you wanted the factor levels
that give the lowest possible response.
If you picked by main effects, you
would pick A low and B high.
But look at the interaction plot and it will
tell you to pick A high and B high.
This is why, if the interaction term is
significant, you never, never, never
interpret the corresponding main effects.
They are meaningless in the presence
of interaction.
And it is because factorial designs
provide the ability to test for interactions
that they are so popular and so
successful.
You can get response surface plots for
these regression equations. If there is
no interaction, the response surface is
a plane in the 3rd dimension above the
X1,X2 Cartesian space. The plane may
be tilted, but it is still a plane.
If there is interaction, the response
surface is a twisted plane representing
the curvature in the model.
The simplest factorials are two-factor
experiments.
As an example, a battery must be designed
to be robust to extreme variations in
temperature. The engineer has three possible
choices for the plate material. He decides to
test all three plate materials at three
temperatures.
He tests four batteries at each combination
of material type and temperature. The
response variable is battery life.
Here are
the data
he got.
Plate
material
type
1
2
3
Temperature (˚F)
-15
70
125
130
34
20
74
40
70
155
80
82
180
75
58
150
136
25
159
122
70
188
106
58
126
115
45
138
174
96
110
120
104
168
150
82
160
139
60
The model here is
Yijk  Y   j   k   j  k   ijk
Both factors are fixed so we have the
same constraints as before
J
and
K
 j  0
 k  0
j i
In addition,
k i
J
K
    
j i
j
k
k 1
j
k
0
The experiment has n = 4 replicates, so
there are nJK total observations.
n
Y 
J
Y
i 1 j 1 k 1
nJK
n
Yj 
K
K
Y
i 1 k 1
nK
n
Yk 
J
Y
i 1 j 1
nJ
n
Y jk 
ijk
Y
ijk
i 1
n
ijk
ijk
The total sum of squares can be
partitioned into four components:
n J K
(Y
i 1 j 1 k 1
ijk
J
K
J K
n J K
 Y )  nK  (Y j  Y )  nJ  (Yk  Y )  n (Y jk  Y j  Yk  Y )   (Yijk  Y jk )2
2
2
j 1
2
k 1
2
j 1 k 1
SStotal = SSA + SSB + SSAB +SSe
i 1 j 1 k 1
The expectation of the MS due to each
of these components is
E ( MS E )   2
J
E ( MS AB )   2 
K
n ( j  k ) 2
j 1 k 1
( J  1)( K  1)
J
E ( MS A )   2 
Kn  2j
j 1
J 1
K
E ( MS B )   2 
Jn  k2
k 1
K 1
So the appropriate F-ratio for testing each of
these effects is
Test of A effect:
MS A
F
MS E
MS B
Test of B effect: F 
MS E
Test of AB interaction:
MS AB
F
MS E
and the ANOVA table is
Source
SS
df
MS
A
SSA
J-1
B
SSB
K-1
AB
SSAB
(J-1)(K-1)
Error
SSe
JK(n-1)
Total
SStotal
JKn -1
p
For the battery life experiment,
Material
type
Temperature (˚F)
-15
70
125
1
134.75
57.25
57.50
83.17
2
155.75
119.75
49.50
108.33
3
144.00
145.75
85.50
125.08
Yk
144.83
107.58
64.17 Y  105.53
Yj
The ANOVA table is
Source
SS
Material
10,683.72
Temperature 39,118.72
Interaction
9,613.78
Error
18,230.75
Total
77,646.97
df
MS
2
5,341.86
2 19,558.36
4
2,403.44
27
675.21
35
p
0.0020
0.0001
0.0186
Because the interaction is significant, the only
plot of interest is the interaction plot.
Interaction Plot for Material Ty pe v s Temperature
170
Type 2
Battery Life in Hours
150
Type 3
Type 3
Type 1
130
Type 2
110
90
Type 3
70
Type 1
50
Type 1
Type 2
30
15
70
Temperature
125
Although it is not the best at the lowest
temperature, Type 3 is much better than
the other two at normal and high
temperatures. Its life at the lowest
temperature is just an average of 12
hours less than the life with Type 2.
Type 3 would probably provide the
design most robust to temperature
differences.
Suppose you have a factorial design
with more than two factors. Take, for
example, a three-way factorial design,
where the factors are A, B, and C.
All the theory is the same, except that
now you have three 2-way interactions,
AB, AC, BC, and one 3-way interaction,
ABC.
Consider the problem of soft-drink
bottling. The idea is to get each bottle
filled to a uniform height, but there is
variation around this height. Not every
bottle is filled to the same height.
The process engineer can control three
variables during the filling process:
percent carbonation (A), operating
pressure (B), and number of bottles
produced per minute or line speed (C).
The engineer chooses three levels of
carbonation (factor A), two levels of pressure
(factor B), and two levels for line speed
(factor C). This is a fixed effects design.
He also decides to run two replicates.
The response variable is the average
deviation from the target fill height in a
production run of bottles at each set of
conditions. Positive deviations are above the
target and negative deviations are below the
target.
The data are
Operating pressure (B)
Percent
carbonation
(A)
10
12
14
25 psi
30 psi
line speed (C) line speed (C)
200
250
200
250
-3
-1
-1
1
-1
0
0
1
0
2
2
6
1
1
3
5
5
7
7
10
4
6
9
11
The 3–way means are
Operating pressure (B)
Percent
carbonation
(A)
25 psi
30 psi
line speed (C)
line speed (C)
200
250
200
250
10
-2
-.5
-.5
1
12
.5
1.5
2.5
5.5
14
4.5
6.5
8
10.5
The 2-way means are
B (low)
B (high)
C (low)
C (high)
A
25 psi
30 psi
A
200
250
10
-1.25
0.25
10
-1.25
0.25
12
1.00
4.00
12
1.50
3.50
14
5.50
9.25
14
6.25
8.50
C (low) C (high)
B
200
250
25 psi
1.00
2.50
30 psi
3.33
5.67
The main effect means are
Factor A
Mean
Factor B Mean
10 %
-0.500
25 psi
1.75
12 %
14 %
2.500
7.375
30 psi
4.50
Factor C Mean
200
2.167
250
4.083
The ANOVA table is
Source
SS
df
MS
A
252.750
2 126.375
B
45.375
1 45.375
C
22.042
1 22.042
AB
5.250
2
2.625
AC
0.583
2
0.292
BC
1.042
1
1.042
ABC
1.083
2
0.542
Error
8.500 12
0.708
Total
336.625 23
p
<0.0001
<0.0001
0.0001
0.0557
0.6713
0.2485
0.4867
So the only significant effects are those
for A, B, C, AB. The AB interaction is
barely significant, so interpretation must
be tempered by what we see in the A
and B main effects. The plots are
shown next.
The plots are
Factor B
Factor A
8
7
7
6
6
5
Response
8
Response
5
4
3
2
4
3
2
1
1
0
0
-1
-1
25
10
12
30
14
Lev el of B
Lev el of A
Factor C
AB Interactio n
8
10
B=30 psi
7
8
5
4
Response
Re spon se
6
3
2
1
6
B=25 psi
4
B=30 psi
2
B=25 psi
0
B=30 psi
0
-1
200
250
L e v e l of C
B=25 psi
-2
10
12
L ev el o f A
14
Our goal is to minimize the response.
Given the ANOVA table and these plots,
we would choose the low level of factor
A, 10% carbonation, and the low level
of factor B, 25 psi. This is true whether
we look at the two main effects plots or
the interaction plot. This is because the
interaction is barely significant.
We would also choose the slower line
speed, 200 bottles per minute.
Now suppose you do an experiment
where you suspect nonlinearity and
want to test for both linear and
quadratic effects.
Consider a tool life experiment, where
the life of a tool is thought to be a
function of cutting speed and tool angle.
Three levels of each factor are used.
So this is a 2-way factorial fixed effects
design.
The three levels of cutting speed are 125, 150,
175. The three levels of tool angle are 15˚,
20˚, 25˚. Two replicates are used and the
data are shown below.
Tool Angle
(degrees)
15
20
25
Cutting Speed (in/min)
125
150
175
-2
-3
2
-1
0
3
0
1
4
2
3
6
-1
5
0
0
6
-1
The ANOVA table for this experiment is
Source
SS df MS
p
Tool Angle 24.33 2 12.17 0.0086
Cut Speed 25.33 2 12.67 0.0076
TC
61.34 4 15.34 0.0018
Error
13.00 9
1.44
Total
124.00 17
The table of cell and marginal means is
Factor
T
Factor C
Yj
125
150
175
15˚
-1.5
-1.5
2.5
-0.167
20˚
1.0
2.0
5.0
2.667
25˚
-0.5
5.5
-0.5
1.500
Yk
-0.33
2.0
2.33
Cutting Speed Factor
3
3
2 .5
2 .5
2
2
Response
Response
Tool Angle Factor
1 .5
1
1 .5
1
0 .5
0 .5
0
0
-0 .5
15
20
Lev el of T
25
-0 .5
125
150
Lev el of C
175
Clearly there is reason to suspect
quadratic effects here. So we can
break down each factor’s df into linear
and quadratic components.
We do this by using orthogonal
contrasts. The contrast for linear is
-1, 0, =1 and the contrast for quadratic
is +1, -2, +1.
We need a table of factor totals to proceed.
For factor T,
Factor T Sum of Obs
15
-1
20
16
25
9
Now applying the linear and quadratic
contrasts to these sums,
Factor
Sum
T
of Obs
Linear
Quadratic
15
-1
-1
+1
20
16
0
-2
25
9
+1
+1
10
-24
Contrast
Now to find the SS due to these two new
contrasts,
2


  c jY j 
2




10
j 1
 
SSlin  
 8.33
3
(2)(3)( 2)
2
nJ  c j
3
j 1
2


  c jY j 
2





24
j

1
 

 16
3
(2)(3)(6)
2
nJ  c j
3
SSquad
j 1
Now we can do the same thing for factor C.
The table of sums with the contrasts
Factor C
Sum of Obs
Linear
Quadratic
included is
125
-2
-1
+1
150
12
0
-2
175
14
+1
+1
16
-12
Contrast
Now for the SS due to each contrast,
2
2


  ckYk 
2

16 
k 1


SS lin 

 21.33
3
(
2
)(
3
)(
2
)
nK  ck2
3
k 1
SSquad
 3

  ckYk 
2

 12 
k 1




 4.0
3
(2)(3)(6)
2
nK  ck
k 1
Now we can write the new ANOVA table
Source
SS
df
Tool angle 24.33
2
Linear
8.33
1
Quad
16.00
1
Cut Speed 25.33
2
Linear
21.33
1
Quad
4.00
1
TC
61.34
4
Error
13.00
9
Total
124.00 17
MS
12.17
8.33
16.00
12.67
21.33
4.00
15.34
1.44
p
0.0086
0.0396
0.0088
0.0076
0.0039
0.1304
0.0018
Now see how the df for each of the factors
has been split into its two components, linear
and quadratic. It turns out that everything
except the quadratic for Cutting Speed is
significant.
Now guess what! There are 4 df for the
interaction term and why not split them into
linear and quadratic components as well. It
turns out that you can get TlinClin, TlinCquad,
TquadClin, and TquadCquad.
These 4 components use up the 4 df for the
interaction term.
There is reason to believe the quadratic
component in the interaction, as shown
below, but we’ll pass on this for now.
Interaction of Tool Angle and Cutting Speed
6
Speed 150
5
Speed 175
3
Speed 175
2
Speed 150
1
Speed 125
Interaction of Tool Angle and Cutting Speed
6
Angle25
5
0
Speed 175
125
Speed 125
150
-2
15
20
Tool Angle
25
Angle 20
4
-1
Response
Response
4
3
Angle 15
2
1
Angle 20
Angle 20
0
Angle25
Angle25
-1
Angle 15
-2
125
Angle 15
150
Cutting Speed
175
Now let’s talk about blocking in a factorial
design. The concept is identical to blocking
in a 1-way design. There is either a nuisance
factor or it is not possible to completely
randomize all the runs in the design.
For example, there simply may not be enough
time to run the entire experiment in one day,
so perhaps the experimenter could run one
complete replicate on one day and another
complete replicate on the second day, etc.
In this case, days would be a blocking factor.
Let’s look at an example. An engineer
is studying ways to improve detecting
targets on a radar scope. The two
factors of importance are background
clutter and the type of filter placed over
the screen.
Three levels of background clutter and
two filter types are selected to be tested.
This is a fixed effects 2 x 3 factorial
design.
To get the response, a signal is
introduced into the scope and its
intensity is increased until an operator
sees it. Intensity at detection is the
response variable.
Because of operator availability, an
operator must sit at the scope until all
necessary runs have been made. But
operator differ in skill and ability to use
the scope, so it makes sense to use
operators as a blocking variable.
Four operators are selected for use in the
experiment. So each operator receives the
2 x 3 = 6 treatment combinations in random
order, and the design is a completely
randomized block design. The data are:
Operators
Filter type
1
2
3
4
1
2
1
2
1
2
1
2
Low
90
86
96
84
100
92
92
81
Medium
102
87
106
90
105
97
96
80
High
114
93
112
91
108
95
98
83
Ground
clutter
Since each operator (block) represents the
complete experiment, all effects are within
operators. The ANOVA table is
Source
Within blocks
Ground clutter
Filter type
GF interaction
Between blocks
Error
Total
SS
df
MS
1479.33 5 295.87
335.58
2 167.79
1066.67
1 1066.67
77.08
2
38.54
402.17
3 134.06
166.33 15
11.09
2047.83 23
p
<0.000001
<0.0003
<0.0001
0.0573
<0.0002
The effects of both the background
clutter and the filter type are highly
significant. Their interaction is
marginally significant.
As suspected, the operators are
significantly different in their ability to
detect the signal, so it is good that they
were used as blocks.
Now let’s look at the 2k factorial design.
This notation means that there are k
factors, each at 2 levels, usually a high
and a low level. These factors may be
qualitative or quantitative.
This is a very important class of designs
and is widely used in screening
experiments. Because there are only 2
levels, it is assumed that the response is
linear over the range of values chosen.
Let’s look at an example of the simplest
of these designs, the 22 factorial design.
Consider the effect of reactant
concentration (factor A) and amount of
catalyst (factor B) on the yield in a
chemical process.
The 2 levels of factor A are: 15% and
25%. The 2 levels of factor B are: 1
pound and 2 pounds. The experiment
is replicated three times.
Here are the data.
Factor A
Factor B Replicate 1
Replicate 2
Replicate 3
-1 (low)
-1 (low)
28
25
27
26.67
+1 (high) -1 (low)
36
32
32
33.33
-1 (low)
+1 (high)
18
19
23
20.00
+1 (high) +1 (high)
31
30
29
30.00
Y jk
This design can be pictured as rectangle.
20
30
+1
factor B
-1
26.67
-1
factor A
33.33
+1
The interaction codes can also be derived
from this table.
Factor A
Factor B
AB interaction
-1 (low)
-1 (low)
(-1)(-1)=
+1
+1 (high)
-1 (low)
(+1)(-1)=
-1
-1 (low)
+1 (high)
(-1)(+1)=
-1
+1 (high)
+1 (high)
(+1)(+1)= +1
Multiplying the A and B factor level codes
gets the AB interaction codes. This is
always the way interaction codes are
obtained. Now averaging according to the
AB codes gives the interaction effect.
Now we can find the effects easily from the
table below. A B AB Replicate
average
-1
-1
+1
26.67
+1
-1
-1
33.33
-1
+1
-1
20
+1
+1
+1
30
Aeffect 
(1)( 26.67)  (1)(33.33)  (1)( 20)  (1)(30) 16.67

 8.33
2
2
Beffect 
(1)( 26.67)  (1)(33.33)  (1)( 20)  (1)(30)  10

 5
2
2
ABeffect 
(1)( 26.67)  (1)(33.33)  (1)( 20)  (1)(30) 3.34

 1.67
2
2
Because there are only first-order
effects, the response surface is a plane.
Yield increases with increasing reactant
concentration (factor A) and decreases
with increasing catalyst amount (factor
B).
The ANOVA table is
Source
A
B
AB
Error
Total
SS
208.33
75.00
8.33
31.34
323.00
df
MS
p
1 208.33 <0.0001
1
75.00 <0.0024
1
8.33 0.1826
8
3.92
11
It is clear that both main effects are
significant and that there is no AB
interaction.
The regression model is
8.33
 5.00
ˆ
Y  27.5 
X1 
X2
2
2
where the β coefficients are ½ the
effects, as before. 27.5 is the grand
mean of all 12 observations.
Now let’s look at the 23 factorial design. In this
case, there are three factors, each at 2 levels.
The design is
Run
A
B
C
AB
AC
BC
ABC
1
-1
-1
-1
(-1)(-1)= +1
(-1)(-1)= +1
(-1)(-1)= +1
(-1)(-1)(-1)= -1
2
+1
-1
-1
(+1)(-1)= -1
(+1)(-1)= -1
(-1)(-1)= +1
(+1)(-1)(-1)= +1
3
-1
+1
-1
(-1)(+1)= -1
(-1)(-1)= +1
(+1)(-1)= -1
(-1)(+1)(-1)= +1
4
+1
+1
-1
(+1)(+1)= +1
(+1)(-1)= -1
(+1)(-1)= -1
(+1)(+1)(-1)= -1
5
-1
-1
+1
(-1)(-1)= +1
(-1)(+1)= -1
(-1)(+1)= -1
(-1)(-1)(+1)= +1
6
+1
-1
+1
(+1)(-1)= -1
(+1)(+1)= +1
(-1)(+1)= -1
(+1)(-1)(+1)= -1
7
-1
+1
+1
(-1)(+1)= -1
(-1)(+1)= -1
(+1)(+1)= +1
(-1)(+1)(+1)= -1
8
+1
+1
+1
(+1)(+1)= +1
(+1)(+1)= +1
(+1)(+1)= +1
(+1)(+1)(+1)= +1
Remember the beverage filling study we
talked about earlier? Now assume that
each of the 3 factors has only two
levels.
So we have factor A (% carbonation) at
levels 10% and 12%.
Factor B (operating pressure) is at
levels 25 psi and 30 psi.
Factor C (line speed) is at levels 200
and 250.
Now our experimental matrix becomes
Run
A: Percent
carbonation
B: Operating
pressure
C: Line
speed
Replicate
1
Replicate
2
Mean of obs
1
10
25
200
-3
-1
-2
2
12
25
200
0
1
0.5
3
10
30
200
-1
0
-0.5
4
12
30
200
2
3
2.5
5
10
25
250
-1
0
-0.5
6
12
25
250
2
1
1.5
7
10
30
250
1
1
1
8
12
30
250
6
5
5.5
And our design matrix is
Run
A
B
C
AB
AC
BC
ABC
Replicate 1
Replicate 2
Mean of obs
1
-1
-1
-1
+1
+1
+1
-1
-3
-1
-2
2
+1
-1
-1
-1
-1
+1
+1
0
1
0.5
3
-1
+1
-1
-1
+1
-1
+1
-1
0
-0.5
4
+1
+1
-1
+1
-1
-1
-1
2
3
2.5
5
-1
-1
+1
+1
-1
-1
+1
-1
0
-0.5
6
+1
-1
+1
-1
+1
-1
-1
2
1
1.5
7
-1
+1
+1
-1
-1
+1
-1
1
1
1
8
+1
+1
+1
+1
+1
+1
+1
6
5
5.5
From this matrix, we can determine all our
effects by applying the linear codes and
dividing by 4, the number of +1 codes in the
column.
The effects are
Aeffect 
(1)( 2)  (1)(0.5)  (1)( 0.5)  (1)( 2.5)  (1)( 0.5)  (1)(1.5)  (1)(1)  (1)(5.5) 12

3
4
4
Beffect 
(1)( 2)  (1)(0.5)  (1)( 0.5)  (1)( 2.5)  (1)( 0.5)  (1)(1.5)  (1)(1)  (1)(5.5) 9
  2.25
4
4
Ceffect 
(1)( 2)  (1)(0.5)  (1)( 0.5)  (1)( 2.5)  (1)( 0.5)  (1)(1.5)  (1)(1)  (1)(5.5) 7
  1.75
4
4
ABeffect 
(1)( 2)  (1)(0.5)  (1)( 0.5)  (1)( 2.5)  (1)( 0.5)  (1)(1.5)  (1)(1)  (1)(5.5) 3
  0.75
4
4
ACeffect 
(1)( 2)  (1)(0.5)  (1)( 0.5)  (1)( 2.5)  (1)( 0.5)  (1)(1.5)  (1)(1)  (1)(5.5) 1
  0.25
4
4
BCeffect 
(1)( 2)  (1)(0.5)  (1)( 0.5)  (1)( 2.5)  (1)( 0.5)  (1)(1.5)  (1)(1)  (1)(5.5) 2
  0.5
4
4
ABCeffect 
(1)( 2)  (1)(0.5)  (1)( 0.5)  (1)( 2.5)  (1)( 0.5)  (1)(1.5)  (1)(1)  (1)(5.5) 2
  0.5
4
4
The ANOVA table is
Source
A: Percent carb
B: Op Pressure
C: Line speed
AB
AC
BC
ABC
Error
Total
SS
df
MS
p
36.00
20.25
12.25
2.25
0.25
1.00
1.00
5.00
78.00
1
1
1
1
1
1
1
8
15
36.00
20.25
12.25
2.25
0.25
1.00
1.00
0.625
<0.0001
<0.0005
0.0022
0.0943
0.5447
0.2415
0.2415
There are only 3 significant effects, factors A, B, and
C. None of the interactions is significant.
The regression model for soft-drink fill
height deviation is
Yˆ   0  1 X 1   2 X 2   3 X 3
3
2.25
1.75
 1.00  X 1 
X2 
X3
2
2
2
Because the interactions are not
significant, they are not included in the
regression model. So the response
surface here is a plane at each level of
line speed.
All along we have had at least 2
replicates for each design so we can
get an error term. Without the error
term, how do we create the F-ratio to
test for significance?
But think about it. A 24 design has 16
runs. With 2 replicates, that doubles to
32 runs. The resources need for so
many runs are often not available, so
some large designs are run with only 1
replicate.
Now what do we do for an error term to
test for effects?
The idea is to pool some high-level
interactions under the assumption that
they are not significant anyway and use
them as an error term. If indeed they
are not significant, this is OK. But what
if you pool them as error and they are
significant? This is not OK.
So it would be nice to know before we pool,
which terms are actually poolable. Thanks to
Cuthbert Daniel, we can do this. Daniel’s idea
is to do a normal probability plot of the
effects.
All negligible effects will fall along a line and
those that do not fall along the line are
significant. So we may pool all effects that
are on the line. The reasoning is that the
negligible effects, like error, are normally
distributed with mean 0 and variance σ2 and
so will fall along the line.
Let’s look at an example of a chemical
product. The purpose of this
experiment is to maximize the filtration
rate of this product, and it is thought to
be influenced by 4 factors: temperature
(A), pressure (B), concentration of
formaldehyde (C), and stirring rate (D).
The design matrix and response are:
Run
A
B
C
D
AB
AC
BC
AD
BD
CD
ABC
ABD
ACD
BCD
ABCD
Filt
rate
1
-1
-1
-1
-1
+1
+1
+1
+1
+1
+1
-1
-1
-1
-1
+1
45
2
+1
-1
-1
-1
-1
-1
+1
-1
+1
+1
+1
+1
+1
-1
-1
71
3
-1
+1
-1
-1
-1
+1
-1
+1
-1
+1
+1
+1
-1
+1
-1
48
4
+1
+1
-1
-1
+1
-1
-1
-1
-1
+1
-1
-1
+1
+1
+1
65
5
-1
-1
+1
-1
+1
-1
-1
+1
+1
-1
+1
-1
+1
+1
-1
68
6
+1
-1
+1
-1
-1
+1
-1
-1
+1
-1
-1
+1
-1
+1
+1
60
7
-1
+1
+1
-1
-1
-1
+1
+1
-1
-1
-1
+1
+1
-1
+1
80
8
+1
+1
+1
-1
+1
+1
+1
-1
-1
-1
+1
-1
-1
-1
-1
65
9
-1
-1
-1
+1
+1
+1
+1
-1
-1
-1
-1
+1
+1
+1
-1
43
10
+1
-1
-1
+1
-1
-1
+1
+1
-1
-1
+1
-1
-1
+1
+1
100
11
-1
+1
-1
+1
-1
+1
-1
-1
+1
-1
+1
-1
+1
-1
+1
45
12
+1
+1
-1
+1
+1
-1
-1
+1
+1
-1
-1
+1
-1
-1
-1
104
13
-1
-1
+1
+1
+1
-1
-1
-1
-1
+1
+1
+1
-1
-1
+1
75
14
+1
-1
+1
+1
-1
+1
-1
+1
-1
+1
-1
-1
+1
-1
-1
86
15
-1
+1
+1
+1
-1
-1
+1
-1
+1
+1
-1
-1
-1
+1
-1
70
16
+1
+1
+1
+1
+1
+1
+1
+1
+1
+1
+1
+1
+1
+1
+1
96
From this matrix, we can estimate all the
effects and then do a normal probability
plot of them. The effects are:
A= 21.625
B= 3.125
C= 9.875
D= 14.625
AB= 0.125
AC=-18.125
AD= 16.625
BC= 2.375
BD= -0.375
CD= -1.125
ABC= 1.875
ABD= 4.125
ACD=-1.625
BCD=-2.625
ABCD=1.375
The best stab at a normal probability plot is
normal probability plot of effects
100
cumulative probability
90
80
70
60
50
40
30
20
10
0
-3 0
-2 0
-1 0
0
effects
10
20
30
There are only 5 effects that are off the
line. These are, in the upper right
corner: C, D, AD, A, and in the lower
left corner, AC. All of the points on the
line are negligible, behaving like
residuals.
Because we drop factor B and all its
interactions, we now get an ANOVA table with
the extra observations as error.
Source
SS
df
MS
A
1870.56
1 1870.56
C
390.06
1
390.06
D
855.56
1
855.56
AC
1314.06
1 1314.06
AD
1105.56
1 1105.56
CD
5.06
1
5.06
ACD
10.56 1
10.56
Error
179.52 8
22.44
Total
5730.94 15
p
<0.0001
<0.0001
<0.0001
<0.0001
<0.0001
Essentially, we have changed the design
from a 24 design with only 1 replicate to
a 23 design with two replicates.
This is called projecting a higher-level
design into a lower-level design. If you
start with an unreplicated 2k design, then
drop h of the factors, you can continue
with a 2k-h design with 2h replicates.
In this case, we started with a 24 design,
dropped h=1 factor, and ended up with a
24-1 design with 21 replicates.
The main effects plots are
Factor A
Factor C
85
80
75
response
75
70
70
65
65
60
60
55
55
-1
0
1
-1
Lev el of A
0
Lev el of C
Factor D
80
75
response
response
80
70
65
60
55
-1
0
Lev el of D
1
1
The two significant interaction plots are
AD Interaction
AC Interaction
100
90
85
90
80
C high
75
C high
70
65
60
85
response
response
D high
95
C lo w
80
75
70
65
D lo w
60
55
55
50
50
45
45
C lo w
-1
D lo w
D high
0
Lev el of A
1
-1
0
Lev el of A
1
Now we are going to talk about the
addition of center points to a 2k design.
In this case, we are looking for
quadratic curvature, so we must have
quantitative factors.
The center points are run at 0 for each
of the k factors in the design. So now
the codes are -1, 0, +1. We have n
replicates at the center points.
Now let’s go back to the box we used
earlier to describe a 22 design.
+1
factor B
-1
-1
+1
factor A
At each corner, we have a point of the
design, for example, (A-,B-), (A-,B+),
(A+,B-), and (A+,B+).
Now we can add center points to this design
to see if there is quadratic curvature.
+1
factor B 0
o
-1
-1
0
+1
factor A
Now in addition to the 4 points we had earlier,
we have n observations at the center point:
(A=0, B=0).
Now if we average the 4 factorial points
to get Y factorial, and then average the n
center points to get Ycenter , we can tell if
there is a quadratic effect by the size of
Y factorial  Ycenter.
If this difference is small, then the
center points lie on the plane
established by the factorial points. If
this difference is large, then there is
quadratic curvature present.
A single df SS for pure quadratic curvature is given by
SSquad 
nF nc (Y factorial  Ycenter ) 2
nF  nc
where nF is the number of design points in the 2k
design and nC is the number of replicates at the
center point.
This SSquad = MSquad can be tested for significance by
MSerror.
Let’s look at an example.
A chemical engineer is studying the
yield of a process. The two factors of
interest are reaction time (A) and
reaction temperature (B).
The engineer decides to run a 22
unreplicated factorial and include 5
center points to test for quadratic
curvature.
The design then has reaction time at 30, 35,
40 minutes and reaction temp at 150˚C,
155˚C, 160˚C. So the design points and the
yield data are
40
41.5
+1
factor B
40.3
40.5
40.7
40.2
40.6
0
-1
39.3
-1
0
factor A
40.9
+1
The ANOVA table for this experiment is
Source
SS
df
MS
p
A (time)
2.4025 1 2.4025 0.0017
B (temp)
0.4225 1 0.4225 0.0350
AB
0.0025 1 0.0025 0.8185
Pure quad 0.0027 1 0.0027 0.8185
Error
0.1720 4 0.0430
Total
3.0022 8
In this design, Y factorial = 40.425 and
Y center = 40.46. Since the difference is very small,
there is no quadratic effect, so the center points
may be used to get an error term to test each of the
effects.
5
MS error 
SS error

df error
 (Y
center 1
c
 Yc )
nc  1
5
2

 (Y
center 1
c
 40.46) 2
5 1
 0.0430
So now this unreplicated design has an error term
from the replicated center points that lie on the
same plane as the factorial points.
In this experiment, a first-order model is
appropriate because there is no
quadratic effect and no interaction. But
suppose we have a situation where
quadratic terms will be required and we
have the following second-order model
Y   0  1 X 1   2 X 2  12 X 1 X 2  11 X 12   22 X 22  
But this gives 6 values of β to estimate
and the 22 design with center points has
only 5 independent runs. So we cannot
estimate the 6 parameters unless we
change the design.
So we augment the design with 4 axial
runs and create a central composite
design to fit the second-order model.
The central composite design for a 22 factorial looks like
this in our box format x2,
*(0,α)
(-,+)
*
(+,+)
*(α,0)
o (0,0)
(-α,0)
(-,-)
(+,-)
*(0,-α)
X1
We’ll talk about central composite
designs later, when we cover response
surface methodology.
Now we’ll move on to fractional 2k
factorials. A fractional factorial is a ½
fraction or a ¼ fraction or an 1/8 fraction
of a complete factorial. Fractional
factorials are used when a large number
of factors need to be tested and
higher-order interactions are
considered unlikely.
Fractional factorials are widely used as
screening experiments, where we try to
identify which factors have a major
effect and which factors are not
relevant.
They are often used in the early stages
of a project when the major features of
the project are little understood.
They are often followed by sequential
studies to explore the project further.
A ½ fraction is obtained as follows.
Suppose you have three factors of
interest and need a 23 design (8 runs)
but for whatever reason, you cannot
make 8 runs. You can however make 4
runs.
So instead of a 23 design, you use a
23-1 design or 4 runs. This 23-1 design
is called a ½ fraction of the 23 design.
To create the 23-1 design, set up a 22 design
and put the third factor in the AB interaction
column.
Run
Factor A
Factor B
Factor C = AB
1
-1
-1
+1
2
+1
-1
-1
3
-1
+1
-1
4
+1
+1
+1
Now factor C is confounded with AB. You
cannot separate the effect of the AB
interaction from the effect of the C factor. In
other words, C is aliased with AB.
What may be the consequences of this
confounding?
1. The AB effect and the C effect may both
be large and significant but are in the
opposite direction so they cancel each other
out. You would never know.
2. The AB effect and the C effect may both
be small but in the same direction, so the
effect looks significant, but neither AB nor C
separately is significant. Again you wouldn’t
know.
3. One effect may be significant and the
other may not, but you cannot tell which one
is significant.
But this isn’t all. Now where are the AC and the BC
interactions? Well, multiplying the codes we get
Run
Factor A + BC
Factor B + AC
Factor C + AB
1
-1
-1
+1
2
+1
-1
-1
3
-1
+1
-1
4
+1
+1
+1
So a fractional factorial doesn’t just confound the AB
effect with the C effect, it also confounds all main
effects with 2-way interactions.
When effects are confounded, they are called aliased.
Now since A is aliased with BC, the first column
actually estimates A+BC. Similarly, the second
column estimates B+AC because B and AC are aliases
of one another. The third column estimates the sum
of the two aliases C and AB.
Now there are some better fractional designs,
but you have to look at the generator to see
them.
C=AB is called the generator of the design.
Since C = AB, multiply both sides by C to get
I= ABC, so I= ABC is the defining relation for
the design. The defining relation is the set of
columns equal to I.
ABC is also called a word. The length of the
defining relation word tells you the resolution
of the design. The defining relation ABC is a
3-letter word so this design is of resolution
III.
What does design resolution mean? Design
resolution tells you the degree of
confounding in the design. There are three
levels of resolution.
Resolution III: Main effects are aliased with
2-factor interactions and 2-factor
interactions may be aliased with one another.
Resolution IV: Main effects are not aliased
with 2-factor interactions, but 2-factor
interactions are aliased with one another.
Resolution V: main effects are not
aliased with 2-factor interactions and
2-factor interactions are not aliased
with one another. But main effects and
2-way interactions may be aliased with
higher-way interactions.
Of course, we would like to have the
highest resolution design possible
under the circumstances.
You can also use the defining relation to
get the aliasing. In this example, where
I = ABC, we can get the aliases by
multiplying any column by the defining
relation.
Alias of A: A*ABC = IBC=BC so A is
aliased with BC
Alias of B: B*ABC = AIC= AC so B is
aliased with AC.
Let’s look at 24-1 factorial, a resolution IV
design. First we create a 23 design.
Run
A
B
C
AB
AC
BC
D=ABC
1
-1
-1
-1
+1
+1
+1
-1
2
+1
-1
-1
-1
-1
+1
+1
3
-1
+1
-1
-1
+1
-1
+1
4
+1
+1
-1
+1
-1
-1
-1
5
-1
-1
+1
+1
-1
-1
+1
6
+1
-1
+1
-1
+1
-1
-1
7
-1
+1
+1
-1
-1
+1
-1
8
+1
+1
+1
+1
+1
+1
+1
Then we alias the 4th factor with the highest
level interaction.
The generator for this design is D=ABC.
To get the defining relation, multiply
both sides by D to get I = ABCD.
Since the defining relation word here is
length 4, this is a resolution IV design.
Now let’s look at the aliases for this
design.
Alias
Alias
Alias
Alias
for
for
for
for
A:
B:
C:
D:
A*ABCD = BCD
B*ABCD = ACD
C*ABCD = ABD
D*ABCD = ABC
Alias for AB: AB*ABCD = CD
Alias for AC: AC*ABCD = BD
Alias for BC: BC*ABCD = AD
After all the aliasing, the design is
Run
A+BCD
B+ACD
C+ABD
AB+CD
AC+BD
BC+AD
D+ABC
1
-1
-1
-1
+1
+1
+1
-1
2
+1
-1
-1
-1
-1
+1
+1
3
-1
+1
-1
-1
+1
-1
+1
4
+1
+1
-1
+1
-1
-1
-1
5
-1
-1
+1
+1
-1
-1
+1
6
+1
-1
+1
-1
+1
-1
-1
7
-1
+1
+1
-1
-1
+1
-1
8
+1
+1
+1
+1
+1
+1
+1
Note that the main effects are aliased with
3-factor interactions and the 2-factor
interactions are aliased with one another, so
this is a resolution IV design.
Now let’s look at a 25-1 factorial, a resolution V
design. First we create the 24 design, and
then place the 5th factor in the highest-level
interaction column.
Run
A
B
C
D
AB
AC
AD
BC
BD
CD
ABC
ABD
ACD
BCD
E=ABCD
1
-1
-1
-1
-1
+1
+1
+1
+1
+1
+1
-1
-1
-1
-1
+1
2
+1
-1
-1
-1
-1
-1
-1
+1
+1
+1
+1
+1
+1
-1
-1
3
-1
+1
-1
-1
-1
+1
+1
-1
-1
+1
+1
+1
-1
+1
-1
4
+1
+1
-1
-1
+1
-1
-1
-1
-1
+1
-1
-1
+1
+1
+1
5
-1
-1
+1
-1
+1
-1
+1
-1
+1
-1
+1
-1
+1
+1
-1
6
+1
-1
+1
-1
-1
+1
-1
-1
+1
-1
-1
+1
-1
+1
+1
7
-1
+1
+1
-1
-1
-1
+1
+1
-1
-1
-1
+1
+1
-1
+1
8
+1
+1
+1
-1
+1
+1
-1
+1
-1
-1
+1
-1
-1
-1
-1
9
-1
-1
-1
+1
+1
+1
-1
+1
-1
-1
-1
+1
+1
+1
-1
10
+1
-1
-1
+1
-1
-1
+1
+1
-1
-1
+1
-1
-1
+1
+1
11
-1
+1
-1
+1
-1
+1
-1
-1
+1
-1
+1
-1
+1
-1
+1
12
+1
+1
-1
+1
+1
-1
+1
-1
+1
-1
-1
+1
-1
-1
-1
13
-1
-1
+1
+1
+1
-1
-1
-1
-1
+1
+1
+1
-1
-1
+1
14
+1
-1
+1
+1
-1
+1
+1
-1
-1
+1
-1
-1
+1
-1
-1
15
-1
+1
+1
+1
-1
-1
-1
+1
+1
+1
-1
-1
-1
+1
-1
16
+1
+1
+1
+1
+1
+1
+1
+1
+1
+1
+1
+1
+1
+1
+1
This is a resolution V design because
the generator is E=ABCD, which makes
the defining relation I = ABCDE. Now
let’s check the aliases.
Alias
Alias
Alias
Alias
Alias
of
of
of
of
of
A:
B:
C:
D:
E:
A*ABCDE = BCDE
B*ABCDE = ACDE
C*ABCDE = ABDE
D*ABCDE = ABCE
E*ABCDE = ABCD
Alias
Alias
Alias
Alias
Alias
Alias
Alias
Alias
Alias
Alias
of
of
of
of
of
of
of
of
of
of
AB:
AC:
AD:
BC:
BD:
CD:
AE:
BE:
CE:
DE:
AB*ABCDE = CDE
AC*ABCDE = BDE
AD*ABCDE = BCE
BC*ABCDE = ADE
BD*ABCDE = ACE
CD*ABCDE = ABE
AE*ABCDE = BCD
BE*ABCDE = ACD
CE*ABCDE = ABD
DE*ABCDE = ABC
Now the design is
Run
A+
BCDE
B+
ACDE
C+
ABDE
D+
ABCE
AB +
CDE
AC +
BDE
AD +
BCE
BC +
ADE
BD +
ACE
CD +
ABE
ABC
+DE
ABD
+CE
ACD
+BE
BCD
+AE
E+
ABCD
1
-1
-1
-1
-1
+1
+1
+1
+1
+1
+1
-1
-1
-1
-1
+1
2
+1
-1
-1
-1
-1
-1
-1
+1
+1
+1
+1
+1
+1
-1
-1
3
-1
+1
-1
-1
-1
+1
+1
-1
-1
+1
+1
+1
-1
+1
-1
4
+1
+1
-1
-1
+1
-1
-1
-1
-1
+1
-1
-1
+1
+1
+1
5
-1
-1
+1
-1
+1
-1
+1
-1
+1
-1
+1
-1
+1
+1
-1
6
+1
-1
+1
-1
-1
+1
-1
-1
+1
-1
-1
+1
-1
+1
+1
7
-1
+1
+1
-1
-1
-1
+1
+1
-1
-1
-1
+1
+1
-1
+1
8
+1
+1
+1
-1
+1
+1
-1
+1
-1
-1
+1
-1
-1
-1
-1
9
-1
-1
-1
+1
+1
+1
-1
+1
-1
-1
-1
+1
+1
+1
-1
10
+1
-1
-1
+1
-1
-1
+1
+1
-1
-1
+1
-1
-1
+1
+1
11
-1
+1
-1
+1
-1
+1
-1
-1
+1
-1
+1
-1
+1
-1
+1
12
+1
+1
-1
+1
+1
-1
+1
-1
+1
-1
-1
+1
-1
-1
-1
13
-1
-1
+1
+1
+1
-1
-1
-1
-1
+1
+1
+1
-1
-1
+1
14
+1
-1
+1
+1
-1
+1
+1
-1
-1
+1
-1
-1
+1
-1
-1
15
-1
+1
+1
+1
-1
-1
-1
+1
+1
+1
-1
-1
-1
+1
-1
16
+1
+1
+1
+1
+1
+1
+1
+1
+1
+1
+1
+1
+1
+1
+1
So the main effects are aliased with
4-factor interactions and the 2-factor
interactions are aliased with the
3-factor interactions.
This all seems better than a resolution
III design, on the surface, but
remember that these effects are still
confounded and all the consequences
of confounding are still there.
In all of the cases we have been talking
about, we have had ½ fractions. If
necessary to clear up any ambiguities,
we can always run the other ½ fraction.
If the original ½ fraction has defining
relation I=ABCD, the complementary ½
fraction has defining relation I = -ABCD.
Remember the filtration rate experiment
we talked about earlier which was an
unreplicated 24 factorial design. We
used the normal probability plot of
effects to determine that temperature
(A), concentration of formaldehyde (C),
and stirring rate (D) were significant,
along with AC and AD interactions.
Now what would have happened if we
had run a ½ fraction instead of the full
factorial?
Since the original design was 24, we use
instead a 24-1 ½ fraction. The design now is
Run
A+BCD
B+ACD
C+ABD
AB+CD
AC+BD
BC+AD
D+ABC
Filt rate
1
-1
-1
-1
+1
+1
+1
-1
45
2
+1
-1
-1
-1
-1
+1
+1
100
3
-1
+1
-1
-1
+1
-1
+1
45
4
+1
+1
-1
+1
-1
-1
-1
65
5
-1
-1
+1
+1
-1
-1
+1
75
6
+1
-1
+1
-1
+1
-1
-1
60
7
-1
+1
+1
-1
-1
+1
-1
80
8
+1
+1
+1
+1
+1
+1
+1
96
and the generator is D = ABC.
The effect of the first column is
A+BCD=(-1)45+(+1)100+(-1)45+(+1)65
+(-1)75+(+1)60+(-1)80 +(+1)96
= 76/4 = 19
The other effects are found in the same
way. They are:
B+ACD = 1.5
C+ABD = 14.0
D+ABC = 16.5
AB+CD = -1.0
AC+BD = 18.5
AD+BC = 19.0
But these effects are all confounded.
The engineer suspects that because the
B column effect is small and the A, C,
and D column effects are large that the
A, C, and D effects are significant.
He also thinks that the significant
interactions are AC and AD, not BD or
BC because the B effect is so small.
He may suspect, but is he right?
Let’s do the complementary ½ fraction and
find out. For the complementary ½ fraction,
the generator is D = -ABC. So the effect
confounding is
Run
A-BCD
B-ACD
C-ABD
AB-CD
AC-BD
BC-AD
D-ABC
Filt rate
1
-1
-1
-1
+1
+1
+1
+1
43
2
+1
-1
-1
-1
-1
+1
-1
71
3
-1
+1
-1
-1
+1
-1
-1
48
4
+1
+1
-1
+1
-1
-1
+1
104
5
-1
-1
+1
+1
-1
-1
-1
68
6
+1
-1
+1
-1
+1
-1
+1
86
7
-1
+1
+1
-1
-1
+1
+1
70
8
+1
+1
+1
+1
+1
+1
-1
65
Now we can find the effects from this
design the same way as from the
original one.
A-BCD
B-ACD
C-ABD
D-ABC
=
=
=
=
24.25
4.75
5.75
12.75
AB-CD =
1.25
AC-BD = -17.75
AD-BC = 14.25
Now we can resolve the ambiguities by
combining the effects from the original ½
fraction and those from the complementary ½
fraction.
Since the original ½ fraction estimates
A+BCD and the complementary ½ fraction
estimates A-BCD, we can isolate A by
averaging the two estimates. This gives
[(A+BCD) + (A-BCD)]/2 = 2A/2 = A
Similarly we can isolate the BCD effect by
[(A+BCD) – (A-BCD)]/2 = 2BCD/2 = BCD.
The unconfounded estimates are
Column
Original
estimate
Complementary
estimate
A
19
24.25
B
1.5
4.75
C
14
5.75
9.88 → C
D
16.5
12.75
14.63 → D
AB
-1
1.25
AC
18.5
AD
19
1/2(Orig + Comp)
21.63 → A
3.13 → B
1/2(Orig – Comp)
-2.63 → BCD
-1.63 → ACD
4.13 → ABD
1.88 → ABC
0.13 → AB
-1.13 → CD
-17.75
-18.13 → AC
-0.38 → BD
14.25
16.63 → AD
2.38 → BC
So we can unconfound the effects by doing
the complementary ½ fraction. This should
not be surprising because the complete
factorial has no confounding.
Now let’s look at the ¼ fraction design.
The designation for a ¼ fraction is 2k-2
fractional design.
To make a ¼ fraction design, say a 26-2,
we first create a 24 design and
associate the extra two variables with
the highest-level interactions. This
means that a ¼ fraction will have two
generators.
In the 26-2 example, we may associate
factor E with ABC and factor F with
BCD. The two generators are E=ABC
and F=BCD.
Therefore the two defining relations are
I=ABCE and I=BCDF. To get the
complete defining relation, we use all
columns = I, so the complete defining
relation is the above two and their
interaction: I=ABCE=BCDF=ADEF.
Because the smallest word here is
length 4, this is a resolution IV design.
To find the aliases for each effect,
multiply each word in the complete
defining relation by that effect. For
example,
Aliases of A: A*ABCE= BCE
A*BCDF= ABCDF
A+ADEF= DEF
So A= BCE=ABCDF=DEF.
In ¼ fraction designs, each effect has a
number of aliases. The complete alias
structure for the 26-2 design with
I=ABCE=BCDF=ADEF is
A=BCE=DEF=ABCDF
B=ACE=CDF=ABDEF
C=ABE=BDF=ACDEF
D=BCF=AEF=ABCDE
E=ABC=ADF=BCDEF
F=BCD=ADE=ABCEF
ABD=CDE=ACF=BEF
ACD=BDE=ABF=CEF
AB=CE=ACDF=BDEF
AC=BE=ABDF=CDEF
AD=EF=BCDE=ABCF
AE=BC=DF=ABCDEF
AF=DE=BCEF=ABCD
BD=CF=ACDE=ABEF
BF=CD=ACEF=ABDE
There are three complementary fractions for the
I=ABCE=BCDF=ADEF design. They have defining
relations:
I = ABCE =-BCDF =-ADEF
I =-ABCE = BCDF =-ADEF
I =-ABCE =-BCDF = ADEF
In the first and third complementary fractions, the
expression –BCDF means that F is placed in the BCD
column and all the signs in the BCD column are
reversed.
Similarly, in the second and third complementary
fractions, the expression –ABCE means that E is
placed in the ABC column and all the signs in the
ABC column are reversed.
The alias structure for the effects will
now change. For example, in the first
complementary fraction, the aliases of
A =BCE=-DEF=-ABCDF.
Whole tables of these fractional
factorials exist, where you can find 1/8
fractions, 1/16 fractions, etc. So if you
have a large number of factors to study
and wish a small design and a huge
headache, consult these tables.
In fact, there are 3k designs, which can
be fractionalized. In these designs,
there are three levels of each factor and
k factors.
These designs work pretty much the
same way as the 2k fractionals, except
that there are complex alias
relationships in 3k-1 designs that require
the assumption of no interaction to be
useful.
In addition, the 3-level designs are quite large
even for a modest number of factors, so they
tend to be used only occasionally. Mostly
they are used for testing quadratic
relationships, but they are not the best way to
do so.
On the other hand, the 2k designs and their
fractionals are used quite extensively in
industrial experimentation, despite the
confounding in the fractionals.
The most vigorous proponent of
fractional factorials is a Japanese
gentleman named Genichi Taguchi.
Taguchi has been rightly credited with
popularizing experimental design in
manufacturing. He has incorrectly
credited himself with creating what he
calls orthogonal arrays, which really are
fractional factorials. Taguchi calls them
L8, L16, etc. designs, depending on the
number of runs.
Taguchi is a proponent of quality
engineering and manufacturing. His
design philosophy is that all products
and processes should be robust to
various forms of noise during their use.
For example, airplanes should fly as
well in thunderstorms as they do in
clear skies and cars should drive as well
in the rain and snow as they do in good
weather.
In addition to promoting robust design,
Taguchi emphasizes the reduction of
variability in manufacturing and
emphasizes the importance of
minimizing cost. For all of this, he
deserves great credit.
Taguchi has designed his experiments
to cover both controllable factors and
uncontrollable noise.
Taguchi sees each system as
Signal
System
Noise factors
Variation
Response
Taguchi puts the controllable factors in
an inner array and the noise factors in
the outer array. So his design looks like
Outer Array (L4)
Inner Array (L8)
Run
A
B
D
1
1
2
2
E
1
2
1
2
C
1
1
1
1
resp
resp
resp
resp
2
2
1
1
resp
resp
resp
resp
3
1
2
1
resp
resp
resp
resp
4
2
2
1
resp
resp
resp
resp
5
1
1
2
resp
resp
resp
resp
6
2
1
2
resp
resp
resp
resp
7
1
2
2
resp
resp
resp
resp
8
2
2
2
resp
resp
resp
resp
Y
S-N
Taguchi then uses both the mean and a
measure of variation he calls the SN
ratio for each row.
In each case, the combination of
factors represented by the mean is the
average over all noise combinations.
This is what makes it robust.
Taguchi chooses the best combination
of conditions by looking at plots of
factor effects. He does not believe in
significance tests.
Taguchi also proposes analyzing the
signal-to-noise ratio for each
combination of conditions. His S-N
ratio for larger-the-better is
1 n 1 
SN L  10 log   2 
 n i 1 Yi 
If smaller is better, the smaller-thebetter SN is
1 n 2
SN S  10 log   Yi 
 n i 1

Taguchi believes that the SN ratios separate
location from variability. When he analyzes
them as response variables in an ANOVA, he
thinks he is both optimizing the response
and reducing the variability around it.
This has been shown to be completely
incorrect. But Taguchi adherents still plot
SN for each effect just as they do for Y.
Taguchi also does not believe in
interactions, although they are
sometimes present in the experiments
he has designed. He claims that if the
engineer is working at the “energy level
of the system,” there are no interactions.
But since Taguchi eyeballs marginal
means plots and SN plots to pick the
winners, he clearly misses out on some
of the best combinations if there are
interactions.
Another criticism of the Taguchi approach is
that his combined inner and outer array setup produces very large designs.
A better strategy might be to use a single
design that has both controllable and noise
factors and look at their interactions, as we
did in the battery life experiment earlier (slide
109) where batteries had to be robust to
extreme temperatures.
Now let’s look further at random effects
factorial experiments.
You already know that a random effects
model has factors with very many levels,
and that the levels used in the
experiment are chosen at random from
all those available.
Let’s take a two-factor factorial where
both factors are random.
In this experiment, the model is
Yijk  Y   j   k   j  k   ijk
where j = 1,2, …, J
k = 1,2, …, K
i = 1,2, …, n replicates
and the model parameters, j ,  k , j  k ,
and  ijk are all independent normally
distributed random variables2 with mean
2
2 
2
0 and variances   ,   ,  , and  .
The SS and MS for each factor are
calculated exactly the same as for the
fixed effects model. But for the F ratios,
we must examine the expected MS for
each of the variance components.
E ( MS E )   2
2
E ( MS A )   2  n 
 Kn 2
2
E ( MS B )   2  n 
 Jn 2
2
E ( MS AB )   2  n 
To test each of these effects, we form the
following F-ratios
Interaction effect:
A effect:
MS A
F
MS AB
B effect:
F
F
MS AB
MS E
MS B
MS AB
Note that the main effects tests are different
from those in the fixed-effects model.
In the fixed effects model, each of the MS
terms estimates only error variance plus its
own effect, so all effects can be tested by
MSE.
E ( MS )  
2
E
J
E ( MS AB )   2 
K
n ( j  k ) 2
j 1 k 1
( J  1)( K  1)
J
E ( MS A )   2 
Kn  2j
j 1
J 1
K
E ( MS B )   2 
Jn  k2
k 1
K 1
Now most people do random effects
models more to estimate the variance
components than to test for
significance. These estimates are
 2  MS E
 2 
MS AB  MS E
n
 2 
MS A  MS AB
Kn
 2 
MS B  MS AB
Jn
Gage R&R studies are a common
industrial application of random effects
models to test a measurement system.
In a typical experiment of this sort, there
are J parts to be measured by some
gage and K operators to do the
measurement with n repetitions of the
measurement.
In this example, there are J=20 parts to
be measured, K=3 operators, and n=2
repetitions.
The data are
Part
Operator 1
Operator 2
Operator 3
1
21
20
20
20
19
21
2
24
23
24
24
23
24
3
20
21
19
21
20
22
4
27
27
28
26
27
28
5
19
18
19
18
18
21
6
23
21
24
21
23
22
7
22
21
22
24
22
20
8
19
17
18
20
19
18
9
24
23
25
23
24
24
10
25
23
26
25
24
25
11
21
20
20
20
21
20
12
18
19
17
19
18
19
13
23
25
25
25
25
25
14
24
24
23
25
24
25
15
29
30
30
28
31
30
16
26
26
25
26
25
27
17
20
20
19
20
20
20
18
19
21
19
19
21
23
19
25
26
25
24
25
25
20
19
19
18
17
19
17
The total variability can be divided into that due to
parts, to operators, and to the gage itself.
Y2  2   2  2   2
 2 is the variance component for parts

2

2

2
is the variance component for
operators
is the variance component for interaction of
parts and operators
is the random experimental error variance
A gage R&R study is a repeatability and
reproducibility study.
The repeatability part of this is given by
2
 because this reflects variation when the
same part is measured by the same operator.
2
2



The reproducibility part is given by 

because this reflects the additional variability
in the system from different operators using
the gage.
The ANOVA table for this study is
Source
SS
df
A: Parts
1185.43 19
B: Operators
2.62
2
AB
27.05 38
Error
59.50 60
Total
1274.60 119
MS
p
62.39 <0.00001
1.31
0.1730
0.71
0.8614
0.99
The estimates of the variance components are
 2  MS E  0.99
 2 
MS AB  MS E 0.71  0.99

 0.14
n
2
 2 
MS A  MS AB 62.39  0.71

 10.28
Kn
(3)( 2)
 2 
MS B  MS AB 1.31  0.71

 0.015
Jn
(20)( 2)
Notice that one of the variance components
is negative, which is impossible because
variances are positive by definition.
What can you do about this?
Well, you could just call it 0 and leave
the other components unchanged.
Or you could notice that the interaction
is insignificant and redo the ANOVA for
a reduced model excluding the
interaction term.
The reduced ANOVA table is
Source
SS
df
MS
p
A: Parts
1185.43 19 62.39 <0.00001
B: Operators
2.62
2
1.31 0.2324
Error
86.55 98
0.88
Total
1274.60 119
and the new variance components are
 2  MS E  0.88
 2 
MS A  MS E 62.39  0.88

 10.25
Kn
(3)( 2)
 2 
MS B  MS E 1.31  0.88

 0.0108
Jn
(20)( 2)
Then the gage variance is
  2   2
 0.88  0.0108
 0.8908
and the total variance is
  2   2   2
 0.88  0.0108  10.25
 11.1408
So most of the total variance is due to variability in
the product. Very little is due to operator variability or
nonrepeatability from part to part.
Of course, it had to come to this. If
factors can be fixed and they can be
random, there are certainly going to be
studies that are a combination of fixed
and random factors. These are called
mixed models.
Let’s look at a simple case where there
is one fixed factor A and one random
factor B.
The model is
Yijk  Y   j   k   j  k   ijk
J
where  j is fixed so

j 1
j
0
The other effects are random. However,
summing the interaction components over
the fixed effect = 0. That is,
J

j 1
j
k  0
This restriction implies that some of the
interaction elements at different levels of the
fixed factor are not independent.
This restriction makes the model a restricted
model. The expected MS are
E ( MS E )   2
E ( MS AB )   2  n 2
J
E ( MS A )   2  n 2 
E ( MS B )   2  Jn 2
Kn 2j
j 1
J 1
This implies that the F-ratio for the fixed
MS A
factor is
F
MS AB
But the tests for the random factor B
and the AB interaction are
MS B
F
MS E
and
F
MS AB
MS E
Let’s look at a mixed effects ANOVA.
Suppose we still have a gage R&R study,
but now there are only 3 operators who
use this gage.
In this case, Operators is a fixed factor,
not a random factor as we had earlier.
The parts are still random, of course,
because they are chosen from
production randomly.
We still have the same observations, so
the ANOVA table is the same as before
except for the p values. This is
because the F-ratios are different.
The F-ratio for operators is, as before,
Foperators 
MS operators
MS AB
but the F-ratio for parts is now
F parts 
MS parts
MS e
The conclusions are still the same. Only
the parts factor is significant, which is
expected because the parts are
different and should have different
measurements.
The variance estimates are also virtually
identical to those in the complete
random effects model, even with the
negative estimate for the AB interaction.
The reduced model then produces the
same results as before.
So far, we have talked about several
methods for reducing the residual
variance by controlling nuisance
variables.
Remember that nuisance variables are
expected to affect the response, but we
don’t want that effect contaminating the
effect of interest. If the nuisance factors
are known and controllable, we can use
blocking (most common), Latin Squares,
or Graeco-Latin Squares.
But suppose the nuisance variable is known
but uncontrollable. Now we need a new way
to compensate for its effects. This new way
is called analysis of covariance or ANCOVA.
Say we know that our response variable Y is
linearly related to another variable X, which
cannot be controlled but can be observed
along with Y. Then X is called a covariate
and ANCOVA adjusts the response Y for the
effect of the covariate X.
If we don’t make this adjustment, MSE
could be inflated and thus reduce the
power of the test to find real differences
in Y due to treatments.
ANCOVA uses both ANOVA and
regressions analysis. For a one-way
design, the model is Yij  Y   j   ( X ij  X )  eij .
As usual, we assume that the errors are
normal with constant
variance and that
J

j 1
j
0
because we have a fixed-effect model.
For ANCOVA, we also assume that β ≠ 0,
so there is a linear relationship between X
and Y, and that β is the same for each
treatment level.
The estimate of β is the pooled sum of
cross-products of X and Y divided by the
pooled sum of squares of the covariate
within treatments:
n
̂ 
J
 ( X
i 1 j 1
n
ij
J
 X j )(Yij  Y j )
2
(
X

X
)
 ij j
i 1 j 1

SCPXYpooled
SS Xpooled
The SSE for this model is
SSerror
n J

(
X

X
)(
Y

Y
)
 ij
j
ij
j 
n J
i 1 j 1
2


  (Yij  Y j ) 
n J
i 1 j 1
 ( X ij  X j )2
2
i 1 j 1
with nJ-J-1 df.
But if there were no treatment effect, the SSE
for the reduced model would be
'
SS error
n J

(
X

X
)(
Y

Y
)
 ij

ij
n J
i

1
j

1

  (Yij  Y ) 2  
n J
i 1 j 1
 ( X ij  X ) 2
i 1 j 1
with nJ -2 df.
2
Note that SSE is smaller than the reduced SS’E
because the full model with the treatment effect
contains the additional parameters τj so SS’E – SSE
is the reduction in SS due to the τj.
So to test the treatment effect, we use
'
( SS error
 SS error ) /( J  1)
F
SS error /( nJ  J  1)
The ANCOVA table is
Source
Regression
Treatments
Error
Total
SS
SCPXY
SS X
'
SS error
 SS error
SS error
n
J
SSTotal   (Yij  Y ) 2
i 1 j 1
df
1
MS p
J-1
Jn – J -1
Jn – 1
Consider an experiment seeking to
determine if there is a difference in the
breaking strength of a fiber produced
by three different machines.
Clearly the breaking strength of the
fiber is affected by its thickness, so the
thickness of the fiber is recorded along
with its strength measurement.
This is a perfect example for ANCOVA.
The data are
Breaking strength in pounds; Diameter in 10-3 inches
Machine 1
Machine 2
Machine 3
strength
diameter
strength
diameter
strength
diameter
36
20
40
22
35
21
41
25
48
28
37
23
39
24
39
22
42
26
42
25
45
30
34
21
49
32
44
28
32
15
It is clear that the strength and diameter are
linearly related from this plot:
Scatterplot of Strength v s Diameter
50
Strength
45
40
35
30
12
15
18
21
24
Diameter
27
30
33
The ANCOVA table is
Source
Regression
Treatments
SS
SCPXY
 305.13
SS X
SSe'  SSe 
df
MS
1 305.13
2
6.64 0.118
41.27  27.99  13.28
Error
Total
SSerror  27.99
n
J
SSTotal   (Yij  Y )2  346.40
i 1 j 1
11
14
p
2.54
In this case, the machines have not
been shown to be different.
But suppose you had ignored the
relationship of the diameter to the
breaking strength and done instead a
simple one-way ANOVA.
Source
SS
df
Machines 140.4 2
Error
206.0 12
Total
346.4 14
MS
p
70.20 0.0442
17.17
So if you had ignored the relationship
of breaking strength and diameter, you
would have concluded that the
machines were different.
Then you would have been misled into
spending resources trying to equalize
the strength output of the machines,
when instead you should be trying to
reduce the variability of the diameter of
the fiber. This shows how important it
is to control for nuisances.
Now we are going to talk about nested
designs. A nested design is one in which the
levels of one factor are not identical for
different levels of another factor.
For example, a company purchases its raw
material from 3 different suppliers, and wants
to know if the purity of the material is the
same for all three suppliers.
There are 4 batches of raw material available
from each supplier and three determinations
of purity are made for each batch.
The design has this hierarchical structure.
Supplier
1
2
3
Batch
1
2
3
4
1
2
3
4
1
2
3
4
1st obs
Y111
Y121
Y131
Y141
Y211
Y221
Y231
Y241
Y311
Y321
Y331
Y341
2nd obs
Y112
Y122
Y132
Y142
Y212
Y222
Y232
Y242
Y312
Y322
Y332
Y342
3rd Obs
Y113
Y123
Y133
Y143
Y213
Y223
Y233
Y243
Y313
Y323
Y333
Y343
The batches are nested within supplier. That is,
batch 1 from supplier 1 has nothing to do with
batch 1 from the other two suppliers. The
same is true for the other three batches. So
suppliers and batches are not crossed.
This design is called a two-stage
nested design because there is only
one factor nested within one other
factor. Suppliers are the first stage and
batches are the second stage.
It is possible to have higher-stage
designs. For example, if each batch
had another factor nested in it, this
would become a three-stage
hierarchical or nested design.
The linear model for the two-stage nested
design is
Y  Y   j   k ( j )   jk ( i )
The notation  k ( j ) is read βk nested in  j .
There are J levels of factor A, as usual, and
K levels of factor B in each level of factor A.
There are n replicates for each level of B
nested in A.
The above design is a balanced nested
design because there is the same number of
levels of B nested within each level of A and
the same number of replicates.
As always, the total SS can be
partitioned
n
J
K
 (Y
i 1 j 1 k 1
J
ijk
K
2
j 1
SST = SSA
df
J
n
J
K
 Y )  Kn (Y j  Y )  n (Y jk  Y j )   (Yijk  Y jk ) 2
2
2
j 1 k 1
i 1 j 1 k 1
+ SSB(A) + SSE
JKn = J-1 + J(K-1) + JK(n-1)
The appropriate F-ratio for each factor
depends on whether the factor is fixed
or random.
For fixed A and B,
J
E ( MS A )   2 
Kn 2j
j 1
J 1
J
E ( MS B ( A) )   2 
E ( MS E )   2
K
n  k ( j )
j 1 k 1
J ( K  1)
So the F-ratio for A is MSA / MSE and
for B(A) = MSB(A) / MSE .
If both factors are random, the
expectations are
E ( MS A )   2  n 2  Kn 2
E ( MS B ( A) )   2  n 2
E ( MS E )   2
So the F-ratio for A = MSA / MSB(A)
and for B(A) = MSB(A) / MSE .
If we have a mixed model with A fixed and B
random, the expectations are
J
Kn  2j
E ( MS A )   2  n 2 
j 1
J 1
E ( MS B ( A) )   2  n 2
E ( MS E )   2
So the F-ratio for A is MSA / MSB(A) and for
B(A) = MSB(A) / MSE, the same as in the fully
random model.
This model would be used if the batches were
a random selection from each supplier’s full
set of batches.
Suppose this were a mixed model with the
batches being a random selection from the
total set of batches for each supplier. The
data are
Supplier
1
2
3
Batch
1
2
3
4
1
2
3
4
1
2
3
4
1st obs
1
-2
-2
1
1
0
-1
0
2
-2
1
3
2nd obs
-1
-3
0
4
-2
4
0
3
4
0
-1
2
3rd Obs
0
-4
1
0
-3
2
-2
2
0
2
2
1
The ANOVA table is
Source
SS
Suppliers
15.06
Batches (within
suppliers)
69.92
Error
63.33
Total
148.31
df
MS
p
2
7.53
0.42
9
24
35
7.77
2.64
0.02
An examination of this table shows that only
batches within suppliers is significant. If this
were a real experiment, this would be an
important conclusion.
If the suppliers had been different in purity of
their raw material, the company could just
pick the best supplier. But since it is the
purity from batch to batch within supplier that
is different, the company has a real problem
and must get the suppliers to reduce their
variability.
For the m-stage nested design, we can
just extend the results from the 2-stage
nested design. For example, suppose a
foundry is studying the hardness of two
metal alloy formulations.
For each alloy formulation, three heats
are prepared and two ingots are
selected at random from each heat.
Two hardness measurements are made
on each ingot.
This design is
alloy
1
heat
ingot
1
1
2
2
2
1
3
2
1
1
2
1
2
2
1
3
2
1
2
Obs 1
Obs 2
Note that the ingots (random) are nested within the
heats (fixed) and the heats are nested within the
alloy formulations (fixed). So this is a 3-stage
nested design with 2 replicates.
It is analyzed in the same way as a 2-stage design
except that there is an additional factor to consider.
There are some designs with both
crossed and nested factors. Let’s look
at an example.
An industrial engineer needs to improve
the assembly speed of inserting
electronic components on printed
circuit boards. He has designed three
assembly fixtures and two workplace
layouts and he randomly selects 4
operators for each fixture-layout
combination.
In this experiment, the 4 operators are nested
under each layout and the fixtures are crossed
with layouts. The design is
Workplace Layout 1
Workplace Layout 2
Operator
1
2
3
4
1
2
3
4
Fixture 1
Y1111
Y1112
Y1121
Y1122
Y1131
Y1132
Y1141
Y1142
Y1211
Y1212
Y1221
Y1222
Y1231
Y1232
Y1241
Y1242
Fixture 2
Y2111
Y2112
Y2121
Y2122
Y2131
Y2132
Y2141
Y2142
Y2211
Y2212
Y2221
Y2222
Y2231
Y2232
Y2241
Y2242
Fixture 3
Y3111
Y3112
Y3121
Y3122
Y3131
Y3132
Y3141
Y3142
Y3211
Y3212
Y3221
Y3222
Y3231
Y3232
Y3241
Y3242
The model is
Yijkl  Y   j   k   l ( k )   j  k   j  l ( k )   jkl( i )
where  j is the effect of the jth fixture
 k is the effect of the kth layout
 l (k ) is the effect of the lth operator
within the jth layout
 j  k is the fixture by layout interaction
 j  l (k ) is the fixture by operators within
layout interaction
 jkl (i ) is the error term
Designs such as this are done occasionally
when it is physically impossible to cross the
factors.
For example, in this design, the workplace
layouts were in different parts of the plant so
the same 4 operators could not be used for
both types of layout.
The fixtures, on the other hand, could be
crossed with the layouts because they could
be installed in both workplace layouts.
Now we’ll look at split-plot designs.
The split-plot design is a generalization
of the randomized block design when
we cannot randomize the order of the
runs within the block.
That is, there is a restriction on
randomization.
For example, a paper manufacturer is
interested in the tensile strength of his
paper.
He has three different pulp preparation
methods and four different cooking
temperatures.
He wants three replicates for the design.
How can the paper manufacturer
design his experiment?
Now the pilot plant where the
experiment is to be run can do only 12
runs a day. So the experimenter
decides to run one replicate each day
for three days. In this case, days is a
block.
On each day, the first method of preparation
is used and when the pulp is produced, it is
divided into four samples, each of which is to
be cooked at one of the four cooking
temperatures.
Then the second method of preparation is
used and when the pulp is produced, it is
divided into four samples, each of which is
cooked at one of the four temperatures.
Finally the third method of preparation is used
and when the pulp is produced, it is divided
into four samples, each of which is cooked at
one of the four temperatures.
The design is
Block 1 (day 1)
Block 2 (day 2)
Block 3 (day 3)
Pulp Prep
method
1
2
3
1
2
3
1
2
3
Temp
200
30
34
29
28
31
31
31
35
32
225
35
41
26
32
36
30
37
40
34
250
37
38
33
40
42
32
41
39
39
275
36
42
36
41
40
40
40
44
45
Each block (day) is divided into three Pulp
Prep methods called main plots.
Then each main plot is further divided into
four cooking temperatures called subplots or
split plots. So this design has three main
plots and four split plots.
The model for the split-plot design is
Y jkl  Y   j   k   j  k   l   j  l   k  l   j  k  l   jkl
where the second, third, and fourth terms
represent the whole plot:
 j is blocks (factor A)
k
is pulp prep methods (factor B)
 j  k is the AB interaction, which is whole
plot error.
The rest of the terms represent the split plot:
l
is the temperature (factor C)
 j  is the AC interaction
 k  l is the BC interaction
 j  k  l is the ABC interaction, the subplot
error
The expected MS for this design, with blocks
random and pulp treatments and temps fixed
are
Whole plot
E ( MS A )   2  KL 2
K
E ( MS B )   2  L 2 
JL  k2
k 1
K 1
E ( MS AB )   2  L 2
Split plot
E ( MS C )   2  K 2 
L
JL  l2
l 1
K 1
E ( MS AC )   2  K 2
K
2
E ( MS BC )   2   

2
E ( MS ABC )   2   
L
J  ( ) 2kl
k 1 l 1
( K  1)( L  1)
With these expected mean squares, it is
clear how to form the F-ratio for testing
each effect.
B is tested by AB
C is tested by AC
BC is tested by ABC
A, AB, AC, and ABC would all be tested
by MSE if it were estimable, but it is not
because there is only one observation
per prep method, temperature
combination per day.
The ANOVA table for this design is
Source
SS
Blocks (A)
77.55
Prep method (B)
128.39
AB (whole plot error) 36.28
Temperature (C)
434.08
AC
20.67
BC
75.17
ABC (subplot error) 50.83
Total
822.97
df MS
p
2 38.78
2 64.20 0.05
4
9.07
3 144.69 <0.01
6
3.45
6 12.53 0.05
12 4.24
35
The split-plot design, due to Fisher, had
its origin in agriculture. There is usually
a very large plot of land called a field,
which is divided into subplots.
If each field is planted with a different
crop and different fertilizers are used in
the field subplots, the crop varieties are
the main treatments and the fertilizers
are the subtreatments.
In applications other than agriculture,
there are often factors whose levels are
more difficult to change than those of
other factors or there are factors that
require larger experimental units than
others.
In these cases, the hard-to-vary
factors form the whole plots and the
easy-to-vary factors are run in the
subplots.
Of course, if there are split-plots, there
would have to be split-split-plots.
These designs occur when there is
more than one restriction on
randomization.
Consider the example of a researcher
studying how fast a drug capsule is
absorbed into the bloodstream. His
study has 3 technicians, 3 dosage
strengths, and four capsule wall
thicknesses.
In addition the researcher wants 4
replicates, so each replicate is run on a
different day. So the days are blocks.
Within each day (block), each technician
runs three dosage strengths at the four
wall thicknesses.
But once a dosage strength is
formulated, all the wall thicknesses
must be run at that dosage level by
each technician. This is a restriction on
randomization.
The first dosage strength is formulated
and the first technician runs it at all four
wall thicknesses.
Then another dosage strength is
formulated and this technician runs it at
all four wall thicknesses.
Then the third dosage strength is
formulated and this technician runs it at
all four wall thicknesses.
On that same day, the other two
technicians are doing the same thing.
This procedure has two randomization
restrictions within a block: technician
and dosage strength.
The whole plot is the technician.
The dosage strengths form three
subplots, and may be randomly
assigned to a subplot.
Within each dosage strength (subplot),
there are four sub-subplots, the
capsule wall thicknesses, which may be
run in random order.
The design is shown below and repeated for 2
more blocks (days).
Technician
Dosage
strength
Block
(Day)
Wall
thick
1
1
2
3
4
2
1
2
3
4
1
1
2
2
3
1
2
3
3
1
2
3
Now blocks (days) is a random factor and
the other factors are fixed. So the expected
mean squares are:
Whole plot
A(blocks) : E ( MS A )   2  JKL 2
K
B(techs) : E ( MS B )   2  KL 2 
Subplot
JLH   k2
k 1
K 1
AB : E ( MS AB )   2  KL 2
L
C ( Dosage ) : E ( MS C )   2  KH 2 
JKH   l2
l 1
L 1
AC : E ( MS AC )   2  KH 2
K
2
BC : E ( MS BC )   2  H 

2
ABC : E ( MS ABC )   2  H 
L
JH  (  ) 2kl
k 1 l 1
( K  1)( L  1)
Sub-subplot
H
D(Wallthick) : E ( MS D )   2  KL 2 
JKL  h2
h 1
H 1
AD : E ( MS AD )   2  KL 2
K
H
JL (  ) 2kh
2
BD : E ( MS BD )   2  L 

k 1 h 1
( K  1)( H  1)
2
ABD : E ( MS ABD )   2  L 
L
2
CD : E ( MS CD )   2  K 

H
JK  ( ) 2lh
l 1 h 1
(l  1)( H  1)
2
ACD : E ( MS ACD )   2  K 
K
2
BCD : E ( MS BCD )   2   

2
ABCD : E ( MS ABCD )   2   
L
H
J  (  ) 2klh
k 1 l 1 h 1
( K  1)( L  1)( H  1)
From these expectations, the tests would be:
MS(B) / MS(AB)
MS(C) / MS(AC)
MS(BC) / MS(ABC)
MS(D) / MS(AD)
MS(BD) / MS(ABD)
MS(CD) / MS(ACD)
MS(BCD) / MS(ABCD)
Factor A, and all its interactions (AB, AC, AD,
ABC, ABD, ACD, ABCD) would be tested by
MS(Error) if it were estimable, but it is not.
All along in our discussion of fixed
models, we have been interested in
whether the means for different
treatments are different.
What if we are interested in whether the
variances are different in different levels
of a factor? How would we handle this?
We know that variances are not normally
distributed so we cannot do an ANOVA
on raw variances.
Suppose there is an experiment in an
aluminum smelter. Here alumina is added to
a reaction cell with other ingredients. Four
algorithms to maintain the ratio of alumina to
other ingredients in the cell are under test.
The response is related to cell voltage. A
sensor scans cell voltage several times per
second producing thousands of voltage
measurements during each run of the
experiment. The average cell voltage and the
standard deviation of cell voltage for each
ratio control algorithm were recorded for each
run.
The data are:
Ratio
control
algorithm
Run 1
Means
Run 2
Means
Run 3
Means
Run 4
Means
Run 5
Means
Run 6
Means
1
4.93
4.86
4.75
4.95
4.79
4.88
2
4.85
4.91
4.79
4.85
4.75
4.85
3
4.83
4.88
4.90
4.75
4.82
4.90
4
4.89
4.77
4.94
4.86
4.79
4.76
Ratio
control
algorithm
Run 1
SDs
Run 2
SDs
Run 3
SDs
Run 4
SDs
Run 5
SDs
Run 6
SDs
1
0.05
0.04
0.05
0.06
0.03
0.05
2
0.04
0.02
0.03
0.05
0.03
0.02
3
0.09
0.13
0.11
0.15
0.08
0.12
4
0.03
0.04
0.05
0.05
0.03
0.02
The engineers want to test both means
and standard deviations.
The mean is important because it
impacts cell temperature.
The standard deviation, called “pot
noise” by the engineers, is important
because it affects overall cell efficiency.
But the problem is that standard
deviations are not normally distributed.
The way to get around this is to do a log
transformation of the standard deviation and
use that for the ANOVA. Because all
standard deviations are less than 1, it is best
to use Y = -ln(SD), the natural log of pot
noise, as the response variable. The ANOVA
table for pot noise is
Source
RC algorithm
Error
Total
SS
df MS
p
6.166
3 2.055 <0.001
1.872 20 0.094
8.038 23
It is clear that the ratio control algorithm
affects pot noise. In particular,
algorithm 3 produces greater pot noise
than the other three algorithms.
There are other occasions when it is
appropriate to use transformations of
the response variable, and many
different transformations are available.
Box and Cox have developed a set of
rules for when to use which
transformation.
Leo Goodman introduced logit analysis
to analyze frequency data in an ANOVA.
Consider the problem of car insurance
for teens. In America, teens pay (or
their parents do) about three times as
much for car insurance as adults.
Teens claim that they are better drivers
than adults, but the insurance
companies are interested in accident
rates.
Consider the following frequency data:
Adults
Teens
Total
Accidents
No accidents
Accidents
No accidents
Males
2
98
16
84
200
Females
3
97
12
88
200
Total
5
195
28
172
400
Now look at the odds of a teen having an
accident. The odds of a teen driver having an
accident are 28:172. The odds of an adult
driver having an accident are 5:195. We can
test to see if teens have a higher accident rate
by forming the odds ratio. We can do the
same thing for males vs females. Then we can
even look at the interaction of gender and age.
But we can’t analyze the raw odds
ratios because they are not normally
distributed. When Goodman
considered this, he came up with “let’s
log it.” Such a transformed odds ratio
has ever since been called a logit.
Once you get these logits, you can do
the ANOVA in the usual way.
Another transformation that is useful
when non-normality is suspected is the
K-W rank transformation, introduced by
Kruskal and Wallis.
To use this technique, put the
observations in ascending order and
assign each observation a rank, Rij.
If observations are tied, assign them the
average rank.
The test statistic is
 J R 2j N ( N  1) 2 
( N  1) 


4
 j 1 n j

H
n J
N ( N  1) 2
2
Rij 

4
i 1 j 1
where N = total number of observations
Rj = sum of ranks for treatment j
Here the denominator divided by N-1 is the
variance of the ranks.
H is referred to the χ2 table to determine
probability.
As an example, consider the following data.
Level 1
Level 2
Level 3
Yi1
Ri1
Yi2
Ri2
Yi3
Ri3
7
1.5
12
5.5
14
7
7
1.5
17
9
18
11.5
15
8
12
5.5
18
11.5
11
4
18
11.5
19
14.5
9
3
18
11.5
19
14.5
Rj
18
43
59
Rj is sum of the ranks in treatment j.
How did
we get
these ranks?
Level
Ordered Yij
Order
Rank
1
7
1
1.5
1
7
2
1.5
1
9
3
3
1
11
4
4
2
12
5
5.5
2
12
6
5.5
3
14
7
7
1
15
8
8
2
17
9
9
2
18
10
11.5
2
18
11
11.5
3
18
12
11.5
3
18
13
11.5
3
19
14
14.5
3
19
15
14.5
Now it’s a simple matter to apply the formula.
 J R 2j N ( N  1) 2 
( N  1) 


n
4
j

1
j


H
n J
N ( N  1) 2
2
Rij 

4
i 1 j 1

(15  1)1130.8  960 2391.2

 8.74
2
15(15  1)
273.5
1233.5 
4
The χ2 for 3-1 df at .025 = 7.38. Since the
observed χ2 is > 7.38, we can reject Ho and
conclude that the treatments are different.
The Kruskal-Wallis rank procedure is a very
powerful nonparametric alternative to ANOVA.
It relies on no assumption of normality.
Moreover, the rank transformation is not
distorted by unusual observations (outliers),
so it is very robust to all distributional
assumptions.
It is equivalent to doing an ANOVA on the
ranks. So if in doubt about whether the
random variable is normal, you should use
the Kruskal-Wallis rank transformation.
Now let’s take a look at Response
Surface Methodology or RSM. RSM is
used to analyze problems where there
are several variables influencing the
response and the purpose of the
experiment is to optimize the response.
Let’s say we’re dealing with two factors
that affect the response Y. Then the
model is Y = f(X1, X2) where f(X1, X2) is
a response surface.
In this case, the response surface is in
the third dimension over the X1,X2 plane.
Under the response surface, right on
the X1,X2 plane, we can draw the
contours of the response surface.
These contours are lines of constant
response.
Both the response surface and the
corresponding contour plot are shown
in the handouts.
Generally, the exact nature of the
response surface is unknown and the
model we decide to use is an attempt at
a reasonable approximation to it. If we
use a first-order model, such as
Yˆ = β0 + β1X1 + β2X2 + … + βkXk
we are assuming that the response is a
linear function of the independent
variables.
If there is some curvature in the
relationship, we try a second-order
polynomial to fit the response.
K
K
k 1
k 1
Yˆ   0    k X k    k X k2    jk X k X j
j k
No model ever perfectly fits the
relationship, but over a relatively small
region, they seem to work pretty well.
When we use response surface methods, we are
looking for an optimum.
If the optimum is a maximum, then we are hill-climbing
toward it. In this case, we take the path of steepest
ascent in the direction of the maximum increase in the
response.
If the optimum is a minimum, then we are going down
into a valley toward it. In this case, we take the path
of steepest descent in the direction of the maximum
decrease in the response.
So RSM methodology is sequential, continuing along
the path of steepest ascent (descent) until no further
increase (decrease) in response is observed.
For a first-order model, the path of
steepest ascent looks like
X2
10
20 30 40
X1
Let’s look at an example of the method
of steepest ascent.
A chemical engineer is looking to
improve the yield of his process. He
knows that two variables affect this
yield: reaction time and temperature.
Currently he uses a reaction time of 35
minutes and temperature of 155˚F and
his current yield is 40 percent.
This engineer wants to explore between
30 and 40 minutes of reaction time and
150˚ to 160˚ temperature.
To have the variables coded in the
usual way (-1,+1), he uses
X1 
X2 
1  35
5
 2  155
5
He decides to use a 22 factorial design
with 5 center points. His data are
Natural Variables
1
2
Coded Variables
Response
X1
X2
Y
30
150
-1
-1
39.3
30
160
-1
+1
40.0
40
150
+1
-1
40.9
40
160
+1
+1
41.5
35
155
0
0
40.3
35
155
0
0
40.5
35
155
0
0
40.7
35
155
0
0
40.2
35
155
0
0
40.6
The fitted model is
Yˆ  40.44  0.775 X 1  0.325 X 2
and the ANOVA table is
Source
Regression
Residual
Interaction
Pure quadratic
Pure error
Total
SS
df MS
p
2.8250 2 1.4125 0.0002
0.1772 6
0.0025 1 0.0025 0.8215
0.0027
1 0.0027 0.8142
0.1720 4 0.0430
3.0022 8
Note that this ANOVA table finds the two β
coefficients significant. The error SS is
obtained from the center points in the usual
way. The interaction SS is found by
computing
β12 = ¼[(1*39.3)+(1*49.5)+(-1*40.0)+(-1*40.9)
= -0.025
SSinteraction = (4*-0.025)2 / 4 = 0.0025
which was not significant.
The pure quadratic test comes from
comparing the average of the four points in
the 22 factorial design, 40.425, with the
average of the center points, 40.46.
Y f  Yc  40.425  40.46  0.035
SSquad 
n f nc (Y f  Yc )2
n f  nc
(4)(5)( 0.035)2

 0.0027
45
This quadratic effect is not significant.
The purpose of testing the interaction and the
quadratic effects is to make sure that a firstorder model is adequate.
To move away from the design center (0,0)
along the path of steepest ascent, we move
0.775 units in the X1 direction for every 0.325
units in the X2 direction. That is, the slope of
the path of steepest ascent is 0.325 / 0.775 =
0.42.
Now the engineer decides to use 5 minutes
as the basic step size for reaction time. So
when coded, this step size = 1. This changes
the step size for X2 to 0.42, the slope of the
path of steepest ascent.
Now the engineer computes points along the path
until the response decreases.
Coded
Natural
Variables
Variables
Response
1
2
0
35
155
1
0.42
5
2
Origin + 1Δ
1
0.42
40
157
41.0
Origin + 2Δ
2
0.84
45
159
42.9
Origin + 3Δ
3
1.26
50
161
47.1
Origin + 4Δ
4
1.68
55
163
49.7
Origin + 5Δ
5
2.10
60
165
53.8
Origin + 6Δ
6
2.52
65
167
59.9
Origin + 7Δ
7
2.94
70
169
65.0
Origin + 8Δ
8
3.36
75
171
70.4
Origin + 9Δ
9
3.78
80
173
77.6
Origin +10Δ
10
4.20
85
175
80.3
Origin + 11Δ
11
4.62
90
177
76.2
Origin +12Δ
12
5.04
95
179
75.1
Step
X1
X2
Origin
0
Step Size Δ
Y
These computed results are shown in the plot
below. Note that from steps 1 through 10,
the response is increasing, but at step 11 it
begins to decrease and continues this in step
12.
Y ield v s Steps along Path of Steepest Ascent
85
80
Computed Yield
75
70
65
60
55
50
45
40
0
2
4
6
Steps
8
10
12
From these computations, it is clear that
the maximum is somewhere close to
(response time 85, temp 175), the
natural values of X1 and X2 at step 10.
So the next experiment is designed in
the vicinity of (85, 175). We still retain
the first-order model because there
was nothing to refute it in the first
experiment.
This time the region of exploration for 1
is (80,90) and for  2 , it is (170, 180). So the
coded variables are
X1 
X2 
1  85
5
 2  175
5
Again, the design used is a 22 factorial with 5
center points.
The data for this second experiment are
Natural Variables
Coded Variables
Response
1
2
X1
X2
Y
80
170
-1
-1
76.5
80
180
-1
+1
77.0
90
170
+1
-1
78.0
90
180
+1
+1
79.5
85
175
0
0
79.9
85
175
0
0
80.3
85
175
0
0
80.0
85
175
0
0
79.7
85
175
0
0
79.8
The ANOVA table for this design is
Source
Regression
Residual
Interaction
Pure quad
Pure error
Total
SS
5.00
11.12
0.25
10.66
0.21
16.12
df
MS
2
6
1
0.25
1 10.66
4
0.05
8
p
0.0955
0.0001
The first-order model fitted to the coded
values is
Yˆ  78.97  1.00 X 1  0.50 X 2
The first-order model is now in question
because the pure quadratic term is significant.
in the region tested. We may be getting the
curvature because we are near the optimum.
Now we need further analysis to reach the
optimum.
To explore further, we need a secondorder model. To analyze a secondorder model, we need a different kind of
design from the one we’ve been using
for our first-order model.
A 22 factorial with 5 center points
doesn’t have enough points to fit a
second-order model. We must
augment the original design with four
axial points to get a central composite
design or CCD.
The data for this third (CCD) experiment are
Natural Variables
Coded Variables
Response
1
2
X1
X2
Y
80
170
-1
-1
76.5
80
180
-1
+1
77.0
90
170
+1
-1
78.0
90
180
+1
+1
79.5
85
175
0
0
79.9
85
175
0
0
80.3
85
175
0
0
80.0
85
175
0
0
79.7
85
175
0
0
79.8
92.07
175
1.414
0
78.4
77.93
175
-1.414
0
75.6
85
182.07
0
1.414
78.5
85
167.93
0
-1.414
77.0
The CCD design is
X2
o
(0,1.414)
(-1,1)
(-1.414,0) o
(1,1)
o
(0,0)
(-1,-1)
(1,-1)
o
(0,-1.414)
(1.414,0) X1
The ANOVA table for this CCD design is
Source
SS
df
MS
p
Regression
Intercept
A: Time
B: Temp
A2
B2
AB
Residual
Lack of fit
Pure error
Total
28.25
0.50
0.28
0.22
28.75
5
1
1
1
1
1
1
7
3
4
12
5.649
<0.001
<0.001
<0.001
<0.001
<0.001
0.103
0.071
0.095
0.053
0.289
The quadratic model is significant, but
not the interaction.
The final equation in terms of coded values is
Y = 79.94 + 0.995*X1 + 0.515*X2
-1.376*X12 -1.001*X22 + 0.25*X1X2
and in terms of actual values
Yield = -1430.69 + 7.81(time) +13.27(temp)
-0.055(time2) -0.04(temp2) +0.01(time*temp)
The optimum turns out to be very near
175˚F and 85 minutes of reaction time,
where the response is maximized.
When the experiment is relatively close
to the optimum, the second –order
model is usually sufficient to find it.
J
J
j 1
j 1
Yˆ   0    j X j    jj X 2j    jk X j X k
j k
How do we use this model to find the
optimum point?
This point will be the set of Xk for which
the partial derivatives Yˆ X  Yˆ X  ...  Yˆ X  0 .
This point is called the stationary point.
It could be a maximum, a minimum, or
a saddle point.
1
2
k
We write the second-order model in matrix
notation: Yˆ  0  X ' B  X ' QX
where
 X1 
X 
 2
. 
X 

.


. 


 X K 
kx1
 ˆ1 
 
 ˆ2 
 
.
B 
. 
 
. 
 ˆ 
 K
ˆ11
Q=
ˆ12 / 2
ˆ1K / 2
ˆ / 2
ˆ22
2K
.
.
.
̂ KK
kx1
kxk symmetric
In this notation, X is the vector of K
variables, B is a vector of first-order
regression coefficients, and Q is a k x k
symmetric matrix.
In Q, the diagonals are the pure
quadratic coefficients and the offdiagonal elements are ½ the interaction
coefficients.
The derivative of Y with respect to X
equated to 0 is
Yˆ
 B  2Q X
X
The stationary point is the solution to
1 1
Xs  Q B
2
and we can find the stationary point
response by
1 '
Yˆs   0  X s Q
2
After we find the stationary point, we
want to know if the point is a maximum,
a minimum, or a saddle point.
Moreover, we want to know the relative
sensitivity of the response to the
variables X.
The easiest way to do this is to examine
the response surface or the contour plot
of the response surface. With only two
variables, this is easy, but with more
than two, we need another method.
The formal method is called canonical
analysis. It consists of first
transforming the model so that the
stationary point is at the origin.
Then we rotate the axes about this new
origin until they are parallel to the
principal axes of the fitted response
surface.
The result of this transformation and rotation
is the canonical form of the model
Yˆ  Yˆs  1 w12  2 w22  ...  K w K2
where the {wj} are the transformed, rotated
independent variables and the {λj} are the
eigenvalues of the matrix Q .
If all the {λj} are positive, the stationary point
is a minimum. If all the {λj} are negative,
the stationary point is a maximum.
If the {λj} have mixed signs, the stationary
point is a saddle point.
The magnitude of the {λj} is important
as well. The surface is steepest in the
wj direction for which |λj| is greatest.
Continuing in our example, recall that
the final equation in terms of coded
values is
Y = 79.94 + 0.995*X1 + 0.515*X2
-1.376*X12 -1.001*X22 + 0.25*X1X2
In this case,
 X1 
X  
X 2 
0.995
B

0.515
Q=
-1.376
0.1250
0.1250
-1.001
so the stationary point is
1 1
Xs  Q B
2
0.389
0.306


= 1/2
-0.7345
-0.0917
-0.0917
-1.0096
0.995
0.515


0.389
0.306


is the stationary point in coded values.
In the natural values,
0.389 
1  85
5
1  86.95  87
 2  175
0.306 
5
  2  176.53  176.5 
So the stationary point is 87 minutes of
reaction time and 176.5˚ temperature.
Now we can use the canonical analysis to see
whether this is a max, min, or saddle point.
We can take the roots of the determinantal
equation Q – λI = 0 or
-1.3770- λ
0.1250
0.1250
-1.0018 – λ = 0
which is simply the quadratic equation
λ2 + 2.3788 λ + 1.3639 = 0
whose roots are
λ1 = -0.9641
λ2 = -1.4147
From these roots, we get the canonical
form of the fitted model
Yˆ  80.21  0.9641w12  1.4147 w22
Since both λ1 and λ2 are both negative
within the region of exploration, we
know that the stationary point is a
maximum.
The yield is more sensitive to changes
in w2 than to changes in w1.
Designs that used to fit response
surfaces are called response surface
designs. It is critical to have the right
design if you want the best fitted
response surface.
The best designs have the following
features:
1. Provide a reasonable distribution of
points in the region of interest.
2. Allow investigation of lack of fit
3. Allow experiments to be performed in
blocks
4. Allow designs of higher order to be
built up sequentially
5. Provide an internal estimate of error
6. Do not require a large number of runs
7. Do not require too many levels of the
independent variables
8. Ensure simplicity of calculation of
model parameters
Designs for first-order models include
the orthogonal design, a class of
designs that minimize the variance of
the β coefficients.
A first-order design is orthogonal if the
sum of cross-products (or the
covariance) of the independent
variables is 0. These include 2k
replicated designs or 2k designs with
replicated center points.
The most popular type of design for
second-order models is the central
composite design, CCD.
The CCD is a 2k factorial design with nc
center points and 2k axial points added.
This is the design we used in our
example after the first-order design
proved inadequate.
There are two parameters that need to
be specified in a CCD:
(1) α = nf1/4 is the distance of the axial
points from the design center. nf is
the number of factorial points in the
design.
(2) nc, the number of center point runs,
usually equals 3 or 5.
Another class of designs used for fitting
response surfaces is the Box-Behnken
design. Such a design for 3 factors is
o
o
o
o o
o
o
o
o o
o
o
o
Note that there are no points on the corners of the
design.
This is the three-variable Box-Behnken design
Run
X1
X2
X3
1
-1
-1
0
2
-1
+1
0
3
+1
-1
0
4
+1
+1
0
5
-1
0
-1
6
-1
0
+1
7
+1
0
-1
8
+1
0
+1
9
0
-1
-1
10
0
-1
+1
11
0
+1
-1
12
0
+1
+1
13
0
0
0
14
0
0
0
15
0
0
0
There are other response-surface
designs as well, but the most commonly
used are the CCD and the
Box-Behnken.
In all of these designs, the levels of
each factor are independent of the
levels of other factors.
But what if you have a situation where
the levels of the factors are not
independent of another?
For example, in mixture experiments, the
factors are components of a mixture, so
their levels cannot be independent
because together they constitute the
entire mixture. If you have more of a
level factor A, you must have less of
some other factor’s level.
That is, X1 + X2 + X3 + … + XK = 1
where there are K components in the
mixture. If K = 2, the factor space
includes all points that lie on the line
segment X1 + X2 = 1, where each
component is bounded by 0 and 1.
1
X2
0
1
X1
If K = 3, the mixture space is a triangle,
where the vertices are mixtures of 100%
of one component.
X2
1
0
1
X3
1
X1
With 3 components of the mixture, the
experimental region can be represented on
trilinear coordinate paper as
X1
0.8
0.2
0.2
0.2
0.8
X2
0.8
X3
The type of design used for studying
mixtures is called a simplex design. A
simplex lattice design for K components
is a set of m+1 equally spaced points
from 0 to 1. That is,
Xk = 0, 1/m, 2/m, …,1
for k = 1,2,…, K
So the kth component may be the only
one (Xk =1), may not be used at all
(Xk =0), or may be somewhere in
between.
If there are K=3 components in the
mixture, then m= 2 and the three levels
of each component are Xk = 0, ½, 1.
Then the simplex lattice consists of the
following six runs:
(1,0,0) (0,1,0) (0,0,1)
(½,½,0) (½,0,½) (0,½,½)
They are shown in the simplex lattice
design on the next slide.
A simplex lattice for K = 3 components
is
1,0,0
½,½,0
0,1,0
½,0,½
0,½,½
0,0,1
Note that the three pure blends occur at
the vertices and the other three points
occur at the midpoints of the three
sides.
A criticism of the simplex lattice is that
all of the runs occur on the boundary of
the region and thus include only K-1 of
the K components. When K = 3, the
pure blends include only one of the 3
components and the other runs include
only 2 of the three components.
If you augment the simplex lattice design with
additional points in the interior of the region,
it becomes a simplex centroid design, where
the mixture consists of portions of all K
components. In our K =3 example,
1,0,0
½,½,0
½,0,½
1/3,1/3,1/3
0,1,0
0, ½,½
0,0,1
In the center point, the mixture is 1/3 of
each component.
The mixture models are constrained by
K
X
k 1
k
1
which makes them different from the
usual models for response surfaces.
K
The linear model is Yˆ    k X k
k 1
where the β coefficients represent the
response to the pure blends, where only
one component is present.
The quadratic model isK
Yˆ    k X k    jk X j X k
k 1
j k
where the β coefficients in the
nonlinear portion represent either
synergistic blending (+βjk) or
antagonistic blending (-βjk)
Higher-order terms, including cubic, are
frequently necessary in mixtures
because the process is generally very
complex with numerous points in the
interior of the simplex lattice region.
Let’s look at an example of a mixture
experiment with 3 components,
polyethylene, polystyrene, and
polypropylene, which are blended to
make yarn for draperies.
The response is the yarn elongation,
measured as kilograms of force applied.
A simple lattice design is used, with 2
replicates at each of the pure blends
and 3 replicates at each of the binary
blends.
The data are
Design Point
(X1, X2, X3)
Average
Response
(1, 0, 0)
11.7
(1/2, 1/2, 0)
15.3
(0, 1, 0)
9.4
(0, 1/2, 1/2)
10.5
(0, 0, 1)
16.4
(1/2, 0, 1/2)
16.9
The fitted model is
Yˆ  11.7 X 1  9.4 X 2  16.4 X 3  19.0 X 1 X 2  11.4 X 1 X 3  9.6 X 2 X 3
The model turns out to be an adequate
representation of the response. Since
component 3 (polypropylene) has the largest
β, it produces yarn with the highest elongation.
Since β12 and β13 are positive, a blend of
components 1 and 2 or of components 1 and
3 produces higher elongations than with just
the pure blend alone. This is an example of
synergistic effects.
But a blend of components 2 and 3 is
antagonistic (produces less elongation)
because β23 is negative.
Download