Lenth`s Analysis of Unreplicated Factorial Experiments

advertisement
Lenth's Analysis of Unreplicated Factorial Experiments
When analysing 2k factorial experiments with replicated measurements at each treatment
combination, an estimate of  may be calculated from the replicates at each treatment
combination and these may be combined using the "root mean square" (RMS) formula to
provide an overall estimate.
When there is no replication, that is, there is just one measurement of the response at each
treatment combination, this is not possible.
The traditional solution to this problem was to construct Normal plots of the estimated effects
and judge as "significant" those that departed from linearity by a substantial distance. Following
such a determination of the "significant" effects, an estimate of  could be found by fitting a
model corresponding to the "significant" effects, on the basis that the remaining effects reflected
chance variation. A variation on this method may be used in the case that one or more factors
appears inactive, that is, no main effect of interaction involving those factors appears
"significant". In that case, the observations at the different levels of those inactive factors may
be regarded as replicates at each combination of levels of the active factors and a full factorial
analysis conducted.
In 1989, an alternative approach was proposed by Russell Lenth1 and that approach has gained
popularity and is included as an option in several software packages including Minitab. The
method is explained here in the context of an example taken from BHH, Chapter 5, Sections
5.13, 5.14, pp. 199-208.
Illustrative example
As part of a chemical process development exercise, the effects of changing four process input
factors on process yield (%) were studied in a 2^4 experiment. The factors included in the
study, along with their experimental levels, were
A: Catalyst Charge (lb), 10 and 15,
B: Temperature (°C), 220, 230,
C: Reactant Concentration (%), 10, 12,
D: Pressure (psi), 50, 80.
A response was measured at each of the 16 reaction conditions, as listed below. The run order
was randomised, as also shown.
Design
Point
1
2
3
4
5
6
7
8
9
1
Catalyst
Charge (lb)
10
15
10
15
10
15
10
15
10
Temperature Concentration
°C
%
220
10
220
10
240
10
240
10
220
12
220
12
240
12
240
12
220
10
Pressure
psi
50
50
50
50
50
50
50
50
80
Run
Order
8
2
10
4
16
5
11
14
15
Yield
%
70
60
89
81
60
49
88
82
69
Russell V. Lenth, (1989), "Quick and Easy Analysis of Unreplicated Factorials", Technometrics, 31, 4,
469-473.
Page 2
10
11
12
13
14
15
16
15
10
15
10
15
10
15
220
240
240
220
220
240
240
10
10
10
12
12
12
12
80
80
80
80
80
80
80
9
1
13
3
12
6
7
62
88
81
60
52
86
79
Normal plot of effects
Minitab command DOE may be used to calculate estimates of all 15 main effects and
interactions. The results follow.
Term
Effect
A
B
C
D
A*B
A*C
A*D
B*C
B*D
C*D
A*B*C
A*B*D
A*C*D
B*C*D
A*B*C*D
-8.00
24.00
-5.50
-0.25
1.00
-0.00
0.75
4.50
-1.25
-0.25
0.50
-0.75
-0.25
-0.75
-0.25
If all effects were null, these 15 effect estimates would constitute a simple random sample from
a Normal distribution with constant standard deviation. To assess this, a Normal plot may be
drawn, as follows.
Probability Plot of Effect
Normal
25
20
Effect
15
10
5
0
-5
-10
-2
-1
0
Score
1
2
This suggests one highly significant effect and three others that also appear to deviate
substantially from the rest. From the list of effects above, these may be identified as the A, B
and C main effects and the BC interaction.
Page 3
Lenth's analysis
Using the Lenth's analysis option in the Minitab DOE command results in the following plots.
Normal Plot of the Effects
(response is Yield, Alpha = 0.05)
B
25
Effect Type
Not Significant
Significant
20
Effect
15
Factor
A
B
C
D
10
Name
A
B
C
D
BC
5
0
C
-5
A
-10
-2
-1
0
Score
1
2
Lenth's PSE = 0.75
Pareto Chart of the Effects
(response is Yield, Alpha = 0.05)
1.93
Factor
A
B
C
D
B
A
C
BC
BD
Name
A
B
C
D
Term
AB
BCD
ABD
AD
ABC
D
ABCD
CD
ACD
AC
0
5
10
15
20
25
Effect
Lenth's PSE = 0.75
The first is an enhanced version of the Normal effects plot, with effects significant according to
Lenth's analysis identified.
The second is a Pareto chart, designed to show the separation of the "vital few" from the "trivial
many", in accordance with the "Pareto Principle" popularised by the quality guru Juran. In this
regard there does appear to be a sharp distinction between the four biggest (in magnitude)
effects and the rest. The placement of Lenth's "margin of error" at 1.93 corresponds with this.
The "margin of error" is the product of Lenth's estimated standard error, PSE = 0.75, by a critical
t-value.
Lenth's analysis explained
The basis for Lenth's analysis is that, given several Normal values with mean 0 and common
standard deviation, and given their absolute values (magnitudes, or values without signs), then it
may be shown that
SD(Normal values) ≈ 1.5 × median(Absolute values).
Page 4
If effects were null, then this could be applied directly to the estimated effects. The list shown
above may be adjusted to show just the absolute values of the effects,
Term
Effect
A
B
C
D
A*B
A*C
A*D
B*C
B*D
C*D
A*B*C
A*B*D
A*C*D
B*C*D
A*B*C*D
8.00
24.00
5.50
0.25
1.00
0.00
0.75
4.50
1.25
0.25
0.50
0.75
0.25
0.75
0.25
and then sorted in increasing order of the absolute values,
Term
Effect
A*C
D
C*D
A*C*D
A*B*C*D
A*B*C
A*D
A*B*D
B*C*D
A*B
B*D
B*C
C
A
B
0.00
0.25
0.25
0.25
0.25
0.50
0.75
0.75
0.75
1.00
1.25
4.50
5.50
8.00
24.00
In a sorted list of 15, the 8th is the median, here 0.75, so that Lenth's estimate of effect standard
error would be
1.5 × 0.75 = 1.125.
Since it is unlikely that all effects are null, Lenth refines this estimate by excluding all effects that
exceed 2.5 times this estimate (on the basis that the chances of a null estimate exceeding this
value are negligible) and repeating the exercise on the remaining effects.
Since
2.5 × 1.125 = 2.8125,
Page 5
this entails excluding the last four effects in the ordered list, determining the median of the
remaining 11, 0.5, and calculating
PSE = 1.5 × 0.5 = 0.75.
Here, PSE stands for "Pseudo Standard Error".
To convert this into a critical value for determining "significant effects, Lenth multiplies by a
critical t-value with degrees of freedom equal to total number of effects divided by 3. This
formula was arrived at by a combination of probabilistic simulation, trial and error and
judgement.
In this case, it leads to
df = 15 / 3 = 5,
and
t0.05, 5 = 2.57,
so that the critical value for effects is
2.57 × 0.75 = 1.9275,
rounded to 1.93 by Minitab.
In summary
Lenth's PSE, or pseudo standard error, is used in calculating a critical value for the effects when
there are no replicates. It is based on the fact that the standard deviation of a sample from a
N(0,) distribution may be estimated as 1.5×median(absolute values). In case some effects are
non null, a refinement is to delete effects that exceed 2.5 times this estimate and recompute. To
find the critical value for the effects, the PSE is multiplied by the appropriate critical value for t
with m/3 degrees of freedom, where m is the number of effects being assessed.
Download