# FRACTIONAL FACTORIAL ( ) STUDIES

advertisement
Page 1
FRACTIONAL FACTORIAL (#:; ) STUDIES
Motivation: For : factors, even #: gets big fast
for : œ "! #: œ "!#%
Example
A
B
C
D
E
F
G
H
J
K
L
M
N
O
P
Hendrix 1979 Chemtech
Coating Roll Temp
Solvent
Polymer X-12 Preheat
Web Type
Coating Roll Tension
Number of Chill Rolls
Drying Roll Temp
Humidity of Air Feed
Feed Air to Dryer Preheat
Dibutylfutile in Formula
Surfactant in Formula
Dispersant in Formula
Wetting Agent in Formula
Time Lapse
Mixer Agitation Speed
115° vs 125°
Recycled vs Refined
No vs Yes
LX-14 vs LB-17
30 vs 40
1 vs 2
75° vs 80°
75% vs 90%
Yes vs No
12% vs 15%
.5% vs 1%
.1% vs .2%
1.5% vs 2.5%
10min vs 30min
100rpm vs 250rpm
C œ a measure of product cold crack resistance
#"& œ $#ß (') !!!!!!
"Solution":
Collect data for only some (a fraction) of all possible
combinations of levels of the factors.
Page 2
Qualitative Points That Ought to be "Obvious" + :riori:
ì necessary information loss (relative to the full factorial)
ì some ambiguity inevitable because of the loss
ì careful planning and wise analysis needed to hold this to a minimum
b
ab
(1)
(−)
a
(−)
Factor B
(+)
Example (hypothetical) ##" ... a half fraction of a # ‚ # factorial
Factor A
(+)
Page 3
$"
$
Example (hypothetical) a #
... Suppose that # factorial effects and
combination means are as below:
.á œ "!, !# œ $, "# œ ", ## œ #, !"## œ #, !### œ !, "### œ !,
!"#### œ !
(+)
μbc = 8
μb = 4
μabc = 18
μab = 14
μc = 10
μac = 12
(−)
or
μ(1) = 6
Factor A
(+)
Fa
ct
μa = 8
(−
)
(−)
C
(+
)
Factor B
Suppose further that one gets data adequate to essentially reveal the mean
responses for combinations a, b, c and abc (the % corners circled above) but has
no data on the other combinations.
!# œ "right face average"  "grand average"
A "half-fraction version" of this might be
!*# œ "available right face average"  "available grand average"
œ "$  "!
œ$
!!!!! Here !*# œ !# !!!
Something for nothing?
Page 4
A similar calculation for the C main effect however gives:
##* œ "available back face average"  "available grand average"
œ%
????
##* Á ##
The general story behind this situation is that for this #$" fractional factorial
!*# œ !#  "###
and
##* œ ##  !"##
Confounding/aliasing ... ambiguity
Page 5
Issues to be Addressed in Order to Use #
:;
Fractional Factorials:
ì how to rationally choose #"; out of #: combinations for study
ì how to determine the corresponding aliasing/confounding pattern
ì how to do data analysis
First consider these in the context of half fractions ... then for general ; .
Choice of standard half fractions of #: factorials:
Write out signs for specifying levels for all possible combinations of the
"first" :  " factors. Then "multiply" these together for a given
combination of the "first" factors to arrive at a corresponding level to use
for the "last" factor.
Example (#%" )
With % two-level factors A, B, C and D one proceeds as per
A








B








C








Product (used for D)








Combination
(1)
ad
bd
ab
cd
ac
bc
abcd
Page 6
Example Snee in 1985 ASQC Technical Supplement
A
B
C
D
E
Solvent/Reactant
Catalyst/Reactant
Temperature
Reactant Purity
pH of Reactant
ÐÑ
ÐÑ
low vs high
.025 vs .035
150° vs 160°
92% vs 96%
8.0 vs 8.7
C œ color index
combination
e
a
b
abe
c
ace
bce
abc
C
 Þ'$
#Þ&"
 #Þ')
 "Þ''
#Þ!'
"Þ##
 #Þ!*
"Þ*$
combination
d
ade
bde
abd
cde
acd
bcd
abcde
C
'Þ(*
'Þ%(
$Þ%&
&Þ')
&Þ##
*Þ$)
%Þ$!
%Þ!&
These are data from half of all $# combinations of # levels of each of the &
factors (half of all possible labels of combinations based on the & letters
a,b,c,d and e are given above, namely those involving an odd number of
letters).
Snee followed the standard recommendation for choosing the half fraction
Page 7
Determining the "alias structure" of the half fraction (the implied pattern
of ambiguities):
Use a method of formal multiplication, beginning from a so-called
"generator" that represents the way in which the half fraction was chosen.
The generator is of the form
name of "last" factor Ç product of names of "first" factors
The rules of multiplication are that
• letter ‚ I Ç the same letter
• letter ‚ same letter Ç I
Example (the #$" numerical example used above)
The generator here is
C Ç AB
We can multiply through by C to obtain the so called "defining relation"
I Ç ABC
This first says that the ABC $ factor interaction !"#### is aliased with the
grand mean. That is, only
. á  !"####
can be estimated, not !"#### alone.
Multiplying through the defining relation by any set of letters of interest
produces a statement of what effect(s) are "aliased with" the corresponding
effect. For example, we see that
A Ç BC
(read "the A main effect is aliased with the BC 2 factor interaction). Similarly
C Ç AB
Page 8
as was illustrated earlier. In fact, the whole alias structure is
I Ç ABC
A Ç BC
B Ç AC
C Ç AB
#$ effects are aliased in % pairs.
The technical meaning of aliasing is that only sums of effects can be estimated,
not individual effects.
Example (the #%" again)
With the generator
D Ç ABC
the defining relation is
I Ç ABCD
From this, e.g., we see that the AB # factor interaction is aliased with the CD
2-factor interaction.
Example Snee's #&" study had generator
E Ç ABCD
and hence defining relation
I Ç ABCDE
From this one sees, e.g., that the AB #-factor interaction is aliased with the
CDE $-factor interaction.
Page 9
Data Analysis for Standard Half Fractions:
Initially temporarily ignore the "last" factor and treating the data as a
full factorial in the "first" :  " factors, judge the statistical significance
and practical importance of estimates derived from the Yates algorithm.
Then interpret these estimates in light of the alias structure as estimates of
appropriate sums of #: effects.
Where there is some replication (not all #:" sample sizes are 1) confidence
intervals can be made for the (sums of) effects.
^ „ >†=
effect
pooled
where
#
=pooled
œ
1
2:"
"
"
"
"



â
Ë8
8
8
8
a
b
ab
Ð"Ñ
!aa8combination  "b=#combination b
!a8combination  "b
and the appropriate degrees of freedom for > are
"a8combination  "b œ 8  #:"
Lacking any replication, normal plotting of the output of the Yates algorithm
(ignoring the "last" factor) can be used in judging statistical significance.
Page 10
Example (another hypothetical #
$"
)
Suppose 8a œ ", Ca œ &
8b œ #, –Cb œ $, =#b œ "Þ&
and 8abc œ $, –Cabc œ &Þ&, =#abc œ "Þ).
8c œ ", Cc œ #Þ&
Yates applied to:
#Þ&
&
$
&Þ&
=#pooled œ
!  Ð#  "Ñ"Þ&  !  Ð$  "Ñ"Þ)
!  Ð#  "Ñ  !  Ð$  "Ñ
Intervals:
^ „> †=
effect
$
pooled
"
#$"
" " " "
Ê   
" # " $
Page 11
&"
Example Snee's #
had no replicationÞ Ignoring factor E temporarily,
Yates can be applied to the "' responses exactly as listed earlier. The result is
estimates as belowÞ
combination
e
a
b
abe
c
ace
bce
abc
d
ade
bde
abd
cde
acd
bcd
abcde
C
 Þ'$
#Þ&"
 #Þ')
 "Þ''
#Þ!'
"Þ##
 #Þ!*
"Þ*$
'Þ(*
'Þ%(
$Þ%&
&Þ')
&Þ##
*Þ$)
%Þ$!
%Þ!&
Normal plot the (last "&) estimates ...
estimate ("' divisor)
#Þ)(&
Þ)#$
 "Þ#&$
Þ!&&
Þ$)%
Þ!'%
Þ!%"
Þ!!"
#Þ(*$
 Þ!*&
 Þ!%&
 Þ#))
 Þ$"%
Þ")'
 Þ$!'
 Þ)("
Page 12
Tentative engineering conclusion of Snee study: For uniform color index,
attention must be paid to controlling/reducing variation in
"st, Factor D, Reactant Purity
#nd, Factor B, Catalyst/Reactant Ratio
$rd, Factor E, pH of Reactant
%th, Factor A, Solvent/Reactant Ratio
Page 13
:
Smaller (than half) Fractions of # Studies (#
;œ"
;œ#
;œ$
etc.
:;
Fractional Factorials)
half fractions
quarter fractions
eighth fractions ...
Issues (still) À
ì how to rationally choose #"; of #: possible combinations of levels of : #level factors
ì how to determine the corresponding aliasing/confounding pattern
ì how to do data analysis
Answers: the natural generalizations of the half fraction answers just
discussed
Page 14
Choice of standard
"
#;
fractions of #: factorials:
Write out signs for specifying levels for all possible combinations of the
"first" :  ; factors. Pick ; different groups of the first :  ; factors.
Use the products of the signs corresponding to members of the groups to
specify levels for the "last" ; factors.
Example Best and Hanson 1986 ASA Meeting Presentation
development of a catalyst for producing ethyleneamines by the amination of
monoethanolamine ÞÞÞ : œ & factors
A
B
C
D
E
Ni/Re Ratio
Precipitant
Calcining Temp
Reduction Temp
Support Used
#Î" vs #!Î"
(NH% )# CO$ vs none
$!!° vs &!!°
$!!° vs &!!°
alpha-alumina vs silica alumina
C œ % water produced
; œ # i.e. a "% fraction contemplated ... i.e. #&# œ ) out of the #& œ $#
possible A, B, C, D, E combinations
The (somewhat arbitrary) choice was made to use ABC sign products to
choose levels of D, and BC sign products to choose levels of E.
(Other choices are possible and lead to different aliasing patterns that might for
some other studies be preferred by the engineer in charge.)
Page 15
A








B








C








ABC Product (for D)








BC Product (for E)
+
+






Combination
e
ade
bd
ab
cd
ac
bce
abcde
The last column specifies those 8 combinations actually used in the study. The
data obtained were as below.
combination
e
ade
bd
ab
cd
ac
bce
abcde
C
)Þ(!ß ""Þ'!ß *Þ!!
#'Þ)!
#%Þ))
$$Þ"&
#)Þ*!ß $!Þ*)
$!Þ#!
)Þ!!ß )Þ'*
#*Þ$!
C
*Þ('(
#'Þ)!!
#%Þ))!
$$Þ"&!
#*Þ*%!
$!Þ#!!
)Þ$%&
#*Þ$!!
=#
#Þ&%$
#Þ"'$
Þ#$)
Page 16
Determining the "alias structure" of the
"
#;
fraction:
Use the method of formal multiplication, beginning from ; generators that
represent the way in which the #"; fraction was chosen. To find the
defining relation (the list of all products 'equivalent to' I) first convert the
generators to statements of products equivalent to I, and then multiply
these in pairs, then in triples, then in sets of four, etc. The letter I will
have #;  " equivalent products ... i.e. effects are aliased in #:; different
groups of #; each.
Example Hanson and Best again
D Ç ABC so I Ç ABCD
E Ç BC so I Ç BCE
further, multiplying these two we get
I † I Ç (ABCD) † (BCE)
i.e.
I Ç ADE
So the defining relation for the catalyst study is
I Ç ABCD Ç BCE Ç ADE
and therefore effects are aliased in ) groups of %. For example, multiplying
through the defining relation by A gives
A Ç BCD Ç ABCE Ç DE
and we see that, for example, the A main effect is aliased with the DE # factor
interaction.
Page 17
Data analysis for standard #
:;
studies:
Initially ignore the "last" ; factors, and treating the data as a full factorial
in the "first" :  ; factors, judge the statistical significance and practical
importance of estimates produced by the Yates algorithm. Then interpret
these in light of the alias structure as estimates of appropriate sums of #:
effects.
With some replication, confidence intervals can be made for the (sums of)
effects and used in the process of judging statistical significance.
^ „ >†=
effect
pooled
1
2:;
"
"
"
"



â
Ë8
8a
8b
8ab
Ð"Ñ
where (as always)
#
=pooled
œ
!aa8combination  "b=#combination b
!a8combination  "b
and the appropriate degrees of freedom for > are
"a8combination  "b œ 8  #:;
Lacking any replication, one can normal plot estimates, looking for ones
clearly of larger order of magnitude than the rest (and therefore larger than
background noise as well).
Page 18
Example
Hanson and Best catalyst study
The ) sample means, –C, listed before were in Yates standard order for factors
A, B and C (the "first" :  ; œ $) ignoring D and E (the "last" ; œ #). So the
Yates algorithm can be applied to them in the order listed.
combination
e
ade
bd
ab
cd
ac
bce
abcde
C
*Þ('(
#'Þ)!!
#%Þ))!
$$Þ"&!
#*Þ*%!
$!Þ#!!
)Þ$%&
#*Þ$!!
estimate
#%Þ!%)
&Þ)"&
 Þ"#*
"Þ%*#
Þ$**
 Þ&""
 &Þ%*&
$Þ')#
sum estimated
grand mean  aliases
A main effect  aliases
B main effect  aliases
AB interaction  aliases
C main effect  aliases
AC interaction  aliases
BC interaction  aliases
ABC interaction  aliases
statistical significance/detectability of these?
=#pooled œ
($  ")(#Þ&%$)  (#  ")(#Þ"'$)  (#  ")(Þ#$))
œ "Þ)(#
($  ")  (#  ")  (#  ")
So =pooled œ È"Þ)(# œ "Þ$'), and this can be used as a measure of
background noise and as a basic ingredient of confidence intervals for the sums
of effects.
Page 19
=pooled has % associated degrees of freedom. So ifß e.g.ß *&% confidence
intervals for the sums of effects are desired, the "  /  part" of the confidence
interval formula becomes
„ #Þ(('("Þ$'))
"
" " " " " " " "
      
Ê
#$
$ " " " # " # "
i.e. „ "Þ"*&
We therefore might judge any estimate larger in absolute value than "Þ"*& to
represent a sum of effects clearly large enough to see above the background
experimental variation.
Note the "detectable" sums are (in order of magnitude):
sum
!#  "#$###  !"#%####  $%##
"###  !$##  %#  !"#$%#####
!"####  $#  !%##  "#$%####
!"##  #$##  !#%###  "$%###
estimate
&Þ)"&
 &Þ%*&
$Þ')#
"Þ%*#
Tentative interpretations?
A main effect ??
E main effect ??
D main effect ?? or AE interaction ??
?????? (happily much smaller than the other sums)
(And there are other equally plausible interpretations of the 3 large sums!)
In fact a follow-up study confirmed the importance of the D main effect.
Page 20
If the A (Ni/Re ratio) main effect, the E (Support Type) main effect and the D
(Reduction Temp) main effect are indeed the most important determiners of C,
and large C is desirable, the signs of the estimates indicate the need for "high
A" (#!Î" Ni/Re ratio), "low E" (alpha-alumina support) and "high D" (&!!°
reduction temp).
Notice !!!! The larger ; , the larger the inevitable ambiguity of interpretation of
the fractional factorial results and the more likely the need for follow-up study.
Small fraction are really most useful as screening studies, to pick a few likely
candidates out of many potentially important factors for subsequent more
detailed study.
End with an extreme example of large ; , i.e. a small fraction.
Example Hendrix Chemtech study mentioned at the beginning
C œ cold crack resistance of a product
: œ "& factors A, B, C, D, E, F, G, H, J, K, L, M, N, O, P
(factor names and levels given earlier)
:  ; œ %, i.e., only #% œ "' combinations were run !!!!!
This was a
"
#"&%
œ
"
#!%)
fraction !!!!
The 11 generators used were:
E Ç ABCD
K Ç CD
P Ç AB
F Ç BCD
L Ç BD
G Ç ACD
M Ç AD
H Ç ABC
N Ç BC
J Ç ABD
O Ç AC
These led to the "' combinations and (ultimately) the data below:
Page 21
combination
eklmnop
aghjkln
bfhjkmo
abefgkp
cfghlmp
acefjlo
bcegjmn
abchnop
dfgjnop
adefhmn
bdeghlo
abdjlmp
cdehjkp
acdgkmo
bcdfkln
abcdefghjklmnop
C
"%Þ)
"'Þ$
#$Þ&
#$Þ*
"*Þ'
")Þ'
##Þ$
##Þ#
"(Þ)
")Þ*
#$Þ"
#"Þ)
"'Þ'
"'Þ(
#$Þ&
#%Þ*
Pretty clearly it isn't sensible to write out the whole defining relation here ...
effects are going to be aliased in "' groups of #"" œ #!%) effects.
But for a most tentative interpretation, let's see what we might glean if the
physical system is so simple that only main effects dominate. (Physically
reasonable ???? Ask the engineer, not the statistician!)
The "' observations are listed in Yates order for factors A,B,C and D
(ignoring the rest). We therefore begin by running them through the Yates
algorithm, with the results below.
Page 22
combination
eklmnop
aghjkln
bfhjkmo
abefgkp
cfghlmp
acefjlo
bcegjmn
abchnop
dfgjnop
adefhmn
bdeghlo
abdjlmp
cdehjkp
acdgkmo
bcdfkln
abcdefghjklmnop
C
"%Þ)
"'Þ$
#$Þ&
#$Þ*
"*Þ'
")Þ'
##Þ$
##Þ#
"(Þ)
")Þ*
#$Þ"
#"Þ)
"'Þ'
"'Þ(
#$Þ&
#%Þ*
estimate ("' divisor)
#!Þ#)
Þ"$
#Þ)(
 Þ!)
Þ#(
 Þ!)
 Þ"*
Þ$'
Þ"$
Þ!$
Þ!%
 Þ!'
 Þ#'
Þ#*
"Þ!'
Þ""
sum estimated
grand mean  â
Aâ
Bâ
AB  P  â
Câ
AC  O  â
BC  N  â
ABC  H  â
Dâ
AD  M  â
BD  L  â
ABD  J  â
CD  K  â
ACD  G  â
BCD  F  â
ABCD  E  â
There is no replication in this data set ... so we're driven to normal plotting in
order to judge statistical significance of these estimates.
Page 23
A normal plot of the (last 15) estimates is:
Tentative interpretation: The most important factors appear to be
B (Solvent) and F (# of Chill Rolls)
and for large cold crack resistance "high B" (refined solvent) and "high F" (#
chill rolls) appear best.
(Note that the analysis does point out what is in retrospect quite obvious,
namely that it is those combinations in the data set with "high B" and "high F"
that have the largest C's.)
Download