Confounding

advertisement
Confounding
In an unreplicated 2K there are 2K treatment
combinations. Consider 3 factors at 2 levels
each: 8 t.c’s
If each requires 2 hours to run, 16 hours will be
required. Over such a long time period, there
could be, say, a change in personnel; let’s say
we run 8 hours Monday and 8 hours Tuesday Hence: 4 observations on each of two days.
1
(or 4 observations in each of 2 factories)
( or 4 observations in each of 2
[Potentially different] plots of land)
(or two different groups answering
different sets of questions in a survey)
Replace one (“large”) block by 2 smaller blocks
2
Consider 1, a, b, ab, c, ac, bc, abc,
3
2
1
M
T
a
1
a
ab
b
ab
b
bc
c
ac
ac
c
abc
abc
bc
bc
abc
M
T
M
1
c
1
a
ac
b
ab
T
Does it matter? Which is preferable? Why?
3
The block with the “1” observation (everything
at low level) is called the “Principal block”. (It
has equal stature with other blocks, but is useful
to identify.)
Assume all Monday yields are higher than
Tuesday yields by a (near) constant but
unknown amount X. (X is in units of the
dependent variable under study).
What is the consequence(s) of having 2 smaller
blocks?
4
Again Consider
M
T
1
a
ab
b
ac
c
bc
abc
Usual estimate :
A = (1/4) [-1 + a - b + ab - c + ac - bc + abc]
NOW BECOMES . . . . .
5
[
- c + (ac + x) - (bc + x) + abc
= (usual estimate)
Usual ABC =
1
4
=
1
4
[
[
[ x’s cancel out ]
-1 + a + b - ab + c - ac - bc + abc
- ( 1 + x) + a + b - (ab + x)
+ c - (ac + x) - (bc + x) + abc
[
4
- (1 + x) + a - b + (ab + x)
[
1
= Usual estimate -x
6
We would find that we estimate
A, B, AB, C, AC, BC, ABC - X
Switch M & T, and ABC -X becomes ABC + X
Replacement of one block by 2 smaller blocks
requires the “sacrifice” (confounding) of (at
least) one effect.
7
T
M
T
M
T
M
1
c
1
a
1
a
a
ac
ab
b
ab
b
b
bc
c
ac
ac
c
ab
abc
abc
bc
bc
abc
Confounded Effects:
Only C
Only AB
Only ABC
8
M
T
B, C,
1
ab
AB,
a
c
AC
b
bc
ac
abc
Confounded Effects:
(4 out of 7, instead of 1 out of 7)
9
Plus-Minus Table
A
B
AB
C
AC
BC
ABC
1
-
-
+
-
+
+
-
a
+
-
-
-
-
+
+
b
-
+
-
-
+
-
+
ab
+
+
+
-
-
-
-
c
-
-
+
+
-
-
+
ac
+
-
-
+
+
-
-
bc
-
+
-
+
-
+
-
abc +
+
+
+
+
+
+
10
Recall: X is “nearly constant”. If X
varies significantly with t.c.’s, it
interacts with A/B/C, etc., and should
be included as an additional factor.
11
Basic idea can be viewed as follows:
STUDY IMPORTANT FACTORS UNDER
MORE HOMOGENEOUS CONDITIONS, With
the influence of some of the heterogeneity in
yields caused by unstudied factors confined
to one effect, (generally the one we’re least
interested in estimating - often one we’re
willing to assume equals zero - usually the
highest order interaction). We reduce Exp.
Error by creating 2 smaller blocks, at
expense of confounding one effect.
12
All estimates not “lost” can be judged against
less variability (and hence, we get narrower
confidence intervals, smaller  error for given 
error, etc.)
For K large in 2k, confounding is popular - Why?
(1) it is difficult to create large
homogeneous blocks.
(2) loss of one effect is not thought
to be important.
(e.g., in 27, we give up 1 out of 127
effects - perhaps ABCDEFG)
13
Partial Confounding
23 with 4 replications:
Confound
Confound
Confound
Confound
ABC
AB
AC
BC
1
a
1
a
1
a
1
b
ab
b
ab
b
b
ab
a
ab
ac
c
c
ac
ac
c
bc
c
bc
abc
abc
bc
abc
bc
abc
ac
14
Can estimate A, B, C from all replicates
(32 “units of reliability”)
AB
from Repl. 1, 3, 4
AC
from
1, 2, 4
BC
from
1, 2, 3
ABC from
2, 3, 4
}
24 “units
of reliability”
15
Example :
Dependent Variable:
Weight loss of
Ceramic Ware
A: Firing Time
B: Firing Temperature
C: Formula of
Ingredients
16
Only 2 weighing mechanisms are available, each able to
handle (only) 4 t.c.’s. The 23 is replicated twice:
1
Confound ABC
Confound AB
2
Machine 1 Machine 2
Machine 1
Machine 2
1
a
1
a
ab
b
ab
b
ac
c
c
ac
bc
abc
abc
bc
A, B, C, AC, BC, “clean” in both replications.
AB from repl.
1
; ABC from repl.
2
17
Multiple Confounding
Further blocking: (more than 2 blocks)
Example:
1
2
3
24
1
a
b
c
cd
acd
bcd
d
abd
bd
ad
abcd
abc
bc
ac
ab
4
18
Imagine that these blocks differ by constants
in terms of the variable being measured; all
yields in the first block are too high (or too
low) by R. Similarly, the other 3 blocks are too
high (or too low) by amounts S, T, U,
respectively. (These letters play the role of X
in 2 - block confounding).
(R + S + T + U = 0
by definition)
19
Given the allocation of the 16 t.c.’s to the
smaller blocks shown above, (lengthy)
examination of all 15 effects reveals that
these unknown but constant (and
systematic) block differences R, S, T, U,
confound estimates AB, BCD, and ACD (#
estimates confounded at minimum = 1 fewer
than # blocks) but leave UNAFFECTED the
12 remaining estimates in the 24 design.
This result is illustrated for ACD (a
confounded effect) and D (a “clean” effect).
20
D
ACD
Sign of
treatment
block
effect
Sign of
treatment
block
effect
1
-
-R
-
-R
a
+
-
+S
-
-S
-T
-
-T
+
+
+U
-
-U
+U
-
-U
+
-T
-
-T
+S
-
-S
-R
+
+U
ad
-R
+U
-
d
+
-
+
+T
bd
+
-T
+S
+
+S
abd
-
-R
+
+R
cd
-
-R
+
+R
acd
+S
+
+S
bcd
+
-
-T
+
+T
abcd
+
+U
+
+U
b
ab
c
ac
bc
abc
21
In estimating D, block differences
cancel. In estimating ACD, Block
differences DO NOT cancel (the R’s,
S’s, T’s, and U’s accumulate). In
fact, we would estimate not ACD,
but
[ACD - R/2 + S/2 - T/2 + U/2].
22
The ACD estimate is hopelessly confounded with
block effects.
We began this discussion of multiple
Confounding with 4 treatment combo’s allocated
to each of the four smaller blocks. We then
determined what effects were and were not
confounded.
23
Sensibly, this is ALWAYS REVERSED.
The experimenter decides what effects
he/she is willing to confound, then
determines the treatments appropriate
to each smaller block. (In our example,
experimenter chose AB, BCD, ACD).
24
As a consequence of a theorem by Barnard, only
two of the three effects can be chosen by the
experimenter. The third is then determined by “MOD
2 multiplication”.
Depending which two effects were selected, the
third will be produced as follows:
AB x BCD = AB2CD = ACD
AB x ACD = A2BCD = BCD
BCD x ACD = ABC2D2 = AB
25
Need to select with care: in 25 with 4 blocks,
each of 8 t.c.’s, need to confound 3 effects:
choose ABCDE and ABCD.
(consequence: E - a main effect)
Better would be to confound more modestly:
say - ABD, ACE, BCDE. (No Main Effects nor
“2fi’s” lost).
26
Once effects to be confounded are
selected, t.c’s which go into each block
are found as follows:
Those t.c.’s with an even number of
letters in common with all confounded
effects go into one block (the principal
block); t.c.’s for the remaining block(s)
are determined by MOD - 2 multiplication
of the principal block.
27
Example: 25 in 4 blocks of 8.
confounded: ABD, ACE, [BCDE]
Of the 32 t.c.’s: 1, a, b, . . . . . . . . . abcde,
the 8 with even # letters in common with
all 3 terms (actually the first two alone is
EQUIVALENT):
28
ABD, ACE, BCDE
Prin. block * 1, abc, bd, acd, abe, ce, ade, bcde
mult. by a:
a, bc, abd, cd, be, ace, de, abcde
mult. by b:
b, ac, d, abcd, ae, bce, abde, cde
mult. by e:
e, abce, bde, acde, ab, c, ad, bcd
any thus far
“unused” t.c.
* note: “invariance property”
29
Remember that we compute the
31 effects in the usual way.
Only, ABD, ACE, BCDE are not
“clean”. Consider from the 25
table of signs: P. 265, Tale 9.4.
30
If the influence of the unknown block
effect, R, is to be removed, it must be
done in Block 1, for R appears only in
Block 1. You can see when it cancels &
when it doesn’t.
(Similarly for S, T, U).
31
In general: (For 2k in 2r blocks)
2r
number of
smaller
blocks
r
2r-1-r
2r-1
number of
number of confounded number of
automatically
confounded
effects
experimenter confounded
effects
effects
may choose
2
1
1
0
4
3
2
1
8
7
3
4
16
15
4
11
32
With 8 blocks, we lose 7 effects, 3
chosen independently:
X Y Z
XY XZ YZ XYZ
With 16 blocks, we lose 15 effects, 4
chosen independently:
X Y Z V
XY XZ XV YZ YV ZV
XYZ XYV XZV YZV XYZV
33
It may appear that there would be little
interest in designs which confound as
many as, say, 7 effects. Wrong! Recall
that in a, say, 26, there’s 63=26-1 effects.
Confounding 7 of 63 might well be
tolerable.
34
EXAMPLE OF:
Error reduction through confounding
A = (1/4)• (-1 + a - b + ab - c + ac - bc + abc)
V1(A) = (1/16) • 8s2 = s2/2
(Suppose s = 2,  V1(A) = 2)
35
Now suppose that we ran the experiment in two
blocks of four M: t1, t2, t3, t4
T : t5, t6, t7, t8
Further suppose that (M - T) = X
36
There are 70 ways to allocate the 8 treatment
combinations, 4 on Monday and 4 on Tuesday 8!/(4!• 4!) = 70
There are 36 ways that yield us a “clean”
(say) A estimate,
4!/(2!x2!) • 4!/(2!x2!) = 6 x 6 = 36
There are 16 ways in which the estimate we get is
(A - X/2)
4!/(1!x3!) • 4!/(3!x1!) = 4 x 4 = 16
37
There are also 16 ways we get an estimate of
(A + X/2)
Finally, there is one way each for getting
(A+X) and (A-X)
Overall distribution:
Estimate of A
A-X
A-X/2
A
A + X/2
A+X
Probability
1/70
16/70
36/70
16/70
1/70
38
This distribution has a variance, Vday(A) =
(1/70)•(-X)2+(16/70)•(-X/2)2+(36/70)•0
+(16/70)•(X/2)2+ (1/70)•(X)2
=(10/70) • X2
39
Suppose, for example, that X =4, same as 2s2.
Then, Vday(A) = (10/70)42=2.29.
We have
Vtotal(A) = V1(A) + Vday(A)
= 2 + 2.29
= 4.29
So, without confounding, Vtotal(A) = 2 + 2.29 = 4.29,
with confounding, Vtotal(A) = 2 + 0 = 2
(with confounding, standard deviation is 1.41,
as opposed to 2.07, a reduction of 32%.)
40
Download