CRD

advertisement
A Latin square is an n by n array of n distinct symbols in such a manner that each symbol
appears in each row and column exactly once. For example, a Latin square of order 4
might be represented as follows:
A
B
C
D
B
A
D
C
C
D
A
B
D
C
B
A
The particular Latin square above features the symbols in alphabetical order in the first
row and column. This arrangement makes this a standard or reduced Latin square. If you
shuffle around the rows or columns you cannot destroy the Latinization of the square, but
it will not be a reduced one. This ability to permute rows and/or columns is an important
facet of Latin squares.
In order to best understand how these Latin squares provide a good structure for
experimental design, specific applications can be reviewed. For instance, suppose a tire
manufacturer wants to evaluate four different treatments for a new tire tread. The
manufacturer arranges for four different cars and four different drivers which are to be
used to test the different treatments, which we conveniently choose to represent with the
letters A, B, C, & D. Now, we can assign the cars and drivers to the rows and columns
(respectively) of the array above and get, without loss of generality, the following
arrangement1:
Ed
Hal
Joe
Ken
Ferrari
A
B
C
D
GTO
B
A
D
C
Isuzu
C
D
A
B
LeMans
D
C
B
A
This arrangement spreads out the nuisance factors of the different drivers and different
cars in a manner which should systematically block variability possibly caused by those
factors. To put some numbers with this design for the purpose of demonstrating the
calculations, we’ll use the following set of numbers.
Ferrari
GTO
Isuzu
LeMans
Ed
A=12
B=18
C=49
D=61
Hal
B =24
A=12
D=63
C=57
Joe
C=37
D=53
A=20
B=42
Ken
D=67
C=49
B=40
A=36
total
140
132
172
196
Total
140
156
152
192
640
1
The statistical model for this arrangement would be given by the equation.
Yijk    R i  C j  Tk   ijk
With the T term measuring the treatment effect, the R and C terms measuring the row and
column terms respectively and the  and  terms representing the grand mean and the
residuals. This model leads to the following ANOVA structure.
SST = +SStreatments + SSrows+ SScolumns + SSerror
Source
SS
df
MS
n-1
SS treatments
n 1
n-1
SS rows
n 1
n-1
SS columns
n 1
(n - 2)(n - 1)
SS e
(n  2)(n  1)
2
Treatments
1 n 2 y....
y 
n j1 . j.. n 2
Rows
1 n 2 y....
y 
n i 1 i... n 2
Columns
1 n 2 y....
y 
n k 1 ..k . n 2
2
2
Error
2
y
 y 
n
n
Total
n
n
i 1 j1 k 1
2
ijk
...
2
n2 – 1
The analysis of variance of the n2 observations would give (n-1) degrees of freedom to
the Row, Column and Treatment factors, this sums to 3(n-1), which, when subtracted
from the n2-1 or (n + 1)(n - 1) gives us (n + 1 -3)(n – 1) or (n - 2)(n - 1) degrees of
freedom for the error term. For the sake of good order, it should be noted that in general
the 3(n – 1) would be (n -1)(n -1), so (n + 1 – (n – 1))(n - 1) becomes 2(n - 1), which is
consistent with (n – 2)(n -1) here as n = 4.
We should observe that the row and column Latin restrictions compromise the
randomization of the design, and therefore one must apply some mechanism to counter
this. The standard randomization procedure for Latin square designs was given by Yates
in 1933 and roughly it says that for squares of order 3, 4 or 5, you must start with a
reduced Latin square and then you must permute all rows except the first and all columns
or all columns except the first and all rows, and then assign treatments at random to the
2
marker symbols. For squares of order 6 or higher, it is satisfactory to permute all rows,
columns and treatments, without the need to start with a reduced Latin square.
Additionally, it should be pointed out that the model doesn’t allow for interactions and
affords the experimenter information only on the main effects.
For each of the specific designs [differentiated by the exponentiated footnote] for which a
model is articulated there will be a complete ANOVA using a single set of data. The set of
these calculations will be in gathered at the end of the paper.
So let’s say that we set this design up and are about to run it when some technician points
out that we have the cars and drivers rented for four days, so we could run the
experiments for all four days and thus have replicates. Just as the cheering dies down
from the realization that we can build in a greater variance of error, someone points out
that there are choices to be made. Specifically, we could maintain the tread, driver, and
car trinity throughout the four days2, or we could rotate the either the cars3 or drivers4 or
even both5. In each of the cases the basic model is,
Yijkl    R i  C j  Tk  Pl   ijkl ,
where the P term accounts for the replication. However, it should be pointed out that the
very specific differences in the designs call for different ANOVA calculations. The
subtleties are best understood when they are viewed together.
This paper will outline the different setup for each of the various cases of the replication,
but again, the computations will be reserved for the end of the paper. In order to maintain
some degree of brevity, we will employ only a pair of replicates.
Case 1 (LSD2): No variation in the tread/car/driver arrangement through the p (in this
case 2) replicates. Here the same reduced and randomized (we’ll assume without loss of
generality that the random permutations left the reduced design intact) Latin square will
be employed.
Ed
Hal
Joe
Ken
total
Ferrari
GTO
Isuzu
LeMans
A=12,16
B=18,23
C=49,51
D=61,44
B =24,18
A=12,49
D=63,16
C=57,33
C=37,41
D=53,33
A=20,59
B=42,82
D=67,52
C=49,28
B=40,37
A=36,29
140,127
132,133
172,163
196,188
Total
140,134
156,116
152,215
192,146
640,611
The first number in the ijth entry in the array corresponds to the first replicate, and the
second number corresponds to the second trial
3
Source
SS
df
MS
n-1
SS treatments
n 1
n-1
SS rows
n 1
n-1
SS columns
n 1
2
Treatments
1 n 2 y....
y 
np j1 . j.. n 2 p
n
2
y
Rows
1
 2....
y

i
...
np i 1
n p
Columns
y
1 n 2
 2....
y

np k 1 ..k . n p
Replicates
1
n2
2
2
p
y
l 1
2
... l

y
2
....
2
p-1
n p
Error
(n-1)(p(n+1)-3)
SS replicates
p 1
SS e
(n  1)(p(n  1)  3)
2
y
 y 
n
n
Total
n
n
n
i 1 j1 k 1 l 1
2
ijkl
....
2
pn2 - 1
At times the following substitutions may be made E = Ed, H = Hal, J = Joe, K = Ken,
F = Ferrari, G = GTO, I = Isuzu, and L = LeMans.
4
Case 2 (LSD3): The drivers are randomly rearranged in the second replicate, but not the
cars. The first set of numbers will be assigned to the first (reduced) Latin square, but the
second set will be assigned to the following, permuted version of the original.
E
H
J
K
K
J
E
H
F
A=12 B=24 C=37 D=67
F
A=16 B=18 C=41 D=52
G
B=18 A=12 D=53 C=49
G
B=23 A=49 D=33 C=28
I
C=49 D=63 A=20 B=40
I
C=51 D=16 A=59 B=37
L
D=61 C=57 B=42 A=36
L
D=44 C=33 B=82 A=29
+
Source
SS
df
MS
2
Treatments
1 n 2 y....
y 
np j1 . j.. n 2 p
n-1
SS treatments
n 1
Rows
p
y ....2
1 p n 2

 y 
n l 1 i 1 i..l l 1 n 2
p(n-1)
SS rows
p(n  1)
Columns
y
1 n 2
 2....
y

np k 1 ..k . n p
n-1
SS columns
n 1
Replicates
1
n2
2
p
y
j 1
2
... l

y
2
....
2
p-1
n p
Error
(n - 1)(np - 1)
n
Total
n
n
n
 y
i 1 j1 k 1 l 1
2
ijkl

y
n
2
....
2
pn2 - 1
5
SS replicates
p 1
SS e
(n  1)(np  1)

Case 3 (LSD4): The cars are randomly rearranged in the second replicate, but not the
drivers.
E
H
J
K
E
H
J
K
F
A=12 B=24 C=37 D=67
I
A=16 B=18 C=41 D=52
G
B=18 A=12 D=53 C=49
F
B=23 A=49 D=33 C=28
I
C=49 D=63 A=20 B=40
L
C=51 D=16 A=59 B=37
L
D=61 C=57 B=42 A=36
G
D=44 C=33 B=82 A=29
+
Source
SS
df
MS
n-1
SS treatments
n 1
2
Treatments
1 n 2 y....
y 
np j1 . j.. n 2 p
Rows
1 n 2 y....
y 
np i 1 i... n 2 p
n–1
SS rows
n 1
Columns
p
y ....2
1 n p 2
 y  
n k 1 l 1 ..kl l 1 n 2
p(n – 1)
SS columns
p(n  1)
2
Replicates
1
n2
p
y
j 1
2
... l

y
2
....
2
p-1
n p
Error
(n-1)(pn - 1)
n
Total
n
n
n
 y
i 1 j1 k 1 l 1
2
ijkl

y
n
2
....
2
pn2 - 1
6
SS replicates
p 1
SS e
(n  1)(pn  1)

Case 4 (LSD5): Both the cars and drivers are randomly rearranged in the second replicate.
E
H
J
K
K
F
A=12 B=24 C=37 D=67
G
B=18 A=12 D=53 C=49
J
E
H
I
A=16 B=18 C=41 D=52
F
B=23 A=49 D=33 C=28
+
I
C=49 D=63 A=20 B=40
L
C=51 D=16 A=59 B=37
L
D=61 C=57 B=42 A=36
G
D=44 C=33 B=82 A=29
Source
SS
df
MS
Treatments
1 n 2 y....
y 
np j1 . j.. n 2 p
n-1
SS treatments
n 1
Rows
p
y ....2
1 p n 2
 y  
n l 1 i 1 i..l l 1 n 2
p(n – 1)
SS rows
p(n  1)
Columns
p
y ....2
1 n p 2
 y  
n k 1 l 1 ..kl l 1 n 2
p(n - 1)
SS columns
p(n  1)
Replicates
1
n2
2
p
y
j 1
2
... l

y
2
....
2
p-1
n p
Error
(n - 1)(pn - 1)
n
Total
n
n
n
 y
i 1 j1 k 1 l 1
2
ijkl

y
n
2
....
2
pn2 - 1
7
SS replicates
p 1
SS e
(p  1)(np  1)

Finally, just as decisions are about to be made, the issue is complicated by the realization
that only a single set of tires can be measured at the end of a day and further, the weather
forecast calls for materially different weather on each of the four days. While the first bit
of news removes the possibility for replicates, the second evidenced the addition of
another factor. Essentially we want to simultaneously consider the three Latin square
designs listed below.
Clear Hot
Sleet
Wind
Ferrari
A
B
C
D
GTO
B
A
D
C
Isuzu
C
D
A
B
Le Mans
D
C
B
A
Ed
Hal
Joe
Ken
Clear
A
B
C
D
Hot
B
A
D
C
Sleet
C
D
A
B
Wind
D
C
B
A
Ed
Hal
Joe
Ken
Ferrari
A
B
C
D
GTO
B
A
D
C
Isuzu
C
D
A
B
Le Mans
D
C
B
A
8
It is through the use of an orthogonal Latin square that this can easily be achieved.
A pair of Latin squares are said to be orthogonal if the unordered pairs formed by the
union of the two sets of n2 ijth entries are non-repeating. For example look at the two
Latin squares below and then their “union”.
A
B
C
D
A
B
C
D
B
A
D
C
C
D
A
B
C
D
A
B
D
C
B
A
D
C
B
A
B
A
D
C
AA
BB
CC
DD
BC
AD
DA
CB
CD
DC
AB
BA
DB
CA
BD
AC
These are commonly represented with different symbols, Latin and Greek, thus avoiding
any visual confusion about similar symbols represented by different fonts. It also gives
rise to the term Graeco-Latin squares.
To summarize this union and its specific arrangement here, we need to detail the symbols
and demonstrate how the factors are blocked. A, B, C, & D are the treads, A, B, C, & D
represent Ed, Hal, Joe and Ken respectively. Thus the Latin square above can be used to
represent the following design.
Clear
Hot
Sleet
Wind
Ed
A Ferrari
B GTO
C Isuzu
D LeMans
Hal
B Isuzu
A LeMans
D Ferrari
C GTO
Joe
C LeMans
D Isuzu Ed
A GTO
B Ferrari
Ken
D GTO
C Ferrari
B LeMans
A Isuzu
For future needs, we’ll establish the set of a, b, c & d as symbols for Ferrari, GTO, Isuzu
& LeMans.
9
This idea of extending the “uniting of squares” can be demonstrated if we can add
another Latin square that is pairwise orthogonal to each of the above Latin squares and
we’ll see their “union”.
a
b
c
d
AAa
BBb
CCc
DDd
d
c
b
a
BCd
ADc
DAb
CBa
b
a
d
c
CDb
DCa
ABd
BAc
c
d
a
b
DBc
CAd
BDa
ACb
Think
tread/driver/car
This is a set of three mutually orthogonal Latin squares of size 4, and in fact, this set is
complete. Modulo a change of symbols or a randomization of the rows or columns, are
complete sets of size 4 are identical to that given above. It should be obvious that the
measure of a complete set of MOLS of size n is n -1. What isn’t obvious is that they are
only proven to exist for an N which is either a prime or a prime power. All other number
have a lower bound of 2 MOLS, except for n = 6, for which there is no pair. This means
that additional factors can more easily be accommodated as the number of treatments
climbs.
Let us combine our developing tire experiment with the complete set of MOLS above for
a theoretical experiment. In this latest version6 we are not only adding in the blocking of
the factor of the weather, but also we are adding in yet another factor for, say, different
locations for the four tracks.
Clear
Hot
Sleet
Wind
Track1 A ED Ferrari
B Hal GTO
C Joe Isuzu
D KenLemans
Track 2 B Ken Isuzu
A Joe LeMans
D Hal Ferrari
C Ed GTO
Track 3 C Hal LeMans
D Ed Isuzu
A Ken GTO
B Joe Ferrari
Track 4 D Joe GTO
C Ken Ferrari
B Ed LeMans
A Hal Isuzu
Here we have the four tire tread types (A,B,C, & D) arranged with the four, four-degree
factors (location, weather, car & driver). A Latin square design can accommodate one
more factor than the cardinality of the complete set of set of MOLS for a given n.
While one may defend the stipulation that there are no interactions, it is difficult to ignore
the constraint that the various factors must have an equal number of options. In the case
of continuous parameters of which we have control, we can consider that we are merely
adding center points and we would realize that some factors can easily be forced to fit the
model’s needs. For instance, if one of the factors, say, a temperature level only had two
levels, while the other factor and the treatment had a cardinality of four, we could balance
10
things out by interjecting two interpolated temperature points to achieve the requisite
symmetry of the Lain square design. While it is reassuring to know that such an option
may exist, there are frequent occasions where an imbalance in the parameters does
preclude the use of a Latin square. For example:
Suppose a baker wants to experiment with seven different bread recipes at a variety of
temperatures in three different ovens. It is easy for him to justify seven different
temperature levels (and cooking times), but there is no flexibility in the three distinct
ovens. Clearly a Latin square cannot be used, but there exist fundamental results in
design theory which allows for non-symmetric designs and they are called Balanced
Incomplete Block Designs (BIBD), and they are defined below.
A BIBD with parameters (v, b, r, k,  ) is a pair (X, A) that satisfies the following
properties:
1. X is a set of v elements (called points).
2. A is a family of b subsets of X, each with cardinality k (called blocks).
3. Each pair occurs in exactly r blocks.
4. Every pair of distinct points occurs in exactly  blocks.
A little bit of inspection would reveal the following relationships between the parameters:
vr = bk
 (v-1) = r(k-1)
&
This allows a BIBD to be defined as a (v, k,  )-BIBD, which gives us all our
information.
In this case, the experiment can easily be set up as follows:
250
275
300
325
350
375
400
v
b
r
k

60 min
50 min
45 min
42 min
40 min
38 min
37 min
Oven 1
Oven 2
Oven 3
A
B
C
D
E
F
G
B
C
D
E
F
G
A
D
E
F
G
A
B
C
= number of treatments.
= number of blocks.
= number of blocks in which a treatment occurs.
= size of the blocks.
= number of times that two treatments occur together in a block in the overall design.
11
This BIBD, sometimes called a Youden square, loses the two direction heterogeneity as
the columns are orthogonal to the rows and treatments, but the rows are not orthogonal to
the treatments since not every treatment occurs in every row. This means that the
estimates of the treatment effects have to be adjusted for row effects, i.e., no longer can
treatment means be used to estimate treatment effects but Latin square means must be
obtained
A Room square of side n (on a set of n + 1 symbols) is an n by n array which satisfies the
following properties:
1. Every cell is either empty or filled with an unordered pair of symbols.
2. Every symbol occurs in each row and column exactly once.
3. Every unordered pair occurs in exactly one cell.
For example:
07
-
-
15
-
46
23
34
17
-
-
26
-
50
61
45
27
-
-
30
-
-
02
56
37
-
-
41
52
-
13
60
47
-
-
-
63
-
24
01
57
-
-
-
04
-
35
12
67
A common application for Room squares is in the construction of round-robin
tournaments. Think of the rows as the rounds, the columns as the locations and n + 1
symbols as the teams. Notice that the following properties result:
1. Every team plays every other team exactly once.
2. Every team plays exactly once in every round.
3. Every team plays at every location exactly once.
The application for this would be for examining interactions while blocking for other
factors. My question was whether this design, in conjunction with analysis using a Latin
square design to give some measure for the main effects, would be a reasonable platform
for comparison.
To more clearly articulate how this experiment might look, I’ll extend and modify the tire
tread analysis. We’ll assume that there are eight treatments, now A, B, C, D, E, F, G, &
12
H, which we substitute for 0 to 7 according to their rank. After running the n = 8 Latin
square design, we might look at the following Room square design.
Quint Rod
Sam
Tim
Unk
Will
EG
CD
Isuzu
AH
Jeep
DE
BH
KIA
BG
EF
CH
AC
FG
DH
BD
AG
EH
CE
AB
FH
DF
BC
LeMans
Nova
BF
Vic
CF
Olds
CG
DG
Pontiac
AE
AF
AD
BE
GH
What I cannot show you very well is that this design can also be extended to ndimensions through some clever constructions, the main being through the construction
of pairwise orthogonal, symmetric Latin squares of order n.
In fact, the existence of the following is equivalent:
1.
2.
3.
4.
A Room d-cube of side n. (d = dimension)
d pairwise orthogonal-symmetric Latin squares of order n.
d pairwise orthogonal one-factorizations of Kn+1
v(n) greater than or equal to d.
I look forward to continuing my enquiry and I am hopeful that I will better understand the
utility of some combinatorial structures for the purpose of designing experiments.
13
ANOVA tables.
For all the different versions, we will use the following set of measures for the degree of
tire erosion. The numbers in parenthesis are for the first replicate (LSD2 – LSD5).
1
2
3
4
total
1
12(16)
18(23)
49(51)
61(44)
140(134)
Treatment
Row
Column
Replicate
Weather
Track
Model
Error
Total
Treatment
Row
Column
Replicate
Weather
Track
Model
Error
Total
Treatment
Row
Column
Replicate
Weather
Track
Model
Error
Total
2
24(18)
12(49)
63(16)
57(33)
156(116)
SS
3944
656
376
na
na
na
4976
80
5056
LSD1
df
MS
3
3
3
0
0
0
9
6
15
SS
3944
656
376
na
na
na
4976
80
5056
LSD3
df
MS
3
3
3
0
0
0
9
6
15
SS
3944
656
376
na
na
na
4976
80
5056
LSD5
df
MS
3
3
3
0
0
0
9
6
15
3
37(41)
53(33)
20(59)
42(82)
152(215)
F value
F value
F value
14
4
67(52)
49(28)
40(37)
36(29)
192(146)
total
140(127)
132(133)
172(163)
196(188)
640(611)
SS
LSD2
df
MS
F Value
SS
LSD4
df
MS
F Value
SS
LSD6
df
MS
F Value
Download