8. Tw o-w a

advertisement
8. Two-way crossed
classications
This is called an \unbalanced"
factorial experiment.
Example 8.1 Days to germination
of three varieties of carrot seed
grown in two types of potting soil.
Soil
Type
1
1 Y111 = 6
Y112 = 10
Y113 = 11
2 Y211 = 12
Y212 = 15
Y213 = 19
Y214 = 18
Variety
2
3
Y121 = 13 Y131 = 14
Y122 = 15 Y132 = 22
Y221 = 31 Y231 = 18
Y232 = 9
Y233 = 12
495
We will restrict our attention
to normal-theory Gauss-Markov
models.
Yijk = ij + ijk
ijk NID(0; 2)
8
>
>
>
>
>
>
>
<
>
>
>
>
>
>
>
:
Soil
type
1
2
1
Variety
2
3
n11 = 3 n12 = 2 n13 = 2
n21 = 4 n22 = 1 n23 = 3
In general we have
i = 1; 2; : : : ; a levels for the rst factor
j = 1; 2; : : : ; b levels for the second
factor
nij > 0
observations at the i-th
level of the rst factor and
the j-th level of the second
factor
496
Overall mean response:
1 a b :: = ab
i=1 j =1 ij
X
X
Mean response at the i-th level of
factor 1, averaging across the levels
of factor 2.
\Cell means" model:
where
Sample sizes
b
i: = 1b j =1
ij
X
i = 1; : : : ; a
j = 1; : : : ; b
k = 1; : : : ; nij
Clearly, E (Yijk) = ij is estimable
if nij > 0.
497
Mean response at the j -th level of
factor 2, averaging across the levels
of factor 1
a
ij
:j = a1 i=1
X
498
Conditional Eects
Contrasts of interest
\main eects" for factor 1:
i: ::
i = 1; 2; : : : ; a
i: k:
i 6= k
ij kj
8
>
>
<
>
>
:
ij i`
8
>
>
>
<
>
>
:
i 6= k
j = 1; 2; : : : ; b
j 6= `
i = 1; 2; : : : ; a
Interaction Contrasts
\main eects" for factor 2:
:j ::
j = 1; 2; : : : ; b
:j :`
j 6= `
(ij kj ) (i` k`)
= (ij i`) (kj k`)
= ij kj i` + k`
499
All of these contrasts are estimable
when
nij > 0
500
An \eects" model
Yijk = + i + j + ij + ij
for all (i; j )
where
because
E(Yij:) = ij
Any linear function of estimable
functions is estimable
501
ijk NID(0; 2)
i = 1; 2; : : : ; a
j = 1; 2; : : : ; b
k = 1; 2; : : : ; nij > 0
502
2
66
66
66
66
66
66
66
66
66
66
66
66
66
66
64
Y111
Y112
Y113
Y121
Y122
Y131
Y132
Y211
Y212
Y213
Y214
Y221
Y231
Y232
Y233
3 2
1
77 66 1
77 66
1
77 66
77 66 1
77 66 1
77 66 1
77 66 1
77 = 66 1
77 66
77 66 1
77 66 1
77 66 1
77 66 1
77 66 1
75 64 1
1
1
1
1
1
1
1
0
0
0
0
0
0
0
1 0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
1
1
1
1
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
1
1
0
0
0
0
0
1
1
1
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
3
77 2
77
6
777 66
77 66
77 66
77 66
77 66
77 66
6
777 66
77 66
77 66
77 66
77 4
75
1
2
1
2
3
11
12
13
21
22
23
2
66
66
66
66
66
66
66
+ 666
66
66
66
66
66
66
4
The resulting restricted model is
3
77
77
77
77
77
77
77
77
77
77
77
5
Yijk = + i + j + ij + ijk
where
111
112
113
121
122
131
132
211
212
213
214
221
231
232
233
3
77
77
77
77
77
77
77
77
77
77
77
77
77
77
75
ijk NID(0; 2)
8
>
>
>
>
>
>
>
<
>
>
>
>
>
>
>
:
and
1
1
i1
1j
=
=
=
=
i = 1; : : : ; a
j = 1; : : : ; b
k = 1; : : : ; nij
0
0
0 for all i = 1; : : : ; a
0 for all j = 1; : : : ; b
We will call these \baseline"
restrictions.
503
504
Soil
Type
Soil
1
2
Variety 1
11 = 21 = + 1
+ 22
Variety 2
Variety 3
Means
+ 1+3 2
+ 2
12 = + 2
22 = + 2
+2 + 22
13 = + 3
23 = + 2
+3 + 23
+ 22 + 2 + 222
+ 22 + 3 + 223
+ 2+3 3
+ 22+3 23
Soil
Type
Soil
1
2
Variety 1
+ 22
Interpretation:
Variety 2
11 = 21 = + 1
12 = + 2
22 = + 2
+2 + 22
Variety 3
Means
+ 1+3 2
+ 2
13 = + 3
23 = + 2
+3 + 23
+ 2+3 3
+ 22+3 23
+ 22 + 2 + 222 + 22 + 3 + 223
Interpretation
= 11 = E (Y11k)
j = 1j 11 = E (Y1jk) E (Y11k)
for j = 1; 2; : : : ; b
is the mean response with both
factors at the rst level.
i = i1 11 = E (Yi1k) E (Y11k)
is the dierence in mean responses
between levels i and 1 of factor 1
when factor 2 is at level 1.
505
is the dierence in the mean responses for levels j and 1 of factor
2 when factor 1 is at level 1.
506
Soil
Type
Soil
1
2
Variety 1
Variety 2
11 = 21 = + 1
+ 22
12 = + 2
22 = + 2
+2 + 22
Variety 3
Means
+ 1+3 2
13 = + 3
23 = + 2
+3 + 23
+ 2
+ 2+3 3
+ 22+3 23
+ 22 + 2 + 222 + 22 + 3 + 223
Interaction:
ij = (ij ib) (aj ab)
= (ij aj ) (ib ab)
2
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
4
Y111 777 666 1
Y112 7777 6666 1
Y113 7777 6666 1
Y121 7777 6666 1
Y122 7777 6666 1
Y131 7777 6666 1
Y132 7777 6666 1
Y211 7777 = 6666 1
Y212 7777 6666 1
Y213 7777 6666 1
Y214 7777 6666 1
Y221 7777 6666 1
Y231 7777 6666 1
Y232 7775 6664 1
1
Y233
3
"
2
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
0
0
0
1
1
0
0
0
0
0
0
1
0
0
0
"
0
0
0
0
0
1
1
0
0
0
0
0
1
1
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
2
3
0 3777
66 111 77
6
77
66 0 7777
66 112 777
66 7
0 7777
66 113 777
66 7
0 7777
6 121 77
0 7777 26 37 66666 122 77777
0 7777 6666 2 7777 6666 131 7777
0 7777 6666 2 7777 6666 132 7777
0 7777 6666 3 7777 + 6666 211 7777
0 7777 6666 22 7777 6666 212 7777
0 7777 64 23 75 6666 213 7777
66 7
0 7777
66 214 777
66 7
0 7777
66 221 777
66 77
1 7777
231
66
77
7
6
1 775
66 232 777
4
1
233 5
Y = X + Note that
ij i` kj +k` = ij i` kj +k`
for any (i; j ) and (k; `)
507
^ 77
^ 2 77777
^ 7
b = ^2 777777
3 7
^22 77775
^23
= (X T X )
2
11
66 Y
66 66 Y21
66 = 66666 YY12
66 13
66 Y
64 22
Y23
3
1
XT Y
Y11
Y11
Y11
Y21 Y12 + Y11
Y21 Y13 + Y11
where
Y N (X; 2I )
508
Restrictions must involve \nonestimable" quantities for the
unrestricted \eects" model.
Baseline restrictions: (SAS)
a = 0
b = 0
ib = 0 for all i = 1; : : : ; a
aj = 0 for all j = 1; : : : ; b
Least squares estimation:
2
66
66
66
66
66
66
66
66
66
64
Matrix formulation:
3
77
77
77
77
77
77
77
77
77
5
Baseline restrictions: (S-PLUS)
1 = 0
1 = 0
i1 = 0 for all i = 1; : : : ; a
1j = 0 for all j = 1; : : : ; b
509
510
-restrictions:
Yijk = ! + i + Æj + ij + ijk
Variety 1
-
Soil
ij = E (Yijk )
where
type 1
Soil
type 2
means
ijk NID(0; 2)
a
i=1 i
b
X
Æ
j =1 j
a
X
i=1 ij
b
X
j =1 ij
X
Variety 2
Variety 3
Interpretation:
1 a b ! = ab
i=1 j =1 ij
= 0
X
= 0
= 0
for each j = 1; : : : ; b
= 0
for each i = 1; : : : ; a
X
is the overall mean germination time, averaging across all soil types and all varieties
used in this study.
512
511
Variety 1
Soil
type 1
Soil
type 2
means
Means
11 = ! + 1 12 = ! + 1 13 = ! + 1 1: = ! + 1
+Æ1 + 11
+Æ2 + 12
+Æ3 + 13
21 = ! + 2 22 = ! + 2 23 = ! + 2 2: = ! + 2
+Æ1 + 21
+Æ2 + 22
+Æ3 + 23
:1 = ! + Æ1 :2 = ! + Æ2 :3 = ! + Æ3
Variety 2
Variety 3
Means
11 = ! + 1 12 = ! + 1 13 = ! + 1 1: = ! + 1
+Æ1 + 11
+Æ2 + 12
+Æ3 + 13
21 = ! + 2 22 = ! + 2 23 = ! + 2 2: = ! + 2
+Æ1 + 21
+Æ2 + 22
+Æ3 + 23
:1 = ! + Æ1 :2 = ! + Æ2 :3 = ! + Æ3
Variety 1
Soil
type 1
Interpretation:
Soil
type 2
means
! + Æj = :j
Variety 2
Variety 3
Means
11 = ! + 1 12 = ! + 1 13 = ! + 1 1: = ! + 1
+Æ1 + 11
+Æ2 + 12
+Æ3 + 13
21 = ! + 2 22 = ! + 2 23 = ! + 2 2: = ! + 2
+Æ1 + 21
+Æ2 + 22
+Æ3 + 23
:1 = ! + Æ1 :2 = ! + Æ2 :3 = ! + Æ3
Interpretation:
Æj = :j ::
1 2 = 1: 2:
and
Æj Æk = (:j ::) (:k ::)
= :j :k
is the dierence in the mean germination
times for dierent soil types, averaging
across varieties.
is the dierence between mean germination times for varieties j and k, averaging
across soil types.
513
514
Variety 1
Variety 2
Variety 3
Means
11 = ! + 1 12 = ! + 1 13 = ! + 1 1: = ! + 1
+Æ1 + 11
+Æ2 + 12
+Æ3 + 13
21 = ! + 2 22 = ! + 2 23 = ! + 2 2: = ! + 2
+Æ1 + 21
+Æ2 + 22
+Æ3 + 23
:1 = ! + Æ1 :2 = ! + Æ2 :3 = ! + Æ3
Soil
type 1
Soil
type 2
means
Interaction:
ij = ij (! + i + Æj )
is a deviation from an additive model.
Then,
kj i` + k`
= ij kj i` + k`
ij
515
Matrix formulation
2
66
66
66
66
66
66
66
66
66
66
66
66
66
66
64
Y111
Y112
Y113
Y121
Y122
Y131
Y132
Y211
Y212
Y213
Y214
Y221
Y231
Y232
Y233
3 2
1
77 66 1
77 66
77 66 1
77 66 1
77 66 1
77 66 1
77 66 1
77 = 66 1
77 66
77 66 1
77 66 1
77 66 1
77 66 1
77 66 1
75 64 1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
0
1
1
1
1
1
1
0
1
1
1
0
0
0
1
1
1
1
0
0
0
0
1
1
1
1
1
1
1
0
0
1
1
1
1
1
1
0
1
1
1
0
0
0
1
1
1
1
0
0
0
0
1
1
1
1
3
77
77
77
77
77 2
77 6
77 66
77 66
77 66
77 64
77
77
77
77
75
!
1
Æ1
Æ2
11
12
2
66
66
66
6
3 66
6
77 66
77 66
77 + 66
77 66
5 66
66
66
66
66
64
111
112
113
121
122
131
132
211
212
213
214
221
231
232
233
3
77
77
77
77
77
77
77
77
77
77
77
77
77
77
75
This uses the -restrictions to obtain
2 = 1
Æ3 = Æ1 Æ2
21 = 11
13 = 11 12
22 = 12
23 = 13 = 11 + 12
516
Least squares estimation
!^ 777
^ 77777
^1 77777
Æ
b = ^ 7777
Æ2 777
^11 77777
^12 5
= (X T X ) 1X T Y
2
1 XX 66
Yij:
6 6
2
66
66
66
66
66
66
66
66
66
66
66
66
64
=
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
64
3
ij
1X
3
j
Y1j: 16 X X Yij:
Y
i i1:
1X Y
2
i i2:
Y11: !^
Y12: !^
1X
2
3
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
1 777
75
1 X X Y
ij:
61
XX 6 Y^ij:
^1 Æ
^1 Æ^2
If restrictions are placed on \nonestimable"
functions of parameters in the unrestricted
\eects" model, then
16:83 3777
3:17 7777
77
7
= 45::33
67 77777
0:33 7775
5:33
2
66
66
66
66
66
66
66
66
66
4
517
The resulting models are reparameterizations of each other.
518
The solution to the normal equations
Y^ = PX Y
e = Y Y^ = (I PX )Y
SSE = eT e = YT (I PX )Y
Y^ T Y^ = YT PX Y
SSmodel = YT (PX P1)Y
are the same for any set of restrictions.
b = (X T X ) 1X T Y
and interpretations of the corresponding parameters will not be
the same for all such sets of restrictions.
If you were to place restrictions on
estimable functions of parameters
in
Yijk = + 1 + j + ij + ijk
then you would change
519
Analysis of variance
rank(X )
space spanned by the columns of
X
Y^ = X (X T X ) X T Y and OLS
estimators of other estimable
quantities.
YT Y = YT PY + YT (P; P)Y
+YT (P;; P;)Y
+YT (PX P;;)Y
+YT (I PX )Y
= R() + R(j) + R(j; )
+R( j; ; ) + SSE
520
2
66
66
66
66
66
66
66
66
66
66
66
66
66
66
64
2
66
66
66
66
66
66
66
66
66
66
66
66
66
66
64
Y111
Y112
Y113
Y121
Y122
Y131
Y132
Y211
Y212
Y213
Y214
Y221
Y231
Y232
Y233
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
3
77
77
77
77
77
77
77
77 =
77
77
77
77
77
77
75
1
1
1
1
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
Dene:
1
1
1
0
0
0
0
1
1
1
1
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
1
1
0
0
0
0
0
1
1
1
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
" - -
call this
call this
X
0
0
0
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
1
0
0
0
0
"
call this
X
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
3
77 2
77
77 66
77 66
77 66
77 66
77 66
77 66
77 66
77 66
77 66
77 66
77 66
77 4
75
1
2
1
2
3
11
12
13
21
22
23
3
77
77
77
77
77
77
77 + 77
77
77
77
5
X = X
P = X(XT X) 1XT
X; = [XjX]
T X;) X T
P; = X;(X;
;
T X;; ) X T
X;; = [XjXjX ] P;; = X;;(X;;
;;
X = [XjXjX jX ] PX = X (X T X ) X T
call this
X
X
521
The following three model matrices correspond to reparameterizations of the same
model:
2
66
66
66
66
66
66
66
66
66
66
66
66
66
66
64
2
66
66
66
66
66
66
66
66
66
66
66
66
66
66
64
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
1
1
1
1
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
1
1
0
0
0
0
0
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
0
0
0
0
1
1
1
0
0
0
0
1
1
1
1
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
1
0
0
0
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
0
0
0
0
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
3
77
77
77
77
77
77
77
77
77
77
77
77
77
77
75
0
0
0
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
0
0
2
66
66
66
66
66
66
66
66
66
66
66
66
66
66
64
0
0
0
0
0
0
0
1
1
1
1
0
0
0
0
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
3
77 2
77
6
777 66
77 66
77 66
77 66
77 66
77 66
77 66
77 66
77 66
6
777 66
77 4
75
1
1
1
0
0
1
1
1
0
1
1
0
1
1
1
1
2
1
2
3
11
12
13
21
22
23
0
0
0
1
1
1
1
0
1
0
0
1
1
1
1
3
77
77
77
77
77
77
77
77
77
77
77
5
1
1
1
0
0
1
1
1
1
1
1
0
1
1
1
0
0
0
1
1
1
1
0
0
0
0
1
1
1
1
3
77
77
77
77
77
77
77
77
77
77
77
77
77
77
75
523
522
R() = YT PY is the same for all three
models
2
66
66
66
66
66
66
66
66
66
66
66
66
66
66
64
2
66
66
66
66
66
66
66
66
66
66
66
66
66
66
64
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
1
1
1
1
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
1
1
0
0
0
0
0
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
0
0
0
0
1
1
1
0
0
0
0
1
1
1
1
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
1
0
0
0
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
0
0
0
0
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
3
77
77
77
77
77
77
77
77
77
77
77
77
77
77
75
0
0
0
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
0
0
2
66
66
66
66
66
66
66
66
66
66
66
66
66
66
64
0
0
0
0
0
0
0
1
1
1
1
0
0
0
0
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
3
77 2
77
77 66
77 66
77 66
77 66
77 66
77 66
6
777 66
77 66
77 66
77 66
77 64
77
5
1
1
1
0
0
1
1
1
0
1
1
0
1
1
1
1
2
1
2
3
11
12
13
21
22
23
0
0
0
1
1
1
1
0
1
0
0
1
1
1
1
3
77
77
77
77
77
77
77
77
77
77
77
5
1
1
1
0
0
1
1
1
1
1
1
0
1
1
1
0
0
0
1
1
1
1
0
0
0
0
1
1
1
1
3
77
77
77
77
77
77
77
77
77
77
77
77
77
77
75
524
R(; ) = YT P;Y is the same for all three
models and so is R(j) = R(; ) R()
2
66
66
66
66
66
66
66
66
66
66
66
66
66
66
64
2
66
66
66
66
66
66
66
66
66
66
66
66
66
66
64
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
1
1
1
1
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
1
1
0
0
0
0
0
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
0
0
0
0
1
1
1
0
0
0
0
1
1
1
1
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
1
0
0
0
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
0
0
0
0
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
0
0
0
0
3
77
77
77
77
77
77
77
77
77
77
77
77
77
77
75
0
0
0
0
0
1
1
0
0
0
0
0
0
0
0
2
66
66
66
66
66
66
66
66
66
66
66
66
66
66
64
0
0
0
0
0
0
0
1
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
3
77 2
77
6
777 66
77 66
77 66
77 66
77 66
77 66
6
777 66
77 66
77 66
77 66
77 4
75
1
1
1
0
0
1
1
1
0
1
1
0
1
1
1
1
2
1
2
3
11
12
13
21
22
23
2
66
66
66
66
66
66
66
66
66
66
66
66
66
66
64
3
77
77
77
77
77
77
77
77
77
77
77
5
0
0
0
1
1
1
1
0
1
0
0
1
1
1
1
R(; ; ) = YT P;; Y is the same for
all three models and so is R(j) =
R(; ; ) R(; )
1
1
1
0
0
1
1
1
1
1
1
0
1
1
1
0
0
0
1
1
1
1
0
0
0
0
1
1
1
1
3
77
77
77
77
77
77
77
77
77
77
77
77
77
77
75
2
66
66
66
66
66
66
66
66
66
66
66
66
66
66
64
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
1
1
1
1
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
1
1
0
0
0
0
0
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
0
0
0
0
1
1
1
0
0
0
0
1
1
1
1
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
1
0
0
0
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
0
0
0
0
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
3
77
77
77
77
77
77
77
77
77
77
77
77
77
77
75
0
0
0
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
0
0
2
66
66
66
66
66
66
66
66
66
66
66
66
66
66
64
0
0
0
0
0
0
0
1
1
1
1
0
0
0
0
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
3
77 2
77
77 66
77 66
77 66
77 66
77 66
77 66
77 66
77 66
77 66
6
777 66
77 64
77
5
1
1
1
0
0
1
1
1
0
1
1
0
1
1
1
1
2
1
2
3
11
12
13
21
22
23
3
77
77
77
77
77
77
77
77
77
77
77
5
0
0
0
1
1
1
1
0
1
0
0
1
1
1
1
525
R(; ; ; ) = YT PX Y is the same for
all three models and so is R( j; ; ) =
R(; ; ; ) R(; ; )
2
66
66
66
66
66
66
66
66
66
66
66
66
66
66
64
2
66
66
66
66
66
66
66
66
66
66
66
66
66
66
64
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
1
1
1
1
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
1
1
0
0
0
0
0
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
0
0
0
0
1
1
1
0
0
0
0
1
1
1
1
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
1
0
0
0
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
0
0
0
0
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
3
77
77
77
77
77
77
77
77
77
77
77
77
77
77
75
0
0
0
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
0
0
2
66
66
66
66
66
66
66
66
66
66
66
66
66
66
64
0
0
0
0
0
0
0
1
1
1
1
0
0
0
0
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
3
77 2
77
77 66
77 66
77 66
77 66
77 66
77 66
77 66
77 66
77 66
77 66
77 66
77 4
75
1
1
1
0
0
1
1
1
0
1
1
0
1
1
1
1
2
1
2
3
11
12
13
21
22
23
0
0
0
1
1
1
1
0
1
0
0
1
1
1
1
1
1
1
0
0
1
1
1
1
1
1
0
1
1
1
0
0
0
1
1
1
1
0
0
0
0
1
1
1
1
3
77
77
77
77
77
77
77
77
77
77
77
77
77
77
75
526
Consequently, the partition
YT Y = YT PY + YT (P; P)Y
+YT (P;; P;)Y
+YT (PX P;;)Y
+YT (I PX )Y
3
77
77
77
77
77
77
77
77
77
77
77
5
1
1
1
0
0
1
1
1
1
1
1
0
1
1
1
0
0
0
1
1
1
1
0
0
0
0
1
1
1
1
3
77
77
77
77
77
77
77
77
77
77
77
77
77
77
75
527
= R() + R(j) + R(j; )
+R( j; ; ) + SSE
is the same for all three models.
528
Normal Theory
Gauss-Markov Model
Using result 4.7, we have also
shown earlier that
ijk NID(0; 2)
X
By Cochran's Theorem, these
quadratic forms (sums of squares)
have independent chi-square
distributions with 1; b 1; a 1;
(a 1)(b 1); and n ab
degrees of freedom, respectively,
when nij > 0 for all (i; j ).
nX
a b ij
ij )2
SSE = i=1
(
Y
Y
ijk
j =1 k=1
= YT (I PX )Y
2n
X
ab
529
530
Produce ANOVA tables:
The `m( ) function in S-PLUS:
`m(Days soil*variety,
data=carrot)
> anova (`m.out1)
R(j)
R(j; )
R( j; ; )
SSE
> `m.out1
To allow the `m( ) function to t
a model involving classication
variables, create factors.
> carrot read.table("carrots.dat",
col.names=c("Soil","Variety","Days"))
> carrot$soil as.factor(carrot$soil)
> carrot$variety
as.factor(carrot$variety)
> options(contrasts=
c(\contr.sum","contr.poly"))
531
`m(Days variety*soil,
data=carrot)
> anova (`m.out2)
R(j)
R(j; )
R( j; ; )
SSE
> `m.out2
532
There are four options for creating columns in the model matrix for
classication variables:
contr: helmert
contr: treatment sets 1 = 0
1 = 0
1j = 0 for all j
i1 = 0 for all i
contr:sum
constraints
contr:poly
orthogonal polynomial
contrasts
equal spacing
equal sample sizes
># This file is posted as
>
+
>
>
>
carrots.ssc
carrot <- read.table("c:\\carrots.dat",
col.names=c("Soil","Variety","Days"))
carrot$Soil <- as.factor(carrot$Soil)
carrot$Variety <- as.factor(carrot$Variety)
carrot
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Soil Variety Days
1
1
6
1
1 10
1
1 11
1
2 13
1
2 15
1
3 14
1
3 22
2
1 12
2
1 15
2
1 19
2
1 18
2
2 31
2
3 18
2
3
9
2
3 12
534
533
# Set up the axes and title of the
# profile plot.
#
#
#
#
> par(fin=c(7,7),cex=1.2,lwd=3,mex=1.5)
> x.axis <- unique(carrot$Variety)
> matplot(c(1,3,1), c(0,40,10), type="n",
xlab="Variety", ylab="Mean Time",
main= "Average Time to Carrot Seed
Germination")
Compute sample means of germination
times for all combinations of soil
type and varieties of carrot seeds
and make a profile plot.
> means <- tapply(carrot$Days,
list(carrot$Variety,carrot$Soil),mean)
> means
1 2
1 9 16
2 14 31
3 18 13
# Add a profile for each soil type
> matlines(x.axis,means,type='l',
lty=c(1,3),lwd=3)
# Plot points for the observations
> matpoints(x.axis,means, pch=c(1,16))
# Add a legend to the plot
> legend(2.,38.6,
legend=c('Soil Type 1','Soil Type 2'),
lty=c(1,3),bty='n')
535
536
# Fit a model with main effects and interaction
# Compute both sets of Type I sums of squares
40
Average Time to Carrot Seed
Germination
20
Analysis of Variance Table
10
Mean Time
30
Soil Type 1
Soil Type 2
> options(contrasts=
c(``contr.sum'',''contr.poly''))
> lm.out1 <- lm(Days~Soil*Variety,data=carrot)
> anova(lm.out1)
0
Response: Days
Terms added sequentially
Df Sum Sq
Soil 1 52.500
Variety 2 124.734
Soil:Variety 2 222.766
Residuals 9 120.000
1.0
1.5
2.0
2.5
3.0
Variety
(first to last)
Mean Sq F Value
52.5000 3.937500
62.3670 4.677527
111.3830 8.353723
13.3333
Pr(F)
0.0785
0.0405
0.0089
538
537
# Create a data frame containing the original
# data and the residuals and estimated means
> lm.out2 <- lm(Days~Variety*Soil,data=carrot)
> anova(lm.out2)
Analysis of Variance Table
Response: Days
Terms added sequentially
Df Sum Sq
Variety 2 93.3333
Soil 1 83.9007
Variety:Soil 2 222.7660
Residuals 9 120.0000
(first to last)
Mean Sq F Value
46.6667 3.500000
83.9007 6.292553
111.3830 8.353723
13.3333
539
Pr(F)
0.0751
0.0334
0.0089
> data.frame(carrot$Soil,carrot$Variety,
carrot$Days,
Pred=lm.out1$fitted,
Resid=round(lm.out1$resid,3))
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
X1 X2 X3 Pred Resid
1 1 6
9
-3
1 1 10
9
1
1 1 11
9
2
1 2 13 14
-1
1 2 15 14
1
1 3 14 18
-4
1 3 22 18
4
2 1 12 16
-4
2 1 15 16
-1
2 1 19 16
3
2 1 18 16
2
2 2 31 31
0
2 3 18 13
5
2 3 9 13
-4
2 3 12 13
-1
540
0
Residuals
-4 -2
frame( )
par(cex=1.0,mex=1.0,lwd=3,pch=2,
mkh=0.1,fig=c(0,1,.51,1), pty='m')
plot(lm.out1$fitted, lm.out1$resid,
xlab="Estimated Means",
ylab="Residuals")
abline(h=0, lty=2, lwd=3)
2
4
# Create residual plots
10
15
20
25
30
0
-4 -2
par(fig=c(0, 1, 0, 0.49), pty='s')
qqnorm(lm.out1$resid)
qqline(lm.out1$resid)
2
lm.out1$resid
4
Estimated Means
-1
0
1
Quantiles of Standard Normal
541
Create plots for studentized residuals
You must attach the MASS library
to have access to the studres( )
function that computes studentized
residuals in the following code
0
frame( )
par(cex=1.0,mex=1.0,lwd=3,pch=2,
mkh=0.1,fin=c(6.5,6.5))
plot(lm.out1$fitted, studres(lm.out1),
xlab="Estimated Means",
ylab="Studentized Residuals",
main="Studentized Residual Plot")
abline(h=0, lty=2, lwd=3)
Studentized Residuals
library(MASS)
1
2
Studentized Residual Plot
-1
#
#
#
#
#
542
10
15
20
25
30
Estimated Means
qqnorm(studres(lm.out1),
main="Studentized Residuals")
qqline(studres(lm.out1))
543
544
# Compute Type III sums of squares and F-tests.
# First create the model matrix for
# the cell means model.
Studentized Residuals
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
-1
studres(lm.out1)
1
2
> cb <- as.factor(10*as.numeric(carrot$Soil)
+ as.numeric(carrot$Variety))
> lm.out <- lm(carrot$Days ~ cb - 1)
> D <- model.matrix(lm.out)
> D
-1
0
1
Quantiles of Standard Normal
cb11 cb12 cb13 cb21 cb22 cb23
1
0
0
0
0
0
1
0
0
0
0
0
1
0
0
0
0
0
0
1
0
0
0
0
0
1
0
0
0
0
0
0
1
0
0
0
0
0
1
0
0
0
0
0
0
1
0
0
0
0
0
1
0
0
0
0
0
1
0
0
0
0
0
1
0
0
0
0
0
0
1
0
0
0
0
0
0
1
0
0
0
0
0
1
0
0
0
0
0
1
546
545
# Compute the sample means
> y <- matrix(carrot$Days,ncol=1)
> b <- solve(crossprod(D)) %*% crossprod(D,y)
> b
cb11
cb12
cb13
cb21
cb22
cb23
[,1]
9
14
18
16
31
13
# Compute Type III sums of squares and
# related F-tests
> s <- length(unique(carrot$Soil))
> t <- length(unique(carrot$Variety))
> yhat <- D %*% b
> sse <- crossprod(y-yhat)
> df2 <- nrow(y) - s*t
# Generate an identity matrix and
#
a vector of ones
Iden <- function(n) diag(rep(1,n))
one <- function(n) matrix(rep(1,n),ncol=1)
547
> c1 <- kronecker( cbind(Iden(s-1),-one(s-1)),
t(one(t)) )
> q1 <- t(b) %*% t(c1)%*%
solve( c1 %*% solve(crossprod(D))
%*% t(c1))%*% c1 %*% b
> df1<- s-1
> f <- (q1/df1)/(sse/df2)
> p <- 1-pf(f,df1,df2)
548
> c1
[1,]
> data.frame(SS=q2,df=df1,F.stat=f,p.value=p)
[,1] [,2] [,3] [,4] [,5] [,6]
1
1
1 -1 -1 -1
SS df
F.stat
p.value
1 192.1277 2 7.204787 0.01354629
> data.frame(SS=q1,df=df1,F.stat=f,p.value=p)
SS df F.stat
p.value
1 123.7714 1 9.282857 0.01386499
> c2 <- kronecker( t(one(s)), cbind(Iden(t-1),
-one(t-1))
> q2 <- t(b) %*% t(c2)%*%
solve( c2 %*% solve(crossprod(D))
%*% t(c2))%*% c2 %*% b
> df1<- t-1
> f <- (q2/df1)/(sse/df2)
> p <- 1-pf(f,df1,df2)
> c2
[,1] [,2] [,3] [,4] [,5] [,6]
[1,]
1
0
-1
1
0
-1
[2,]
0
1
-1
0
1
-1
> c3 <- kronecker( cbind(Iden(s-1),-one(s-1)),
cbind(Iden(t-1),-one(t-1)) )
> q3 <- t(b) %*% t(c3)%*%
solve( c3 %*% solve(crossprod(D))
%*% t(c3))%*% c3 %*% b
> df1<- (s-1)*(t-1)
> f <- (q3/df1)/(sse/df2)
> p <- 1-pf(f,df1,df2)
> c3
[1,]
[2,]
[,1] [,2] [,3] [,4] [,5] [,6]
1
0
-1 -1
0
1
0
1
-1
0
-1
1
> data.frame(SS=q3,df=df1,F.stat=f,p.value=p)
SS df
F.stat
p.value
1 222.766 2 8.353723 0.00888845
549
What null hypotheses are tested by F-tests
derived from such ANOVA tables?
Consider Type I sums of squares:
R() = YT P1Y
= YT P1P1Y
= (P1Y)T (P1Y)
= (Y:::1)T (Y :::1) = n::Y:::2
2 2
R() 1(Æ ) and
F = SSE=R(n() ab) F(1;n:: ab)(Æ2)
::
where
Æ2 = 12 T X T P1X
1
2
=
=
1 (T X T P )(P X)
1
1
2
1 (P X)T (P X)
1
2 1
550
For the carrot seed germination study:
P1X = n1 1 1T X
::
1
= n 1[n::; n1:; n2:; n:1; n:2; n:3;
::
n11; n12; n13; n21; n22; n23] = n1 1 n:: + Xa ni:i + Xb n:j j
::
a
X
b
X
i=1
j =1
+ i=1 j =1 ij
The null hypothesis is
H0 : 0 = n:: + Xa ni: i+ Xb n:j j +X X nij ij
i=1
j =1
i j
With respect to the cell means
E (Yijk ) = ij = + i + j + ij
this null hypothesis is
H0 : 0 = Xa Xb nnij ij
i=1 j =1
::
551
552
Consider
and
R(j) = YT (P; P)Y
)=(a 1) F
2
F = R(jMSE
(a 1;n:: ab) (Æ )
Here,
1
2
2
2 R(j) a 1(Æ )
For the general eects model for the carrot
seed germination study:
T X;) X T X
P; X = X;(X;
;
2
3
66 n:: n1: n2: 77
= X; 66664 n1: n1: 0 77775
n2: 0 n2:
2
66
64
n:: n1: n2: n:1 n:2 n:3 n11 n12 n13 n21 n22 n23
n1: n1: 0 n11 n12 n13 n11 n12 n13 0 0 0
n2: 0 n2: n21 n22 n23 0 0 0 n21 n22 n23
where a 1 = rank(X;) rank(X) and
Æ2 = 12 T X T (P; P)X
= 12 [(P; P)X]T [(P; P)X]
= X;
2
66
66
66
66
66
66
4
3
1 1 0
1 1 0 777 2
2
66
66
66
66
4
0 0 0
0 n11: 0
0 0 n12:
.
.
.
7 6 0 0 0
.
.
.
.
.
. 7
7 6
1 1 0 77 66 1 1 0
1 0 1 777 4 1 0 1
.
.
. 7
.
.
.
.
.
.
1 0 1
5
0
n11
n1
n11
n1
:
:
0
n12
n1
n22
n2
:
:
3
77
77
77
77
5
#
2
66
66
66
4
0
n13
n1
n23
n2
:
:
3
77
77
77
5
0
n11
n
::
0
0
n12
n
::
0
0
n13
n
::
0
The last eight rows of (P; P)X are
+ 2 + Pbj=1 nn22j: (j + 2j )
+ Xa nni: i + Xb nn:j j + X X nnij ij
i=1 ::
j =1 ::
i j ::
555
3
0
0
n21
n1
n22
n2
:
:
554
553
Then, the rst seven rows of
(P; P)X are
+ 1 + Pbj=1 nn11j: (j + 1j )
+ Xa nni: i + Xb nn:j j + X X nnij ij
i=1 ::
j =1 ::
i j ::
0
0
=
The null hypothesis is
H0 : i + Xb nnij (j + ij )
j =1 i:
are all equal (i = 1; : : : ; a)
with respect to the cell means model,
ij = E (Yijk = + i + j + ij ;
this null hypothesis is
H0 : Xb nnij ij are all equal (i = 1; : : : ; a):
j =1 i:
556
0
7
0 777
n23 5
n1
:
3
77
75
Download