Consider Comments: F o

advertisement
Consider
Comments: For R(j),
b nij
(i) j =1
ni: ij may not be equal
for all i = 1; : : : ; a, when
1 b ij are equal
b j =1
for all i = 1; : : : ; a.
R(j; ) = YT (P;; P;)Y
and the corresponding F-statistic
X
X
F
=
R(j; )=(b
MSE
1)
F(b
Here,
1 R(j; ) 2
2
rank(X;;) rank(X;)(Æ )
2
(ii) j =1 nnij ij may be equal
i:
for all i = 1; : : : ; a, when
1 b ij are not equal
b j =1
for some i = 1; : : : ; a.
b
X
%
-
[1+(a 1)+(b 1)] [1+(a 1)]
= b 1 degrees of freedom
X
and
Æ2 = 21 2 [(P;; P;)X]T [(P;; P;)X]
557
558
T X
T
P;;X = X;; X;;
;; X;; X
2
= X;;
2
6
6
4
6
6
6
6
6
6
6
6
6
6
6
6
6
4
n::
n1:
n2:
n:l
n:2
n:3
n1:
n1:
0
n11
n12
n13
%
n2:
0
n2:
n21
n22
n23
n:1
n11
n21
n:1
3
n:2
n12
n22
0
n:2
0
0 0
n:3
n13
n23
0
0
n:3
2
3
6
6
6
4
7
7
7
5
3
7
7
7
7
7
7
7
7
7
7
7
7
7
5
0
T X
X;;
"
=
"
A
0
1
X
"
#
T
0 01 +
0 C
#
"
I
C 1B
#
T
T
[A BC 1B ] 1[I j
T
1
C
C
C
C
A
X
E (Yijk) = ij ;
0 + A 1 B [C B A 1 B ] 1 [ B A 1 j I ]
0
I
#
B
B
B
B
@
With respect to the cell means,
7
7
5
=
The null hypothesis is
a nij
H0 : i=1
n:j (j + ij )
a nij b nik
( + ik ) = 0
i=1 n:j k=1 ni: k
for all j = 1; : : : ; b
X
A B
call this T
B C
A B 1
BT C
2
1;n:: ab)(Æ )
BC 1]
W
W BC 1
=
C 1B T W C 1 + C 1B T W BC 1
where W = [A BC 1B T ] 1
2
3
6
6
6
6
4
7
7
7
7
5
559
this null hypothesis is
a nij b nik
a nij
H0 : i=1
ij
i=1 n:j k=1 ni: ik
n:j
for all j = 1; 2; : : : ; b:
0
X
X
B
B
B
B
@
X
560
1
C
C
C
C
A
=0
Consider
R( j; ; ) = YT [PX P;;]Y
and the associated F-statistic
=[(a 1)(b
F = R( j; ; )MSE
F(a 1)(b 1);n:: ab(Æ2)
Type I sums of squares
Source
of variation.
Soil
types
Var.
Interaction
Resid.
Corr.
total
Corr.
for
the
mean
1)]
The null hypothesis is:
H0 : (ij i` kj + k`)
= (ij i` kj + k`) = 0
for all (i; j ) and (k; `) :
sums of
squares
d.f.
Mean
square
T
X
n 1 = 14 Y (I P1)Y = 520
R() = 3375
1
562
Summary:
Associated null
hypothesis
R()
H0 : +
i:
=1
::
a
X
b
X
Associated null
hypothesis
R()
H0 : +
::
i
ij
j
::
a
X
@
ij
R(j)
::
R(j)
H0 : + nn ( + ) are equal
=1
n are equal
or H0 :
=1 n
n ( + )
R(j; )
H0 : + nn = nn
=1
=1
=1 n
for all j = 1; : : : ; b
n for all j = 1; : : : ; b
n = n
or H0 :
=1 n
=1 n
=1 n
R( j; ; ) H0 : + = 0 for all (i; j ) and (k; `)
b
X
j
j
@
ij
i:
b
X
1
ij
A
ij
j
i:
a
X
a
X
ij
j
0
ik
k
i
:j
a
X
@
b
X
ij
ij
i
:j k
a
X
ij
ij
b
X
1
ik
ij
kj
(or H0 : ij
i`
kj
A
ik
i
:j
ik
k:
ij
i
:j k
i:
k`
+ = 0 for all (i; j )and (k; `)
i`
:j
j
::
ij
::
j
!
b
X
ij
ij
::
H0 : + nn ( + ) are equal for all j = 1; : : : ; b
=1
n are equal for all j = 1; : : : ; b
or H0 :
=1 n
n ( + )
H0 : nn ( + ) = nn
=1
=1
=1 n
for all i = 1; : : : ; a
n = n
n or H0 :
=1 n
=1 n
=1 n
for all i = 1; : : : ; a
a
X
ij
j
j
ij
:j
i
a
X
!
ij
ij
ij
i
0
b
X
j
=1 j =1
A
ij
=1 j =1
i
b
X
i
i
1
b
X
::
a
X
ij
0
i:
=1
i
j
j
n + n n
=1 n
n =0
+
=1 =1 n
n =0
or H0 :
n
a
X
i
ij
:j
i
b
X
Sums of
Squares
a
X
n + n n
=1 n
n =0
+
=1 =1 n
n =0
or H0 :
n
a
X
i
p-val
T
561
Sums of
Squares
F
R(j) = 52:5
52.5 3.94 .0785
R(j; ) = 124:73
62.4 4.68 .0405
(a-1)(b-1)
R( j; ; )
=2
= 222:76
111.38 8.35 .0089
n ab = 9 Y (I P )Y = 120 13.33
a 1=1
b 1=2
i
R(j; )
b
X
:j
b
X
ij
ij
j
ij
kj
k
j
b
X
a
X
ij
i:
b
X
ij
i: k
"
a
X
ij
#
kj
ij
j
R( j; ; ) H0 : ij
or H0 : kj
ij
kj
j
i:
kj
:j
i:
k
!
:j
+ = 0 for all (i; j ) and (k; `)
i`
k`
!
kj
+ = 0 for all (i; j ) and (k; `)
i`
k`
k`
563
564
Type I sums of squares
Source
of
variat.
sums of
squares
d.f.
\Soils"
\Var."
Interaction
a 1=1
b 1=2
R(j) = 52:50
R(j; ) = 124:73
(a-1)(b-1)
=2
R( j; ; )
(n
\Res."
Corr.
total
n
::
Source
of
variat.
\Var."
\Soils"
Interaction
Corr.
total
=9
T
52.5
62.4
3.94
4.68
.0785
.0405
111.38
8.35
.0089
1 = 14 Y (I P1)Y
=520.00
b 1=2
a 1=1
R(j) = 93:33
R(j; ) = 83:90
(a-1)(b-1)
=2
R( j; ; )
ij
=9
= 222.76
1) Y (I P )Y
T
X
=120.00
Type II sums of squares:
Source
of
variat.
13.33
X
=120.00
d.f.
::
p-val
1) Y (I P )Y
sums of
squares
n
F
T
(n
\Res."
ij
=222.76
Mean
square
Mean
square
F
p-val
46.67
83.90
3.50
6.29
.0751
.0334
111.38
8.35
.0089
sums of
squares
d.f.
Mean
square
F
p-val
83.90
62.37
6.3
4.7
.0339
.0405
\Soils"
\Var."
Interaction
a 1=1
b 1=2
R(j; ) = 83:90
R(j; ) = 124:73
R( j; ; ) = 222:76 111.38 8.4 .0089
\Res."
(a-1)(b-1)
=2
n ab
Y (I P )Y = 120
Corr.
total
n 1
Y (I P1)Y = 520
=9
T
X
13.33
T
13.33
1 = 14 Y (I P1)Y
T
=520.00
565
566
Examine the soil type eect on time
to germination for each variety:
40
Average Time to Carrot Seed
Germination
20
Variety
j=1
j=2
j=3
t
9.0 2.11 16.0 1.83 -2.51
14.0 2.58 31.0 3.65 -3.80
18.0 2.58 13.0 2.11 1.50
Yij:
SY
ij:
Y2j:
SY
2j:
p-value
.0333
.0042
.1679
10
Time to germination for variety 2 is
shorter in soil type 1.
0
Mean Time
30
Soil Type 1
Soil Type 2
Time to Germination
Soil Type 1 Soil Type 2
1.0
1.5
2.0
2.5
3.0
Time to germination for variety 1
may also be shorter in soil type 1.
For variety 3 there is no signicant
Variety
dierence in average germination
times for the two soil types.
567
568
In the previous analysis:
Yij:
= ^ ij
= ^+
^ i + ^j + ^ij
is the OLS estimator (b.l.u.e.) for
ij = + i + j + ij
Also,
for i = 1; : : : ; a
SYij: = MSE
nij
and j = 1; : : : ; b
Since Y1j: is independent of Y2j:
t = Y1j: 1 Y2j: 1 for j = 1; : : : ; b
MSE ( n1j + n2j )
v
u
u
u
u
t
Method of Unweighted
Means
(Type III sums of squares in SAS
when nij > 0 for all (i; j )).
Consider the cell means
reparameterization of the model:
Yijk
+ i + j + ij + ijk
ij + ijk
=
=
v
u
u
u
u
t
570
569
2
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
4
Y111
1 0 0 0 0 0
Y112
1 0 0 0 0 0
Y113
1 0 0 0 0 0
Y121
0 1 0 0 0 0
Y122
0 1 0 0 0 0
Y131
0 0 1 0 0 0
Y132
0 0 1 0 0 0
Y211 = 0 0 0 1 0 0
Y212
0 0 0 1 0 0
Y213
0 0 0 1 0 0
Y214
0 0 0 1 0 0
Y221
0 0 0 0 1 0
Y231
0 0 0 0 0 1
Y232
0 0 0 0 0 0
Y233
0 0 0 0 0 0
"
3
2
3
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
5
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
4
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
5
"
Y
D
2
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
4
111
112
113
121
122
11
131
12
13 + 132
211
21
212
22
213
23
214
221
231
232
233
3
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
5
"
2
3
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
4
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
5
The model is
Y = D + 571
The least squares estimator
(b.l.u.e.) for is
^ = (DT D) 1DT Y
n111
n121
n131
=
2
3
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
4
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
5
Y11:
Y12:
:
= YY13
21:
Y22:
Y23:
2
3
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
4
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
5
n211
n221
n231
2
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
4
Y11:
Y12:
Y12:
Y21:
Y22:
Y23:
572
3
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
5
Test the null hypothesis
H0 : 1b b 1j = 1b b 2j = = 1b b aj
j =1
j =1
j =1
vs.
1 b for some i =
HA : 1b b ij =
6
6 k
b j =1 kj
j =1
X
X
X
X
Express the null hypothesis in
matrix form:
H0 : C1 = 0
X
The OLS estimator (b.l.u.e.) for
1 b ij is
b j =1
where
C1 = [Ia 1
1a 1] 1Tb 2
X
b Yij:
Y~i:: = 1b j =1
=
X
with
V ar(Y~i::)
=
=
2
2
b2 j =1 nij
2 b 1
b2 j =1 nij
1 b
X
6
6
6
6
6
6
6
6
6
6
6
6
6
6
4
1Tb
2
=
X
6
6
6
6
6
6
6
6
6
6
4
X
j
1Tb
1j
...
X
.. j
j a 1;j
X
-1Tb
-1Tb
..
T
1b -1Tb
aj
j aj
X
3
7
7
7
7
7
7
7
7
7
7
7
7
7
7
5
3
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
4
11
12
..
1b
21
..
2b
.
ab
574
Compute
SSH0 = (C1b 0)T [C1(DT D) 1C1T ]
= YT D(DT D) 1C1T [C1(DT D)
C1(DT D) 1DT Y
C1b = C1(DT D) 1DT Y
Y
j 1j:
2
=
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
4
X
..
..
Y
j a 1;j:
X
Y
j aj:
X
Y
j aj:
X
3
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
5
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
5
7
7
7
7
7
7
7
7
7
7
5
573
Then
3
1(C1b
1C T ] 1
1
Use result 4.7 to show
1
is the OLS estimator (b.l.u.e.) of
C1, and
V ar(C1b) = V ar(C1(DT D) 1DT Y)
= C1(DT D) 1DT (2I )D(DT D) 1C1T
= 2C1(DT D) 1DT D(DT D) 1C1T
= 2C1(DT D) 1C1T
575
2 (Æ2)
SS
H
(a 1)
0
2
Check that
A = 12 D(DT D) 1C1T [C1(DT D) 1C1T ] 1
C1(DT D) 1DT (2I )
is idempotent and that
a 1 = rank(C1(DT D) 1C1T )
576
0)
Use result 4.8 to show that
SSE = YT (I PD )Y
- call this A1
is distributed independently of
Compute:
SSE = YT (I PD )Y
where
SSH = YT D(DT D) 1C1T [C1(DT D) 1C1T ] 1C1(DT D) 1DT Y
0
- call this A2
PD = D(DT D) 1DT
Check that
Use result 4.7 to show
A1A2 = A1(2I )A2
= 2A1A2
= 2(I PD )(D(DT D) 1C1T
(C1(DT D) 1C1T ) 1C1(DT D) 1DT
1
2
SSE
(nij 1)
2
= 0
This is true because (I PD )D = 0.
577
578
Reject
Then
F
=
H0 : 1b b 1j = 1b b 2j = = 1b b aj
X
SSH0=(a 1)
SSE=((nij 1))
F (a
j =1
X
j =1
X
j =1
if
2
1;(nij 1))(Æ )
SSH0 =(a 1) > F
F = SSE=
(a 1;(nij 1)) ; ((nij 1))
where
1
Æ2 = 2 T C1T [C1(DT D) 1C1T ] 1C1
579
or, if
p-value = P r F(a 1;(nij 1)) > F
< 580
Test
H0 : 1 a i1 = 1 a i2 = = 1 a ib
a i=1
a i=1
a i=1
vs.
HA : a1 a ij 6= a1 a ik for some j 6= k
X
X
X
X
X
i=1
Compute
SSH0;2
i=1
Write the null hypothesis in matrix
form as
where
H0 : C2 = 0
C2 = 1 [I 1j 1 1]
T
b
a
2
=
then
6
6
6
6
4
1
1
1C T
2
[C2(D T D ) 1C2T ] 1
C2(DT D) 1DT Y
= YT D (D T D )
and reject H0 if
b
...
2
6
6
6
6
6
6
6
6
6
6
6
6
4
1
11
12
...
1
1
1 1
...
...
1
1
3
7
7
7
7
7
7
7
7
7
7
7
7
5
C2 = C2 1 =
21
b
...
ab
2
6
6
6
6
6
6
6
6
4
1
a
1
a
a
X
=1
i
a
X
=1
i
1 a1
i
i;b
...
1
a
X
=1
i
1
a
=1
i
3
7
7
7
7
5
F
=
3
a
X
1
1
...
1
ib
ib
7
7
7
7
7
7
7
7
5
SSH0;2=(b 1)
SSE=((nij 1))
> F(b 1;(nij 1));
581
Test for Interaction:
582
Compute
2
Test
b = (D T D ) 1 D T Y =
H0 : ij i` kj + k` = 0
for all (i; j ) and (k; `)
vs.
6
6
6
6
6
6
6
6
4
Y11:
..
Yab:
3
7
7
7
7
7
7
7
7
5
SSH0;3 = (C3b 0)T [C3(DT D) 1C3T ] 1
(C3b 0)
HA : ij i` kj + k` 6= 0
for some (i 6= k) and (j 6= `).
Write the null hypothesis in matrix
form as
H0 : C3 = 0
where
C3 = [Ia 1j 1a 1] [Ib 1j 1b 1]
583
= YT D(DT D) 1C3T [C3(DT D) 1C3T ] 1
C3(DT D) 1DT Y
and reject H0 if
SS 0;3=((a 1)(b 1))
F = HSSE=
((nij 1))
> F((a 1)(b 1);(nij 1));
584
Note that
PROC GLM is SAS reports this as
Type III sums of squares.
Source
of variation
Soils
Var.
Inter.
Sum of
Squares
d.f.
a-1=1
b-1=2
(a-1)(b-1)=2
Mean
Square
F
p-val
SS 0 = 123.77 123.77 9.28 .0139
SS 0 2 = 192.13 96.06 7.20 .0135
SS 0 3 = 222.76 111.38 8.35 .0089
H
H ;
H ;
YT P1Y + YT D(DT D) 1[C1(DT D) 1C1T ] 1
C1(DT D) 1DT Y
+ YT D(DT D) 1C2T [C2(DT D) 1C2T ] 1
C2(DT D) 1DT Y
+ YT D(DT D) 1C3T [C3(DT D) 1C3T ] 1
C3(DT D) 1DT Y
+ YT (I PD)Y
do not necessarily sum to YT Y, nor do the
middle three terms (SSH0 ; SSH0;2 ; SSH0;3)
necessarily sum to
SSmodel,corrected = YT (PD P1)Y ;
nor are (SSH0 ; SSH0;2 ; SSH0;3) necessarily
independent of each other.
586
585
Note that
Furthermore,
~k: 2
w
Y
k
SSH0 = i=1 wi Y~i: k=1a
w
k=1 k
2
a
6
6
6
6
6
6
6
6
6
4
X
a
3
X
X
2
7
7
7
7
7
7
7
7
7
5
b
SSH0;2 = j =1 wj Y~:j
X
where
Y~i:
=
X
2
wi
=
where
1 b Yij:
b j =1
6
6
6
6
6
4
2 1 = 2 V ar(Y~ ) 1
i:
b2 j =1 nij
1 b
Y~:j
=
3
7
7
7
7
7
5
X
2
3
4
5
and Y~i: is not necessarily equal to
b
b nij
nij Yij:
Y
ijk
j
=1
j
=1
k
=1
=
Yi: = b
b
n
n
ij
j =1
j =1 ij
X
X
X
6
6
6
6
6
6
6
6
6
4
X
X
587
=
3
7
7
7
7
7
7
7
7
7
5
X
1 a Yij:
X
a i=1
3
6
6
6
6
6
4
7
7
7
7
7
5
2
wj
~:` 2
w
Y
`
`=1a
w
`=1 `
a
X
2 1 = 2 V ar(Y~ ) 1
:j
a2 i=1 nij
1 a
X
2
3
4
5
and Y~:j is not necessarily equal to
a
a nij
nij Yij:
Yijk i=1
k
=1
i
=1
=
Y:j = a
a
n
n
ij
i=1
i=1 ij
X
X
X
X
X
588
For a balanced experiment,
Type I, Type II, and Type III
sums of squares are the same:
Balanced factorial experiments
nij = n for i = 1; : : : ; a
j = 1; : : : ; b
R(j)
Example 8.2: Sugar Cane Yields
(from Snedecor and Cochran)
Variety 1
Variety 2
Variety 3
Nitrogem Level
150 lb/acre 210 lb/acre 270 lb/acre
Y111 = 70:5
Y112 = 67:5
Y113 = 63:9
Y114 = 64:2
Y211 = 58:6
Y212 = 65:2
Y213 = 70:2
Y214 = 51:8
Y311 = 65:8
Y312 = 68:3
Y313 = 72:7
Y314 = 67:6
Y121 = 67:3
Y122 = 75:9
Y123 = 72:2
Y124 = 60:5
Y221 = 64:3
Y222 = 48:3
Y223 = 74:0
Y224 = 63:6
Y321 = 64:1
Y322 = 64:8
Y323 = 70:9
Y324 = 58:3
Y131 = 79:9
Y132 = 72:8
Y133 = 64:8
Y134 = 86:3
Y231 = 64:4
Y232 = 67:3
Y233 = 78:0
Y234 = 72:0
Y331 = 56:3
Y332 = 54:7
Y331 = 66:2
Y334 = 54:4
R(j; ) = SSH0
a n b i=1
(Yi:: Y:::)2
=
=
R(j)
=
=
X
R(j; ) = SSH0;2
b n a j =1
(Y:j: Y:::)2
X
R( j; ; ) = SSH0;3
=n
a
b (Yij: Yi:: Y:j: + Y:::)2
i=1 j =1
X
X
590
589
Summary
Associated null
hypothesis
Sum of Squares
R() = YT P1Y
= a b n Y:::2
H0 : + a1
+ ab i
1
a
X
b
a
X
i=1
X
i + 1b
ij = 0
b
X
j =1
=1 j =1
a b
1 X X
H0 : a b
j
R( j; ; ) = n a b (Yij: Yi:: Y:j: + Y:::)2
X
ij = 0
i=1 j =1
b
i + 1b (j + ij )
j =1
H0 : ij kj i` + k` = 0
for all (i; j ) and (k; `)
R(j) = R(j; )
H0 :
a
2
= n b i=1(Yi:: Y:::)
are equal
b
1
H0 : b ij are equal
X
X
X
j =1
a
R(j) = R(j; )
H0 : j + a (i + ij )
i=1
b
2
are equal
= n a j=1(Y:j: Y:::)
a
H0 : a1 ij are equal
1
X
i=1 j =1
H0 : ij kj i` + k` = 0
for all (i; j ) and (k; `)
X
X
X
i=1
591
592
# A file with the S-PLUS commands is
# posted as cane.ssc
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
# Enter the data. Note that the first
# line of this file is a line of data,
# not a line of variable names.
> cane <- read.table("cane.dat",
col.names=c("Variety","Nitrogen",
"Yield"))
# Create factors
> cane$V <- as.factor(cane$Variety)
> cane$N <- as.factor(cane$Nitrogen)
# Print the data frame
> cane
Variety Nitrogen Yield N V
1
150 70.5 150 1
1
150 67.5 150 1
1
150 63.9 150 1
1
150 64.2 150 1
1
210 67.3 210 1
1
210 75.9 210 1
1
210 72.2 210 1
1
210 60.5 210 1
1
270 79.9 270 1
1
270 72.8 270 1
1
270 64.8 270 1
1
270 86.3 270 1
2
150 58.6 150 2
2
150 65.2 150 2
2
150 70.2 150 2
2
150 51.8 150 2
2
210 64.3 210 2
2
210 48.3 210 2
593
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
2
2
2
2
2
2
3
3
3
3
3
3
3
3
3
3
3
3
210
210
270
270
270
270
150
150
150
150
210
210
210
210
270
270
270
270
74.0
63.6
64.4
67.3
78.0
72.0
65.8
68.3
72.7
67.6
64.1
64.8
70.9
58.3
56.3
54.7
66.2
54.4
210
210
270
270
270
270
150
150
150
150
210
210
210
210
270
270
270
270
2
2
2
2
2
2
3
3
3
3
3
3
3
3
3
3
3
3
594
#
#
#
#
#
Compute mean yields for all combinations
of nitrogen levels and varieties and
Make a profile plot. At this point
UNIX users should open a graphics
window with the motif( ) function.
> means <- tapply(cane$Yield,
list(cane$Variety,cane$Nitrogen),
mean)
> means
150
210
270
1 66.525 68.975 75.950
2 61.450 62.550 70.425
3 68.600 64.525 57.900
595
596
Set up the profile plot
par(fin=c(7,7),cex=1.2,lwd=3,mex=1.5,mkh=.20)
x.axis <- unique (cane$Nitrogen)
matplot(c(130,270), c(50,80),
type="n", xlab="Nitrogen(lb/acre)",
ylab="Mean Yield",
main= "Sugar Cane Yields")
65
60
55
# Plot symbols for the sample means
> matpoints(x.axis,means, pch=c(15,16,18))
Variety 1
Variety 2
Variety 3
50
# Add a profile for each soil type
> matlines(x.axis,means,type='l',
lty=c(1,3,5),lwd=3)
70
75
80
Sugar Cane Yields
Mean Yield
#
>
>
>
140 160 180 200 220 240 260
# Add a legend to the plot
> legend(130,60, legend=c('Variety 1',
'Variety 2','Variety 3'),
lty=c(1,3,5),bty='n')
Nitrogen(lb/acre)
597
# Fit a model with main effects and
# interaction effects. Compute both
# sets of Type I sums of squares.
options(contrasts=c('contr.sum','contr.ploy'))
> lm.out2 <- lm(Yield~V*N, data=cane)
> anova(lm.out2)
Analysis of Variance Table
> lm.out1 <- lm(Yield~N*V, data=cane)
> anova(lm.out1)
Response: Yield
Analysis of Variance Table
Response: Yield
Terms added sequentially (first
Df Sum of Sq Mean Sq
N 2
56.541 28.2703
V 2 319.374 159.6869
N:V 4 559.788 139.9469
Residuals 27 1254.460 46.4615
598
to last)
F Value Pr(F)
0.60847 0.551478
3.43698 0.046797
3.01211 0.035471
599
Terms added sequentially (first
Df Sum of Sq Mean Sq
V 2 319.374 159.6869
N 2
56.541 28.2703
V:N 4 559.788 139.9469
Residuals 27 1254.460 46.4615
to last)
F Value
Pr(F)
3.43698 0.046797
0.60847 0.551478
3.01211 0.035471
600
> summary(lm.out2, correlation=F)
Call: lm(formula = Yield ~ V * N, data = cane)
Residuals:
Min
1Q Median
3Q Max
-14.25 -3.131 -0.3625 3.956 11.45
Coefficients:
(Intercept)
V1
V2
N1
N2
V1N1
V2N1
V1N2
V2N2
Value
66.3222
4.1611
-1.5139
-0.7972
-0.9722
-3.1611
-2.5611
-0.5361
-1.2861
Std.
Error
1.1360
1.6066
1.6066
1.6066
1.6066
2.2721
2.2721
2.2721
2.2721
t value Pr(>|t|)
58.3800 0.0000
2.5900 0.0153
-0.9423 0.3544
-0.4962 0.6238
-0.6051 0.5501
-1.3913 0.1755
-1.1272 0.2696
-0.2360 0.8152
-0.5660 0.5760
Residual standard error: 6.816 on 27 df
Multiple R-Squared: 0.4272
F-statistic: 2.517 on 8 and 27 df,
the p-value is 0.03462
> model.matrix(lm.out2)
(Intercept) V1 V2 N1
1
1 1 0 1
2
1 1 0 1
3
1 1 0 1
4
1 1 0 1
5
1 1 0 0
6
1 1 0 0
7
1 1 0 0
8
1 1 0 0
9
1 1 0 -1
10
1 1 0 -1
11
1 1 0 -1
12
1 1 0 -1
13
1 0 1 1
14
1 0 1 1
15
1 0 1 1
16
1 0 1 1
17
1 0 1 0
18
1 0 1 0
N2 V1N1 V2N1 V1N2 V2N2
0
1
0
0
0
0
1
0
0
0
0
1
0
0
0
0
1
0
0
0
1
0
0
1
0
1
0
0
1
0
1
0
0
1
0
1
0
0
1
0
-1 -1
0 -1
0
-1 -1
0 -1
0
-1 -1
0 -1
0
-1 -1
0 -1
0
0
0
1
0
0
0
0
1
0
0
0
0
1
0
0
0
0
1
0
0
1
0
0
0
1
1
0
0
0
1
602
601
18
60
65
70
3.5
2.5
11
1.5
75
60
65
Fitted : N * V
70
75
fits
10
19
5
0
-5
Residuals
11
-15
60
65
70
18
75
-2
Fitted : N * V
-1
0
1
2
Quantiles of Standard Normal
11
19
0.10
Cook’s Distance
0.20
18
-15
-15
0.0
-5
0
10
Residuals
5
Fitted Values
-5
# Create diagnostic plots
> par(mfrow=c(3,2))
> plot(lm.out1)
18
19
0.5
11
sqrt(abs(Residuals))
0
5
10
19
-5
Residuals
1
1
-1
-1
-1
-1
0
0
0
0
-1
-1
-1
-1
1
1
1
1
-15
0
0
0
0
0
0
0
0
0
0
-1
-1
-1
-1
1
1
1
1
80
0
0
-1
-1
-1
-1
-1
-1
-1
-1
0
0
0
0
1
1
1
1
70
0
0
0
0
0
0
-1
-1
-1
-1
0
0
0
0
1
1
1
1
Yield
1
1
-1
-1
-1
-1
0
0
0
0
1
1
1
1
-1
-1
-1
-1
60
0
0
-1
-1
-1
-1
1
1
1
1
0
0
0
0
-1
-1
-1
-1
50
1
1
1
1
1
1
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
10
0
0
0
0
0
0
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
5
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
Yield
0
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
0.0 0.2 0.4 0.6 0.8 1.0
603
0
10
20
30
0.0 0.2 0.4 0.6 0.8 1.0
f-value
Index
604
# Create a data frame containing the original
# data and the residuals and estimated means
> data.frame(cane$Nitrogen,cane$Variety,
cane$Yield,Pred=lm.out1$fitted,
Resid=round(lm.out1$resid,3))
1
2
3
4
5
6
7
8
9
10
11
12
X1
150
150
150
150
210
210
210
210
270
270
270
270
X2
1
1
1
1
1
1
1
1
1
1
1
1
X3
70.5
67.5
63.9
64.2
67.3
75.9
72.2
60.5
79.9
72.8
64.8
86.3
Pred
66.525
66.525
66.525
66.525
68.975
68.975
68.975
68.975
75.950
75.950
75.950
75.950
Resid
3.975
0.975
-2.625
-2.325
-1.675
6.925
3.225
-8.475
3.950
-3.150
-11.150
10.350
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
150
150
150
150
210
210
210
210
270
270
270
270
150
150
150
150
210
210
210
210
270
270
270
270
2
2
2
2
2
2
2
2
2
2
2
2
3
3
3
3
3
3
3
3
3
3
3
3
58.6
65.2
70.2
51.8
64.3
48.3
74.0
63.6
64.4
67.3
78.0
72.0
65.8
68.3
72.7
67.6
64.1
64.8
70.9
58.3
56.3
54.7
66.2
54.4
61.450
61.450
61.450
61.450
62.550
62.550
62.550
62.550
70.425
70.425
70.425
70.425
68.600
68.600
68.600
68.600
64.525
64.525
64.525
64.525
57.900
57.900
57.900
57.900
-2.850
3.750
8.750
-9.650
1.750
-14.250
11.450
1.050
-6.025
-3.125
7.575
1.575
-2.800
-0.300
4.100
-1.000
-0.425
0.275
6.375
-6.225
-1.600
-3.200
8.300
-3.500
605
#
>
>
>
>
>
# Compute Type III sums of squares and
# corresponding F-tests.
# Generate an identity matrix and a
# vector of ones
> Iden <- function(n) diag(rep(1,n))
> one <- function(n) matrix(rep(1,n),ncol=1)
# Compute the transpose of the model
# matrix for the cell means model
>
>
>
>
>
606
s <- length(unique(cane$Nitrogen))
t <- length(unique(cane$Variety))
st <- s*t
r <- length(cane$Yield)/(st)
D <- t(kronecker(Iden(st), t(one(r))))
Least squares estimation
y <- matrix(cane$Yield,ncol=1)
b <- solve(crossprod(D)) %*% crossprod(D,y)
yhat <- D %*% b
sse <- crossprod(y-yhat)
df2 <- nrow(y) - st
>c1 <- kronecker( cbind(Iden(s-1),-one(s-1)),
t(one(t)) )
> q1 <- t(b) %*% t(c1)%*% solve( c1 %*%
solve(crossprod(D)) %*% t(c1))%*%
c1 %*% b
> df1<- s-1
> f <- (q1/df1)/(sse/df2)
> p <- 1-pf(f,df1,df2)
> c1
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,]
1
1
1
0
0
0
-1 -1 -1
[2,]
0
0
0
1
1
1
-1 -1 -1
> data.frame(SS=q1,df=df1,F.stat=f,p.value=p)
SS df F.stat
p.value
1 319.3739 2 3.436975 0.04679743
607
608
> c2 <- kronecker( t(one(s)),
cbind(Iden(t-1),-one(t-1)) )
> q2 <- t(b) %*% t(c2)%*%solve( c2 %*%
solve(crossprod(D)) %*% t(c2))%*%
c2 %*% b
> df1<- t-1
> f <- (q2/df1)/(sse/df2)
> p <- 1-pf(f,df1,df2)
> c2
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,]
1
0 -1
1
0
-1
1
0 -1
[2,]
0
1 -1
0
1
-1
0
1 -1
> data.frame(SS=q2,df=df1,F.stat=f,p.value=p)
SS df F.stat p.value
1 56.54056 2 0.608467 0.551478
> c3 <- kronecker( cbind(Iden(s-1),-one(s-1)),
cbind(Iden(t-1),-one(t-1)) )
> q3 <- t(b) %*% t(c3)%*% solve( c3 %*%
solve(crossprod(D)) %*% t(c3))%*%
c3 %*% b
> df1<- (s-1)*(t-1)
> f <- (q3/df1)/(sse/df2)
> p <- 1-pf(f,df1,df2)
> c3
[1,]
[2,]
[3,]
[4,]
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
1
0
-1
0
0
0 -1
0
1
0
1
-1
0
0
0
0 -1
1
0
0
0
1
0
-1 -1
0
1
0
0
0
0
1
-1
0 -1
1
> data.frame(SS=q3,df=df1,F.stat=f,p.value=p)
SS df F.stat
p.value
1 559.7878 4 3.012107 0.03547072
610
609
Conclusions:
Variety 1 appears to provide a
Variety 3 exhibits a \linear"
consistently higher yield than
Variety 2, but the dierence in
these two varieties is not
\signicant" at the .05 level.
decrease in yield as nitrogen
increases from 150 lb/acre
to 270 lb/acre.
Varieties 1 and 2 exhibit parallel
\linear" increasing trends in
yield as nitrogen increases from
150 lb/acre to 270 lb/acre.
611
Variety 3 seems to do as well
as Variety 1 at 150 lb/acre
of nitrogen.
612
proc glm data=set1;
class variety nitrogen;
model yield = variety|nitrogen /
p clm alpha=.05 ss1 ss2
ss3 ss4 e e1 e2 e3 e4;
output out=setr r=resid p=yhat;
lsmeans variety*nitrogen / stderr pdiff;
means variety nitrogen / tukey;
contrast 'n-linear' nitrogen -1 0 1;
contrast 'n-quad' nitrogen -1 2 -1;
contrast 'v1-v2' variety 1 -1 0;
contrast '(v1+v2)-v3' variety .5 .5 -1;
contrast '(v1-v2)*(n-lin)' variety*nitrogen
-1 0 1 1 0 -1 0 0 0;
contrast '(v1-v2)*(n-quad)' variety*nitrogen
-1 2 -1 1 -2 1 0 0 0;
contrast '(.5(v1+v2)-v3)*(n-lin)'
variety*nitrogen
-.5 0 .5 -.5 0 .5 1 0 -1;
contrast '(.5(v1+v2)-v3)*(n-quad)'
variety*nitrogen
-.5 1 -.5 -.5 1 -.5 1 -2 1;
/* Analysis of completely randomized
factorial experiements with an
application to the sugar cane data
from Snedecor and Cochran. This
program is posted as cane.sas */
data set1;
infile 'cane.dat';
input variety nitrogen yield;
run;
/* Print the data */
proc print data=set1;
var yield;
run;
/* Compute an ANOVA table */
613
estimate
estimate
estimate
estimate
estimate
'n-linear' nitrogen -1 0 1;
'n-quad' nitrogen -1 2 -1;
'v1-v2' variety 1 -1 0;
'(v1+v2)-v3' variety .5 .5 -1;
'(v1-v2)*(n-lin)' variety*nitrogen
-1 0 1 1 0 -1 0 0 0;
estimate '(v1-v2)*(n-quad)' variety*nitrogen
-1 2 -1 1 -2 1 0 0 0;
estimate '(.5(v1+v2)-v3)*(n-lin)'
variety*nitrogen
-.5 0 .5 -.5 0 .5 1 0 -1;
estimate '(.5(v1+v2)-v3)*(n-quad)'
variety*nitrogen
-.5 1 -.5 -.5 1 -.5 1 -2 1;
run;
615
614
/* Make a profile plots for the interaction
between varieties and nitrogen levels */
/* UNIX users can use the following options */
/* goptions cback=white colors=(black)
targetdevice=ps300 rotate=landscape; */
/* Windows users can use the following */
goptions cback=white colors=black
device=WIN target=WINPRTC;
proc sort data=set1; by variety nitrogen;
proc means data=set1 noprint;
by variety nitrogen;
var yield;
output out=means mean=my;
run;
616
General Form of Estimable Functions
axis1 label=(f=swiss h=2.5)
ORDER = 120 to 300 by 30
value=(f=swiss h=2.0) w=3.0
length= 5.5 in;
axis2 label=(f=swiss h=2.0)
order = 50 to 80 by 10
value=(f=swiss h=2.0) w= 3.0
length = 5.5 in;
Effect
Coefficients
Intercept
L1
variety
variety
variety
SYMBOL1 V=CIRCLE H=2.0 w=3 l=1 i=join ;
SYMBOL2 V=DIAMOND H=2.0 w=3 l=3 i=join ;
SYMBOL3 V=square H=2.0 w=3 l=9 i=join ;
PROC GPLOT DATA=means;
PLOT my*nitrogen=variety /
vaxis=axis2 haxis=axis1;
TITLE1 H=3.0 F=swiss "Sugar Cane Yields";
LABEL my='Mean Yield';
LABEL nitrogen = 'Nitrogen (lb/acre)';
RUN;
1
2
3
nitrogen
nitrogen
nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
1
1
1
2
2
2
3
3
3
L2
L3
L1-L2-L3
150
210
270
L5
L6
L1-L5-L6
150
210
270
150
210
270
150
210
270
L8
L9
L2-L8-L9
L11
L12
L3-L11-L12
L5-L8-L11
L6-L9-L12
L1-L2-L3-L5-L6+L8
+L9+L11+L12
618
617
Type I Estimable Functions
Effect
variety
Intercept
variety
variety
variety
0
1
2
3
nitrogen
nitrogen
nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
Type I Estimable Functions
1
1
1
2
2
2
3
3
3
L2
L3
-L2-L3
150
210
270
0
0
0
150
210
270
150
210
270
150
210
270
0.3333*L2
0.3333*L2
0.3333*L2
0.3333*L3
0.3333*L3
0.3333*L3
-0.3333*L2-0.3333*L3
-0.3333*L2-0.3333*L3
-0.3333*L2-0.3333*L3
619
Effect
-------------Coefficients------nitrogen
variety*nitrogen
Intercept
0
0
variety
variety
variety
1
2
3
0
0
0
0
0
0
nitrogen
nitrogen
nitrogen
150
210
270
L5
L6
-L5-L6
0
0
0
variety*nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
1
1
1
2
2
2
3
3
3
150
210
270
150
210
270
150
210
270
0.3333*L5
0.3333*L6
-0.3333*L5-0.3333*L6
0.3333*L5
0.3333*L6
-0.3333*L5-0.3333*L6
0.3333*L5
0.3333*L6
-0.3333*L5-0.3333*L6
L8
L9
-L8-L9
L11
L12
-L11-L12
-L8-L11
-L9-L12
L8+L9+L11+L12
620
Type II Estimable Functions
Type II Estimable Functions
Effect
----Coefficients---variety
Effect
-------------Coefficients--------nitrogen
variety*nitrogen
Intercept
0
Intercept
0
0
variety
variety
variety
1
2
3
L2
L3
-L2-L3
variety
variety
variety
1
2
3
0
0
0
0
0
0
nitrogen
nitrogen
nitrogen
150
210
270
0
0
0
nitrogen
nitrogen
nitrogen
150
210
270
L5
L6
-L5-L6
0
0
0
variety*nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
1
1
1
2
2
2
3
3
3
0.3333*L2
0.3333*L2
0.3333*L2
0.3333*L3
0.3333*L3
0.3333*L3
-0.3333*L2-0.3333*L3
-0.3333*L2-0.3333*L3
-0.3333*L2-0.3333*L3
variety*nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
1
1
1
2
2
2
3
3
3
150
210
270
150
210
270
150
210
270
150
210
270
150
210
270
150
210
270
0.3333*L5
0.3333*L6
-0.3333*L5-0.3333*L6
0.3333*L5
0.3333*L6
-0.3333*L5-0.3333*L6
0.3333*L5
0.3333*L6
-0.3333*L5-0.3333*L6
621
L8
L9
-L8-L9
L11
L12
-L11-L12
-L8-L11
-L9-L12
L8+L9+L11+L12
622
Type III Estimable Functions
----Coefficients---variety
Effect
0
Effect
-------------Coefficients-------nitrogen
variety*nitrogen
L2
L3
-L2-L3
Intercept
0
0
variety
variety
variety
1
2
3
0
0
0
0
0
0
150
210
270
0
0
0
nitrogen
nitrogen
nitrogen
150
210
270
L5
L6
-L5-L6
0
0
0
150
210
270
150
210
270
150
210
270
0.3333*L2
0.3333*L2
0.3333*L2
0.3333*L3
0.3333*L3
0.3333*L3
-0.3333*L2-0.3333*L3
-0.3333*L2-0.3333*L3
-0.3333*L2-0.3333*L3
variety*nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
1
1
1
2
2
2
3
3
3
Intercept
variety
variety
variety
1
2
3
nitrogen
nitrogen
nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
1
1
1
2
2
2
3
3
3
Type III Estimable Functions
623
150
210
270
150
210
270
150
210
270
0.3333*L5
0.3333*L6
-0.3333*L5-0.3333*L6
0.3333*L5
0.3333*L6
-0.3333*L5-0.3333*L6
0.3333*L5
0.3333*L6
-0.3333*L5-0.3333*L6
L8
L9
-L8-L9
L11
L12
-L11-L12
-L8-L11
-L9-L12
L8+L9+L11+L12
624
Type IV Estimable Functions
Effect
----Coefficients---variety
Intercept
0
variety
variety
variety
1
2
3
nitrogen
nitrogen
nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
1
1
1
2
2
2
3
3
3
L2
L3
-L2-L3
150
210
270
0
0
0
150
210
270
150
210
270
150
210
270
0.3333*L2
0.3333*L2
0.3333*L2
0.3333*L3
0.3333*L3
0.3333*L3
-0.3333*L2-0.3333*L3
-0.3333*L2-0.3333*L3
-0.3333*L2-0.3333*L3
Type IV Estimable Functions
Effect
-------------Coefficients------nitrogen
variety*nitrogen
Intercept
0
0
variety
variety
variety
1
2
3
0
0
0
0
0
0
nitrogen
nitrogen
nitrogen
150
210
270
L5
L6
-L5-L6
0
0
0
variety*nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
variety*nitrogen
1
1
1
2
2
2
3
3
3
150
210
270
150
210
270
150
210
270
0.3333*L5
0.3333*L6
-0.3333*L5-0.3333*L6
0.3333*L5
0.3333*L6
-0.3333*L5-0.3333*L6
0.3333*L5
0.3333*L6
-0.3333*L5-0.3333*L6
L8
L9
-L8-L9
L11
L12
-L11-L12
-L8-L11
-L9-L12
L8+L9+L11+L12
625
626
Source
Dependent Variable: yield
Source
DF
Model
8
Error
Sum of
Squares
Mean
Square
F
Pr > F
variety
nitrogen
var*nit
27 1254.4600
DF Type I SS
2
2
4
2
2
4
Mean
Square
F
Pr > F
319.3739 159.6869 3.44 0.0468
56.5406 28.2703 0.61 0.5515
559.7878 139.9469 3.01 0.0355
935.7022 116.9628 2.52 0.0346
46.4615
Source
C. Total 35 2190.1622
Source
variety
nitrogen
var*nit
DF Type II SS
variety
nitrogen
var*nit
Mean
Square
F
Mean
DF Type III SS Square
2
2
4
319.3739 159.6869 3.44 0.0468
56.5406 28.2703 0.61 0.5515
559.7878 139.9469 3.01 0.0355
627
variety
nitrogen
var*nit
DF Type IV SS
2
2
4
Pr > F
319.3739 159.6869 3.44 0.0468
56.5406 28.2703 0.61 0.5515
559.7878 139.9469 3.01 0.0355
Pr > F
Source
F
Mean
Square
F
Pr > F
319.3739 159.6869 3.44 0.0468
56.5406 28.2703 0.61 0.5515
559.7878 139.9469 3.01 0.0355
628
Least Squares Means for effect variety*nitrogen
Pr > |t| for H0: LSMean(i)=LSMean(j)
Dependent Variable: yield
Least Squares Means
variety
1
1
1
2
2
2
3
3
3
i/j
nitrogen
LSMEAN
yield
Standard
Error Pr > |t|
150
210
270
150
210
270
150
210
270
66.525
68.975
75.950
61.450
62.550
70.425
68.600
64.525
57.900
3.408133
3.408133
3.408133
3.408133
3.408133
3.408133
3.408133
3.408133
3.408133
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
<.0001
1
1
2
3
4
5
6
7
8
9
0.6154
0.0610
0.3017
0.4168
0.4255
0.6702
0.6815
0.0848
i/j
1
2
3
4
5
6
7
8
9
2
3
4
5
0.6154
0.0610
0.1594
0.3017
0.1301
0.0056
0.4168
0.1937
0.0098
0.8212
0.1594
0.1301
0.1937
0.7658
0.9386
0.3640
0.0296
0.0056
0.0098
0.2617
0.1389
0.0252
0.0009
0.8212
0.0735
0.1495
0.5289
0.4678
6
7
8
9
0.4255
0.7658
0.2617
0.0735
0.1139
0.6702
0.9386
0.1389
0.1495
0.2202
0.7079
0.6815
0.3640
0.0252
0.5289
0.6852
0.2315
0.4053
0.0848
0.0296
0.0009
0.4678
0.3432
0.0150
0.0350
0.1806
0.7079
0.2315
0.0150
0.4053
0.0350
0.1806
629
Contrast
n-linear
n-quad
v1-v2
(v1+v2)-v3
(v1-v2)*(n-lin)
(v1-v2)*(n-quad)
(.5(v1+v2)-v3)*(n-lin)
(.5(v1+v2)-v3)*(n-quad)
DF
Contrast SS
Mean Square
1
1
1
1
1
1
1
1
39.5266667
17.0138889
193.2337500
126.1401389
0.2025000
1.6875000
528.0133333
29.8844444
39.5266667
17.0138889
193.2337500
126.1401389
0.2025000
1.6875000
528.0133333
29.8844444
Contrast
n-linear
n-quad
v1-v2
(v1+v2)-v3
(v1-v2)*(n-lin)
(v1-v2)*(n-quad)
(.5(v1+v2)-v3)*(n-lin)
(.5(v1+v2)-v3)*(n-quad)
F Value
Pr > F
0.85
0.37
4.16
2.71
0.00
0.04
11.36
0.64
0.3645
0.5501
0.0513
0.1110
0.9478
0.8503
0.0023
0.4296
631
0.1139
0.2202
0.6852
0.3432
630
Parameter
n-linear
n-quad
v1-v2
(v1+v2)-v3
(v1-v2)*(n-lin)
(v1-v2)*(n-quad)
(.5(v1+v2)-v3)*(n-lin)
(.5(v1+v2)-v3)*(n-quad)
Estimate
Standard
Error
2.5666667
-2.9166667
5.6750000
3.9708333
0.4500000
2.2500000
19.9000000
-8.2000000
2.7827289
4.8198279
2.7827289
2.4099139
6.8162659
11.8061189
5.9030595
10.2243989
Parameter
n-linear
n-quad
v1-v2
(v1+v2)-v3
(v1-v2)*(n-lin)
(v1-v2)*(n-quad)
(.5(v1+v2)-v3)*(n-lin)
(.5(v1+v2)-v3)*(n-quad)
Pr > |t|
0.3645
0.5501
0.0513
0.1110
0.9478
0.8503
0.0023
0.4296
632
t Value
0.92
-0.61
2.04
1.65
0.07
0.19
3.37
-0.80
Two factor experiments with
empty cells
Data from Littell, Freund, and Spector,
1991, SAS System for Linear Models,
3rd edition, SAS Institute, Cary, N.C.

j=1
Factor A
Y111 = 5
Y112 = 6
i=1
Y211 = 2
Y212 = 3
i=2
Factor B
j=2
Y121 = 2
Y122 = 3
Y123 = 5
Y124 = 6
Y125 = 7
Y221 = 8
Y222 = 8
Y223 = 9
j=3
{
Sample sizes:
Factor B
Factor A j = 1 j = 2 j = 3
i = 1 n11 = 2 n12 = 5
{
i = 2 n21 = 2 n22 = 3 n23 = 5
Eects model:
Y231 = 4
Y232 = 4
Y233 = 6
Y234 = 6
Y235 = 7
Yijk = + i + j + ij + ijk
for (i; j ) =
6 (1; 3)
and k = 1; : : : ; nij
634
633
ij
=
=
E (Yij:)
+ i + j + ij
is estimable for all (i; j ) 6= (1; 3).
Functions of parameters that are
not estimable include:
13 = + 1 + 3 + 13
:: = 61 2 3 ij
i=1 j =1
= + 21 (1 + 2) + 13 (1 + 2 + 3)
X
X
+ 61 (11 + 12 + 13 + 21 + 22 + 23):
1: = 31 3 1j
j =1
X
635
:3 = 12 (13 + 23)
Two factor classications with
empty cells:
Compute F-tests and sums of
No single \best" or \correct"
Compare estimated means for
analysis.
Analysis of variance
{ Test for interaction is useful
{ Use SSE to estimate the error
variance 2.
{ Tests for \main eects" may
not be meaningful, especially
in the presence of interaction.
squares for meaningful contrasts.
dierent combinations of factor
levels.
Consider the combinations of
factor levels as levels of a single
\combined" factor.
{ one-way ANOVA
{ contrasts
{ compare means
637
636
2 1
2 1
2 2
2 2
2 2
2 3
2 3
2 3
2 3
2 3
run;
/* SAS code for analyzing data
from the two factor experiment
with no data for one combination
of factors> This code is posted
as littell.sas */
data set1;
input A B y;
cards;
1 1 5
1 1 6
1 2 2
1 2 3
1 2 5
1 2 6
1 2 7
2
3
8
8
9
4
4
6
6
7
/* Print the data */
proc print data=set1;
run;
/* Compute sample means for all
factor combinations with data.
Make a profile plot. */
638
639
proc sort data=set1; by a b;
proc means data=set1 noprint; by a b;
var Y;
output out=means mean=my;
run;
SYMBOL1 V=circle H=2.0 w=3 l=1 i=join;
SYMBOL2 V=diamond H=2.0 w=3 l=3 i=join;
goptions cback=white colors=black
device=WIN target=WINPRTC;
/*
goptions cback=white colors=(black)
targetdevice=ps300 rotate=landscape;
*/
proc gplot data=means;
plot my*b=a / vaxis=axis2 haxis=axis1;
title ls=0.8in H=3.0 F=swiss "Sample Means";
label my='Mean';
label b = 'Factor B';
footnote ls=0.4in ' ';
run;
/* Perform analysis of variance where
facror A is entered into the model
before factor B. Use the LSMEANS
statement to compare means for
different combinations of factor A
and factor B. */
axis1 label=(f=swiss h=2.0)
value=(f=swiss h=1.8)
w=3.0 length= 5.0 in;
axis2 label=(f=swiss h=2.0 a=90 r=0)
value=(f=swiss h=1.8)
w= 3.0 length = 5.0 in;
640
641
proc glm data=set1;
class A B;
model y = A B A*B / solution ss1 ss2
ss3 ss4 e e1 e2 e3 e4 p;
means A B A*B;
lsmeans A*B / pdiff tdiff stderr;
estimate 'A1-A2' A 1 -1 / e;
contrast 'A1-A2' A 1 -1 / e;
estimate 'A1-A2 within B1' A 1 -1
A*B 1 0 -1 0 0 / e;
estimate 'A1-A2 within B2' A 1 -1
A*B 0 1 0 -1 0 / e;
estimate 'A1-A2 over B' A 1 -1
A*B .5 .5 -.5 -.5 0 / e;
estimate 'B1-B2 over A' B 1 -1 0
A*B .5 -.5 .5 -.5 0 / e;
estimate 'B3-.5(B1+B2) in A2' B -.5 -.5 1
A*B 0 0 -.5 -.5 1 / e;
estimate 'interaction' A*B 1 -1 -1 1 0 / e;
run;
642
643
/* Do everything with a one-factor ANOVA by
combining the two factors into a single
factor with 5 categories. */
data set1; set set1;
C=10*A+B;
run;
proc glm data=set1;
class C;
model y = C / solution e e2;
estimate 'C11-C21' C 1 0 -1 0 0;
estimate 'C12-C22' C 0 1 0 -1 0;
estimate '.5(C11+C12-C21+C22)'
C .5 .5 -.5 -.5 0;
estimate '.5(C11-C12+C21-C22)'
C .5 -.5 .5 -.5 0;
estimate 'C23-.5(C21+C22)' C 0 0 -.5 -.5 1;
estimate 'C11-C12-C21+C22' C 1 -1 -1 1 0;
lsmeans C / stderr tdiff pdiff;
run;
General Form of Estimable Functions
Effect
Coefficients
Intercept
L1
A
A
1
2
L2
L1-L2
B
B
B
1
2
3
L4
L5
L1-L4-L5
A*B
A*B
A*B
A*B
A*B
1
1
2
2
2
1
2
1
2
3
L7
L2-L7
L4-L7
-L2+L5+L7
L1-L4-L5
645
644
Type IV Estimable Functions
Type III Estimable Functions
Effect
-----------Coefficients----------A
B
A*B
Intercept
0
0
0
A
A
1
2
L2
-L2
0
0
0
0
B
B
B
1
2
3
0
0
0
L4
L5
-L4-L5
0
0
0
A*B
A*B
A*B
A*B
A*B
1
1
2
2
2
0.5*L2
0.5*L2
-0.5*L2
-0.5*L2
0
0.25*L4-0.25*L5
-0.25*L4+0.25*L5
0.75*L4+0.25*L5
0.25*L4+0.75*L5
-L4-L5
L7
-L7
-L7
L7
0
1
2
1
2
3
Effect
------Coefficients-----A
B
A*B
Intercept
0
0
0
A
A
1
2
L2
-L2
0
0
0
0
B
B
B
1
2
3
0
0
0
L4
L5
-L4-L5
0
0
0
A*B
A*B
A*B
A*B
A*B
1
1
2
2
2
0.5*L2
0.5*L2
-0.5*L2
-0.5*L2
0
0
0
L4
L5
-L4-L5
L7
-L7
-L7
L7
0
1
2
1
2
3
NOTE: Other Type IV estimable functions exist.
646
647
General Form of Estimable Functions
Effect
Coefficients
Intercept
L1
Dependent Variable: y
L2
L3
L4
L5
L1-L2-L3-L4-L5
Source
DF
Sum of
Squares
Mean
Square
F Value
Pr > F
Model
4
45.8157
11.4539
5.27
0.0110
Error
12
26.0667
2.1722
C. Total
16
71.8824
C
C
C
C
C
11
12
21
22
23
Type II Estimable Functions
Effect
-CoefficientsC
Intercept
0
C
C
C
C
C
11
12
21
22
23
Parameter
C11-C21
C12-C22
.5(C11+C12-C21+C22)
.5(C11-C12+C21-C22)
C23-.5(C21+C22)
C11-C12-C21+C22
L2
L3
L4
L5
-L2-L3-L4-L5
Estimate
Standard
Error
t
3.0000
-3.7333
-0.3667
-2.4667
0.0167
6.7333
1.4738
1.0763
0.9125
0.9125
0.9418
1.8250
2.04
-3.47
-0.40
-2.70
-0.02
3.69
Estimable functions for Type IV
sums of squares may depend on
location of empty cells
ordering of the levels for the
row and column factors
Least Squares Means
C
Standard
Error
Pr > |t|
LSMEAN
Number
11
12
21
22
23
5.5000
4.6000
2.5000
8.3333
5.4000
1.0421
0.6591
1.0422
0.8509
0.6591
0.0002
<.0001
0.0336
<.0001
<.0001
1
2
3
4
5
Example: Exchange columns 1 and 3
in the previous example.
Least Squares Means for Effect C
t for H0: LSMean(i)=LSMean(j) / Pr > |t|
Dependent Variable: y
i/j
1
1
2
3
4
5
-0.7299
0.4795
-2.0355
0.0645
2.1059
0.0569
-0.0811
0.9367
2
0.7299
0.4795
-1.70301
0.1143
3.46853
0.0046
0.85824
0.4076
3
2.0355
0.0645
1.7030
0.1143
4.3357
0.0010
2.3518
0.0366
4
-2.1059
0.0569
-3.4685
0.0046
-4.3357
0.0010
-2.7253
0.0184
0.0645
0.0046
0.6949
0.0192
0.9862
0.0031
649
648
LSMEAN
y
Pr > |t|
5
0.0811
0.9367
-0.8582
0.4076
-2.3518
0.0366
2.7253
0.0184
650
Factor 2
B
A
Factor 1 (old j=3)
i=1
{
Y12: = 4:6
n12 = 5
i = 2 Y21: = 5:4 Y22 = 8:33
n21 = 5
n22 = 3
C
(old j=1)
Y13 = 5:5
n13 = 2
Y23: = 2:5
n23 = 2
651
Type IV estimable functions for
Factor B:
Main Eects
A B
i=1 { 0
i=2 1 0
C
0
-1
2A 2C
Additive model
Interaction
Yijk
=
+ i + j + ijk
A B C
i = 1 { .5 -.5
i = 2 0 .5 -.5
1 (1B + 2B )
2 1
2 (1C + 2C )
In either case, Type IV sums of
squares and testable functions are
not the same as Type III sums of
squares and testable functions.
i = 1; : : : ; a
j = 1; : : : ; b
k + 1; : : : ; nij
For this model
E (Yijk) = ij = + i + j
may be estimable when nij = 0.
652
For example 8.1, n13 = 0, but
653
Sum of
Squares
R()
13
=
+ 1 + 3
= ( + 2 + 3)
=
Associated null
hypothesis
H0 : + a nni: i + b nn:j j = 0
i=1 ::
j =1 ::
or H0 : a b nnij ij = 0
i=1 j =1 ::
X
( + 2 + 2)
+( + 1 + 2)
=
Summary
23 + (12 22)
E (Y23: Y22: + Y12:)
R(j)
X
X
X
H0 : i + b nij j are equal
j =1 ni:
for all i = 1; : : : ; a
X
or H0 : b nnij ij are equal
j =1 i:
for all i = 1; : : : ; a
X
R(j; ) H0 : j are equal
for all j = 1; : : : ; b
654
655
Sum of
Squares
R()
Associated null
hypothesis
H0 : + a nni: i + b nn:j j = 0
j =1 ::
i=1 ::
or H0 : a b nnij ij = 0
i=1 j =1 ::
R(j)
X
X
X
X
H0 : j + a nnij i are equal
i=1 :j
for all j = 1; : : : ; b
X
or H0 : a nnij ij areequal
i=1 :j
for all j = 1; : : : ; b
X
R(j; ) H0 : i are equal
for all i = 1; : : : ; a
656
Download