Backup of Notes3.wxb

advertisement
St 711 (66)
Latin Squares (Chap. 6)
ñ Block on rows, columns. Standard order
of treatments A,B,C,.. in first row, column.
Each letter once per row, once per column.
Ô A B C D ×Ô A B C D ×Ô A B C D ×Ô A B C
Ö B A D C ÙÖ B C D A ÙÖ B D A C ÙÖ B A D
Ö
ÙÖ
ÙÖ
ÙÖ
C D A B
C D A B
C A D B
C D B
Õ D C B A ØÕ D A B C ØÕ D C B A ØÕ D C A
All others are permutations of these.
ñ 4 people (rows), 4 days (columns)
4 drugs (letters)
Ô A 57
Ö B 43
Ö
Ö C 45
Ö
D 43
Õ (188)
B 59
D 45
A 62
C 52
(218)
drugs: A 240
C 50
A 59
D 58
B 50
(217)
D 63 (229) ×
C 53 (200) Ù
Ù
B 63 (228) Ù
Ù
A 62 (207)
(241) (864) Ø
B 215 C 200 D 209
SS(rows) = SS(people) =
(2292 /4) + ... + (2072 )/4 - 8642 /16 = 162.5
etc.
D×
CÙ
Ù
A
BØ
St 711 (67)
** Demo 19 **;
Data drug; person+1;
do day = 1 to 4;
input drug $ Y @;
output; end;
cards;
A 57 B 59 C 50 D 63
(229) columns
188 218 217 241
B 43 D 45 A 59 C 53
(200) drug totals 240 215 200 209
C 45 A 62 D 58 B 63
(228) none of this ( )is read - why?
D 43 C 52 B 50 A 62
(207)
;
proc print;
proc glm; class person drug day;
model Y = person day drug; run;
Class Level Information
Class
Levels
Values
person
4
1 2 3 4
drug
4
A B C D
day
4
1 2 3 4
Number of observations
16
The GLM Procedure
Dependent Variable: Y
Sum of
Source
DF
Squares
Model
9 736.5000000
Error
6 69.5000000
Corrected Total 15 806.0000000
Source
person
day
drug
Mean Square
81.8333333
11.5833333
DF
Type III SS
Mean Square
(same as Type I)
3
162.5000000
54.1666667
3
353.5000000
117.8333333
3
220.5000000
73.5000000
F Value
7.06
Pr > F
0.0136
F Value
Pr > F
4.68
10.17
6.35
0.0517
0.0091
0.0273
St 711 (68)
ñ Yij = . + Ri + Cj + 7k(ij) + eij
ñ For_ k=1 (k=A)
_ 2 (B) _we get
_ and
Y
_ ñ + 71 + _e
_ A =. + R
_ñ + C
YB =. + Rñ + Cñ + 72 + e <= different e's
Difference does not involve R or C effects.
_ _
E {YA -YB }_= _____
_
Variance {YA -YB } = _____
ñ Balanced: Add 5 to row 1:
SS(drug), SS(col) do not change etc.
ñ Square 3 balanced for carryover (A
follows each other letter once - same for B,
etc.)
ñ If following A adds 5 to response then
difference B mean - C mean
contains 5/4 - 5/4 = 0 carryover effect.
St 711 (69)
ñ Multiple squares (Sec. 6.3)
(1) Three four day trials 4 people each
Square
Person (square)
Day (square)
Drug
Error
2
9
9
3
_ Square*Drug
__œ
_
Error
(2) One four day trial, 12 people
Person
Day
Drug
Error
11
3
3
___(Can split up - G&G table 6.4)
(3) Repeat same square (same R, C)
s=8 regional offices, 25 workers each
5 pulldown menu systems (trt)
St 711 (70)
ñ Rows: Monitor types (5) fixed
Columns: Room lighting (5) fixed
ñ Analysis depends on assumptions
Here we assume:
Squares random, all else fixed
Re-randomize each square
** demo 20 generates some data then GLM, MIXED **;
** MIXED output not shown here **;
proc glm; class r s c menu;
model Y = S|r S|c menu S*menu;
random S S*r S*c s*menu/test; title "Pooling G&G errors";
proc mixed covtest; class r c menu s;
model Y = menu r c/ddfm=satterthwaite;
random S S*r S*c S*menu;
run;
Dependent Variable: Y
Source
Model
Error
DF
103
96
Reaction Time
Sum of
Squares Mean Square
34839.43915
338.24698
79.04480
0.82338
F Value
410.80
Pr > F
<.0001
St 711 (71)
Corrected Total
Source
s
r
r*s
c
s*c
menu
s*menu
Source
s
r
r*s
c
s*c
menu
s*menu
DF
7
4
28
4
28
4
28
199
34918.48395
Type III SS
743.11035
608.34520
26.97640
1337.17270
31.84090
31361.23220
730.76140
Mean Square
106.15862
152.08630
0.96344
334.29318
1.13717
7840.30805
26.09862
F Value
128.93
184.71
1.17
406.00
1.38
9522.06
31.70
Pr > F
<.0001
<.0001
0.2817
<.0001
0.1263
<.0001
<.0001
Type III Expected Mean Square
Var(Error) +
+
Var(Error) +
Var(Error) +
Var(Error) +
Var(Error) +
Var(Error) +
Var(Error) +
5
5
5
5
5
5
5
5
Var(s*menu) + 5 Var(s*c)
Var(r*s)+ 25 Var(s)
Var(r*s) + Q(r)
Var(r*s)
Var(s*c) + Q(c)
Var(s*c)
Var(s*menu) + Q(menu)
Var(s*menu)
The GLM Procedure
Tests of Hypotheses for Mixed Model Analysis of Variance
(partial output)
Dependent Variable: Y
Source
DF
r
4
Error: MS(r*s) 28
Source
r*s
s*c
s*menu
Error: MS(Error)
Reaction Time
Type III SS
608.345200
26.976400
DF
28
28
28
96
Type III SS
26.976400
31.840900
730.761400
79.044800
Mean Square
152.086300
0.963443
Mean Square
0.963443
1.137175
26.098621
0.823383
F Value
157.86
F Value
1.17
1.38
31.70
Pr > F
<.0001
Pr > F
0.2817
0.1263
<.0001
St 711 (72)
Source
c
Error: MS(s*c)
DF Type III SS
4 1337.172700
28
31.840900
Source
DF Type III SS
menu
4 31361
Error: MS(s*menu)28
730.761400
Mean Square
334.293175
1.137175
F Value
293.97
Pr > F
<.0001
Mean Square
7840.308050
26.098621
F Value
300.41
Pr > F
<.0001
ñ Example 2: Dairy Cows (Sec. 6.3.1)
Columns: Cows (12)
Rows: Time (3)
Trts: Feed Type (A,B,C)
** Demo23.sas **;
Data Lucas; Day+1;
Do Square=1 to 4; Do i=1 to 3; drop i;
Cow = 3*(Square-1)+i; Input trt $ Y @@;
output; end; end;
* <-- square 1----> |
<-- square 2----> |
| <-- square 3----> | <-- square 4 ----> ;
cards;
C 35.3 A 46.1 B 42.7
A 64.2 C 29.8 B 29.8
A 26.8 C 37.2 B 29.8
C 61.5 A 26.7 B 38.6
A 53.5 B 28.6 C 33.9
B 27.9 A 26.7 C 30.8
C 24.1 B 36.4 A 26.4
B 46.4 C 25.0 A 29.6
B 24.9 C 38.5 A 35.5
C 38.0 B 37.7 A 36.4
B 50.5 A 23.4 C 24.5
A 24.9 B 20.6 C 29.2
;
proc print;
proc means noprint; Var Y; class Square Trt Cow Day;
output out=out1 mean=mnY sum=sumY;
proc print data=out1;
St 711 (73)
proc glm data=lucas;
class day cow square trt;
Model Y = square trt Cow(square) Day trt*square Day*square;
random Square Cow(Square) Day trt*square Day*square;
/* The G&G model pg. 141 assumes squares have some meaning they might be barns. If just 12 cows over 3 days with
Latin Square trts.
Model Y = trt Cow(square) Day ;
*/
Title 'G&G Analysis Pg. 141';
** Need single quotes - why ??? (Macros!) *; run;
(partial output)
Obs
1
Square
.
trt
Cow
.
Day
.
_TYPE_
0
_FREQ_
36
mnY
34.4972
sumY
1241.9
5
6
7
8
9
10
11
12
13
14
15
16
.
.
.
.
.
.
.
.
.
.
.
.
1
2
3
4
5
6
7
8
9
10
11
12
.
.
.
.
.
.
.
.
.
.
.
.
2
2
2
2
2
2
2
2
2
2
2
2
3
3
3
3
3
3
3
3
3
3
3
3
37.9000
37.7333
37.3667
43.3667
31.4000
32.3333
33.8000
32.3333
26.9000
44.2667
24.1000
32.4667
113.7
113.2
112.1
130.1
94.2
97.0
101.4
97.0
80.7
132.8
72.3
97.4
137
138
139
140
1
2
3
4
.
.
.
.
.
.
.
.
8
8
8
8
9
9
9
9
37.6667
35.7000
31.0111
33.6111
339.0
321.3
279.1
302.5
SS(cow) = (113.72 +...+97.42 )/3 - 1241.92 /36 = 1181.34
SS(Square) = 3392 /9+...+302.52 /9 - 1241.92 /36= 219.87
SS(Cow(Square)) = 1181.34 - 219.87=961.47
St 711 (74)
Similarly, you can reproduce this GLM output:
Source
Model
Error
Corrected Total
Source
Square
trt
Cow(Square)
Day
Square*trt
Day*Square
DF
27
8
35
DF
3
2
8
2
6
6
(Type
Sum of
Squares
3000.565278
898.904444
3899.469722
Mean Square
111.132047
112.363056
F Value
0.99
Pr > F
0.5490
Type I SS
Mean Square
F Value
Pr > F
219.8719444
6.4072222
961.4711111
372.8622222
969.4638889
470.4888889
III=Type I)
73.2906481
3.2036111
120.1838889
186.4311111
161.5773148
78.4148148
0.65
0.03
1.07
1.66
1.44
0.70
0.6036
0.9720
0.4633
0.2496
0.3090
0.6600
Graeco-Latin Squares
Exist for t = prime, not for t=6,
general existence unknown.
ñ Superimpose 2 orthogonal Latin Squares
Latin
Greek
Ô1 2 3 4× Ô1 2 3 4×
Ö3 4 1 2Ù Ö2 1 4 3Ù
Ö
Ù Ö
Ù
4 3 2 1
3 4 1 2
Õ2 1 4 3Ø Õ4 3 2 1Ø
orthogonal: each letter pair appears once
(1,1)=(A, !), (1,2), (2,1) etc.
St 711 (75)
ñ Rows: Days
Greek: Cars
Columns:Drivers
Latin: Fuels
Y = miles per gallon (mpg)
** Demo24.sas **;
** Graeco-Latin Square **;
Data cars; do Day = 1 to 4; Do driver=1 to 4;
Input Car Fuel MPG @@; output; end; end;
Cards;
1 1 27.5
2 2 26.1
3 3 29.3
4 4 41.8
3 2 30.6
4 1 32.2
1 4 37.4
2 3 37.1
4 3 34.4
3 4 37.5
2 1 33.4
1 2 33.0
2 4 37.8
1 3 31.7
4 2 30.5
3 1 35.9
;
proc glm; class Car Fuel Driver Day;
Model MPG = Car Fuel Driver Day;
Means Fuel/Tukey lines;
run;
Dependent Variable: MPG
Source
DF
Model
12
Error
3
Corrected Total 15
Source
DF
Sum of
Squares Mean Square F Value
264.8350
4.2825
269.1175
Type I SS
22.06958
1.42750
Mean Square
15.46
F Value
Pr > F
0.0224
Pr > F
St 711 (76)
Car
Fuel
driver
Day
3
3
3
3
11.00250
159.48250
64.48250
29.86750
(Type III SS
3.66750
53.16083
21.49417
9.95583
2.57
37.24
15.06
6.97
0.2294
0.0071
0.0259
0.0725
same )
Tukey's Studentized Range (HSD) Test for MPG
NOTE: This test controls the Type I experimentwise
error rate, but it generally has a higher Type II
error rate than REGWQ.
Alpha
Error Degrees of Freedom
Error Mean Square
Critical Value of Studentized Range
Minimum Significant Difference
0.05
3
1.4275
6.82453
4.0769
Means with the same letter are not significantly
different.
Tukey Grouping
Mean
N
Fuel
A
38.6250
4
4
B
B
B
B
B
33.1250
4
3
32.2500
4
1
30.0500
4
2
ñ Some squares balanced for carryover
ñ Each trt follows each other same # of times
St 711 (77)
ñ "Complete" Latin squares
ñ Row complete - rows have each treatment
pair once
Î6
Ð 1
Ð
Ð 2
Ð
Ð 3
Ð
4
Ï5
1
2
3
4
5
6
5
6
1
2
3
4
2
3
4
5
6
1
4
5
6
1
2
3
3 Ñ (6,1)
4Ó
(2,6)
Ó
5Ó
(4,6)
Ó
6Ó
(1,6)
Ó
1 (3,6)
2 Ò (5,6)
(6,3)
(6,5)
(6,2)
(6,4)
ñ Complete => row and column complete
ñ Advantage - balance for carryover
Treatment 6 mean:
6 preceded by each other trt once.
Compare 5 to 6, carryover from other
treatments subtracts out of difference.
Youden Squares
ñ Really Rectangles
St 711 (78)
ñ Subset of rows of Latin square such that
each trt pair occurs together in exactly k
columns.
ÔA B C D E F G×
B C D E F G A
ÕD E F G A B CØ
G pairs: GA-col 7, GB-6, GC-7, GD-4, GE-4, GF-6
ÔC
ÖE
Ö
F
ÕG
D
F
G
A
E
G
A
B
F
A
B
C
G
B
C
D
A
C
D
E
(each combo apears twice)
B×
DÙ
Ù
E
FØ
ñ These two- top & bottom of Latin square
ñ Columns form BIB
(Balanced Incomplete Block design)
_ _
ñ YA - YB =
7A -7B + (C" +C5 +C7 )/3-(C" +C2 +C6 )/3 + e's
St 711 (79)
ñ Column effects C fixed -> bias
ñ Column effects C random -> variance
ñ Can estimate nicely in PROC MIXED.
Example:
3 expensive knitting machines, 7 types
of fiber that I use in cloth, 7 knitting speeds.
Impossible to buy more machines,
expensive to set up new (speed, fiber)
combinations. Y = fabric strength
Goal: Investigate effects of fabric type
on Y.
* demo25.sas **
** Youden square ***;
Data Youden;
do row=1 to 3; do column = 1 to 7; input trt $ Y @@;
output; end; end;
cards;
A 57 B 60 C 39 D 42 E 36 F 27 G 69
B 72 C 39 D 33 E 45 F 45 G 81 A 54
D 21 E 24 F 27 G 63 A 51 B 60 C 27
;
proc glm; class row column trt;
model Y = row column trt/solution;
means trt; lsmeans trt/e pdiff;
label row = "Machine"; label column = "Speed";
label trt = "Fabric";
St 711 (80)
** Now suppose row=day,
column = technician (both random) **;
proc mixed; class row column trt;
model Y = trt/solution;
random row column;
lsmeans trt/e pdiff; run;
The GLM Procedure
Dependent Variable: Y
Source
DF
Sum of
Squares
Model
Error
Corrected Total
14
6
20
5724.000000
172.285714
5896.285714
Source
row
column
trt
Source
row
column
trt
DF
2
6
6
DF
2
6
6
Mean Square
F Value
Pr > F
408.857143
28.714286
14.24
0.0018
Type I SS
666.000000
1036.285714
4021.714286
Mean Square
333.000000
172.714286
670.285714
Type III SS
666.000000
199.714286
4021.714286
Mean Square
333.000000
33.285714
670.285714
Level of
trt
A
B
C
D
E
F
G
N
3
3
3
3
3
3
3
F Value
11.60
6.01
23.34
F Value
11.60
1.16
23.34
Pr > F
0.0087
0.0231
0.0007
Pr > F
0.0087
0.4311
0.0007
The GLM Procedure
--------------Y-------------Mean
Std Dev
54.0000000
3.0000000
64.0000000
6.9282032
35.0000000
6.9282032
32.0000000
10.5356538
35.0000000
10.5356538
33.0000000
10.3923048
71.0000000
9.1651514
St 711 (81)
The GLM Procedure
Least Squares Means
Coefficients for trt Least Square Means
Effect
Intercept
row
row
row
column
column
column
column
column
column
column
trt
trt
trt
trt
trt
trt
trt
trt Level
A
1
0.33333333
0.33333333
0.33333333
0.14285714
0.14285714
0.14285714
0.14285714
0.14285714
0.14285714
0.14285714
1
0
0
0
0
0
0
1
2
3
1
2
3
4
5
6
7
A
B
C
D
E
F
G
trt
A
B
C
D
E
F
G
B
1
0.33333333
0.33333333
0.33333333
0.14285714
0.14285714
0.14285714
0.14285714
0.14285714
0.14285714
0.14285714
0
1
0
0
0
0
0
1
0.33333333
0.33333333
0.33333333
0.14285714
0.14285714
0.14285714
0.14285714
0.14285714
0.14285714
0.14285714
0
0
1
0
0
0
0
Y LSMEAN
LSMEAN
Number
54.0000000
65.5714286
38.1428571
30.4285714
33.4285714
31.7142857
70.7142857
1
2
3
4
5
6
7
St 711 (82)
Least Squares Means for effect trt
Pr > |t| for H0: LSMean(i)=LSMean(j)
Dependent Variable: Y
i/j
1
2
3
4
5
6
7
1
0.0584
0.0187
0.0032
0.0060
0.0041
0.0151
2
3
4
5
6
7
0.0584
0.0187
0.0015
0.0032
0.0004
0.1710
0.0060
0.0006
0.3787
0.5675
0.0041
0.0005
0.2427
0.8042
0.7415
0.0151
0.3399
0.0006
0.0002
0.0003
0.0002
0.0015
0.0004
0.0006
0.0005
0.3399
0.1710
0.3787
0.2427
0.0006
0.5675
0.8042
0.0002
0.7415
0.0003
0.0002
NOTE: To ensure overall protection level, only probabilities
associated with pre-planned comparisons should be used.
-------------------------------------------------------------The Mixed Procedure
Covariance Parameter
Estimates
Cov Parm
row
column
Residual
Estimate
43.4694
1.9592
28.7143
Solution for Fixed Effects
Effect trt
Estimate
Standard
Error
Intercept
trt
trt
trt
trt
trt
trt
trt
70.9608
-16.9608
-6.7449
-35.5291
-39.1766
-36.1766
-38.1373
0
4.9662
4.4603
4.4603
4.4603
4.4603
4.4603
4.4603
.
A
B
C
D
E
F
G
DF
t Value
Pr > |t|
2
6
6
6
6
6
6
.
14.29
-3.80
-1.51
-7.97
-8.78
-8.11
-8.55
.
0.0049
0.0089
0.1812
0.0002
0.0001
0.0002
0.0001
.
St 711 (83)
Type 3 Tests of Fixed Effects
Effect
trt
Num
DF
6
Den
DF
6
F Value
27.34
Pr > F
0.0004
Least Squares Means
Effect
trt
trt
trt
trt
trt
trt
trt
trt
A
B
C
D
E
F
G
Estimate
Standard
Error
DF
t Value
Pr > |t|
54.0000
64.2158
35.4316
31.7842
34.7842
32.8234
70.9608
4.9662
4.9662
4.9662
4.9662
4.9662
4.9662
4.9662
6
6
6
6
6
6
6
10.87
12.93
7.13
6.40
7.00
6.61
14.29
<.0001
<.0001
0.0004
0.0007
0.0004
0.0006
<.0001
ñ BIB
7 technicians, 3 fabrics each. No
particular order within technician (rows
have no meaning)
ñ G&G give other variations on Latin
Square, e.g. 6x6 square with 3 trts, each
twice per row, twice per column
St 711 (84)
Chapter 9 Repeated Treatments (brief)
ñ Yij =.+1i +uj +7d(i,j) +#d(i-1,j) +eij
mean+period+unit+trt+carryover
d( ) treatment index.
ñ Balanced (always)
Each unit in each period
ñ Strongly balanced (sometimes)
Each trt follows each trt (including
itself) equally often.
ñ Construction (t even)
Write 0,1,...,t-1 such that successive
differences (mod t) are 1,2,...,t-1 in some
order. Ex: t=6
0 1 5 2 4 3
\/ \/ \/ \/ \/
1 4 3 2 5 <-- (2+6)-5=3
This is column 1. Add 1 (mod t) repeatedly
to generate other columns (row=time period)
St 711 (85)
0
(1)
5
2
4
3
1
2
0
(3)
5
4
2
3
1
4
(0)
5
3
4
(2)
5
1
0
(4)
5
3
0
2
1
5
0
4
1
3
2
ñ Repeat last row to strongly balance.
ñ Carbureator icing study G&G page 185
Trts: 4 de-icing additives
Carryover: throttle plate may have
residue of previous de-icer.
*** Demo26.sas ***;
Data one; Carb + 1 ;
Do period = 1 to 4; *Use 5 to strongly balance;
input De_icer $ Y @; Lagtrt=lag(De_Icer);
if period=1 then Lagtrt=" ";
XCA=(Lagtrt="A")-(Lagtrt="D");
XCB=(Lagtrt="B")-(Lagtrt="D");
XCC=(Lagtrt="C")-(Lagtrt="D");
output; end;
cards;
A 88
B 76
C 88
D 92
D 96
B 78
D 94
A 90
C 90
C 90
C 87.5 A 95.5 D 95.5 B 87.5 B 75.5
D 90.5 C 78.5 B 82.5 A 94.5 A 98.5
St 711 (86)
proc print;
proc glm; class period Carb De_icer;
model Y = Period Carb XCA--XCC De_icer;
Contrast "Carryover" XCA 1, XCB 1, XCC 1;
LSMEANS De_icer/e stderr Pdiff;
run;
Source
DF
Type III SS
Mean Square
F Value
Pr > F
3
3
1
1
1
3
72.0000000
101.3090909
2.1333333
26.1333333
2.1333333
275.8545455
24.0000000
33.7696970
2.1333333
26.1333333
2.1333333
91.9515152
3.33
4.69
0.30
3.63
0.30
12.77
0.1746
0.1183
0.6241
0.1529
0.6241
0.0325
F Value
1.96
Pr > F
0.2968
period
Carb
XCA
XCB
XCC
De_icer
Contrast DF
Carryover 3
Contrast SS
42.40000000
Mean Square
14.13333333
Coefficients for De_icer Least Square Means
Effect
Intercept
period
period
period
period
Carb
Carb
Carb
Carb
XCA
XCB
XCC
De_icer
De_icer
De_icer
De_icer
1
2
3
4
1
2
3
4
A
B
C
D
A
1
0.25
0.25
0.25
0.25
0.25
0.25
0.25
0.25
0
0
0
1
0
0
0
De_icer
B
1
0.25
0.25
0.25
0.25
0.25
0.25
0.25
0.25
0
0
0
0
1
0
0
Level
C
1
0.25
0.25
0.25
0.25
0.25
0.25
0.25
0.25
0
0
0
0
0
1
0
D
1
0.25
0.25
0.25
0.25
0.25
0.25
0.25
0.25
0
0
0
0
0
0
1
St 711 (87)
De_icer
A
B
C
D
Y LSMEAN
Standard
Error
Pr > |t|
LSMEAN
Number
91.8000000
81.7000000
86.2000000
92.3000000
1.3910428
1.3910428
1.3910428
1.3910428
<.0001
<.0001
<.0001
<.0001
1
2
3
4
(note: The online demo 26 also shows a way to check
to see if any arbitrary function is estimable and
if so, it gives the estimate and its standard
error. It uses this to check the LSMEANS here).
Chapter 7 Split and Strip Plots
ñ See also St 512 notes
ñ Split plot (Y= vibration in gears)
8 Motors (whole plot units)
2 designs, 4 each (whole plot trts)
i=1,2 k=1,...,4
3 Axles/motor (split plot units)
3 diameters (split plot trts)
j=1,2,3
ñ Split split plot
Add 6 runs/axle (one each at 6 speeds)
St 711 (88)
ñ Without treatments, "nested design"
8 "identical" motors
3 " axles per motor (24 axles)
6 runs per axle at same speed
ñ Goal: reduce vibration Y
Yijk = . + Mi +A(M)ij + eijk
2
5M
5A2
52
motor axle run
** Demo 27.sas
Nested design ***;
Data Motor Spltplt;
input motor @; motor_design =1+(motor>4);
do axle = 1 to 3; avvib=0;
do run = 1 to 6;
input vib @; avvib=avvib+vib; output motor;
end; avvib= avvib/6; output spltplt; end;
cards;
1 47.0 50.8 47.7 47.3 49.2 51.3 41.4 44.8 42.8
47.0 44.7 42.2 42.0 44.9 42.6
2 47.2 47.7 49.4 48.1 45.3 49.4 54.9 50.9 53.5
39.4 40.9 42.6 42.6 42.4 40.2
3 58.2 60.5 58.8 60.0 56.7 61.6 46.6 51.1 45.8
61.3 58.0 59.0 60.5 59.5 56.9
4 50.7 47.7 53.6 50.8 51.5 51.5 49.3 47.3 46.0
40.3 43.0 44.1 41.4 39.6 39.3
5 52.4 55.0 54.6 57.9 56.5 55.8 40.7 42.6 41.0
38.3 35.1 38.0 38.4 37.9 39.0
6 43.7 43.5 39.1 37.1 41.8 40.9 40.0 38.0 38.7
39.4 37.6 37.7 35.1 35.3 32.5
7 58.3 55.4 55.3 52.0 57.5 51.5 47.6 51.3 51.4
43.6 47.8 40.9
52.8 48.9 52.2
47.5 49.3 48.4
44.5 48.9 48.6
40.4 43.3 43.5
37.0 38.7 37.7
51.3 48.3 47.6
St 711 (89)
8
43.0 43.3 42.0 49.6 44.4 45.6
36.8 38.4 38.7 34.8 41.0 39.6 41.4 40.7 39.5 41.7 39.1 40.0
47.1 42.9 44.4 41.8 43.7 45.4
;
proc glm data=motor; class motor axle;
model vib= motor axle(motor);
random motor axle(motor); Title "Nested";
The GLM Procedure
Dependent Variable: vib
Source
DF
Model
23
Error
120
Corrected Total 143
Sum of
Squares
6163.576667
458.843333
6622.420000
Mean Square F Value
267.981594
70.08
3.823694
Pr > F
<.0001
Source
DF
Type III SS
Mean Square
F Value
Pr > F
motor
axle(motor)
7
16
3400.385556
2763.191111
485.769365
172.699444
127.04
45.17
<.0001
<.0001
Source
motor
axle(motor)
Type III Expected Mean Square
Var(Error) + 6 Var(axle(motor)) + 18 Var(motor)
Var(Error) + 6 Var(axle(motor))
ñ Computing variance component estimates:
_
_
_
Y
iñ + e_iññ
_
_ iññ = . + M
_ i +A(M)
Yñññ = . + Mñ +A(M)ññ + e ñññ
St 711 (90)
_
_
E{18D8i=1 (Yiññ  Yñññ )2 /7}=
2
18[ E{D8i=1 (M
-M
)
}_ +
i
ñ
_
2
E{D8i=1 (A(M)
A(M)
)
}+
iñ
ññ
_
_
E{D8i=1 (e iññ -e ñññ )2 } ] / 7 =
2
18[ 5M
+ 5A2 /3+5 2 /18] =
2
185M
+ 65A2 +5 2
ñ SS for nested effects:_
_
SS(motors)= 18D8i=1 (Yiññ_  Yñññ_ )2 = 3400
3
SS(axles) = 6D8i=1 Dj=1
(Yijñ  Yñññ )2 =6163
SS(axles(motors)) = difference = 2763.
ñ Variance Component Estimates:
"method of moments"
(1) Write down mean squares,
expected mean squares
motor
axle(mtr)
error
485.8
172.7
3.82
Var(Error) + 6 Var(axle(mtr)) + 18 Var(mtr)
Var(Error) + 6 Var(axle(mtr))
Var(error)
St 711 (91)
(2) Find values of variance components so
that sample moments = theoretical moments
s2 =3.82
5
s2A =[MS(A(M))-MSE]/6=
5
[172.7-3.82]/6 = 28.14
5
s 2M = [485.8 - 172.7] /18=17.4
proc mixed data=motor; class motor axle;
model vib = ; random motor axle(motor);
The Mixed Procedure
(partial output)
Class Level Information
Class
motor
axle
Levels
8
3
Values
1 2 3 4 5 6 7 8
1 2 3
Covariance Parameter
Estimates
Cov Parm
motor
axle(motor)
Residual
Estimate
17.3928
28.1460
3.8237
ñ Work on axles to reduce vibration.
St 711 (92)
ñ Split plot - like nested but with treatments
applied G&G pg. 173 rice paddies
4 farms (blocks)
2 paddies/farm:
whole plot units
(C=CO2 =whole plot trt)
3 samples/paddy: split plot units
(N=3 Nitrogen levels=split plot trt)
Yijk =.+Bi +Cj +%(1)ij +Nk +(CN)jk +%(2)ijk
Bi µ N(0,5B2 ) %(1)ij µ N(0,512 )
%(2)ijk µ N(0,522 )
i=1,2,...,r (=4 reps) j=1,2,...,t (=2 CO2 's)
k=1,2,...,s (=3 levels of N)
_
Y
_ñjñ =.+Bñ +Cj +%(1)ñj +Nñ +(CN)jñ +%(2)ñjñ
_Yñññ =
_ .+Bñ +Cñ +%(1)ññ +Nñ +(CN)ññ +%(2)ñññ
Yñjñ -Yñññ = Cj -Cñ +(CN)jñ -(CN)ññ +
[ %(1)ñj -%(1)ññ ]+[%(2)ñjñ - %(2)ñññ ]
St 711 (93)
= (under summing to 0 of fixed effects)
Cj + [ %(1)ñj -%(1)ññ ]+[%(2)ñjñ - %(2)ñññ ]
_
MS(CO2 )=
_
rs[D2j=1 (Yñjñ -Yñññ )2 ]/(t-1)
E{MS(CO2 ) =
rs[ D2j=1 C2j /(t-1) + 512 /r + 522 /rs] (why?)
= rsD2j=1 C2j /(t-1) + s512 + 522
E{MS(CO2 ) = Q(C) + s512 + 522
SAS (no assumptions) writes
Q(C,NC) + 522 + s512
ANOVA
Source
r Blocks
df
r-1
EMS
522 +s512 +š Q(blocks)
ts5 2
BLOCK
t C
t-1
Error 1 (r-1)(t-1)
s N
s-1
CN
(t-1)(s-1)
Error 2
522 + s512 +Q(C)
522 + s512
522 + Q(N)
522 + Q(CN)
522
(F)
(R)
St 711 (94)
if no blocks:
Error 1 (r-1)(t)
522 + s512
********** Demo 28.sas ************;
data GG_pg213; array Y(6); title "G&G % Fertile Spikelets";
input block Y1-Y6;
drop y1-y6 i; do N=1 to 3;
i=N;
Pct = Y(i); CO2="C"; output;
i=N+3; Pct = Y(i); CO2="T"; output; end;
cards;
1 95.8 95.2 92.0
97.1 96.2 93.6
2 95.0 93.3 92.0
95.7 94.9 92.6
3 95.5 95.4 92.5
96.3 96.5 89.1
4 95.1 92.9 87.7
96.0 96.3 89.7
;
proc glm; class block CO2 N;
model Pct = block CO2 block*CO2 N CO2*N;
contrast "WP wrong denominator" CO2 -1 1;
contrast "WP - right!" CO2 -1 1 /E=CO2*block;
contrast "N1 vs N2" N -1 1 0;
contrast "N1 vs N2 in high CO2 -(1)"
N 1 -1 0 CO2*N 0 0 0 1 -1 0;
contrast "N1 vs N2 in high CO2 -(2)"
N 1 -1 0 CO2*N 0 1 0 -1 0 0 ;
** Key is CLASS statement!! **;
LSMEANS CO2/pdiff; ** option E=block*CO2 will fix this;
random block block*CO2/test;
** must appear after CONTRASTs to produce EMS;
run;
St 711 (95)
Dependent Variable: Pct
Source
DF
Sum of
Squares
Model
Error
Corrected Total
11
12
23
126.8616667
19.4716667
146.3333333
Mean Square
F Value
Pr > F
11.5328788
1.6226389
7.11
0.0010
Source
DF
(Type I =
)
Type III SS
Mean Square
block
CO2
block*CO2
N
CO2*N
3
1
3
2
2
12.7333333
5.6066667
5.3200000
100.7158333
2.4858333
4.2444444
5.6066667
1.7733333
50.3579167
1.2429167
Contrast
DF
WP wrong denominator 1
Contrast SS
5.60666667
Mean Square
5.60666667
Contrast
DF Contrast SS
WP - right!
1 5.60666667
N1 vs N2
1 2.10250000
N1 vs N2 in high CO2 -(1) 1 0.18000000
CO2
C
T
Source
block
CO2
block*CO2
N
CO2*N
Pct LSMEAN
93.5333333
94.5000000
F Value
2.62
3.46
1.09
31.03
0.77
Pr > F
0.0992
0.0877
0.3896
<.0001
0.4863
F Value
3.46
Pr > F
0.0877
Mean Square F Value Pr > F
5.60666667
3.16 0.1734
2.10250000
1.30 0.2772
0.18000000
0.11 0.7448
H0:LSMean1=
LSMean2
Pr > |t|
0.0877
Type III Expected Mean Square
Var(Error) + 3 Var(block*CO2) + 6 Var(block)
Var(Error) + 3 Var(block*CO2) + Q(CO2,CO2*N)
Var(Error) + 3 Var(block*CO2)
Var(Error) + Q(N,CO2*N)
Var(Error) + Q(CO2*N)
Source
DF Type III SS Mean Square F Value Pr > F
block
3
12.733333
4.244444
2.39 0.2461
* CO2
1
5.606667
5.606667
3.16 0.1734
Error: MS(block*CO2) 3
5.320000
1.773333
* This test assumes one or more other fixed effects are zero.
St 711 (96)
(more tests follow)
Contrast
WP wrong denominator
WP - right!
N1 vs N2
N1 vs N2 in high CO2 -(1)
Contrast Expected Mean Square
Var(Error) + 3 Var(block*CO2) + Q(CO2,CO2*N)
Var(Error) + 3 Var(block*CO2) + Q(CO2,CO2*N)
Var(Error) + Q(N,CO2*N)
Var(Error) + Q(N,CO2*N)
proc mixed; class block CO2 N;
model Pct = CO2 N CO2*N;
random block block*CO2;
contrast "WP" CO2 -1 1;
contrast "N1 vs N2" N -1 1 0;
contrast "N1 vs N2 in high CO2 -(1)"
N 1 -1 0 CO2*N 0 0 0 1 -1 0;
contrast "N1 vs N2 in high CO2 -(2)"
N 1 -1 0 CO2*N 0 1 0 -1 0 0 ;
LSMEANS CO2/pdiff;
** Key is CLASS statement!! **; run;
Covariance Parameter
Estimates
Cov Parm
Estimate
block
0.4119
block*CO2
0.05023
Residual
1.6226
Fit Statistics
-2 Res Log Likelihood
AIC (smaller is better)
AICC (smaller is better)
BIC (smaller is better)
71.3
77.3
79.0
75.4
Type 3 Tests of Fixed Effects
Effect
CO2
N
CO2*N
Num
DF
1
2
2
Den
DF
3
12
12
F Value
3.16
31.03
0.77
Pr > F
0.1734
<.0001
0.4863
St 711 (97)
Contrasts
Num
DF
Label
WP
N1 vs N2
N1 vs N2 in high CO2 -(1)
N1 vs N2 in high CO2 -(2)
1
1
1
.
Den
DF
F Value
Pr > F
3
12
12
.
3.16
1.30
0.11
.
0.1734
0.2772
0.7448
.
Least Squares Means
Effect
CO2
CO2
CO2
C
T
Estimate
93.5333
94.5000
Standard
Error
0.5007
0.5007
DF
3
3
t Value
186.79
188.72
Pr > |t|
<.0001
<.0001
Differences of Least Squares Means
Effect
CO2
CO2
C
_CO2
T
Estimate
-0.9667
Standard
Error
0.5437
DF
3
t Value
-1.78
Pr > |t|
0.1734
ñ Repeated measures
Whole plots =subjects
Whole plot trts = Drugs (for example)
Split plot trt=time
problem: Observations in time may
be autocorrelated
** Demo29.sas Repeated Measures ***; data univariate;
input patient drug $ @; do visit=1 to 10;
input Y @; output; end; cards;
St 711 (98)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
16.3
14.9
15.1
17.1
7.9
6.9
11.7
18.7
19.1
21.8
15.1
23.6
17.2
7.6
15.6
22.3
21.9
12.7
14.1
8.3
6.2
6.0
10.1
17.1
10.2
11.8
13.7
11.8
13.2
17.7
17.6
9.5
14.7
14.8
9.5
12.7
14.4
14.1
16.4
16.2
7.4
6.0
11.9
18.9
19.3
21.4
16.1
24.2
17.2
7.8
15.5
22.6
20.6
11.7
14.6
8.4
6.2
4.2
9.9
18.5
9.3
12.6
15.4
11.7
11.2
18.0
18.4
8.9
14.2
14.3
9.8
13.1
14.9
11.3
13.4
16.7
6.6
5.0
11.4
17.8
20.0
20.0
16.4
24.0
18.5
7.3
15.5
22.4
20.4
11.7
12.2
8.2
6.8
5.8
8.3
18.2
9.1
10.8
15.9
10.0
13.8
19.2
17.0
9.2
14.9
14.6
11.1
12.0
16.1
11.2
14.7
17.3
7.4
4.1
13.4
16.8
21.1
16.7
13.8
25.1
19.7
8.0
15.5
22.9
20.8
11.9
13.6
7.7
7.1
5.1
9.6
17.2
9.2
9.7
15.1
10.5
12.1
20.0
18.5
10.1
15.2
13.8
11.6
12.3
16.0
10.5
14.7
18.9
8.7
4.4
13.2
16.5
21.4
17.2
12.9
22.7
20.8
6.4
14.7
21.4
21.1
12.4
14.8
7.0
6.4
6.1
10.3
16.7
9.7
10.9
14.4
12.3
12.9
19.9
20.6
9.7
12.1
13.0
12.1
11.0
15.3
8.9
13.6
17.6
8.1
3.8
12.6
18.9
21.1
17.7
13.3
22.7
20.5
8.7
12.3
22.6
21.0
13.3
13.5
7.1
5.4
6.6
9.5
17.1
11.1
12.0
14.3
12.9
12.1
19.6
21.3
12.1
15.6
13.5
11.9
12.9
16.4
9.2
12.7
17.8
9.2
5.4
12.0
17.5
21.7
18.8
11.8
23.2
22.7
11.2
12.3
21.8
19.2
14.4
12.5
5.2
5.3
7.5
10.3
15.7
11.4
12.2
13.8
12.9
12.7
18.5
22.8
11.9
14.6
13.7
11.1
13.2
16.0
8.8
13.8
17.9
7.9
5.7
11.8
17.8
22.6
19.9
12.0
22.8
21.8
10.2
13.3
21.6
19.2
14.0
13.5
4.2
3.6
7.8
9.7
17.3
9.5
12.9
12.1
12.9
10.8
19.4
21.9
11.6
13.7
15.3
12.4
13.3
18.7
10.0
15.2
17.2
5.4
5.9
12.3
18.4
20.0
18.5
11.0
22.0
21.0
9.8
13.4
21.7
19.7
15.3
11.9
4.1
3.0
8.3
10.7
16.5
8.2
12.5
12.8
11.3
13.2
19.1
22.5
10.6
12.0
15.8
11.8
14.3
;
proc transpose data = univariate out=multi
prefix=visit;
var Y; by patient drug;
proc print data=univariate (obs=3); run;
proc print noobs data=multi(obs=3); run;
16.3
10.1
15.7
18.5
8.0
6.4
12.9
17.7
17.9
19.8
10.7
22.1
20.7
9.6
14.3
21.8
19.8
14.1
12.4
5.1
2.5
9.9
11.1
16.2
8.8
13.4
13.2
9.9
14.2
17.6
21.5
11.8
12.8
16.6
11.7
14.2
St 711 (99)
proc glm data=multi; title "GLM - Repeated
statement";
class patient drug;
model visit1-visit10 = drug/nouni;
repeated time polynomial / summary printe;
run;
proc glm data=univariate;
title "GLM - Compound Symmetry";
class patient drug visit;
model Y = drug patient(drug) visit drug*visit;
test h=drug e=patient(drug); run;
proc mixed data=univariate; title "Mixed";
class drug patient visit;
model Y=drug|visit;
** Pick one of these: **;
** first two same **;
* random patient(drug); ** split plot **;
* repeated/ subject=patient type=cs;
** compound symmetry**;
* repeated/ subject=patient type=un;
** unstructured**;
* repeated/ subject=patient type=Toeplitz;
random patient(drug);
repeated / subject=patient type=AR(1);
** split plot with AR(1) errors **;
run;
St 711 (100)
Obs
patient
1
2
3
p
a
t
i
e
n
t
d
r
u
g
_
N
A
M
E
_
1
2
3
A
A
A
Y
Y
Y
v
i
s
i
t
1
16.3
14.9
15.1
drug
1
1
1
v
i
s
i
t
2
14.4
14.1
16.4
v
i
s
i
t
3
A
A
A
v
i
s
i
t
4
14.9
11.3
13.4
visit
16.1
11.2
14.7
Y
1
2
3
v
i
s
i
t
5
16.0
10.5
14.7
v
i
s
i
t
6
15.3
8.9
13.6
16.3
14.4
14.9
v
i
s
i
t
7
16.4
9.2
12.7
v
i
s
i
t
8
16.0
8.8
13.8
GLM - Repeated statement
Sphericity Tests
Variables
DF
Transformed Variates 44
Orthogonal Components 44
v
i
s
i
t
1
0
v
i
s
i
t
9
18.7
10.0
15.2
16.3
10.1
15.7
(1)
Mauchly's
Criterion Chi-Square Pr > ChiSq
0.0015538
0.0015538
199.16041
199.16041
<.0001
<.0001
St 711 (101)
Manova Test Criteria and Exact F Statistics
for the Hypothesis of no time Effect
H = Type III SSCP Matrix for time
E = Error SSCP Matrix
S=1
M=3.5
N=12
Statistic
Value
Wilks' Lambda
Pillai's Trace
Hotelling-Lawley Trace
Roy's Greatest Root
F Value NumDF DenDF Pr>F
0.83661364
0.16338636
0.19529487
0.19529487
0.56
0.56
0.56
0.56
9
9
9
9
26
26
26
26
0.8135
0.8135
0.8135
0.8135
Manova Test Criteria and Exact F Statistics
for the Hypothesis of no time*drug Effect
H = Type III SSCP Matrix for time*drug
E = Error SSCP Matrix
S=1
M=3.5
N=12
Statistic
Wilks' Lambda
Pillai's Trace
Hotelling-Lawley Trace
Roy's Greatest Root
Value F Value NumDF
0.73281701 1.05 9
0.26718299 1.05 9
0.36459714 1.05 9
0.36459714 1.05 9
DenDF Pr>F
26 0.4273
26 0.4273
26 0.4273
26 0.4273
The GLM Procedure
Repeated Measures Analysis of Variance
Tests of Hypotheses for Between Subjects Effects
Source
drug
Error
DF
1
34
Type III SS
867.692250
7229.275389
Mean Square F Value Pr > F
867.692250
4.08 0.0513
212.625747
(2)
St 711 (102)
The GLM Procedure
Repeated Measures Analysis of Variance
Univariate Tests of Hypotheses for Within Subject Effects
Source
DF Type III SS Mean Square
time
9
4.9529167
0.5503241
time*drug
9
9.6535833
1.0726204
Error(time) 306 522.9285000
1.7089167
Source
time
time*drug
Error(time)
Pr > F
0.9675
0.7733
F Value
0.32
0.63
Adj Pr > F
G - G
H - F
0.8065
0.8314
0.5965
0.6165
Greenhouse-Geisser Epsilon
Huynh-Feldt Epsilon
(4)
0.3285
0.3738
The above results treat each patient's vector
of responses as a random vector. The
eigenvalue based tests are covered in a
multivariate course. Power for within
subject effects (time, time*drug) is low
because no structure is imposed.
If "compound symmetry" can be assumed,
you can treat this as a split plot. The "test
for sphericity" test H0:split plot analysis is
OK. The test should be done on orthogonal
components (orthogonal polynomials work
St 711 (103)
so Mauchley's criterion test is same in both
lines above)
One more GLM part that is interesting is
this: Strip out the linear part (linear visit
effect) from each patient. Analyze these 36
numbers. Strip out the quadratic part.
Analyze these 36 numbers etc. (orthogonal
polynomials)
Here are the results:
time_N represents the nth degree polynomial contrast for
time
Contrast Variable: time_1
Source
Mean
drug
Error
DF
1
1
34
Type III SS
0.0275766
3.1517045
256.6953552
Mean Square
0.0275766
3.1517045
7.5498634
F Value
0.00
0.42
Pr > F
0.9522
0.5225
Contrast Variable: time_2
Source
DF
Type III SS
Mean Square
F Value
Pr > F
Mean
drug
Error
1
1
34
0.58444655
0.58444655
3.45320076
3.45320076
97.06667088
2.85490208
(more polynomials)
0.20
1.21
0.6538
0.2791
St 711 (104)
Contrast Variable: time_9
Source DF Type III SS
Mean
drug
Error
1
1
34
0.17820870
0.36995640
16.14159446
Mean Square
F Value
Pr > F
0.17820870
0.36995640
0.47475278
0.38
0.78
0.5442
0.3836
Our PROC GLM on the univariate dataset is
not justified (we definitely do NOT have
compound symmetry CS) but here's the test
we would use if CS were obtained. The
initial ANOVA F test with the wrong
denominator is way off, showing p<.0001.
As you can see, this test is exactly the same
as that from the multivariate data set
(unstructured covariance matrix). Thus
concern about the covariance structure has
(little or) no effect on the between subject
tests.
Tests of Hypotheses Using the Type III
MS for patient(drug) as an Error Term
Source
drug
DF
1
Type III SS
867.6922500
Mean Square F Value Pr > F
867.6922500
4.08 0.0513
(2)
St 711 (105)
We can run several MIXED models with
various within subject covariance structures,
using SBC or AIC to select between them.
AR(1) works well here (we have a "random
subject" piece that usually would be
specified in such an experiment). Any
covariance matrix, like compound
symmetry, that retains the same form when
a constant is added to all entries will not
require this random statement.
Adding a constant to all terms in a
geometric sequence, like that used in AR(1)
does NOT produce another geometric
sequence so we need the random statement
too. (Note the large patient variance
component estimate). With the refined
covariance structure we have evidence of a
drug effect (p-values are almost the same
and this difference is likely due just to the
different estimation schemes-OLS vs
REML, hence the phrase "little or no effect"
above). There seems to be no change over
St 711 (106)
time, either on average (visit) or differing
from drug to drug (visit*time).
For more information on the GLM approach
and multivariate tests, see any mutivariate
text under multivariate multiple regression.
For repeated measures (epsilon adjustments,
sphericity etc.), the book Analysis of Messy
Data (Milliken & Johnson) is a good
resource.
The Mixed Procedure
Convergence criteria met.
Covariance Parameter Estimates
Cov Parm
patient(drug)
AR(1)
Residual
Subject
patient
Estimate
18.3299
0.8159
3.4224
Fit Statistics
-2 Res Log Likelihood
AIC (smaller is better)
AICC (smaller is better)
BIC (smaller is better)
1188.0
1194.0
1194.1
1198.8
St 711 (107)
Type 3 Tests of Fixed Effects
Effect
drug
visit
drug*visit
Num
DF
Den
DF
F Value
Pr > F
1
9
9
34
306
306
4.28
0.58
0.70
0.0462
0.8117
0.7062
Note 1: We see the patient variance
component is 18.33 and the within patient
variance (visit-to-visit) is 3.422. These
within patient visti-to-visit errors eij (t) are
not independent. Their correlation is
estimated to be .8159d where d is time
difference between visits.
For patient j on drug i, visit time t we have
Yij (t) = . + Drugi + Patientij + Time(t) +
DTit + eij (t)
Patientij ~ N(0,18.33), eij (t) ~ N(0,3.42) and
corr(eij (t), eij (s))= .8159|t-s| . Here ~ means,
St 711 (108)
of course, a distribution with the variance
estimated, not known.
Note 2: The unstructured matrix in MIXED
gives the same F test for DRUG as GLM
multivariate and GLM using compound
symmetry (split plot) namely F= 4.08. Here
is the code and partial output:
proc mixed data=univariate; title "Mixed";
class drug patient visit;
model Y=drug|visit;
repeated/ subject=patient type=un; run;
Type 3 Tests of Fixed Effects
Num
Den
Effect
DF
DF
F Value
drug
1
34
4.08
visit
9
34
0.74
drug*visit
9
34
1.38
Pr > F
0.0513
0.6719
0.2367
(2)
Notice that the within subject tests do not
match those of PROC GLM. An area of
continuing research is the study of what
kinds of finite sample degree of freedom
calculations give the best performance (the
REML theory is based on asymptotic, i.e.
St 711 (109)
large sample, considerations). Until recently,
a method that tries to emulate Satterthwaite's
approach has seemed to me to be a good
choice. Using DDFM (denominator degree
of freedom method) = SATTERTHWAITE
on the MODEL statement invokes this.
Some recent research by Kenward and
Roger takes a slightly different approach.
Some preliminary studies seem to indicate
that it is the best approach. Here is the
Kenward Roger adjustment and note that it
reproduces the Wilk Lambda (an exact F test
that is well known and widely used) from
PROC GLM. More information is available
in the online doc for SAS (click from our
home page)
proc mixed data=univariate; title "Mixed";
class drug patient visit;
model Y=drug|visit / ddfm=KenwardRoger;
repeated/ subject=patient type=un;
St 711 (110)
Type 3 Tests of Fixed Effects
Num
Den
Effect
DF
DF
F Value
drug
1
34
4.08
visit
9
26
0.56
drug*visit
9
26
1.05
Pr > F
0.0513
0.8135
0.4273
One down side is that Kenward Roger is slow. It took 14.38 CPU seconds
on a Pentium windows XP machine for this rather small example.
Fine print:
(1) Test H0:split plot approach is OK -
reject!
(2) MS(drug)/MS(patient(drug)) is OK multivar. or split plot
(3) Split plot tests - would be OK if split plot OK
(4) Adjustment's based on Box's epsilon correction
Note: GLM also gives mulitvariate tests assuming general
unstructured variance matrix within patient. Always justified,
often not very powerful.
ñ Some covariance structures:
Unstructured
Îa
Ð b
Ð
c
Ïd
b
e
f
g
c
f
h
i
Toeplitz
dÑ Îa
g ÓÐ b
ÓÐ
i
c
j Ò Ïd
b c
a b
b a
c b
AR(1)
d Ñ Î 1 3 32 33 Ñ
c ÓÐ 3 1 3 32 Ó
52
ÓÐ 2
Ó
b
3
3 1 3
a Ò Ï 33 32 3 1 Ò
St 711 (111)
Compound Symmetry
(Spherical) implies usual
split plot analysis is OK:
2
2
2
a=5whole
+5split
b=5whole
Îa
Ð b
Ð
b
Ïb
b
a
b
b
b
b
a
b
bÑ
bÓ
Ó
b
aÒ
proc mixed data=univariate;
class drug patient visit;
model Y=drug|visit;
** Pick one of these: **;
** first two same **;
* random patient(drug);
** split plot **;
* repeated/ subject=patient type=cs; ** compound symmetry**;
* repeated/ subject=patient type=un; ** unstructured**;
* repeated/ subject=patient type=Toeplitz;
random patient(drug); repeated / subject=patient type=AR(1);
** split plot with AR(1) errors **;
run; Title "PROC MIXED";run;
The Mixed Procedure
Convergence criteria met.
Covariance Parameter Estimates
Cov Parm
patient(drug)
AR(1)
Residual
Subject
patient
Estimate
18.3299
0.8159
3.4224
St 711 (112)
Fit Statistics
-2 Res Log Likelihood
AIC (smaller is better)
AICC (smaller is better)
BIC (smaller is better)
1188.0
1194.0
1194.1
1198.8
Type 3 Tests of Fixed Effects
Effect
drug
visit
drug*visit
Num
DF
1
9
9
Den
DF
34
306
306
F Value
4.28
0.58
0.70
Pr > F
0.0462
0.8117
0.7062
ñ "Visit" tests for time effects.
ñ AR(1) with random patient effects is
restricted Toeplitz.
Toeplitz: -2 ln(L) =1181.2 (10 parms)
AR(1): -2 ln(L) = 1188.0 ( 3 (5 2 ,3,5P2 ) )
H0 : AR(1) is OK
H1 : Need more general Toeplitz
Test: ;27 (=10-3) = 6.8 not significant
ñ Many more structures (including spatial)
St 711 (113)
ñ Note (stat majors especially): research on
these models is ongoing.
A particularly interesting result (Self and
Liang, 1987, JASA) concerns testing a
parameter on the boundary using this
"likelihood ratio test". The upshot is that if
you are testing that a single variance
component is 0 (on the "boundary") then
you have a 50% chance of hitting the
boundary and getting a 0, and a 50% chance
of getting ;21 random variable in large
samples.
Thus if you get a calculated ;21 value C
and if the "p-value" from a ;21 is Pr{;21 >C}
= .0780 then the "p-value" of your test
would be 0.0390. For AR(1), 3=0 is not on
the boundary. Our example above is
unaffected.
St 711 (114)
Strip plot
Fields with 4x5 grid of plots
Split plot: randomly assign
4 fertilizers (1,2,3,4) to rows,
5 varieties randomized within rows
Concern: fertility gradient (down)
one block:
1A 1C 1D
4C 4D 4A
3E 3A 3D
2E 2C 2D
1B
4E
3B
2A
1E
4B
3C
2B
<--- row = whole plot
Strip plot: randomly assign
4 fertilizers to rows,
5 varieties to column
Concern: gradients down and across
St 711 (115)
one block
1A 1C 1D
4A 4C 4D
3A 3C 3D
2A 2C 2D
1B
4B
3B
2B
1E
4E
3E
2E
ñ Analysis (5 blocks)
Block
4
F
3
Error A 12
< SAS: block*fert
Variety
4
Error B 16
< block*var
F*V
12
Error C 48
ñ Split-split plot
Batches of cotton - 3 from each of 5
suppliers SS(supplier)= 755
4 df
SS(batch) = 1400 14 df
SS(error A) = 645 10 df
---------------------------
St 711 (116)
4 spinning tensions, 1/4 batch done on
each. Each batch -> 4 bobbins of fiber.
SS(bobbins) 2494
59 df
SS(tension) 737
3 df
SS(sup*ten) 114
12 df
SS(Error B) = 2494-1400-851 30 df
--------------------------From each bobbin, fibers woven together
to form threads. Thread of 2, 3, and 4
ply fibers from each bobbin.
Y = breaking strength of thread.
SS(threads) 2707 179 df
this is SS(total)
SS(ply)
169
2 df
SS(ply*sup)
2
8 df
SS(ply*ten) 0.5
6 df
SS(P*S*T) 8.5 24 df
SS(Error C) = 2707-2494-180=33 80 df
St 711 (117)
*** Demo31.sas Split Split Plot*****;
data spin;
do supp = 1 to 5;
do batch = 1 to 3; b=10*normal(1827655);
do tension = 1 to 4; t = 5*normal(1827655);
do ply = 2 to 4; s = 2*normal(1827655);
input Y;
output; end; end; end; end;
cards;
6.9
(more data)
16.2
18
;
proc glm; class supp tension ply batch;
model Y = supp batch(supp)
tension supp*tension batch*supp*tension
ply supp*ply tension*ply supp*tension*ply;
random batch(supp) batch*supp*tension/test;
contrast "linear tension" tension -3 -1 1 3/
e=batch*supp*tension;
Dependent Variable: Y
Source
Model
Error
Corrected Total
DF
99
80
179
Sum of
Squares
2674.482000
32.380000
2706.862000
Mean Square
27.014970
0.404750
F Value
66.74
Corrected Total
Source
supp
batch(supp)
tension
supp*tension
supp*tension*batch
DF
4
10
3
12
30
Type I SS
755.7370000
644.6100000
736.6740000
114.2710000
242.5700000
Mean Square
188.9342500
64.4610000
245.5580000
9.5225833
8.0856667
F Value
466.79
159.26
606.69
23.53
19.98
St 711 (118)
ply
supp*ply
tension*ply
supp*tension*ply
2
8
6
24
169.3480000
2.0420000
0.6000000
8.6300000
84.6740000
0.2552500
0.1000000
0.3595833
209.20
0.63
0.25
0.89
(most of these tests are wrong!)
(there follow "fixed up" tests, e.g.:)
Source
DF
Type III SS
Mean Square
F Value
batch(supp)
10
644.610000
64.461000
7.97
tension
3
736.674000
245.558000
30.37
supp*tension
12
114.271000
9.522583
1.18
Error
30
242.570000
8.085667
Error: MS(supp*tension*batch)
* This test assumes one or more other fixed effects are zero.
*
*
(and finally the requested contrast)
Dependent Variable: Y
Tests of Hypotheses Using the Type III MS
for supp*tension*batch as an Error Term
Contrast
linear tension
DF
1
Contrast SS
731.1616000
Mean Square
731.1616000
F Value
90.43
H0 : No linear effect of tension
Assume: equal spacing
Contrast coefficients for linear -3 -1 1 3
Q=[-3(tension 1 total) -1( ) +1( )+3( )]
Suppose Q=811.2.
denominator = (9+1+1+9)(45) = 900
SS(Tension linear)=Q2 /denom = 731.16
St 711 (119)
F130 = (731/1)/(242/30) = 90.43
proc mixed; class supp tension ply batch;
model Y = supp|tension|ply;
random batch(supp) batch*tension batch*ply;
The Mixed Procedure
Covariance Parameter Estimates
Cov Parm
batch(supp)
supp*tension*batch
Residual
Estimate
4.6979
2.5603
0.4047
Fit Statistics
-2 Res Log Likelihood
AIC (smaller is better)
BIC (smaller is better)
438.5
444.5
446.6
Type 3 Tests of Fixed Effects
Num
Effect
DF
supp
4
tension
3
supp*tension
12
ply
2
supp*ply
8
tension*ply
6
supp*tension*ply 24
Den
DF
10
30
30
80
80
80
80
F Value
2.93
30.37
1.18
209.20
0.63
0.25
0.89
Pr > F
0.0764
<.0001
0.3418
<.0001
0.7497
0.9591
0.6158
F Value
90.43
Pr > F
<.0001
Contrasts
Num
Label
DF
linear tension 1
Den
DF
30
ñ All tests correct automatically
Download