Mixed Mo del Analysis

advertisement
Mixed Model Analysis
with
E (e) = 0
E (u) = 0
Cov(e; u) = 0
Basic model:
Y = X + Z u + e
where
V ar(e) = R
V ar(u) = G
Then
X is a n p model matrix of known
constants
is a p 1 vector of \xed"
unknown parameter values
Z is a n q model matrix of known
constants
u is a q 1 random vector
e is a n 1 vector of random errors
E (Y) = E (X + Z u + e)
= X + ZE (u) + E (e)
= X
V ar(Y) = V ar(X + Z u + e)
= V ar(Z u) + V ar(e)
= ZGZ T + R
687
Normal-theory mixed model
2
66
64
u N 0 ; G 0
e
0 0 R
3
77
75
02
B66
B
B
@64
3
77
75
2
66
64
688
Example 10.1: Random Blocks
Comparison of four processes
for producing penicillin
P rocess A
P rocess B Levels of a \xed"
treatment factor
P rocess C
P rocess D
31
777C
C
5C
A
9
>
>
>
>
>
>
>
>
>
>
>
>
=
>
>
>
>
>
>
>
>
>
>
>
>
;
Then,
Y N (X; ZGZ T + R)
"
call this 689
Blocks correspond to dierent
batches of an important raw material, corn steep liquor
690
Here, batch eects are considered
as random block eects:
Random sample of ve batches
Split each batch into four parts:
{ run each process on one part
{ randomize the order in which
the processes are run
within each batch
Batches are sampled from a pop-
ulation of many possible batches
To repeat this experiment you
would need to use a dierent set
of batches of raw material
Data Source: Box, Hunter & Hunter (1978),
Statistics for Experimenters,
Wiley & Sons, New York.
Data le:
penclln.dat
SAS code:
penclln.sas
S-PLUS code: penclln.ssc
692
691
Model:
Yij =
+ i
Yield
for the
i-th process
applied
to the
j-th batch
mean
yield
for the
i-th process,
averaging
across the
entire
population
of
possible
batches
"
where
"
+j
+eij
"
"
random random
batch error
eect
Here
i = E (Yij ) = E ( + i + j + eij )
= + i + E (j ) + E (eij )
= + i
i = 1; 2; 3; 4
represents the mean yield for the ith process, averaging across all possible batches.
PROC GLM and PROC MIXED
in SAS t a restricted model with
4 = 0. Then
j NID(0; 2 )
eij NID(0; e2)
= 4 is the mean yield for
process D
i = i 4 i = 1; 2; 3; 4:
and any eij is independent of
any j .
693
694
Variance-covariance structure:
In S-PLUS you could use the
"treatment" constraints where
1 = 0. Then
= 1 is the mean yield for
process A
i = i 1 i = 1; 2; 3; 4:
V ar(Yij ) = V ar( + i + j + eij )
= V ar(j + eij )
= V ar(j ) + V ar(eij )
= 2 + e2
for all (i; j )
Alternatively, you could choose the
solution to the normal equations
given by "sum" constraints
1 + 2 + 3 + 4 = 0
= (1 + 2 + 3 + 4)=4
i = i Dierent runs on the same batch:
Cov(Yij ; Ykj )
= Cov( + i + j + eij ; + k + j + ekj )
= Cov(j + eij ; j + ekj )
= Cov(j ; j ) + Cov(j ; ekj ) + Cov(eij ; j )
+Cov(eij ; ekj )
= V ar(j )
= 2
for all i 6= k
695
Correlation among yields for runs
on the same batch:
Cov(Yij ; Ykj )
= V ar
(Yij )V ar(Ykj )
2
= 2 + 2 for i 6= k
e
s
696
Results
from the four runs on a
single batch:
2
66
66
66
66
66
4
Y1j
Y
V ar Y2j
3j
Y4j
3
77
77
77
77
77
5
=
2
66
66
66
66
66
64
2 + e2 2
2
2
2
2
2
2
+ e 2
2
2
2
2
+ e 2
2
2
2
2 + e2
= 2 J + e2I
"
"
matrix identity
of
matrix
ones
Results for runs on dierent
batches are uncorrelated
(independent):
This special type of covariance
structure is called
Cov(Yij ; Yk`) = 0 for j 6= `
compound symmetry
697
698
3
77
77
77
77
77
75
Write this model as Y = X +Z u+e
2
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
4
Y11 37 26 1 1 0 0 0
Y21 77 66 1 0 1 0 0
Y31 77 66 1 0 0 1 0
Y41 777 666 1 0 0 0 1
Y12 77 66 1 1 0 0 0
Y22 77 66 1 0 1 0 0
Y32 777 666 1 0 0 1 0
Y42 77 66 1 0 0 0 1
Y13 77 66 1 1 0 0 0
Y23 777 = 666 1 0 1 0 0
Y33 77 66 1 0 0 1 0
Y43 77 66 1 0 0 0 1
Y14 777 666 1 1 0 0 0
Y24 77 66 1 0 1 0 0
Y34 77 66 1 0 0 1 0
Y44 777 666 1 0 0 0 1
Y15 77 66 1 1 0 0 0
Y25 77 66 1 0 1 0 0
Y35 5 4 1 0 0 1 0
Y45
1 0 0 0 1
1
1
1
1
0
0
0
0
0
+ 00
0
0
0
0
0
0
0
0
0
2
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
4
0
0
0
0
1
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
1
0
0
0
0
0
0
0
0
3
77
77
77
77
77
77
77
77 2
77
77 66
77 66
77 64
77
77
77
77
77
77
77
77
5
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
1
0
0
0
0
Here
G = V ar(u) = B2 I55
1
2
3
4
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
1
3
77
77
77
77
77
77
77
77 2
77
77 66
77 66
77 64
77
77
77
77
77
77
77
77
5
R = V ar(e) = e2Inn
3
77
77
75
and
1
2
3
4
5
2
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
4
e11
e21
e31
e41
e12
e22
e32
3
e42
e13
77
77 + e23
75
e33
e43
e14
e24
e34
e44
e15
e25
e35
e45
V ar(Y) = V ar(X + Z u + e)
= V ar(Z u) + V ar(e)
= ZGZ T + R
= 2 ZZ T + e2I
3
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
77
5
2
66
66
66
66
66
66
4
=
2 J + e2I
2 J + e2I
...
2 J + e2I
700
699
Example 10.2: Hierarchical
Random Eects Model
Analysis of sources of variation in
a process used to monitor the
production of a pigment paste.
Current Procedure:
Sample barrels of pigment paste
One sample from each barrel
Send the sample to a lab for determination of moisture content
Problem: Variation in moisture
content is too large
avearge moisture content
is approximately 25 (or 2.5%)
standard deviation of about 6
Examine sources of variation:
Measured Response: (Y ) moisture
content of the pigment paste (units
of one tenth of 1%).
701
702
3
77
77
77
77
77
77
5
Model:
Data Collection: Hierarchical
(or nested) Study Design
Sample b barrels of pigment
paste
s samples are taken from the
content of each barrel
Each sample is mixed and
divided into r parts.
Each part is sent to the lab.
There are
n = (b)(s)(r) observations.
Yijk = + i + Æij + eijk
where
Yijk is the moisture content determination for the k-th part of the
j -th sample from the i-th barrel
is the mean moisture content
i is a random barrel eect:
i NID(0; 2 )
Æij is a random sample eect:
Æij NID(0; Æ2)
eijk corresponds to random
measurement error:
eijk NID(0; e2)
703
Covariance Structure
Homogeneous variances:
V ar(Yijk) = V ar( + i + Æij + eijk)
= V ar(i) + V ar(Æij ) + V ar(eijk)
= 2 + Æ2 + e2
Two parts of one sample:
Cov(Yijk; Yij`)
= Cov( + i + Æij + eijk; + i + Æij + eij`)
= Cov(i; i) + Cov(Æij ; Æij )
= 2 + Æ2
for k 6= `
705
704
Observations on dierent samples
taken from the same barrel:
Cov(Yijk; Yim`)
= Cov( + i + Æij + eijk; + i + Æim + eim`)
= Cov(i; i)
= 2 j 6= m
Observations from dierent barrels:
Cov(Yijk; Ycm`) = 0; i 6= c
706
Write this model in the form:
In this study
Y = X + Z u + e
b = 15 barrels were sampled
s = 2 samples were taken from each
barrel
r = 2 sub-samples were analyzed
from each sample taken from
each barrel
2
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
4
2
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
66
4
Data le: pigment.dat
SAS code: pigment.sas
S-PLUS code: pigment.ssc
3
2
3
Y111 77 66 1 77
Y112 777 666 1 777
Y121 777 666 1 777
Y122 7777 6666 1 7777
Y211 777 666 1 777
Y212 777 666 1 777
Y221 777 = 666 1 777 [ Y222 777 666 1 777
..
77
66 . 77
7
6 . 7
Y15;1;1 7777 6666 1 7777
Y15;1;2 777 666 1 777
Y15;2;1 775 664 1 775
Y15;2;2
1
1 0 ::: 0 1 0
1 0 ::: 0 1 0
1 0 ::: 0 0 1
1 0 ::: 0 0 1
0 1 ::: 0 0 0
0 1 ::: 0 0 0
0 1 ::: 0 0 0
0. 1. :. : : 0. 0. 0.
. . . . . .
0 0 ::: 1 0 0
0 0 ::: 1 0 0
0 0 ::: 1 0 0
0 0 ::: 1 0 0
]+
0
0
0
0
1
1
0
0.
.
0
0
0
0
0
0
0
0
0
0
1
1.
.
0
0
0
0
:::
:::
:::
:::
:::
:::
:::
:::
..
:::
:::
:::
:::
0
0
0
0
0
0
0
0.
.
1
1
0
0
0 377 2
3
0 777 66 1 77
7
6
0 777 666 . 2 7777
0 777 666 . 777
0 777 666 15 777
0 777 666 Æ1;1 777
0 777 666 Æ1;2 777 + e
0. 777 666 Æ2;1 777
. 7777 6666 Æ. 2;2 7777
0 777 666 . 777
0 777 664 Æ15;1 775
1 775 Æ15;2
1
707
Analysis of Mixed Linear
Models
where
R = V ar(e) = e2I
2
G = V ar(u) = 0 I 02I
Æ
Then
2
66
66
64
3
77
77
75
Y = X + Z u + e
E (Y) = X = 1
V ar(Y) = = ZGZ T + R
2
3
2
66 Ib 0
77
77 Z T + 2 I
= Z 664
e bsr
2
0
Æ Ibs 5
= 2 (Ib Jsr ) + Æ2(Ibs Jr) + e2Ibsr
because Z = [Ib 1sr jIbs 1r ]
%
%
(sr) 1
r1
vector
of ones
708
where Xnp and Znq are known
model matrices and
u N 0 ; G 0
e
0 0 R
2
66
64
Then
3
77
75
02
B66
B
B
@64
3
77
75
2
66
64
31
777C
C
5C
A
Y N (X; )
where
vector
of ones
709
= ZGZ T + R
710
Some objectives
Methods of Estimation
(i) Inferences about estimable
functions of xed eects
Point estimates
Condence intervals
Tests of hypotheses
I. Ordinary Least Squares
Estimation:
(ii) Estimation of variance components (elements of G and R)
(iii) Predictions of random eects
(blup)
(iv) Predictions of future observations
Normal equations (estimating
equations):
(X T X )b = X T Y
and solutions have the form
b = (X T X ) X T Y
712
711
The Gauss-Markov Theorem
cannot be applied because it
requires uncorrelated responses.
In these models
V ar(Y) = ZGZ T + R
6= 2I
The OLS estimator for CT is
CT b = CT (X T X ) X T Y
where
b = (X T X ) X T Y
is a solution to the normal
equations.
Hence, the OLS estimator of an
estimable function CT is not
necessarily a best linear
unbiased estimator (b.l.u.e.).
713
The OLS estimator CT b is a
linear function of Y.
E(CT b) = CT 714
V ar(CT b) =
CT (X T X ) X T (ZGZ T
X (X T X ) C
+ R)
If Y N (X; ZGZ T + R); then
CT b
has a normal distribution
with mean
CT and covariance matrix
CT (X T X ) X T (ZGZ T + R)
X (X T X ) C
II. Generalized Least Squares
(GLS) Estimation:
Suppose
E (Y) = X
and also suppose
= V ar(Y) = ZGZ T + R
is known. Then a GLS estimator
for is any b that minimizes
Q(b) = (Y X b)T 1(Y X b)
716
715
For any estimable function C T ,
the unique b.l.u.e. is
The estimating equations are:
C T bGLS = C T (X T 1X ) X T 1Y
(X T 1X )b = X T 1Y
and
bGLS = (X T 1X ) (X T 1Y)
is a solution.
with
V ar(C T bGLS ) = C T (X T 1X ) C
If Y N (X; ), then
C T bGLS N C T ; C T (X T 1X ) C
0
@
717
718
1
A
C T bGLS is not a linear function
When G and/or R contain unknown
parameters, you could obtain an
\approximate BLUE" by replacing
the unknown parameters with consistent estimators to obtain
^ T + R^
^ = Z GZ
and
C T bGLS = C T (X T ^ 1X ) ^ 1Y
of Y
C T bGLS is not a best linear unbiased estimator (BLUE)
See Kackar and Harville (1981,
1984) for conditions under
which C T bGLS is an unbiased
estimator for C T C T (X T ^ 1X ) C tends to
\underestimate" V ar(C T bGLS )
(see Eaton (1984))
For \large" samples
C T bGLS _ N (C T ; C T (X T 1X ) C )
719
720
Basic Approaches
Variance component
estimation
Estimation of parameters in G
and R
Crucial to the estimation of estimable functions of xed eects
(e.g. E (Y) = X)
Of interest in its own right
(sources of variation in the pigment paste production example)
721
(i) ANOVA methods (method
of moments):
Set observed values of mean
squares equal to their expectations and solve the resulting
equations.
(ii) Maximum likelihood estimation
(ML)
(iii) Restricted maximum likelihood
estimation (REML)
722
Example 10.1 Penicillin production
Yij = + i + j + eij
ANOVA method
(Method of Moments)
where
Bj NID(0; 2 )
Compute an ANOVA table
Equate mean squares to their ex-
pected values
Solve the resulting equations
and
eij NID(0; e2)
Source of
Variation d.f. Sums of Squares
Blocks
4 a Pbj=1(Y:j Y::)2 = SSblocks
Processes 3 b Pai=1(Y1: Y::)2 = SSprocesses
error
12 Pai=1 Pbj=1(Yij Y1: Y:j + Y::)2 = SSE
C. total
19 Pai=1 Pbj=1(Yij Y::)2
723
Start at the bottom:
MSerror = (a SSE
1)(b 1)
E (MSerror) = e2
Then an unbiased estimator for e
is
^e2 = MSerror
724
Next, consider the mean square for
the random block eects:
MSblocks = SSb blocks
1
E (MSblocks) = e2 + a2
"
number of
observations
for each block
Then,
) e2
2 = E (MSblocks
a
= E (MSblocks) a E (MSerror)
725
726
An unbiased estimator for 2 is
^ 2 = MSblocks a MSerror
For the penicillin data
^e2 = MSerror = 18:83
^ = MSblocks MSerror
4
66
:
0
18
:
83
=
= 11:79
4
V ar(Yij ) = ^2 + ^ e2
= 11:79 + 18:83 = 30:62
d
727
/* This is a program for analyzing the
penicillan data from Box, Hunter, and
Hunter. It is posted in the file
penclln.sas
First enter the data */
data set1;
infile 'penclln.dat';
input batch process $ yield;
run;
/* Compute the ANOVA table, formulas for
expectations of mean squares, process
means and their standard errors */
proc glm data=set1;
class batch process;
model yield = batch process / e e3;
random batch / q test;
lsmeans process / stderr pdiff tdiff;
output out=set2 r=resid p=yhat;
run;
728
/* Compute a normal probability plot for
the residuals and the Shapiro-Wilk test
for normality */
proc rank data=set2 normal=blom out=set2;
var resid; ranks q;
run;
proc univariate data=set2 normal plot;
var resid;
run;
goptions cback=white colors=(black)
target=win device=winprtc rotate=portrait;
axis1 label=(h=2.5 r=0 a=90 f=swiss 'Residuals')
value=(f=swiss h=2.0) w=3.0 length=5.0 in;
axis2 label=(h=2.5 f=swiss 'Standard
Normal Quantiles')
value=(f=swiss h=2.0) w=3.0 length=5.0 in;
729
axis3 label=(h=2.5 f=swiss 'Production Process')
value=(f=swiss h=2.0) w=3.0 length=5.0 in;
symbol1 v=circle i=none h=2 w=3 c=black;
proc gplot data=set2;
plot resid*q / vaxis=axis1 haxis=axis2;
title h=3.5 f=swiss c=black
'Normal Probability Plot';
run;
proc gplot data=set2;
plot resid*process / vaxis=axis1 haxis=axis3;
title h=3.5 f=swiss c=black 'Residual Plot';
run;
730
General Linear Models Procedure
Class Level Information
/* Fit the same model using PROC MIXED. Compute
REML estimates of variance components. Note
that PROC MIXED provides appropriate standard
errors for process means. When block effects
are random. PROC GLM does not provide correct
standard errors for process means */
Class
Levels
Values
BATCH
5
1 2 3 4 5
PROCESS
4
A B C D
Number of observations in data set = 20
General Form of Estimable Functions
proc mixed data=set1;
class process batch;
model yield = process / ddfm=satterth solution;
random batch / type=vc g solution cl alpha=.05;
lsmeans process / pdiff tdiff;
run;
Effect
Coefficients
INTERCEPT
L1
BATCH
1
2
3
4
5
L2
L3
L4
L5
L1-L2-L3-L4-L5
PROCESS
A
B
C
D
L7
L8
L9
L1-L7-L8-L9
732
731
Type III Estimable Functions for: BATCH
Effect
INTERCEPT
BATCH
PROCESS
Coefficients
0
1
2
3
4
5
A
B
C
D
L2
L3
L4
L5
-L2-L3-L4-L5
Dependent Variable: YIELD
0
0
0
0
Type III Estimable Functions for: PROCESS
Effect
INTERCEPT
BATCH
PROCESS
Coefficients
0
1
2
3
4
5
0
0
0
0
0
A
B
C
D
L7
L8
L9
-L7-L8-L9
733
Source
DF
Sum of
Squares
Mean
Square
F Value
Pr > F
Model
7
334.00
47.71
2.53
0.0754
Error
12
226.00
18.83
Cor. Total
19
560.00
R-Square
C.V.
Root MSE
YIELD Mean
0.596
5.046
4.3397
86.0
Source
DF
Type III
SS
Mean
Square
F Value
Pr > F
BATCH
PROCESS
4
3
264.0
70.0
66.0
23.3
3.50
1.24
0.0407
0.3387
734
Tests of Hypotheses for Mixed Model
Analysis of Variance
Dependent Variable: YIELD
Quadratic Forms of Fixed Effects in the
Expected Mean Squares
Source: BATCH
Error: MS(Error)
DF
4
Source: Type III Mean Square for PROCESS
PROCESS
PROCESS
PROCESS
PROCESS
A
B
C
D
PROCESS
A
3.750
-1.250
-1.250
-1.250
PROCESS
B
-1.250
3.750
-1.250
-1.250
PROCESS
C
-1.250
-1.250
3.750
-1.250
PROCESS
D
-1.250
-1.250
-1.250
3.750
Denominator
DF
MS
12 18.83
Type III MS
66
Source: PROCESS
Error: MS(Error)
DF
3
F Value
3.5044
Denominator
DF
MS
12
18.83
Type III MS
23.33
Pr > F
0.0407
F Value
1.2389
Pr > F
0.3387
Least Squares Means
Source
Type III Expected Mean Square
BATCH
Var(Error) + 4 Var(BATCH)
PROCESS
Var(Error) + Q(PROCESS)
YIELD
LSMEAN
Std
Error
Pr > |T|
t-tests / p-values
A
84
1.941
0.0001
1
B
85
1.941
0.0001
2
C
89
1.941
0.0001
3
D
86
1.941
0.0001
4
.
-0.364
0.722
0.364
.
0.722
1.822 1.457
0.094 0.171
0.729 0.364
0.480 0.723
-1.822
0.093
-1.457
0.171
.
-1.093
0.296
735
736
The MIXED Procedure
Iteration History
Model Information
Data Set
Dependent Variable
Covariance Structure
Estimation Method
Residual Variance Method
Fixed Effects SE Method
Degrees of Freedom Method
WORK.SET1
yield
Variance Components
REML
Profile
Model-Based
Satterthwaite
Class Level Information
Class
PROCESS
BATCH
Levels Values
4
5
Iteration
Eval
-2 Res Log Like
0
1
1
1
106.59285141
103.82994387
Criterion
0.00000000
Convergence criteria met.
Estimated G Matrix
Row
1
2
3
4
5
Effect
batch
batch
batch
batch
batch
batch
1
2
3
4
5
Col1
11.7917
Col2
11.7917
Col3
11.7917
Col4
11.7917
Col5
11.7917
Covariance Parameter Estimates
Cov Parm
Estimate
A B C D
1 2 3 4 5
737
batch
Residual
11.7917
18.8333
738
-0.729
0.480
-0.364
0.722
1.093
0.296
.
Solution for Random Effects
Fit Statistics
Res Log Likelihood
Akaike's Information Criterion
Schwarz's Bayesian Criterion
-2 Res Log Likelihood
Effect batch
-51.9
-53.9
-53.5
103.8
batch
batch
batch
batch
batch
1
2
3
4
5
Estimate
4.2879
-2.1439
-0.7146
1.4293
-2.8586
Std Err
Pred
DF
2.2473
2.2473
2.2473
2.2473
2.2473
5.29
5.29
5.29
5.29
5.29
t
1.91
-0.95
-0.32
0.64
-1.27
Pr > |t|
0.1115
0.3816
0.7627
0.5513
0.2564
Solution for Fixed Effects
Effect
process
Intercept
process
process
process
process
A
B
C
D
Estimate
Standard
Error
86.0000
-2.0000
-1.0000
3.0000
0
2.4749
2.7447
2.7447
2.7447
.
DF
t Value
Pr > |t|
11.1
12
12
12
.
34.75
-0.73
-0.36
1.09
.
<.0001
0.4802
0.7219
0.2958
.
Solution for Random Effects
Effect
batch
batch
batch
batch
batch
batch
1
2
3
4
5
Alpha
Lower
Upper
0.05
0.05
0.05
0.05
0.05
-1.3954
-7.8273
-6.3980
-4.2540
-8.5419
9.9712
3.5394
4.9687
7.1126
2.8247
739
740
Inferences about treatment means:
Type 3 Tests of Fixed Effects
Effect
Num
DF
Den
DF
F Value
Pr > F
3
12
1.24
0.3387
process
Least Squares Means
Effect process
process
process
process
process
A
B
C
D
Est.
Standard
Error
84.0000
85.0000
89.0000
86.0000
2.4749
2.4749
2.4749
2.4749
DF
t Value
Pr > |t|
11.1
11.1
11.1
11.1
33.94
34.35
35.96
34.75
<.0001
<.0001
<.0001
<.0001
Yij = + i + j + eij
Consider the sample mean (one
observation for each treatment in
each block):
b
Yij
Yi: = 1b j =1
X
Differences of Least Squares Means
Effect
process
process
process
process
process
process
process
A
A
A
B
B
C
B
C
D
C
D
D
Estimate
-1.0000
-5.0000
-2.0000
-4.0000
-1.0000
3.0000
Standard
Error
DF
2.7447
2.7447
2.7447
2.7447
2.7447
2.7447
12
12
12
12
12
12
t
Pr > |t|
-0.36
-1.82
-0.73
-1.46
-0.36
1.09
0.7219
0.0935
0.4802
0.1707
0.7219
0.2958
741
+ i
for random
block eects
j NID(0; 2 )
E (Yi:) =
+ i + 1b Pbj=1 j for xed
block eects
8
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
<
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
:
742
Condence Intervals
Fixed additive block eects:
SY2i: = 1b ^e2 = 1b MSerror
and
b
V ar(Yi:) = V ar( 1b j =1
Yij )
The standard error for Yi: is
X
b
V ar(Yij )
= b12 j =1
1 (2 + 2) random block
b
b e
eects
= 1 2
xed block
b (e )
X
8
>
>
>
>
>
>
>
>
>
>
>
>
>
<
>
>
>
>
>
>
>
>
>
>
>
>
>
:
eects
SYi: = 1b MSerror = 1:941
v
u
u
u
u
u
u
t
A (1 ) 100% condence interval
for
b
+ i + 1b j =1
j
is
X
Yi: t(a 1)(b 1); 2 1b MSerror
v
u
u
u
u
u
u
t
744
743
t-tests:
Reject H0 : + i + 1b bj=1 j = d if
Yi: dj > t
jtj = 1jMS
(a 1)(b 1); 2
error
b
P
v
u
u
t
This is what is done by the
LSMEANS option in the GLM
procedure in SAS, even
when you specify
RANDOM BATCH;
This is what is done by the
MIXED procedure in SAS
when batch eects are not
random
745
Models with random additive block
eects:
SY2i: = 1b (^e2 + ^2 )
0
1
= 1b BB@MSerror + MSblocks a MSerror CCA
1 MS
= a ab 1 MSerror + ab
blocks
= ab(b1 1) [SSerror + SSblocks]
%
e22(a 1)(b 1)
-
(e2 + a2 )2(b 1)
Hence, the distribution of SY2i: is not
a multiple of a central chi-square
random variable.
746
Standard error for Yi: is
1 MS
SYi: = a ab 1 MSerror + ab
blocks
= 2:4749
v
u
u
u
u
u
u
t
An approximate (1 ) 100%
condence interval for + i is
1 MS
Yi: t; 2 a ab 1 MSerror + ab
blocks
where
v
u
u
u
u
u
u
t
v =
"
#2
a 1 MSerror + 1 MS
blocks
ab
ab
2
a 1 MSerror 2 1 MS
blocks
ab
ab
+
b 1
(a 1)(b 1)
Result 10.1: CochranSatterthwaite approximation
Suppose MS1; MS2; ; MSk are
mean squares with
independent distributions
degrees of freedom = dfi
)MSi 2
(Edf(iMS
dfi
i)
Then, for positive constants
4 1 MSerror + 1 MSblocks352
ab
= (4)(5)
= 11:075
2
a 1 MSerror 2 1 MS
blocks
ab
ab
+
b 1
(a 1)(b 1)
2
4
747
S 2 = a1MS1 + a2MS2 + : : : + akMSk
is approximated by
vS 2 _ 2
v
E (S 2 )
where
2 2
v = [a1E(MS1)]2 [E (S )] [akE(MSk)]2
+ : : : + dfk
df1
is the value for the degrees of
freedom.
749
ai > 0;
i = 1; 2; : : : ; k
the distribution of
748
In practice, the degrees of freedom
are evaluated as
22
V = (a1MS1)2 [S ] (akMSk)2
df1 + : : : + dfk
These are called the CochranSatterthwaite degrees of freedom.
Cochran, W.G. (1951) Testing a Linear
Relation among Variances, Biometrics 7,
17-32.
750
Dierence between two means:
E (Yi: Yk:)
b
b
= E 1b j =1
Yij 1b j =1
Ykj
b
= E 1b j =1
(Yij Ykj )
= E 1 b ( + + + 0
B
B
B
@
X
0
B
B
B
@
X
0
B
B
B
@
X
0
1
V ar(Yi: Yk:) = V ar BB@i k + 1b Xb (ij kj )CCA
j =1
b
X
V ar(ij kj )
= b12 j =1
2
= 2e
1
C
C
C
A
X
1
C
C
C
A
b
i
j
ij
b j =1
k j kj )
b
E (ij kj )
= i k + 1b j =1
" this is zero
= i k
= ( + i) ( + k)
whether block eects are xed or
random.
!
X
The standard error for Yi: Yk: is
SYi: Yk: = 2MSberror
v
u
u
u
u
u
u
t
A (1 ) 100% condence interval
for i i is
(Yi: Yk:) t(a 1)(b 1);2 2MSberror
"
v
u
u
u
u
u
u
t
d.f. for MSerror
751
t-test:
752
# Analyze the penicillin data from Box,
# Hunter, and Hunter. This code is
# posted as penclln.ssc
# Enter the data into a data frame and
# change the Batch and Process variables
# into factors
Reject H0 : i k = 0 if
Yj:j
jtj = jY2i:MSerror
> t(a
v
u
u
t
b
1)(b 1); 2
"
d.f. for MSerror
753
>
+
>
>
>
penclln <- read.table("penclln.dat",
col.names=c("Batch","Process","Yield"))
penclln$Batch <- as.factor(penclln$Batch)
penclln$Process <- as.factor(penclln$Process)
penclln
754
Batch Process Yield
1
1
89
1
2
88
1
3
97
1
4
94
2
1
84
2
2
77
2
3
92
2
4
79
3
1
81
3
2
87
3
3
87
3
4
85
4
1
87
4
2
92
4
3
89
4
4
84
5
1
79
5
2
81
5
3
80
5
4
88
# Construct a profile plot. UNIX users
# should use the motif( ) command to open
# a graphics window
> attach(penclln)
> means <- tapply(Yield,list(Process,Batch),mean)
>
>
>
+
+
>
>
par(fin=c(6,7),cex=1.2,lwd=3,mex=1.5)
x.axis <- unique(Process)
matplot(c(1,4), c(75,100), type="n", xaxt="n",
xlab="Process", ylab="Yield",
main= "Penicillin Production Results")
axis(1, at=(1:4)*1, labels=c("A", "B", "C", "D"))
matlines(x.axis,means,type='l',lty=1:5,lwd=3)
> legend(4.2,95, legend=c('Batch 1','Batch 2',
+ 'Batch 3','Batch 4','Batch 5'), lty=1:5,bty='n')
> detach( )
756
755
# Use the lme( ) function to fit a model
# with additive batch (random) and process
# (fixed) effects and create diagnostic plots.
95
100
Penicillin Production Results
80
85
90
Batch 1
Batch 2
Batch 3
Batch 4
Batch 5
> options(contrasts=c("contr.treatment",
+
"contr.poly"))
> penclln.lme <- lme(Yield ~ Process,
+
random= ~ 1|Batch, data=penclln,
+
method=c("REML"))
> summary(penclln.lme)
75
Yield
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
A
B
C
D
Linear mixed-effects model fit by REML
Data: penclln
AIC
BIC
logLik
83.28607 87.92161 -35.64304
Process
757
758
Random effects:
Formula: ~ 1 | Batch
StdDev:
(Intercept) Residual
3.433899 4.339739
Fixed effects: Yield ~ Process
Value Std.Error DF
(Intercept)
84 2.474874 12
Process2
1 2.744692 12
Process3
5 2.744692 12
Process4
2 2.744692 12
> names(penclln.lme)
t-value
33.94113
0.36434
1.82170
0.72868
p-value
<.0001
0.7219
0.0935
0.4802
Correlation:
(Intr) Prcss2 Prcss3
Process2 -0.555
Process3 -0.555 0.500
Process4 -0.555 0.500 0.500
[1]
[4]
[7]
[10]
[13]
"modelStruct"
"coefficients"
"apVar"
"groups"
"fitted"
"dims"
"varFix"
"logLik"
"call"
"residuals"
"contrasts"
"sigma"
"numIter"
"method"
"fixDF"
> # Contruct ANOVA table for fixed effects
> anova(penclln.lme)
Standardized Within-Group Residuals:
Min
Q1
Med
Q3
Max
-1.415158 -0.5017351 -0.1643841 0.6829939 1.28365
(Intercept)
Process
numDF denDF F-value p-value
1
12 2241.213 <.0001
3
12
1.239 0.3387
Number of Observations: 20
Number of Groups: 5
759
> # Estimated parameters for fixed effects
> coef(penclln.lme)
1
2
3
4
5
(Intercept) Process2 Process3 Process4
88.28788
1
5
2
81.85606
1
5
2
83.28535
1
5
2
85.42929
1
5
2
81.14141
1
5
2
> # BLUP's for random effects
> ranef(penclln.lme)
1
2
3
4
5
760
> # Confidence intervals for fixed effects
> # and estimated standard deviations
> intervals(penclln.lme)
Approximate 95% confidence intervals
Fixed effects:
(Intercept)
Process2
Process3
Process4
lower est.
upper
78.6077137 84 89.39229
-4.9801701
1 6.98017
-0.9801701
5 10.98017
-3.9801701
2 7.98017
Random Effects:
Level: Batch
(Intercept)
4.2878780
-2.1439390
-0.7146463
1.4292927
-2.8585854
lower
est.
upper
sd((Intercept)) 0.8555882 3.433899 13.78193
Within-group standard error:
lower
est. upper
2.464606 4.339739 7.64152
761
762
> # Create a listing of the original data
> # residuals and predicted values
> data.frame(penclln$Process,penclln$Batch,
+
penclln$Yield,
+
Pred=penclln.lme$fitted,
+
Resid=round(penclln.lme$resid,3))
X2
1
1
1
1
2
2
2
2
3
3
3
3
4
4
4
4
5
5
5
5
X3 Pred.fixed Pred.Batch Resid.fixed Resid.Batch
89
84
88.28788
5
0.712
88
85
89.28788
3
-1.288
97
89
93.28788
8
3.712
94
86
90.28788
8
3.712
84
84
81.85606
0
2.144
77
85
82.85606
-8
-5.856
92
89
86.85606
3
5.144
79
86
83.85606
-7
-4.856
81
84
83.28535
-3
-2.285
87
85
84.28535
2
2.715
87
89
88.28535
-2
-1.285
85
86
85.28535
-1
-0.285
87
84
85.42929
3
1.571
92
85
86.42929
7
5.571
89
89
90.42929
0
-1.429
84
86
87.42929
-2
-3.429
79
84
81.14141
-5
-2.141
81
85
82.14141
-4
-1.141
80
89
86.14141
-9
-6.141
88
86
83.14141
2
4.859
> frame( )
> par(fin=c(7,7),cex=1.2,lwd=3,mex=1.5)
> plot(penclln.lme$fitted, penclln.lme$resid,
+
xlab="Estimated Means",
+
ylab="Residuals",
+
main="Residual Plot")
> abline(h=0, lty=2, lwd=3)
> qqnorm(penclln.lme$resid)
> qqline(penclln.lme$resid)
764
763
84
86
88
90
92
0
5
82
-5
0
penclln.lme$resid
5
Residual Plot
-5
X1
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
1
2
3
4
Residuals
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
> # Create residual plots
-2
Estimated Means
-1
0
1
2
Quantiles of Standard Normal
765
766
Example 10.2 Pigment production
In this example the main objective
is the estimation of the variance
components
Source of
d.f.
MS E(MS)
Variation
Batches
15-1=14
86.495 e2 + 2Æ2 + 42
Samples
in Batches 15(2-1)=15 57.983 e2 + 2Æ2
Tests in
Samples
(30)(2-1)=30 0.917 e2
Estimates of variance components:
^e2 = MStests = 0:917
^ 2 = MSsamples MStests = 28:533
Æ
2
^2 = MSbatches 4 MSsamples = 7:128
768
767
/* This is a SAS program for analyzing
data from a nested or heirarchical
experiment. This program is posted
as
pigment.sas
The data are measurements of moisture
content of a pigment taken from Box,
Hunter and Hunter (page 574).*/
/* The "random" statement in the following
GLM procedure prints of formulas for
expectations of mean squares. These results
are used in variance component estimation */
proc glm data=set1;
class batch sample;
model y = batch sample(batch) / e1;
random batch sample(batch) / q test;
run;
data set1;
infile 'pigment.dat';
input batch sample test y;
run;
proc print data=set1;
run;
769
770
/* Alternatively, REML estimates of variance
components are produced by the MIXED
procedure in SAS. Note that there are
no terms on the rigth of the equal sign in
the model statement because the only
non-random effect is the intercept.
*/
OBS
BATCH
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
proc mixed data=set1;
class batch sample test;
model y = ;
random batch sample(batch);
run;
/* Use the MIXED procedure in SAS to compute
maximum likelihood estimates of variance
components */
proc mixed data=set1 method=ml;
class batch sample test;
model y = ;
random batch sample(batch);
run;
SAMPLE
TEST
Y
1
1
2
2
1
1
2
2
1
1
2
2
1
1
2
2
1
1
2
2
1
1
2
2
1
1
2
2
1
1
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
40
39
30
30
26
28
25
26
29
28
14
15
30
31
24
24
19
20
17
17
33
32
26
24
23
24
32
33
34
34
1
1
1
1
2
2
2
2
3
3
3
3
4
4
4
4
5
5
5
5
6
6
6
6
7
7
7
7
8
8
772
771
OBS
BATCH
SAMPLE
TEST
Y
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
8
8
9
9
9
9
10
10
10
10
11
11
11
11
12
12
12
12
13
13
13
13
14
14
14
14
15
15
15
15
2
2
1
1
2
2
1
1
2
2
1
1
2
2
1
1
2
2
1
1
2
2
1
1
2
2
1
1
2
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
29
29
27
27
31
31
13
16
27
24
25
23
25
27
29
29
31
32
19
20
29
30
23
24
25
25
39
37
26
28
General Linear Models Procedure
Class Level Information
Class
Levels
BATCH
15
SAMPLE
2
Values
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
1 2
Number of observations in the data set = 60
Dependent Variable: Y
773
Source
DF
Model
29
Error
30
C.Total
59
Sum of Squares
2080.68
27.5000
Mean Square
F Value
Pr > F
71.7477
78.27
0.0001
0.9167
2108.18333333
Source
DF
Type I SS
Mean Square
F Value
Pr > F
BATCH
SAMPLE(BATCH)
14
15
1210.93
869.75
86.4952
57.9833
94.36
63.25
0.0001
0.0001
774
Source
Type I Expected Mean Square
BATCH
Var(Error) + 2 Var(SAMPLE(BATCH))
+ 4 Var(BATCH)
SAMPLE(BATCH)
Var(Error) + 2 Var(SAMPLE(BATCH))
The MIXED Procedure
Class Level Information
Class
BATCH
SAMPLE
TEST
Dependent Variable: y
Source
DF
Type I SS
MS
batch
14
1210.933
86.495
869.750
57.983
Error:
15
MS(sample(batch))
Source
DF Type I SS
sample(batch)
15
869.75
57.983
Error:
MS(Error)
30
27.50
0.917
F
Pr>F
1.49
0.2256
Levels
15
2
2
Values
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
1 2
1 2
REML Estimation Iteration History
MS
F
Pr>F
63.25
<.0001
Iteration
Evaluations
Objective
0
1
1
1
Criterion
274.08096606
183.82758851
0.0000000
Convergence criteria met.
775
Covariance Parameter Estimates (REML)
Cov Parm
Ratio
BATCH
SAMPLE(BATCH)
Residual
7.7760
31.1273
1.0000
Estimate
Std Error
7.1280
28.5333
0.9167
9.7373
10.5869
0.2367
Z
Pr > |Z|
0.73
2.70
3.87
0.4642
0.0070
0.0001
The MIXED Procedure
Class Level Information
Class
Levels
BATCH
SAMPLE
TEST
15
2
2
Values
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
1 2
1 2
Model Fitting Information for Y
Description
Observations
Variance Estimate
Standard Deviation Estimate
REML Log Likelihood
Akaike's Information Criterion
Schwarz's Bayesian Criterion
-2 REML Log Likelihood
Value
60.0000
0.9167
0.9574
-146.131
-149.131
-152.247
292.2623
776
ML Estimation Iteration History
Iteration
Evaluations
Objective
Criterion
0
1
1
1
273.55423884
184.15844023
0.00000000
Convergence criteria met.
777
Estimation of = E (Yijk):
1 b s r Y
^ = Y ::: = bsr
i=1 j =1 k=1 ijk
X
Covariance Parameter Estimates (MLE)
Cov Parm
BATCH
SAMPLE(BATCH)
Residual
Ratio
Estimate
Std Error
Z
6.2033
31.1273
1.0000
5.68639
28.53333
0.91667
9.07341
10.58692
0.23668
0.63
2.70
3.87
Model Fitting Information for Y
Description
Value
Observations
Variance Estimate
Standard Deviation Estimate
Log Likelihood
Akaike's Information Criterion
Schwarz's Bayesian Criterion
-2 Log Likelihood
X
X
E (Y :::) = Pr > |Z|
1 (2 + r2 + Sr2 )
V ar(Y :::) = bsr
e
Æ
0.5309
0.0070
0.0001
Standard error:
1 ^ 2 + r^ 2 + sr^ 2
SY ::: = bsr
e
Æ
v
u
u
u
u
u
u
t
60.0000
0.9167
0.9574
-147.216
-150.216
-153.357
294.4311
0
@
1
A
1 (MS
= bsr
Batches)
:495 = 1:4416
= 8660
v
u
u
u
u
u
u
t
v
u
u
u
u
u
u
t
779
778
> # This file is stored as
A 95% condence interval for Y ::: t14;:025SY :::
%
>
+
>
>
>
pigment <- read.table("pigment.dat",
col.names=c("Batch","Sample","Test","Y"))
pigment$Batch <- as.factor(pigment$Batch)
pigment$Sample <- as.factor(pigment$Sample)
pigment
1
2
3
4
5
6
7
8
9
10
11
12
.
.
df for MSBatches
Here, t14;:025 = 2:510 and the
condence interval is
26:783 (2:510)(1:4416)
) (23:16; 30:40)
780
pigment.spl
Batch Sample Test Y
1
1
1 40
1
1
2 39
1
2
1 30
1
2
2 30
2
1
1 26
2
1
2 28
2
2
1 25
2
2
2 26
3
1
1 29
3
1
2 28
3
2
1 14
3
2
2 15
.
.
. .
.
.
. .
781
> summary(pigment.raov)
>
>
>
>
>
>
>
#
#
#
#
#
#
#
The function raov() may be used for
balanced designs with only random effects,
and gives a conventional analysis including
the estimation of variance components.
The function varcomp() is more general. It
may be used to estimate variance components
for balanced or unbalanced mixed models.
> # raov(): Random Effects Analysis of Variance
>
> pigment.raov <- raov(Y ~ Batch/Sample,
+
data=pigment)
>
>
>
>
>
# or you could use
# pigment.raov <- raov(Y ~ Batch +
#
Batch:Sample, data=pigment)
# pigment.raov <- raov(Y ~ Batch +
#
Sample %in% Batch, data=pigment)
Df Sum of Sq
Batch 14 1210.933
Sample %in% Batch 15 869.750
Residuals 30
27.500
Mean Sq Est. Var.
86.49524 7.12798
57.98333 28.53333
0.91667 0.91667
> names(pigment.raov)
[1] "coefficients" "residuals" "fitted.values"
[4] "effects"
"R"
"rank"
[7] "assign"
"df.residual" "contrasts"
[10] "terms"
"call"
"model"
[13] "replications" "ems.coef"
> pigment.raov$rep
Batch Sample %in% Batch
4
2
> pigment.raov$ems.coef
Sample
Batch %in% Batch
Batch
4
0
Sample %in% Batch
2
2
Residuals
1
1
783
782
> # The same variance component estimates can be
> # found using varcomp(), but this allows mixed
> # models and we first must declare which factors
> # are random using the is.random() function.
> # All factors in the data frame are established
> # as random effects by the following
>
> is.random(pigment) <- T
> is.random(pigment)
Batch Sample
T
T
>
> # The possible estimation methods are
> # "minque0": minimum norm quadratic estimators
> #
(the default)
> # "reml" : residual (or reduced or restricted)
> #
maximum likelihood.
> # "ml"
: maximum likelihood.
784
Residuals
0
0
1
> varcomp(Y ~ Batch/Sample, data=pigment,
+
method="reml")$var
Batch Sample %in% Batch Residuals
7.12866
28.53469 0.916641
> varcomp(Y ~ Batch/Sample, data=pigment,
+
method="ml")$var
Batch Sample %in% Batch Residuals
5.68638
28.53333 0.9166668
785
Properties of ANOVA methods for
variance component estimation:
(i) Broad applicability
easy to compute in
balanced cases
ANOVA is widely known
not required to completely
specify distributions for
random eects
(ii) Unbiased estimators
(iii) Sampling distribution is not exactly known, even under the
usual normality assumptions (except for ^ e2 = MSerror)
(iv) May produce negative estimates
of variances
(v) REML estimates have the same
values
in simple balanced cases
when ANOVA estimates of
variance components are
inside the parameter space
(vi) For unbalanced studies, there
may be no \natural" way to
choose
k
^ 2 = i=1
aiMSi
X
787
786
Result 10.2: If
MS1; MS2; : : : ; MSk are distributed
independently with
(dfi)MSi 2
dfi
E (MSi)
and constants ai > 0; i = 1; 2; : : : ; k
are selected so that
k
^ 2 = i=1
aiMSi
X
has expectation 2, then
k a2i [E (MSi)]2
V ar(^2) = 2 i=1
dfi
X
788
and an unbiased estimator of this
variance is
a2i MSi2
V ar(^2) = 2(df
i + 2)
d
)MSi 2 and
Proof: Since (Edf(iMS
dfi
i)
E (2dfi) = dfi and V ar(2dfi) = 2dfi,
it follows that
E (MSi)2dfi
V ar(MSi) = V ar(
)
dfi
i)]2
= 2[E (MS
dfi
789
From the independence of the
MSi's, we have
Consequently,
k a2i MSi2
2
E 2 i=1
(dfi + 2) = V ar(^ )
2
66
66
64
k 2
V ar(^2) = i=1
ai V ar(MSi)
k a2i [E (MSi)]2
= 2 i=1
dfi
X
X
3
77
77
75
X
A \standard error" for
k
^ 2 = i=1
aiMSi
X
Furthermore,
E (MSi2) = V ar(MSi) + [E (MSi)]2
i)]2 + [E (MS )]2
= 2[E (MS
i
dfi
i + 2 [E (MS )]2
= dfdf
i
i
0
B
B
B
B
@
could be reported as
k a2i MSi2
S^ 2 = 2 i=1
(dfi + 2)
v
u
u
u
u
u
u
u
u
t
1
C
C
C
C
A
X
791
790
Using the Cochran-Satterthwaite
approximation (Result 10.1), an approximate (1 ) 100% condence
interval for 2 could be constructed
as:
2
1 =_ P r 2;1 =2 v^2 2;=2
2
2
= P r v2^ 2 v^
;1 =2
;=2
8
>
>
>
>
>
<
>
>
>
>
>
:
9
>
>
>
>
>
>
=
>
>
>
>
>
>
;
8
>
>
>
>
>
>
<
>
>
>
>
>
>
:
where ^ 2 = ki=1 aiMSi and
Consider the mixed model
Yn1 = Xp1 + Z uq1 + en1
where
9
>
>
>
>
>
=
>
>
>
>
>
;
2
66
64
Then,
u N 0 ; G 0
e
0 0 R
3
77
75
02
B
66
B
B
@64
3
77
75
2
66
64
31
77C
75C
C
A
Yn1 N (X; )
where = ZGZ T + R
Maximum Likelihood Estimation
Restricted Maximum Likelihood
Estimation (REML)
P
v=
10.3 Likelihood-based methods:
k aiMSi352
i=1
[a MS ]2
Pk
i=1 i dfi i
2
4P
792
793
Maximum Likelihood Estimation
This is a diÆcult computational
problem:
Multivariate normal likelihood:
L(; ; Y) = (2) n=2jj 1=2
exp 12(Y X)T 1(Y X)
8
>
>
>
<
>
>
>
:
9
>
>
>
=
>
>
>
;
The log-likelihood function is
`(; ; Y) = n log(2) 1 log(jj)
2
2
1(Y X)T 1(Y X)
2
Given the values of the observed responses, Y, nd values and that
maximize the log-likelihood function.
no analytic solution (except
in some balanced cases)
use iterative numerical methods
{ Need starting values (initial
guesses at the values of ^
^ T + R^ .
and ^ = Z GZ
{ local or global maxima?
{ what if ^ becomes singular or
is not positive denite?
794
Constrained optimization
{ estimates of variances cannot
be negative
{ estimated correlations
between -1 and 1
{ ^ ; G^ , and R^ are positive
denite (or non-negative
denite)
Large sample distributional
properties of estimators
{ consistency
{ normality
{ eÆciency
not guaranteed for ANOVA
methods
795
Estimates of variance components tend to be too small
Consider a sample Y1; : : : ; Yn from a
N (; 2) distribution. An unbiased
estimator for 2 is
n
S 2 = n 1 1 j =1
(Yj Y )2
X
The MLE for 2 is
n
^ 2 = n1 j =1
(Yj Y )2
with
E (^2) = n n 1 2 < 2
X
0
B
B
B
@
796
1
C
C
C
A
797
Note that S 2 and ^ 2 are based on
\error contrasts"
e1 = Y1 Y = nn 1 ; n1 ; : : : ; n1 Y
..
en = Yn Y = n1 ; n1 ; : : : ; n1 ; nn 1 Y
whose distribution does not depend
on
= E (Yj ) :
1
A
0
@
0
@
1
A
When Y N (1; 2I ),
e1
e = .. = (I P1)Y N (0; 2(I P1))
en
2
66
66
66
66
4
3
77
77
77
77
5
The MLE ^ 2 = n1 nj=1 e2j fails to
acknowledge that e is restricted
to an (n 1)-dimensional
space, i.e., nj=1 ej = 0.
P
P
The MLE fails to make the
appropriate adjustment in
\degrees of freedom" needed
to obtain
an unbiased estimator
for 2.
799
798
Dene
Example: Suppose n = 4 and
Y N (1; 2I ).
Then
Y1 Y
e = YY23 YY = (I P1)Y
Y4 Y
N (0; 2(I P1))
"
2
66
66
66
66
66
66
64
3
77
77
77
77
77
77
75
This covariance
matrix is singular.
Here, m = rank(I P1) = n 1 = 3.
800
r = M e = M (I PX )Y
where
1 1 1 1
M= 1 1 1 1
1 1 1 1
2
66
66
66
66
4
3
77
77
77
77
5
has row rank equal to
m = rank(I PX ).
Then
Y1 + Y2 Y3 Y4
r1
r = r2 = Y1 Y2 + Y3 Y4
Y1 Y2 Y3 + Y4
r3
= M (I P1)Y
N (0; 2M (I P1)M T )
2
66
66
66
66
4
3
77
77
77
77
5
2
66
66
66
66
4
3
77
77
77
77
5
"
call this 2W
801
(Restricted) likelihood equation:
2; r)
@`
(
0 = @2 = 2m2 + 2(12)2 rT W 1r
Restricted Likelihood function:
L(2; r) = (2)M=21j2W j1=2 e
1 T
1
22 r W r
Restricted Log-likelihood:
`(2; r) = m log(2) m log(2)
2
2
1logjW j 1 rT W 1r
2
22
(Note that j2W j = (2)mjW j)
Solution (REML estimator for 2):
2
^REML
= m1 rT W 1r
= m1 YT (I P1)T M T (M (I P1)M T ) 1M (I P1)Y
%
This is a projection of Y onto the
column space of M (I P1) which is
the column space of I P1
= m1 YT (I P1)Y
n
= n 1 1 j =1
(Yj Y )2 = S 2
X
802
REML (Restricted Maximum
Likelihood) estimation
Estimate parameters in
= ZGZ T + R
803
Maximize a likelihood function
for \error contrasts"
{ linear combinations of observations that do not depend on
X
{ Find a set of
by maximizing the part of
the likelihood that does
not depend on E (Y) = X
n rank(X )
linearly independent \error
contrasts"
804
805
Mixed (normal-theory) model:
Y = X + Z u + e
where ue N 00 ; G0 R0
2
66
64
3
77
75
02
B
B666
B
@4
3
77
75
2
66
64
31
77C
C
75C
A
for some M.
Then
LY = L(X + Z u + e)
= LX + LZ u + Le
(Here PX = X (X T X ) X T )
is invariant to X if and only if
LX = 0. But LX = 0 if and only if
L = M (I PX )
806
To avoid losing information we
must have
row rank(M ) = n rank(X )
=n p
Then a set of n p error contrasts
is
r = M (I PX )Y
Nn p(0; M (I PX ) 1(I PX )M T )
%
call this W ,
then rank(W ) = n p
and W 1 exists.
807
The "Restricted" likelihood is
1
L(; r) = (2)(n p1)=2jW j1=2 e 2 rT W 1r
The resulting log-likelihood is
`(; r) =
(n p)log(2) 1logjW j
2
2
1rT W 1r
2
808
For any M(n p)n with row rank
equal to
n p = n rank(X )
the log-likelihood can be expressed
in terms of
e = (I X (X 1X T ) X T 1)Y
Denote the resulting REML
estimators as
^ T + R^
G^ R^ and ^ = Z GZ
as
`(; e) = constant 12 log(jj)
1log(jX T 1X j) 1eT 1e
2
2
where X is any set of p =rank(X )
linearly independent columns
of X .
809
Estimation of xed eects
For any estimable function C, the
blue is the generalized least squares
estimator
C bGLS = C (X T 1X ) X T 1Y
810
Prediction of random eects:
Given the observed responses Y,
predict the value of u.
For our model,
2
66
64
Using the REML estimator for
u N 0 ; G 0 :
e
0 0 R
3
77
75
02
B66
B
B
@64
3
77
75
2
66
64
31
C
777C
5C
A
Then (from result 4.1)
= ZGZ T + R
2
66
64
an approximation is
C ^ = C (X T ^ 1X ) X T ^ 1Y
and for \large" samples:
C ^ _ N (C; C (X T 1X ) C T )
811
u
u
Y = X + Z u + e
= 0X + IZ 0I ue
T
N 0 ; G GZ
3
77
75
2
66
64
3
77
75
2
66
64
3
77
75
02
B
66
B
B
6
B
@4
X
2
66
64
3
77
75
3
77
75
2
66
66
4
2
66
64
3
77
75
ZG ZGZ T + R
812
31
77C
C
77C
A
5C
Substituting REML estimators G^
and R^ for G and R, an approximate
BLUP for u is
The Best Linear Unbiased
Predictor (BLUP): is the
b.l.u.e. for
E (ujY)
= E (u) + (GZ T )(ZGZ T + R) 1(Y E (Y))
= 0 + GZ T (ZGZ T + R) 1(Y X)
"
substitute the b.l.u.e. for X
X bGLS = X (X T 1X ) X T 1Y
Then, the BLUP for u is
BLUP (u) = GZ T 1(Y X bGLS )
= GZ T 1(I X (X T 1X ) X T 1)Y
when G and = ZGZ T + R are
known.
^ T ^
u^ = GZ
^ T ^
= GZ
For \large" samples, the distribution of u^ is approximately multivariate normal with mean vector 0 and
covariance matrix
GZ T 1(I P )(I P ) 1ZG
where
P = X (X T 1X ) X T 1
814
813
^ R^ and
Given estimates G;
^ T + R;
^
^ = Z GZ
^ and u^ provide a solution to the
mixed model equations:
2
66
66
4
X T R^ 1X X T R^
Z T R^ 1 Z T R^
32
1Z
^ 37777
77 66 77 66
1Z + G^ 1 5 4 u^ 5
T R^
= X
Z T R^
2
66
66
4
1Y 377
1Y 775
A generalized inverse of
X T R^ 1X X T R^ 1Z
Z T R^ 1 Z T R^ 1Z + G^ 1
2
66
66
4
3
77
77
5
is used to approximate the covari^
ance matrix for u^
2
66
66
4
3
77
77
5
815
1(I X (X T ^ 1X ) X T ^ 1)Y
1(Y X ^ )
References:
Bates, D.M. and Pinheiro, J.C. (1998)
"Computational methods for multilevel
models" available in PostScript or
PDF formats at
http://franz.stat.wisc.edu/pub/NLME/
Davidian, M. and Giltinan, D.M. (1995) "Nonlinear
Mixed Eects Models for Repeated Measurement Data", Chapman and Hall.
Hartley, H.O. and Rao, J.N.K. (1967) Maximumlikelihood estimation for the mixed analysis
of variance model, Biometrika, 54, 93-108.
Harville, D.A. (1977) Maximum likelihood
approaches to variance component estimation
and to related problems, Journal of the
American Statistical Association, 72,
320-338.
Jennrich, R.I. and Schluchter, M.D. (1986)
Unbalanced repeated-measures models with
structured covariance matrices,
Biometrics, 42, 805-820.
Kackar, R.N. and Harville, D.A. (1984)
Approximations for standard errors of
estimators of xed and random eects
in mixed linear models. Journal of the
American Statistical Association, 79, 853-862.
816
Laird, N.M. and Ware, J.H. (1982) Random-eects
models for longitudinal data, Biometrics,
38, 963-974.
Lindstrom, M.J. and Bates, D.M. (1988)
Newton-Raphson and EM algorithms for
linear-mixed-eects models for
repeated-measures data. Journal of the
American Statistical Association, 83,
1014-1022.
Littel, R.C., Milliken, G.A., Stroup, W.W., and
Wolnger, R.D. (1997) "SAS Systems for Mixed
Models", SAS Institute.
Pinheiro, J.C. and Bates., D.M. (1996)
"Unconstrained Parametrizations for
Variance-Covariance Matrices",
Statistics and Computing, 6, 289-296.
Pinheiro, J.C. and Bates., D.M. (2000) Mixed
Eects Models in S and S-PLUS, New York,
Springer-Verlag.
Robinson, G.K. (1991) That BLUP is a good thing:
the estimation of random eects, Statistical
Science, 6, 15-51.
Searle, S.R., Casella, G. and McCulloch, C.E.
(1992) Variance Components, New York,
John Wiley & Sons.
Wolnger, R.D., Tobias, R.D. and Sall, J. (1994)
Computing Gaussian likelihoods and their
derivatives for general linear mixed models,
SIAM Journal on Scientic Computing,
15(6), 1294-1310.
817
Download