Mixed Mo del Analysis

advertisement
Mixed Model Analysis
Basic model:
where
with
Y = X + Zu + e
X
is a n p model matrix of known constants
is a p 1 vector of \xed" unknown
parameter values
Z
is a n q model matrix of known constants
u
is a q 1 random vector
e
is a n 1 vector of random errors
E (e)
E (u)
Cov(e; u)
Then
= 0
= 0
= 0
V ar(e) = R
V ar(u) = G
E (Y )
= E (X + Z u + e)
= X + ZE (u) + E (e)
= X
V ar(Y) = V ar(X + Z u + e)
= V ar(Z u) + V ar(e)
= ZGZ T + R
731
732
Example 10.1: Random Block Eects
Comparison of four processes for
producing penicillin
P rocess
P rocess
P rocess
P rocess
Normal-theory mixed model:
"
Then,
u
e
#
N
"
0
0
; G
0
0 R
# "
Y N (X ; ZGZ T
#!
A
B
C
D
9
>
>
>
=
>
>
>
;
Levels of a \xed" treatment factor
Blocks correspond to dierent batches of an
important raw material, corn steep liquor
+ R)
Random sample of ve batches
call this Split each batch into four parts:
"
run each process on one part
{ randomize the order in which the
processes are run within each batch
{
733
734
Here, batch eects are considered as random
block eects:
Batches are sampled from a population of
many possible batches
To repeat this experiment you would need
to use a dierent set of batches of raw
material
Data Source: Box, Hunter & Hunter (1978),
Statistics for Experimenters,
Wiley & Sons, New York.
Data le:
penclln.dat
SAS code:
penclln.sas
S-PLUS code: penclln.ssc
Model:
Yij =
+ i
Yield
for the
i-th process
applied
to the
j-th batch
mean
yield
for the
i-th process,
averaging
across the
entire
population
of
possible
batches
"
where
"
j
eij
+j
+eij
"
"
random random
batch error
eect
NID(0; 2)
NID(0; e2)
and any eij is independent of any j .
735
736
In S-PLUS you could use the "treatment"
constraints where 1 = 0. Then
Here
i = E (Yij )
= E ( + i + j + eij )
= + i + E (j ) + E (eij )
= + i i = 1; 2; 3; 4
represents the mean yield for the i-th process,
averaging across all possible batches.
PROC GLM and PROC MIXED in SAS would
t a restricted model with 4 = 0. Then
= 4 is the mean yield for process D
i = i 4
i = 1; 2; 3; 4:
= 1 is the mean yield for process A
i = i 1
i = 1; 2; 3; 4:
Alternatively, you could choose the solution
to the normal equations corresponding to the
"sum" constraints where
1 + 2 + 3 + 4 = 0
= (1 + 2 + 3 + 4)=4 is an overall mean
yield
i = i is the dierence between the
mean yield for the i-th process and the
overall mean yield.
737
738
Variance-covariance structure:
V ar(Yij ) = V ar( + i + j + eij )
= V ar(j + eij )
= V ar(j ) + V ar(eij )
= 2 + e2 for all (i; j )
Correlation among yields for runs on the
same batch:
=
q
Cov(Yij ; Ykj )
V ar(Yij )V ar(Ykj )
2
= 2 + 2 for i 6= k
e
Dierent runs on the same batch:
Cov(Yij ; Ykj )
= Cov( + i + j + eij ; + k + j + ekj )
= Cov(j + eij ; j + ekj )
= Cov(j ; j ) + Cov(j ; ekj ) + Cov(eij ; j )
+Cov(eij ; ekj )
= V ar(j )
= 2 for all i 6= k
Results for runs on dierent batches are
uncorrelated (independent):
Cov(Yij ; Yk`) = 0
for j 6= `
740
739
Write this model in the form Y = X + Z u + e
2
Results from the four runs on a single batch:
2
6
V ar 6
6
4
Y1j
Y2j
Y3j
Y4j
3
2
7
7
7
5
6
6
6
6
4
=
2 + e2
2
2
2
2
2
2
2
+ e
2
2
2
2
2
+ e
2
2
2
2
2
+ e2
3
7
7
7
7
5
= 2J + e2I
"
"
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
4
Y11
Y21
Y31
Y41
Y12
Y22
Y32
Y42
Y13
Y23
Y33
Y43
Y14
Y24
Y34
Y44
Y15
Y25
Y35
Y45
3
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
5
1
1
1
1
1
1
1
1
1
= 11
1
1
1
1
1
1
1
1
1
2
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
4
1
1
1
1
0
0
0
0
0
+ 00
0
0
0
0
0
0
0
0
0
2
matrix identity
of
matrix
ones
This special type of covariance structure is
called compound symmetry
741
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
4
1
0
0
0
1
0
0
0
1
0
0
0
1
0
0
0
1
0
0
0
0
0
0
0
1
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
1
0
0
0
1
0
0
0
1
0
0
0
1
0
0
0
0
0
0
0
0
0
0
1
1
1
1
0
0
0
0
0
0
0
0
0
0
1
0
0
0
1
0
0
0
1
0
0
0
1
0
0
0
1
0
03
07
07
1 77
07
0 77
0 72 3
17 0 77 6 1 7
0 7 4 2 5
0 7 3
1 77 4
07
07
0 77
17
07
05
0
1
2
e11
0 03
0 07
6 ee21
0 07
6 31
6 e41
0 0 77
6 e12
0 07
6
e
0 0 77
0 0 7 2 3 666 e2232
0 0 7 1 6 e42
0 0 77 6 2 7 66 e13
0 0 7 4 3 5 + 6 e23
0 0 7 4 6 e33
0 0 77 5 66 e43
e
1 07
6 e14
1 07
6 24
1 0 77
6 e34
6 e44
1 07
6 e
0 17
6 15
0 15
4 e25
e35
0 1
e45
0 1
3
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
5
742
Here
G
R
and
V ar(Y)
=
=
Example 10.2: Hierarchical Random Eects
Model
V ar(u) = B2 I55
Analysis of sources of variation in a process
used to monitor the production of a pigment
paste.
V ar(e) = e2Inn
= V ar(X + Z u + e)
= V ar(Z u) + V ar(e)
= ZGZ T + R
= 2ZZ T + e2I
2
2J + e2I
6 6
2J + e2I
= 666
...
4
Current Procedure:
Sample barrels of pigment paste
Take one sample from each barrel
3
2J + e2I
7
7
7
7
7
5
Send the sample to a lab for determination
of moisture content
Measured Response: (Y ) moisture content of
the pigment paste (units of one tenth of 1%).
744
743
Current status: Moisture content varied
about a mean of approximately 25
(or 2.5%)
Data Collection: Hierarchical (or nested)
Study Design
with standard deviation of about 6
Problem: Variation in moisture content was
regarded as excessive.
Sampled b barrels of pigment paste
s samples are taken from the content
of each barrel
Sources of Variation
Each sample is mixed and divided into
r
parts. Each part is sent to the lab.
There are n = (b)(s)(r) observations.
745
746
Model:
Yijk
Covariance structure:
V ar(Yijk ) = V ar( + i + Æij + eijk )
= V ar(i) + V ar(Æij ) + V ar(eijk )
= 2 + Æ2 + e2
= + i + Æij + eijk
where
Yijk
is the moisture content determination for
the k-th part of the j -th sample from the
i-th barrel
Two parts of one sample:
Cov(Yijk ; Yij`)
= Cov( + i + Æij + eijk ; + i + Æij + eij`)
= Cov(i; i) + Cov(Æij ; Æij )
= 2 + Æ2 for k 6= `
is the overall mean moisture content
i
is a random barrel eect:
i NID(0; 2)
Æij
is a random sample eect:
Æij NID(0; Æ2)
Observations from dierent samples taken
from the same barrel:
Cov(Yijk ; Yim`) = Cov(i; i)
= 2 j 6= m
corresponds to random measurement
(lab analysis) error: eijk NID(0; e2)
Observations from dierent barrels:
Cov(Yijk ; Ycm`) = 0; i 6= c
eijk
747
Write this model in the form:
Y = X + Zu + e
In this study
b = 15
748
2
barrels were sampled
s=2
samples were taken from each barrel
r=2
sub-samples were analyzed from each
sample taken from each barrel
Data le: pigment.dat
SAS code: pigment.sas
S-PLUS code: pigment.ssc
749
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
4
3
2
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
Y15;1;1 7
Y15;1;2 7
7
Y15;2;1 5
Y15;2;2
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
4
Y111
Y112
Y121
Y122
Y211
Y212
Y221
Y222
..
1
6 1
6
6 1
6
6 1
6 0
6
6 0
6
6 0
6
6 0
6 .
6 .
6
6 0
6
6 0
4 0
0
2
=
0 :::
0 :::
0 :::
0 :::
1 :::
1 :::
1 :::
1. :. : :
. .
0 :::
0 :::
0 :::
0 :::
0
0
0
0
0
0
0
0.
.
1
1
1
1
13
1 77
1 77
1 77
1 77
1 7 1 77 +
1. 77
. 77
1 77
1 75
1
1
1 0 0 0 :::
1 0 0 0 :::
0 1 0 0 :::
0 1 0 0 :::
0 0 1 0 :::
0 0 1 0 :::
0 0 0 1 :::
0. .0 .0 1. :. : :
. . . . .
0 0 0 0 :::
0 0 0 0 :::
0 0 0 0 :::
0 0 0 0 :::
0
0
0
0
0
0
0
0.
.
1
1
0
0
0 32
3
0 77 6 1 7
0 77 6 . 2 7
0 77 66 . 77
0 77 66 15 77
0 7 66 Æ1;1 77
0 77 66 Æ1;2 77 + e
0. 77 66 Æ2;1 77
. 77 66 Æ. 2;2 77
0 77 6 . 7
0 75 4 Æ15;1 5
1 Æ15;2
1
750
where
R
G
=
V ar(e) = e2I
=
V ar(u) =
Then
E (Y) = X =
V ar(Y) = =
=
"
2I
0
0
Analysis of Mixed Linear Models
#
Æ2I
Y = X + Zu + e
1
ZGZ T + R
2
2 Ib 0 3
5 Z T + 2I
Z4
e bsr
0 Æ2Ibs
= 2(Ib Jsr) + Æ2(Ibs Jr) + e2Ibsr
because Z = Ib 1
h
%
sr (sr) 1
vector
of ones
Ibs 1r
i
where Xnp and Znq are known model
matrices and
"
u
e
#
Then
Y
where
-
r1
N
"
0
0
; G
0
0 R
# "
#!
N (X ; )
= ZGZ T + R
vector
of ones
752
751
Some objectives:
Methods of Estimation
(i) Inferences about estimable functions of
xed eects
Point estimates
Condence intervals
Tests of hypotheses
(ii) Estimation of variance components
(elements of G and R)
I. Ordinary Least Squares Estimation:
For the linear model
Yj = 0 + 1X1j + : : : ; r Xrj + j
2
2
b = 64
(iii) Predictions of random eects (blup)
b0
..
br
3
7
5
Q(b) =
(iv) Predictions of future observations
753
0
an OLS estimator for = ..
6
4
r
j = 1; : : : ; n
3
7
5
is any
that minimizes
n
X
[Yj
j =1
b0 b1X1j
: : : br Xrj ]2
754
Normal equations (estimating equations):
@Q(b)
@b0
= 2 (Yj
@Q(b)
@b1
= 2
n
X
j =1
n
X
j =1
b0 b1X1j
Xij (Yj
: : : br Xrj ) = 0
The Gauss-Markov Theorem (Result 3.11)
cannot be applied because it requires uncorrelated responses. In these models
V ar(Y)
b0 b1X1j = : : : br Xrj ) = 0
for i = 1; 2; : : : ; r
= ZGZ T + R
6 2I
=
Hence, the OLS estimator of an estimable
function CT is not necessarily a best linear
unbiased estimator (b.l.u.e.).
The matrix form of these equations is
(X T X )b = X T Y
The OLS estimator for CT is
and solutions have the form
b = (X T X ) X T Y
CT b = CT (X T X ) X T Y
where
b = (X T X ) X T Y
is a solution to the normal equations.
756
755
II. Generalized Least Squares (GLS)
Estimation:
The OLS estimator CT b is a linear function
of Y.
E (CT b) = CT V ar(CT b) =
CT (X T X ) X T (ZGZ T
+ R)X (X T X )
Suppose
and
E (Y ) = X = V ar(Y) = ZGZ T + R
is known. Then a GLS estimator for is any
b that minimizes
C
If Y N (X ; ZGZ T + R); then CT b
has a
N (C T ; CT (X T X ) X T (ZGZ T + R)X (X T X ) C):
distribution.
Q(b) = (Y X b)T 1(Y X b)
(from Defn. 3.8). The estimating equations
are:
and
(X T 1X )b = X T 1Y
bGLS
757
is a solution.
= (X T 1X ) (X T 1Y)
758
For any estimable function
b.l.u.e. is
C T ,
the unique
C T bGLS = C T (X T 1X ) X T 1Y
with
V ar(C T bGLS )
= C T (X T 1X ) X T 1X [(X T 1X ) ]T C
= C T (X T 1X ) C
(from result 3.12).
If Y N (X ; ), then
C T bGLS N (C T ;
When G and/or R contain unknown
parameters, you could obtain an \approximate
BLUE" by replacing the unknown parameters
with consistent estimators to obtain
^ T + R^
^ = Z GZ
and
^ 1X ) ^ 1Y
C T bGLS = C T (X T C T (X T 1 X ) C )
760
759
C T bGLS is not a linear function of Y
C T bGLS is not a best linear unbiased esti-
mator (BLUE)
See Kackar and Harville (1981, 1984)
for conditions under which C T bGLS is
an unbiased estimator for C T C T (X T ^ 1X ) C tends to
\underestimate" V ar(C T bGLS )
(see Eaton (1984))
Variance component estimation:
Estimation of parameters in G and R
Crucial to the estimation of estimable func-
tions of xed eects (e.g. E (Y) = X )
Of interest in its own right (sources of vari-
For \large" samples
C T bGLS _ N (C T ; C T (X T 1X ) C )
761
ation in the pigment paste production example)
762
ANOVA method (method of moments):
Compute an ANOVA table
Basic approaches:
(i) ANOVA methods (method of moments):
Set observed values of mean squares equal
to their expectations and solve the resulting equations.
(ii) Maximum likelihood estimation (ML)
(iii) Restricted maximum likelihood estimation
(REML)
Equate mean squares to their expected val-
ues
Solve the resulting equations
Example 10.1 Penicillin production
Yij = + i + j + eij
where Bj NID(0; 2) and eij NID(0; e2)
Source of
Variation
Blocks
Processes
error
C. total
d.f. Sums
of Squares
Pb
Y::)2 = SSblocks
4 aP
(
j =1 Y:j
3 b ai=1(Y1: Y::)2 = SSprocesses
12 Pai=1 Pbj=1(Yij Y1: Y:j + Y::)2 = SSE
19 Pai=1 Pbj=1(Yij Y::)2
763
Start at the bottom:
764
Then,
= (a SSE
1)(b 1)
E (MSerror ) = e2
MSerror
2
Then an unbiased estimator for e is
^e2 = MSerror
=
E (MSblocks)
=
e2
a
E (MSerror )
and an unbiased estimator for 2 is
Next, consider the mean square for the random
block eects:
MSblocks
)
= E (MSblocks
a
= E (MSblocks)
SSblocks
b 1
e2 + a2
"
number of
observations
for each
block
765
^2 =
MSblocks MSerror
a
For the penicillin data
^e2 = MSerror = 18:83
MSblocks MSerror
^ =
4
66
:0 18:83
=
= 11:79
4
V ar(Yij ) = 2 + e2
= 11:79 + 18:83 = 30:62
766
/* This is a program for analyzing the
penicillan data from Box, Hunter, and
Hunter. It is posted in the file
penclln.sas
proc rank data=set2 normal=blom out=set2;
var resid; ranks q;
run;
First enter the data */
data set1;
infile 'penclln.dat';
input batch process $ yield;
run;
proc univariate data=set2 normal plot;
var resid;
run;
/* Compute the ANOVA table, formulas for
expectations of mean squares, process
means and their standard errors */
proc glm data=set1;
class batch process;
model yield = batch process / e e3;
random batch / q test;
lsmeans process / stderr pdiff tdiff;
output out=set2 r=resid p=yhat;
run;
/* Compute a normal probability plot for
the residuals and the Shapiro-Wilk test
for normality */
goptions cback=white colors=(black)
target=win device=winprtc rotate=portrait;
axis1 label=(h=2.5 r=0 a=90 f=swiss 'Residuals')
value=(f=swiss h=2.0) w=3.0 length=5.0 in;
axis2 label=(h=2.5 f=swiss 'Standard
Normal Quantiles')
value=(f=swiss h=2.0) w=3.0 length=5.0 in;
767
axis3 label=(h=2.5 f=swiss 'Production Process')
value=(f=swiss h=2.0) w=3.0 length=5.0 in;
symbol1 v=circle i=none h=2 w=3 c=black;
proc gplot data=set2;
plot resid*q / vaxis=axis1 haxis=axis2;
title h=3.5 f=swiss c=black
'Normal Probability Plot';
run;
proc gplot data=set2;
plot resid*process / vaxis=axis1 haxis=axis3;
title h=3.5 f=swiss c=black 'Residual Plot';
run;
769
768
/* Fit the same model using PROC MIXED. Compute
REML estimates of variance components. Note
that PROC MIXED provides appropriate standard
errors for process means. When block effects
are random. PROC GLM does not provide correct
standard errors for process means */
proc mixed data=set1;
class process batch;
model yield = process / ddfm=satterth solution;
random batch / type=vc g solution cl alpha=.05;
lsmeans process / pdiff tdiff;
run;
770
Type III Estimable Functions for: BATCH
General Linear Models Procedure
Class Level Information
Class
Levels
BATCH
5
1 2 3 4 5
PROCESS
4
A B C D
Effect
INTERCEPT
Values
BATCH
Number of observations in data set = 20
Effect
L2
L3
L4
L5
-L2-L3-L4-L5
A
B
C
D
0
0
0
0
Type III Estimable Functions for: PROCESS
Coefficients
INTERCEPT
1
2
3
4
5
PROCESS
General Form of Estimable Functions
Coefficients
0
Effect
INTERCEPT
L1
Coefficients
0
BATCH
1
2
3
4
5
L2
L3
L4
L5
L1-L2-L3-L4-L5
BATCH
1
2
3
4
5
0
0
0
0
0
PROCESS
A
B
C
D
L7
L8
L9
L1-L7-L8-L9
PROCESS
A
B
C
D
L7
L8
L9
-L7-L8-L9
771
772
Dependent Variable: YIELD
Source
DF
Sum of
Squares
Mean
Square
F Value
Pr > F
Model
7
334.00
47.71
2.53
0.0754
Error
12
226.00
18.83
Cor. Total
19
560.00
Quadratic Forms of Fixed Effects in the
Expected Mean Squares
Source: Type III Mean Square for PROCESS
PROCESS
PROCESS
PROCESS
PROCESS
R-Square
C.V.
Root MSE
YIELD Mean
0.596
5.046
4.3397
86.0
Source
Source
DF
Type III
SS
BATCH
PROCESS
4
3
264.0
70.0
Mean
Square
F Value
Pr > F
66.0
23.3
3.50
1.24
0.0407
0.3387
773
A
B
C
D
PROCESS
A
3.750
-1.250
-1.250
-1.250
PROCESS
B
-1.250
3.750
-1.250
-1.250
PROCESS
C
-1.250
-1.250
3.750
-1.250
PROCESS
D
-1.250
-1.250
-1.250
3.750
Type III Expected Mean Square
BATCH
Var(Error) + 4 Var(BATCH)
PROCESS
Var(Error) + Q(PROCESS)
774
Tests of Hypotheses for Mixed Model
Analysis of Variance
The MIXED Procedure
Dependent Variable: YIELD
Source: BATCH
Error: MS(Error)
DF
4
Denominator
DF
MS
12 18.83
Type III MS
66
Source: PROCESS
Error: MS(Error)
DF
3
Model Information
F Value
3.5044
Denominator
DF
MS
12
18.83
Type III MS
23.33
Pr > F
0.0407
F Value
1.2389
Pr > F
0.3387
Data Set
Dependent Variable
Covariance Structure
Estimation Method
Residual Variance Method
Fixed Effects SE Method
Degrees of Freedom Method
WORK.SET1
yield
Variance Components
REML
Profile
Model-Based
Satterthwaite
Least Squares Means
YIELD
LSMEAN
Std
Error
Pr > |T|
t-tests / p-values
A
84
1.941
0.0001
1
B
85
1.941
0.0001
2
C
89
1.941
0.0001
3
D
86
1.941
0.0001
4
.
-0.364
0.722
0.364
.
0.722
1.822 1.457
0.094 0.171
0.729 0.364
0.480 0.723
Class Level Information
-1.822
0.093
-1.457
0.171
.
-1.093
0.296
-0.729
0.480
-0.364
0.722
1.093
0.296
.
Class
Levels Values
PROCESS
BATCH
4
5
A B C D
1 2 3 4 5
776
775
Iteration History
Iteration
Eval
-2 Res Log Like
0
1
1
1
106.59285141
103.82994387
Criterion
Fit Statistics
0.00000000
Convergence criteria met.
Res Log Likelihood
Akaike's Information Criterion
Schwarz's Bayesian Criterion
-2 Res Log Likelihood
-51.9
-53.9
-53.5
103.8
Estimated G Matrix
Row
1
2
3
4
5
Effect
batch
batch
batch
batch
batch
batch
1
2
3
4
5
Col1
11.7917
Col2
Col3
Col4
Col5
Solution for Fixed Effects
11.7917
11.7917
11.7917
Effect
11.7917
Covariance Parameter Estimates
Cov Parm
Estimate
batch
Residual
process
Intercept
process
process
process
process
A
B
C
D
Estimate
86.0000
-2.0000
-1.0000
3.0000
0
Standard
Error
DF
t Value
Pr > |t|
2.4749 11.1
2.7447
12
2.7447
12
2.7447
12
.
.
34.75
-0.73
-0.36
1.09
.
<.0001
0.4802
0.7219
0.2958
.
11.7917
18.8333
777
778
Type 3 Tests of Fixed Effects
Solution for Random Effects
Effect batch
batch
batch
batch
batch
batch
1
2
3
4
5
Estimate
4.2879
-2.1439
-0.7146
1.4293
-2.8586
Std Err
Pred
DF
2.2473
2.2473
2.2473
2.2473
2.2473
5.29
5.29
5.29
5.29
5.29
Effect
t
1.91
-0.95
-0.32
0.64
-1.27
Den
DF
F Value
Pr > F
3
12
1.24
0.3387
process
Pr > |t|
0.1115
0.3816
0.7627
0.5513
0.2564
Num
DF
Least Squares Means
Effect process
process
process
process
process
A
B
C
D
Standard
Error
Est.
84.0000
85.0000
89.0000
86.0000
2.4749
2.4749
2.4749
2.4749
DF
t Value
Pr > |t|
11.1
11.1
11.1
11.1
33.94
34.35
35.96
34.75
<.0001
<.0001
<.0001
<.0001
Solution for Random Effects
Effect
batch
batch
batch
batch
batch
batch
1
2
3
4
5
Alpha
Lower
Upper
0.05
0.05
0.05
0.05
0.05
-1.3954
-7.8273
-6.3980
-4.2540
-8.5419
9.9712
3.5394
4.9687
7.1126
2.8247
Differences of Least Squares Means
Effect
process
process
process
process
process
process
process
A B
A C
A D
B C
B D
C D
Estimate
-1.0000
-5.0000
-2.0000
-4.0000
-1.0000
3.0000
Standard
Error
DF
2.7447
2.7447
2.7447
2.7447
2.7447
2.7447
12
12
12
12
12
12
779
t
Pr > |t|
-0.36
-1.82
-0.73
-1.46
-0.36
1.09
0.7219
0.0935
0.4802
0.1707
0.7219
0.2958
780
Inferences about treatment means:
Yij = + i + j + eij
and
Consider the sample mean (one observation for
each treatment in each block):
1 Xb Y
Yi: =
ij
b j =1
8
>
>
>
>
>
>
>
>
<
+ i
>
>
>
>
>
>
>
:
P
+ i + 1b bj=1 j
E (Yi:) = >
for random
block eects 2
j NID(0; )
for xed
block eects
781
V ar(Yi:)
=
V ar(
1 Xb Y )
ij
b
j =1
b
= b12 X V ar(Yij )
j =1
=
8
>
>
>
>
>
<
>
>
>
>
>
:
1 (e2 + 2)
b
b
1 (e2)
b
random block
eects
xed block
eects
782
Variance estimates & condence intervals:
Fixed additive block eects:
1
1
SY2i: = ^e2 = MSerror
b
b
t-tests:
Reject H0 : + i + 1b Pbj=1 j = d if
jtj = qj1Yi: dj > t(a 1)(b 1); 2
b MSerror
"
The standard error for Yi: is
s
1
S = MSerror = 1:941
Yi:
A (1
degrees of
freedom
for SSerror
b
) 100%
condence interval for
1 Xb + +
i
is
This is what is done by the LSMEANS
option in the GLM procedure in SAS,
even when you specify
RANDOM BATCH;
b j =1 j
s
1
Yi: t(a 1)(b 1); 2 MSerror
b
This is what is done in the MIXED
"
procedure in SAS when batch eects
are not random
degrees of freedom
for SSerror
784
783
Models with random additive block eects:
1
SY2i: = (^e2 + ^2)
b
= 1 MSerror + MSblocks
=
=
b
a
ab
1 MSerror + 1 MS
1
ab(b
ab
a
MSerror e22(a 1)(b 1)
Yi:
ab
-
(e2 + a2)2(b 1)
Hence, the distribution of SY2i: is not a multiple
of a central chi-square random variable.
785
blocks
ab
= 2:4749
An approximate (1 ) 100% condence
interval for + i is
Yi: t; 2
where
blocks
1) [SSerror + SSblocks]
%
Standard error for Yi: is
s
a 1
1
S =
MSerror + MS
v
=
a
ab
1 MSerror + 1 MS
ab
blocks
2
a 1 MSerror + 1 MS
blocks
ab
ab
2
a 1 MSerror 2
1
ab
ab MSblocks
b 1
(a 1)(b 1) +
h
h
=
s
i
4 1 MSerror + 1 MSblocksi2
ab
(4)(5)
= 11:075
a 1 MSerror
ab
2
(a 1)(b 1) +
1
ab MSblocks
b
2
1
786
Result 10.1: Cochran-Satterthwaite
approximation
Suppose MS1; MS2; ; MSk are mean squares
with
S 2 = a1MS1 + a2MS2 + : : : + ak MSk
is approximated by
independent distributions
V S2
E (S 2)
degrees of freedom = dfi
_ 2V
where
)MSi 2
(Edf(iMS
dfi
i)
V
Then, for positive constants
ai > 0;
i = 1; 2; : : : ; k
the distribution of
2 2
= [a1E(MS1)]2[E (S )] [akE(MSk)]2
+ : : : + dfk
df1
is the value for the degrees of freedom.
787
788
Dierence between two treatment means:
E (Yi: Yk:)
In practice, the degrees of freedom are
evaluated as
V
= (a1MS1)2
df1
1
0
b
b
= E @ 1b X Yij 1b X Ykj A
[S 2]2
0
j =1
b
= E @ 1b X (Yij
2
k)
+ : : : + (akMS
dfk
0
These are called the Cochran-Satterthwaite
degrees of freedom.
Cochran, W.G. (1951) Testing a Linear
Relation among Variances, Biometrics 7,
17-32.
j =1
j =1
1
Ykj )A
b
= E @ 1b X ( + i + j + ij
j =1
=
1 Xb E (
i k +
b
j =1
ij
1
k j
kj )
" this is zero
= i k
= ( + i) ( + k)
whether block eects are xed or random.
789
790
kj )A
V ar(Yi: Yk:) =
0
V ar i k
@
b
+ 1 X (ij
b
= b12 X V ar(ij
j =1
b j =1
1
kj )A
kj )
t-test:
2
= 2b e
Reject H0 : i
The standard error for Yi: Yk: is
s
2MSerror
S =
Yi: Yk:
"
if
Yj:j
jtj = qjY2i:MSerror
> t(a 1)(b 1); 2
b
"
b
A (1 ) 100% condence interval for i
is
s
(Yi: Yk:) t(a 1)(b 1); 2 2MSerror
k = 0
d.f. for MSerror
i
b
d.f. for MSerror
792
791
# Analyze the penicillin data from Box,
# Hunter, and Hunter. This code is
# posted as penclln.ssc
# Enter the data into a data frame and
# change the Batch and Process variables
# into factors
>
+
>
>
>
penclln <- read.table("penclln.dat",
col.names=c("Batch","Process","Yield"))
penclln$Batch <- as.factor(penclln$Batch)
penclln$Process <- as.factor(penclln$Process)
penclln
793
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Batch Process Yield
1
1
89
1
2
88
1
3
97
1
4
94
2
1
84
2
2
77
2
3
92
2
4
79
3
1
81
3
2
87
3
3
87
3
4
85
4
1
87
4
2
92
4
3
89
4
4
84
5
1
79
5
2
81
5
3
80
5
4
88
794
# Construct a profile plot. UNIX users
# should use the motif( ) command to open
# a graphics window
Penicillin Production Results
95
80
85
90
Batch 1
Batch 2
Batch 3
Batch 4
Batch 5
75
par(fin=c(6,7),cex=1.2,lwd=3,mex=1.5)
x.axis <- unique(Process)
matplot(c(1,4), c(75,100), type="n", xaxt="n",
xlab="Process", ylab="Yield",
main= "Penicillin Production Results")
axis(1, at=(1:4)*1, labels=c("A", "B", "C", "D"))
matlines(x.axis,means,type='l',lty=1:5,lwd=3)
Yield
>
>
>
+
+
>
>
100
> attach(penclln)
> means <- tapply(Yield,list(Process,Batch),mean)
A
B
C
D
Process
> legend(4.2,95, legend=c('Batch 1','Batch 2',
+ 'Batch 3','Batch 4','Batch 5'), lty=1:5,bty='n')
> detach( )
795
796
Random effects:
Formula: ~ 1 | Batch
# Use the lme( ) function to fit a model
# with additive batch (random) and process
# (fixed) effects and create diagnostic plots.
> options(contrasts=c("contr.treatment",
+
"contr.poly"))
> penclln.lme <- lme(Yield ~ Process,
+
random= ~ 1|Batch, data=penclln,
+
method=c("REML"))
StdDev:
(Intercept) Residual
3.433899 4.339739
Fixed effects: Yield ~ Process
Value Std.Error DF
(Intercept)
84 2.474874 12
Process2
1 2.744692 12
Process3
5 2.744692 12
Process4
2 2.744692 12
t-value
33.94113
0.36434
1.82170
0.72868
p-value
<.0001
0.7219
0.0935
0.4802
Correlation:
(Intr) Prcss2 Prcss3
Process2 -0.555
Process3 -0.555 0.500
Process4 -0.555 0.500 0.500
> summary(penclln.lme)
Linear mixed-effects model fit by REML
Data: penclln
AIC
BIC
logLik
83.28607 87.92161 -35.64304
Standardized Within-Group Residuals:
Min
Q1
Med
Q3
Max
-1.415158 -0.5017351 -0.1643841 0.6829939 1.28365
797
Number of Observations: 20
Number of Groups: 5
798
> # Estimated parameters for fixed effects
> coef(penclln.lme)
> names(penclln.lme)
[1]
[4]
[7]
[10]
[13]
"modelStruct"
"coefficients"
"apVar"
"groups"
"fitted"
"dims"
"varFix"
"logLik"
"call"
"residuals"
"contrasts"
"sigma"
"numIter"
"method"
"fixDF"
> # Contruct ANOVA table for fixed effects
> anova(penclln.lme)
numDF denDF F-value p-value
(Intercept)
1
12 2241.213 <.0001
Process
3
12
1.239 0.3387
1
2
3
4
5
(Intercept) Process2 Process3 Process4
88.28788
1
5
2
81.85606
1
5
2
83.28535
1
5
2
85.42929
1
5
2
81.14141
1
5
2
> # BLUP's for random effects
> ranef(penclln.lme)
(Intercept)
1 4.2878780
2 -2.1439390
3 -0.7146463
4 1.4292927
5 -2.8585854
800
799
> # Confidence intervals for fixed effects
> # and estimated standard deviations
> intervals(penclln.lme)
Approximate 95% confidence intervals
> # Create a listing of the original data
> # residuals and predicted values
> data.frame(penclln$Process,penclln$Batch,
+
penclln$Yield,
+
Pred=penclln.lme$fitted,
+
Resid=round(penclln.lme$resid,3))
Fixed effects:
(Intercept)
Process2
Process3
Process4
lower est.
upper
78.6077137 84 89.39229
-4.9801701
1 6.98017
-0.9801701
5 10.98017
-3.9801701
2 7.98017
Random Effects:
Level: Batch
lower
est.
upper
sd((Intercept)) 0.8555882 3.433899 13.78193
Within-group standard error:
lower
est. upper
2.464606 4.339739 7.64152
801
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
X1 X2 X3 Pred.fixed Pred.Batch Resid.fixed Resid.Batch
1 1 89
84
88.28788
5
0.712
2 1 88
85
89.28788
3
-1.288
3 1 97
89
93.28788
8
3.712
4 1 94
86
90.28788
8
3.712
1 2 84
84
81.85606
0
2.144
2 2 77
85
82.85606
-8
-5.856
3 2 92
89
86.85606
3
5.144
4 2 79
86
83.85606
-7
-4.856
1 3 81
84
83.28535
-3
-2.285
2 3 87
85
84.28535
2
2.715
3 3 87
89
88.28535
-2
-1.285
4 3 85
86
85.28535
-1
-0.285
1 4 87
84
85.42929
3
1.571
2 4 92
85
86.42929
7
5.571
3 4 89
89
90.42929
0
-1.429
4 4 84
86
87.42929
-2
-3.429
1 5 79
84
81.14141
-5
-2.141
2 5 81
85
82.14141
-4
-1.141
3 5 80
89
86.14141
-9
-6.141
4 5 88
86
83.14141
2
4.859
802
Residual Plot
> qqnorm(penclln.lme$resid)
> qqline(penclln.lme$resid)
0
-5
Residuals
> frame( )
> par(fin=c(7,7),cex=1.2,lwd=3,mex=1.5)
> plot(penclln.lme$fitted, penclln.lme$resid,
+
xlab="Estimated Means",
+
ylab="Residuals",
+
main="Residual Plot")
> abline(h=0, lty=2, lwd=3)
5
> # Create residual plots
82
84
86
88
90
92
Estimated Means
803
804
Example 10.2 Pigment production
In this example the main objective is the
estimation of the variance components
0
-5
penclln.lme$resid
5
Source of
Variation
d.f.
MS E(MS)
Batches
15-1=14 86.495 e2 + 2Æ2 + 42
Samples(Batches) 15(2-1)=15 57.983 e22 + 2Æ2
Tests(Samples) (30)(2-1)=30 0.917 e
-2
-1
0
1
2
Quantiles of Standard Normal
805
806
/* This is a SAS program for analyzing
data from a nested or heirarchical
experiment. This program is posted
as
pigment.sas
The data are measurements of moisture
content of a pigment taken from Box,
Hunter and Hunter (page 574).*/
Estimates of variance components:
^e2
data set1;
infile 'pigment.dat';
input batch sample test y;
run;
=
^Æ2 =
MStests = 0:917
MSsamples MStests
^2
MSbatches MSsamples
= 7:128
=
2
= 28:533
proc print data=set1;
run;
4
808
807
/* Alternatively, REML estimates of variance
components are produced by the MIXED
procedure in SAS. Note that there are
no terms on the rigth of the equal sign in
the model statement because the only
non-random effect is the intercept.
*/
/* The "random" statement in the following
GLM procedure prints of formulas for
expectations of mean squares. These results
are used in variance component estimation */
proc glm data=set1;
class batch sample;
model y = batch sample(batch) / e1;
random batch sample(batch) / q test;
run;
proc mixed data=set1;
class batch sample test;
model y = ;
random batch sample(batch);
run;
/* Use the MIXED procedure in SAS to compute
maximum likelihood estimates of variance
components */
proc mixed data=set1 method=ml;
class batch sample test;
model y = ;
random batch sample(batch);
run;
809
810
OBS
BATCH
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
SAMPLE
TEST
Y
OBS
BATCH
SAMPLE
TEST
Y
1
1
2
2
1
1
2
2
1
1
2
2
1
1
2
2
1
1
2
2
1
1
2
2
1
1
2
2
1
1
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
40
39
30
30
26
28
25
26
29
28
14
15
30
31
24
24
19
20
17
17
33
32
26
24
23
24
32
33
34
34
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
8
8
9
9
9
9
10
10
10
10
11
11
11
11
12
12
12
12
13
13
13
13
14
14
14
14
15
15
15
15
2
2
1
1
2
2
1
1
2
2
1
1
2
2
1
1
2
2
1
1
2
2
1
1
2
2
1
1
2
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
1
2
29
29
27
27
31
31
13
16
27
24
25
23
25
27
29
29
31
32
19
20
29
30
23
24
25
25
39
37
26
28
1
1
1
1
2
2
2
2
3
3
3
3
4
4
4
4
5
5
5
5
6
6
6
6
7
7
7
7
8
8
811
812
General Linear Models Procedure
Class Level Information
Class
Levels
BATCH
15
SAMPLE
2
Values
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
1 2
Number of observations in the data set = 60
Source
Type I Expected Mean Square
BATCH
Var(Error) + 2 Var(SAMPLE(BATCH))
+ 4 Var(BATCH)
SAMPLE(BATCH)
Var(Error) + 2 Var(SAMPLE(BATCH))
Dependent Variable: y
Dependent Variable: Y
Source
DF
Model
29
Error
30
C.Total
59
Sum of Squares
Mean Square
F Value
Pr > F
71.7477
78.27
0.0001
2080.68
27.5000
0.9167
2108.18333333
Source
DF
Type I SS
Mean Square
F Value
Pr > F
BATCH
SAMPLE(BATCH)
14
15
1210.93
869.75
86.4952
57.9833
94.36
63.25
0.0001
0.0001
813
Source
DF
Type I SS
MS
batch
14
1210.933
86.495
Error:
15
MS(sample(batch))
869.750
57.983
Source
DF
Type I SS
sample(batch)
15
869.75
57.983
Error:
MS(Error)
30
27.50
0.917
MS
F
Pr>F
1.49
0.2256
F
Pr>F
63.25
<.0001
Covariance Parameter Estimates (REML)
The MIXED Procedure
Cov Parm
Class Level Information
Class
Levels
BATCH
SAMPLE
TEST
15
2
2
Ratio
BATCH
SAMPLE(BATCH)
Residual
Values
7.7760
31.1273
1.0000
Estimate
Std Error
7.1280
28.5333
0.9167
9.7373
10.5869
0.2367
Z
Pr > |Z|
0.73
2.70
3.87
0.4642
0.0070
0.0001
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
1 2
1 2
Model Fitting Information for Y
Description
REML Estimation Iteration History
Iteration
Evaluations
Objective
0
1
1
1
Observations
Variance Estimate
Standard Deviation Estimate
REML Log Likelihood
Akaike's Information Criterion
Schwarz's Bayesian Criterion
-2 REML Log Likelihood
Criterion
274.08096606
183.82758851
Value
0.0000000
Convergence criteria met.
60.0000
0.9167
0.9574
-146.131
-149.131
-152.247
292.2623
814
815
Covariance Parameter Estimates (MLE)
The MIXED Procedure
Cov Parm
Class Level Information
Class
Levels
BATCH
SAMPLE
TEST
15
2
2
Values
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
1 2
1 2
BATCH
SAMPLE(BATCH)
Residual
Ratio
Estimate
Std Error
Z
6.2033
31.1273
1.0000
5.68639
28.53333
0.91667
9.07341
10.58692
0.23668
0.63
2.70
3.87
Model Fitting Information for Y
Description
ML Estimation Iteration History
Iteration
Evaluations
Objective
Criterion
0
1
1
1
273.55423884
184.15844023
0.00000000
Observations
Variance Estimate
Standard Deviation Estimate
Log Likelihood
Akaike's Information Criterion
Schwarz's Bayesian Criterion
-2 Log Likelihood
Value
60.0000
0.9167
0.9574
-147.216
-150.216
-153.357
294.4311
Convergence criteria met.
816
817
Pr > |Z|
0.5309
0.0070
0.0001
Estimation of = E (Yijk):
^ = Y ::: =
1
A 95% condence interval for b X
s X
r
X
Y ::: t14;:025SY :::
Y
bsr i=1 j =1 k=1 ijk
%
E (Y :::) = V ar(Y :::) =
Standard error:
SY :::
=
=
=
s
s
df for MSBatches
1
(2 + rÆ2 + Sr2)
bsr e
Here, t14;:025 = 2:510 and the condence
interval is
1 ^2 + r^2 + sr^2
Æ
bsr e
26:783 (2:510)(1:4416)
1
(MSBatches)
bsr
s
86:495 = 1:4416
) (23:16; 30:40)
60
818
> # This file is stored as
>
+
>
>
>
1
2
3
4
5
6
7
8
9
10
11
12
.
.
pigment.spl
pigment <- read.table("pigment.dat",
col.names=c("Batch","Sample","Test","Y"))
pigment$Batch <- as.factor(pigment$Batch)
pigment$Sample <- as.factor(pigment$Sample)
pigment
Batch Sample Test Y
1
1
1 40
1
1
2 39
1
2
1 30
1
2
2 30
2
1
1 26
2
1
2 28
2
2
1 25
2
2
2 26
3
1
1 29
3
1
2 28
3
2
1 14
3
2
2 15
.
.
. .
.
.
. .
819
>
>
>
>
>
>
>
#
#
#
#
#
#
#
The function raov() may be used for
balanced designs with only random effects,
and gives a conventional analysis including
the estimation of variance components.
The function varcomp() is more general. It
may be used to estimate variance components
for balanced or unbalanced mixed models.
> # raov(): Random Effects Analysis of Variance
>
> pigment.raov <- raov(Y ~ Batch/Sample,
+
data=pigment)
>
>
>
>
>
820
# or you could use
# pigment.raov <- raov(Y ~ Batch +
#
Batch:Sample, data=pigment)
# pigment.raov <- raov(Y ~ Batch +
#
Sample %in% Batch, data=pigment)
821
> summary(pigment.raov)
Df Sum of Sq
Batch 14 1210.933
Sample %in% Batch 15 869.750
Residuals 30
27.500
Mean Sq Est. Var.
86.49524 7.12798
57.98333 28.53333
0.91667 0.91667
> names(pigment.raov)
[1] "coefficients" "residuals"
[4] "effects"
"R"
[7] "assign"
"df.residual"
[10] "terms"
"call"
[13] "replications" "ems.coef"
"fitted.values"
"rank"
"contrasts"
"model"
> pigment.raov$rep
Batch Sample %in% Batch
4
2
> pigment.raov$ems.coef
Batch Sample %in% Batch Residuals
Batch
4
0
0
Sample %in% Batch
2
2
0
Residuals
1
1
1
>
>
>
>
>
>
>
>
>
#
#
#
#
#
#
The same variance component estimates can be
found using varcomp(), but as this allows mixed
models we first must declare which facotrs are
random using the is.random() function.
All factors in the data frame are declared to
be random effect by
is.random(pigment) <- T
is.random(pigment)
Batch Sample
T
T
>
>
>
>
>
>
>
#
#
#
#
#
#
The possible estimation methods are
"minque0": minimum norm quadratic estimators
(the default)
"reml" : residual (or reduced or restricted)
maximum likelihood.
"ml"
: maximum likelihood.
822
823
Properties of ANOVA methods for variance
component estimation:
> varcomp(Y ~ Batch/Sample, data=pigment,
+
method="reml")$var
Batch Sample %in% Batch Residuals
7.12866
28.53469 0.916641
> varcomp(Y ~ Batch/Sample, data=pigment,
+
method="ml")$var
(i) Broad applicability
easy to compute (balanced cases)
ANOVA is widely known
not required to completely specify
distributions for random eects
(ii) Unbiased estimators
(iii) Sampling distribution is not exactly known,
even under the usual normality assumptions (except for ^e2 = MSerror)
Batch Sample %in% Batch Residuals
5.68638
28.53333 0.9166668
(iv) May produce negative estimates of
variances
824
825
Result 10.2:
(v) REML estimates have the same values
in simple balanced cases
where ANOVA estimates of variance
components are inside the parameter
space
(vi) For unbalanced studies, there may be no
\natural" way to choose
^2 =
k
X
i=1
aiMSi
If MS1; MS2; : : : ; MSk are distributed independently with
(dfi)MSi 2
dfi
E (MSi)
and constants ai > 0; i = 1; 2; : : : ; k are
selected so that
k
X
^2 =
aiMSi
i=1
has expectation 2, then
V ar(^2) = 2
a2i [E (MSi)]2
dfi
i=1
k
X
826
827
From the independence of the MSi's, we have
and an unbiased estimator of this variance is
a2MS 2
Vd
ar(^2) = i i
(dfi + 2)
V ar(^2)
=
k
X
a2i V ar(MSi)
i=1
k
X
= 2
a2i [E (MSi)]2
dfi
i=1
Furthermore,
)MSi 2 and E (2 ) = dfi
Proof: Since (Edf(iMS
dfi
dfi
i)
and V ar(2dfi ) = 2dfi, it follows that
E (MSi2)
=
V ar(MSi) + [E (MSi)]2
2
= 2[E (dfMSi)] + [E (MSi)]2
i
E (MSi)2dfi
2[E (MSi)]2 :
)
=
V ar(MSi) = V ar(
dfi
dfi
=
Consequently,
dfi + 2
[E (MSi)]2
dfi
!
a2i MSi2 5
E 42
= V ar(^2)
i=1 (dfi + 2)
2
828
k
X
3
829
Using the Cochran-Satterthwaite approximation (Result 10.1), an approximate (1 ) 100% condence interval for 2 could be constructed as follows:
A \standard error" for
k
X
^2 =
aiMSi
1
i=1
could be reported as
S^2 =
v
u
u
u
t
=_
=
(
P r 2;1
8
>
<
v^2
2
=2 2 ;=2
9
=
v^2
v^2 >
Pr > 2
2 ;1 =2 >
:
;
;=2
"
2 i2
2 (adfi MS
i=1 i + 2)
k
X
)
lower
limit
where ^2 = Pki=1 aiMSi and
v=
hP
k
"
upper
limit
2
i
i=1 aiMSi
Pk
[aiMSi]2
i=1 dfi
830
831
10.3 Likelihood-based methods:
Maximum Likelihood Estimation
Consider the mixed model
Yn1 = X p1 + Z uq1 + en1
where " #
"
# "
#!
u
0
G 0
N
;
e
0
0 R
Then,
Multivariate normal likelihood:
L(; ; Y) = (2) n=2jj 1=2
exp
Yn1 N (X ; )
where = ZGZ T + R
Maximum Likelihood Estimation (ML)
Restricted Maximum Likelihood Estima-
tion (REML)
832
1 (Y
2
X )T 1(Y X )
The log-likelihood function is
n
1
`(; ; Y) =
2 log(2) 2 log(jj)
1
T
1
2 (Y X ) ( Y
X )
Given the values of the observed responses, Y,
nd values and that maximize the loglikelihood function.
833
This is a diÆcult computational problem:
Constrained optimization problem
estimates of variances cannot
be negative
{ estimated correlations are between
-1 and 1
^ ; G^, and R^ are positive denite (or
{ non-negative denite)
{
no analytic solution (except in some
balanced cases)
use iterative numerical methods
starting values (initial guesses at the
^ T + R^.
values of ^ and ^ = Z GZ
{ local or global maxima?
{
{
what if ^ becomes singular or is not
positive denite?
Large sample distributional properties of
estimators follow from the general theory
of maximum likelihood estimation
{ consistency
{ normality
{ eÆciency
not
guaranteed for ANOVA methods
834
Estimates of variance components tend to
be too small
Consider a sample Y1; : : : ; Yn from a N (; 2)
distribution. An unbiased estimator for 2 is
S2 =
n
1
n
X
835
Note that S 2 and ^2 are based on \error
contrasts"
e1 = Y1 Y = n n 1 ; n1 ; : : : ; n1 Y
..
en = Yn Y = n1 ; n1 ; : : : ; n1 ; n n 1 Y
whose distribution does not depend on
= E (Yj ) :
When Y N (1; 2I ),
(Yj Y )2
1 j=1
2
e=
The MLE for 2 is
1 Xn (Yj Y )2
^2 =
n j =1
with
n 1 2
E (^2) =
< 2
n
6
4
e1
3
.. 75 = (I
en
P1)Y N (0; 2(I
P1))
The MLE ^2 = n1 Pnj=1 e2j fails to
acknowledge that e is restrictedPto an
(n 1)-dimensional space, i.e., nj=1 ej = 0.
836
The MLE fails to make the appropriate adjustment in \degrees of freedom" needed to obtain
an unbiased estimator for 2.
837
Example: Suppose n = 4 and Y N (1; 2I ).
Then
2
3
Y1 Y
6
Y Y 7
e = 664 2 775 = (I P1)Y N (0; 2(I P1))
Y3 Y
Y4 Y
"
This covariance
matrix is singular.
The density
function does
not exist.
Here, m = rank(I
Dene
P1 ) = n
r = M e = M (I
where
1 = 3.
2
6
4
m
=
M (I
Y1 + Y2 Y3 Y4
7
6
Y2 + Y3 Y4
5 = 4 Y1
Y1 Y2 Y3 + Y4
3
2
3
7
5
P1 ) Y
N (0; 2M (I P1)M T )
"
call this 2W
Restricted Likelihood function:
1
L(2; r) =
(2)M=2j2W j1=2 e
1 T 1
22 r W r
P1)M T ) 1M (I
%
Estimate parameters in
P1)Y
This is a projection of Y onto the
column space of M (I P1) which is
the column space of I P1
= m1 YT (I P1)Y
n
= n 1 1 X (Yj Y )2 = S 2
j =1
840
839
REML (Restricted Maximum Likelihood)
estimation:
Solution (REML estimator for 2):
= m1 rT W 1r
= 1 YT (I P1)T M T (M (I
=
r1
6
4 r2
r3
(Note that j2W j = (2)mjW j)
PX ).
838
(Restricted) likelihood equation:
2
0 = @`(@ 2; r) = 2m2 + 2(12)2 rT W 1r
2
^REML
r
2
Restricted Log-likelihood:
m
m
2
`(2; r) =
2 log(2) 2 log( )
1 logjW j 1 rT W 1r
2
22
PX )Y
1 1 1 1 37
M= 1 1 1 15
1 1 1 1
has row rank equal to m = rank(I
Then
= ZGZ T + R
by maximizing the part of the likelihood
that does not depend on E (Y) = X .
Maximize a likelihood function for
\error contrasts"
{ linear combinations of observations that
do not depend on X { will need a set of
n rank(X )
linearly independent \error contrasts"
841
Mixed (normal-theory) model:
Y = X + Zu + e
where
"
u
e
#
N
Then
LY
"
0
0
; G
0
0 R
# "
To avoid losing information we must have
row rank(M ) = n rank(X )
= n p
#!
Then a set of n p error contrasts is
r = M (I PX )Y
Nn p(0; M (I PX ) 1(I PX )M T )
= L(X + Z u + e)
= LX + LZ u + Le
%
is invariant to X if and only if LX = 0.
But LX = 0 if and only if
L = M (I PX )
for some M.
(Here PX = X (X T X )
call this W ,
then rank(W ) = n
and W 1 exists.
XT )
842
The "Restricted" likelihood is
L(; r) =
1T 1
1
e 2r W r
(
n
p
)
=
2
1
=
2
(2)
jW j
The resulting log-likelihood is
`(; r)
p
= (n2 p) log(2) 21 logjW j
1 rT W 1 r
2
844
843
For any M(n p)n with row rank equal to
n p = n rank(X )
the log-likelihood can be expressed in terms of
e = (I X (X 1 X T ) X T 1 ) Y
as
1
`(; e) = constant log(jj)
2
1 log(jX T 1Xj) 1 eT 1e
2
2
where X is any set of p =rank(X ) linearly
independent columns of X .
Denote the resulting REML estimators as
^ = Z GZ
^ T + R^
G^ R^ and 845
Estimation of xed eects:
Prediction of random eects:
For any estimable function C , the
blue is the generalized least squares estimator
C bGLS = C (X T 1X ) X T 1Y
Given the observed responses Y, predict the
value of u.
Using the REML estimator for
= ZGZ T + R
an approximation is
^ 1X )
C ^ = C (X T "
u
e
#
"
N
0
0
; G
0
0 R
# "
#!
:
Then (from result 4.1)
"
#
"
#
u
u
= X + Zu + e
Y
^ 1Y
XT and for \large" samples:
C ^ _ N (C ; C (X T 1X )
For our model,
=
"
X
N
CT )
#
0
"
+
"
I 0
Z I
#"
u
e
#
# "
G GZ T
;
X
ZG ZGZ T + R
0
846
The Best Linear Unbiased Predictor (BLUP):
is the b.l.u.e. for
E (ujY)
= E (u) + (GZ T )(ZGZ T + R) 1(Y E (Y))
= 0 + GZ T (ZGZ T + R) 1(Y X )
(from result 4.5)
"
substitute
the b.l.u.e.
for X X bGLS = X (X T 1X ) X T 1Y
Then, the BLUP for u is
BLUP (u) = GZ T 1(Y X bGLS )
= GZ T 1(I X (X T 1X )
when G and = ZGZ T + R are known.
X T 1 )Y
847
Substituting REML estimators G^ and R^ for G
and R, an approximate BLUP for u is
u
^
^ T ^
= GZ
^ T ^
= GZ
1 (I X ( X T ^ 1X ) X T ^ 1)Y
1(Y X ^ )
" the approximate
b.l.u.e. for X .
For \large" samples, the distribution of u^ is
approximately multivariate normal with mean
vector 0 and covariance matrix
GZ T 1(I P )(I P ) 1ZG
where
P
848
#!
= X (X T 1X )
XT 1
849
:
References
^ R^ and ^ = Z GZ
^ T + R^;
Given estimates G;
^ and u^ provide a solution to the mixed model
equations:
"
X T R^ 1X X T R^
Z T R^ 1 Z T R^
#"
#
1Z
^
1Z + G^ 1 u^
=
"
X T R^
Z T R^
1Y #
1Y
A generalized inverse of
"
#
X T R^ 1X X T R^ 1Z
Z T R^ 1 Z T R^ 1Z + G^ 1
is used
"
#to approximate the covariance matrix
^
for u^
850
Lindstrom, M.J. and Bates, D.M. (1988) NewtonRaphson
and EM algorithms for linear-mixed-eects models
for repeated-measures data. Journal of the
American Statistical Association, 83, 1014-1022.
Littel, R.C., Milliken, G.A., Stroup, W.W., and
Wolnger, R.D. (1997) "SAS Systems for Mixed
Models", SAS Institute.
Pinheiro, J.C. and Bates., D.M. (1996) "Unconstrained
Parametrizations for Variance-Covariance Matrices",
Statistics and Computing, 6, 289-296.
Pinheiro, J.C. and Bates., D.M. (2000) "Mixed Eects
Models in S and S-PLUS", New York, SpringerVerlag.
Robinson, G.K. (1991) That BLUP is a good thing: the
estimation of random eects, Statistical Science,
6, 15-51.
Searle, S.R., Casella, G. and McCulloch, C.E. (1992)
Variance Components, New York, John Wiley &
Sons.
Wolnger, R.D., Tobias, R.D. and Sall, J. (1994)
Computing Gaussian likelihoods and their derivatives
for general linear mixed models, SIAM Journal
on Scientic Computing, 15(6), 1294-1310.
852
Bates, D.M. and Pinheiro, J.C. (1998) "Computational
methods for multilevel models" available in
PostScript or PDF formats at
http://franz.stat.wisc.edu/pub/NLME/
Davidian, M. and Giltinan, D.M. (1995) "Nonlinear
Mixed Eects Models for Repeated Measurement
Data", Chapman and Hall.
Hartley, H.O. and Rao, J.N.K. (1967) Maximumlikelihood estimation for the mixed analysis
of variance model, Biometrika, 54, 93-108.
Harville, D.A. (1977) Maximum likelihood
approaches to variance component estimation and
to related problems, Journal of the American
Statistical Association, 72, 320-338.
Jennrich, R.I. and Schluchter, M.D. (1986) Unbalanced
repeated-measures models with structured covariance
matrices, Biometrics, 42, 805-820.
Kackar, R.N. and Harville, D.A. (1984) Approximations
for standard errors of estimators of xed and
random eects in mixed linear models. Journal
of the American Statistical Association, 79, 853862.
Laird, N.M. and Ware, J.H. (1982) Random-eects
models for longitudinal data, Biometrics,
38, 963-974.
851
Download