Document 10639954

advertisement
Linear Mixed-Effects Models
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
1 / 34
The Linear Mixed-Effects Model
y = Xβ + Zu + e
X is an n × p design matrix of known constants
β ∈ Rp is an unknown parameter vector
Z is an n × q matrix of known constants
u is a q × 1 random vector
e is an n × 1 vector of random errors
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
2 / 34
The Linear Mixed-Effects Model
y = Xβ + Zu + e
The elements of β are considered to be non-random and are
called “fixed effects.”
The elements of u are random variables and are called “random
effects.”
The elements of the error vector e are always considered to be
random variables.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
3 / 34
Because the model includes both fixed and random effects (in
addition to the random errors), it is called a “mixed-effects” model
or, more simply, a “mixed” model.
The model is called a “linear” mixed-effects model because (as we
will soon see)
E(y|u) = Xβ + Zu,
a linear function of fixed and random effects.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
4 / 34
We assume that
E(e) = 0
Var(e) = R
E(u) = 0
Var(u) = G
Cov(e, u) = 0.
It follows that
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
5 / 34
E(y) = E(Xβ + Zu + e)
= Xβ + ZE(u) + E(e)
= Xβ
Var(y) = Var(Xβ + Zu + e)
= Var(Zu + e)
= Var(Zu) + Var(e)
= ZVar(u)Z0 + R
= ZGZ0 + R ≡ Σ.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
6 / 34
We usually consider the special case in which
"
u
e
#
"
∼N
0
0
# "
,
G 0
0
#!
R
=⇒ y ∼ N(Xβ, ZGZ0 + R).
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
7 / 34
The conditional moments, given the random effects, are
E(y|u) = Xβ + Zu
Var(y|u) = R.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
8 / 34
Example 1
Suppose a study was conducted to compare two plant genotypes
(genotype 1 and genotype 2). Suppose 10 seeds (5 of genotype 1 and
5 of genotype 2) were planted in a total of 4 pots. Suppose 3 genotype
1 seeds were planted in one pot, and the other 2 genotype 1 seeds
were planted in another pot. Suppose the same planting strategy was
used when planting genotype 2 seeds in the other two pots. The seeds
germinated and emerged from the soil as seedlings. After a four-week
growing period, each seedling was dried and weighed. Let yijk denote
the weight for genotype i, pot j, seedling k. Provide a linear
mixed-effects model for the dry-weight data. Determine y, X, β, Z, u,
G, R, and Var(y).
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
9 / 34
Consider the model
yijk = µ + γi + pij + eijk
i.i.d.
p11 , p12 , p21 , p22 ∼ N(0, σp2 )
independent of the eijk terms, which are assumed to be iid N(0, σe2 ).
This model can be written in the form
y = Xβ + Zu + e, where
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
10 / 34












y=










y111


1 1 0




 1 1 0 
y112 






 1 1 0 
y113 







 1 1 0 
y121 



µ
 1 1 0 

y122 




 , β =  γ1
, X = 
 1 0 1 
y211 



γ2



 1 0 1 
y212 



 1 0 1 
y213 






 1 0 1 
y221 



1 0 1
y222
c
Copyright 2012
Dan Nettleton (Iowa State University)


,

Statistics 611
11 / 34












Z=










1 0 0 0

1 0 0 0 


1 0 0 0 



0 1 0 0 



0 1 0 0 


, u = 

0 0 1 0 



0 0 1 0 

0 0 1 0 


0 0 0 1 

0 0 0 1
c
Copyright 2012
Dan Nettleton (Iowa State University)


p11


p12 

,
p21 

p22











e=










e111


e112 


e113 


e121 

e122 

.
e211 


e212 

e213 


e221 

e222
Statistics 611
12 / 34
G = Var(u) = Var([p11 , p12 , p21 , p22 ]0 ) = σp2 I4×4
R = Var(e) = σe2 I10×10
Var(y) = ZGZ0 + R = Zσp2 IZ0 + σe2 I = σp2 ZZ0 + σe2 I.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
13 / 34

ZZ0











= 










1 1 1 0 0 0 0 0 0 0


1 1 1 0 0 0 0 0 0 0 


1 1 1 0 0 0 0 0 0 0 


0 0 0 1 1 0 0 0 0 0 

0 0 0 1 1 0 0 0 0 0 


0 0 0 0 0 1 1 1 0 0 


0 0 0 0 0 1 1 1 0 0 

0 0 0 0 0 1 1 1 0 0 


0 0 0 0 0 0 0 0 1 1 

0 0 0 0 0 0 0 0 1 1
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
14 / 34
Thus, Var(y) = σp2 ZZ0 + σe2 I is a block diagonal matrix.
The first block is

y111


 

 
Var 
 y112  = 
y113
c
Copyright 2012
Dan Nettleton (Iowa State University)
σp2 + σe2
σp2
σp2

σp2
σp2 + σe2
σp2
σp2
σp2
σp2 + σe2

.

Statistics 611
15 / 34
Note that
Var(yijk ) = σp2 + σe2
Cov(yijk , yijk∗ ) = σp2
Cov(yijk , yi∗ j∗ k∗ ) = 0
∀ i, j, k.
∀ i, j, and k 6= k∗ .
if i 6= i∗ or j 6= j∗ .
Any two observations from the same pot have covariance σp2 .
Any two observations from different pots are uncorrelated.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
16 / 34
Note that Var(y) may be written as σe2 V where V is a block
diagonal matrix with blocks of the form

1 + σp2 /σe2
σp2 /σe2
.
.
.
σp2 /σe2











σp2 /σe2
1 + σp2 /σe2
.
.
.
σp2 /σe2
.
.
.
.
.










.
.
σp2 /σe2
σp2 /σe2
.
.
.
.
.
.
.
. 1 + σp2 /σe2
Thus, if σp2 /σe2 were known, we would have the Aitken Model.
y = Xβ + ε, where ε = Zu + e ∼ N(0, σ 2 V), σ 2 ≡ σe2 .
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
17 / 34
Thus, if σp2 /σe2 were known, we would use GLS to estimate any
estimable Cβ by Cβ̂ GLS = C(X0 V −1 X)− X0 V −1 y.
However, we seldom know σp2 /σe2 or, more generally, Σ or V.
For the general problem where Var(y) = Σ is an unknown positive
definite matrix, we can rewrite Σ as σ 2 V, where σ 2 is an unknown
positive variance and V is an unknown positive definite matrix.
As in our simple example, each entry of V is usually assumed to
be a known function of few unknown parameters.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
18 / 34
Thus, our strategy for estimating an estimable Cβ involves
estimating the unknown parameters in V to obtain
0 −1
− 0 −1
Cβ̂ GLS
d = C(X V̂ X) X V̂ y.
In general,
0 −1
− 0 −1
Cβ̂ GLS
d = C(X V̂ X) X V̂ y
is an nonlinear estimator that is an approximation to
Cβ̂ GLS = C(X0 V −1 X)− X0 V −1 y,
which would be the BLUE of Cβ if V were known.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
19 / 34
In special cases, Cβ̂ GLS
d may be a linear estimator.
For example, if there exists a matrix Q such that VX = XQ, then
we know that
Cβ̂ GLS = Cβ̂ and Cβ̂ GLS
d = Cβ̂,
which is a linear estimator of Cβ.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
20 / 34
However, even for our simple example involving seedling dry
weight, Cβ̂ GLS
d is a nonlinear estimator of Cβ for
C = [1, 1, 0] ⇐⇒ Cβ = µ + γ1 ,
C = [1, 0, 1] ⇐⇒ Cβ = µ + γ2 , and
C = [0, 1, −1] ⇐⇒ Cβ = γ1 − γ2 .
Confidence intervals and tests for these estimable functions are
not exact.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
21 / 34
In our simple example involving seedling dry weight, there was
only one random factor (pot).
When there are m random factors, we can partition Z and u as


u1
 . 
. 
Z = [Z1 , . . . , Zm ] and u = 
 . ,
um
where uj is the vector of random effects associated with factor j
(j = 1, . . . , m).
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
22 / 34
We can write Zu as


u1
m
 .  X


.
[Z1 , . . . , Zm ]  .  =
Zj uj .
j=1
um
We often assume that all random effects (including random errors)
are mutually independent and that the random effects associated
with the jth random factor have variance σj2 (j = 1, . . . , m).
Under these assumptions,
Var(y) = ZGZ0 + R =
c
Copyright 2012
Dan Nettleton (Iowa State University)
m
X
σj2 Zj Z0j + σe2 I.
j=1
Statistics 611
23 / 34
Example 2
Consider an experiment involving 4 litters of 4 animals each.
Suppose 4 treatments are randomly assigned to the 4 animals in
each litter.
Suppose we obtain two replicate muscle samples from each
animal and measure the response of interest for each muscle
sample.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
24 / 34
Let yijk denote the kth measure of the response for the animal from
litter j that received treatment i (i = 1, 2, 3, 4; j = 1, 2, 3, 4; k = 1, 2)
Suppose
yijk = µ + τi + `j + aij + eijk ,
where
β = [µ, τ1 , τ2 , τ3 , τ4 ]0 ∈ R5
is an unknown vector of fixed parameters,
u = [`1 , `2 , `3 , `4 , a11 , a21 , a31 , a41 , a12 , . . . , a34 , a44 ]0
is a vector of random effects, and
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
25 / 34
e = [e111 , e112 , e212 , . . . , e411 , e412 , . . . , e441 , e442 ]0
is a vector of random errors.
With
y = [y111 , y112 , y212 , . . . , y411 , y412 , . . . , y441 , y442 ]0 ,
we can write the model as a linear mixed-effects model
y = Xβ + Zu + e,
where
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
26 / 34


1 1 0 0 0
1 1 0 0 0 


1 0 1 0 0 


1 0 1 0 0 
1 0 0 1 0 


1 0 0 1 0 


1 0 0 0 1 

X=
1 0 0 0 1  ,






 The matrix 


above repeated
 three more 




times.
c
Copyright 2012
Dan Nettleton (Iowa State University)

1
1
1

1

1

1

1

1
.
Z=
 ..
0

0

0

0

0

0

0
0
0
0
0
0
0
0
0
0
..
.
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
..
.
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
..
.
1
1
1
1
1
1
1
1
1
1
0
0
0
0
0
0
..
.
0
0
0
0
0
0
0
0
0
0
1
1
0
0
0
0
..
.
0
0
0
0
0
0
0
0
0
0
0
0
1
1
0
0
..
.
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
..
.
0
0
0
0
0
0
0
0
...
...
...
...
...
...
...
...
..
.
...
...
...
...
...
...
...
...
Statistics 611

0
0
0

0

0

0

0

0
.. 

.
0

0

0

0

0

0

1
1
27 / 34
We can write less and be more precise using Kronecker product
notation.
X = 1 ⊗ [ 1 , I ⊗ 1 ],
4×1
8×1 4×4
2×1
Z=[ I ⊗ 1 , I
4×4
8×1 16×16
⊗ 1 ].
2×1
In this experiment, we have two random factors: litter and animal.
We can partition our random effects vector u into a vector of litter
effects and a vector of animal effects:
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
28 / 34
u=
" #
`
a
 
`
 1
` 
 2
` =  ,
`3 
 
`4
,
 
a11
 
a21 
 
 
a31 
 
 
a = a41  ,
 
a12 
 
 . 
 .. 
 
a44
We make the usual assumption that
u=
" #
`
a
∼N
" # "
#!
0
σ`2 I
0
,
,
0
0 σa2 I
where σ`2 , σa2 ∈ R+ are unknown parameters.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
29 / 34
We can partition
Z = [ I ⊗ 1 , I
4×4
8×1 16×16
⊗ 1 ]
2×1
= [Z` , Za ].
We have
Zu = [Z` , Za ]
" #
`
a
= Z` ` + Za a
and
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
30 / 34
Var(Zu) = ZGZ0
"
= [Z` , Za ]
σ`2 I
0
#"
0 σa2 I
Z0`
#
Z0a
= Z` (σ`2 I)Z0` + Za (σa2 I)Z0a
= σ`2 Z` Z0` + σa2 Za Z0a
= σ`2 I ⊗ 110 + σa2 I
4×4
c
Copyright 2012
Dan Nettleton (Iowa State University)
8×8
16×16
⊗ 110 .
2×2
Statistics 611
31 / 34
We usually assume that all random effects and random errors are
mutually independent and that the errors (like the effects within each
factor) are identically distributed:

  
 
σ`2 I
0
0
0
`

  
 
 .
a ∼ N 0 ,  0 σ 2 I
0
a

  
 
0
0 σe2 I
0
e
The unknown variance parameters σ`2 , σa2 , σe2 ∈ R+ are called
variance components.
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
32 / 34
In this case, we have R = Var(e) = σe2 I.
Thus,
Var(y) = ZGZ0 + R
= σ`2 Z` Z0` + σa2 Za Z0a + σe2 I.
This is a block diagonal matrix with a block as follows.
(To get a block to fit on one slide, let ` = σ`2 , a = σa2 , e = σe2 ).
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
33 / 34
`+a+e
`+a
`
`
`
`
`
`
`+a
`+a+e
`
`
`
`
`
`
`
`
`+a+e
`+a
`
`
`
`
`
`
`+a
`+a+e
`
`
`
`
`
`
`
`
`
`+a+e
`+a
`
`
`
`
`
`+a
`+a+e
`
`
`
`
`
`
`
`
`+a+e
`+a
`
`
`
`
`
`
`+a
`+a+e
c
Copyright 2012
Dan Nettleton (Iowa State University)
Statistics 611
34 / 34
Download