EM algorithm for mixed effects model 1/15

advertisement
EM algorithm for mixed effects
model
1/15
A linear mixed effects model
I
Consider the following linear mixed effects model
Yij = β0 + ui + xijT β + εij ,
iid
iid
where εij ∼ N(0, σe2 ) and ui ∼ N(0, σu2 ), εij and ui are
independent.
I
The unknown parameters include δ = (β0 , β, σe2 , σu2 ).
2/15
The joint likelihood
The joint likelihood for δ is
L(δ) =
m
Y
f (yi ) =
i=1
=
=
f (yi |ui )f (ui )dui
j=1
ni
m Z Y
Y
i=1
f (yi , ui )dui
i=1
ni
m Z Y
Y
i=1
m Z
Y
√
j=1
×√
1
2πσu
1
2πσe
exp
exp
n
−
n
−
(yij − β0 − ui − xijT β)2 o
2σe2
ui2 o
dui
2σu2
3/15
Maximum likelihood estimators
The joint likelihood for δ then can be written as
ni
m
n
o
Y
p
ai b2
1 X
L(δ) =
ci 2πai exp − 2
(yij − β0 − xijT β)2 exp( i )
2
2σe j=1
i=1
where
1 n i 1
√
ci = √
2πσe
2πσu
n
1
ai−1 = 2i + 2
σe
σu
n
i
1 X
(yij − β0 − xijT β)
bi = 2
σe j=1
Thus, the maximum likelihood estimator of δ is
δ̂ = arg max L(δ).
δ
4/15
EM algorithm
For the above linear mixed effects model,
I
Complete data: (yi , ui );
I
Observed data: yi .
The log-likelihood for complete data is
m
m
1 X
ni ) log(2πσe2 ) − log(σu2 )
`(δ; y , u) = − (
2
2
i=1
m ni
m
1 XX
1 X 2
T
2
− 2
(yij − β0 − ui − xij β) − 2
u .
2σe i=1 j=1
2σu i=1 i
5/15
EM algorithm
2(0)
Step 1: Starting with some initial values for σu
2(0)
, σe
(0)
, β0 and
β (0) .
Step 2: (E-step) Evaluate the expectation of the log-likelihood for
complete data given the observed data and estimate in the
last iteration. Namely,
Q(δ, δ (k−1) ) = E[`(δ; y, u)|y ; δ (k−1) ].
Step 3: (M-step) Update δ by
δ (k) = arg max Q(δ, δ (k−1) ).
δ
Step 4: Repeat Step 2 and 3 until convergence.
6/15
M-step when σu2 and σe2 are known
If σu2 and σe2 are known, let δ = (β0 , β)T , then M-step include
solving the following estimating equations
ni m X
hX
i
∂Q(δ, δ (k−1) )
1
=E
(yij − β0 − ui − xijT β)|y ; δ (k−1) .
∂δ
xij
i=1 j=1
Therefore, we can update the parameters (β0 , β)T by
(k) β0
β (k)
=
ni ni m X
m X
hX
i−1 X
1
1
(1, xijT )
{yij −E(ui |y; δ (k−1) )}.
xij
xij
i=1 j=1
i=1 j=1
7/15
M-step
Because we know the joint distribution of ui and yi is normal,
specifically



 

ui
0
σu2
σu2 1T

 ∼ N 
,
 ,
yi
β0 1 + Xi β
σu2 1 σe2 Ini + σu2 11T
where Xi = (Xi1 , · · · , Xini )T is an ni × p matrix, it can be shown
that
(k−1)
E(ui |y; δ (k−1) ) = σu2 1T (σe2 Ini +σu2 11T )−1 (yi −β0
1−Xi β (k−1) ).
8/15
EM algorithm when σu2 and σe2 are known
(0)
Step 1: Starting with some initial values for β0 and β (0) .
Step 2: At the k -th iteration, update β0 , β by
(k) h X
ni m X
i−1
β0
1
T
=
(1,
x
)
ij
xij
β (k)
i=1 j=1
ni m X
X
1
×
{yij − E(ui |y; δ (k −1) )}.
xij
i=1 j=1
where
(k−1)
E(ui |y ; δ (k−1) ) = σu2 1T (σe2 Ini +σu2 11T )−1 (yi −β0
1−Xi β (k−1) ).
Step 3: Repeat Step 2 until convergence.
9/15
Logistic regression model with random intercept
Consider the logistic regression model with random intercept as
following:
Yij |Ui
independent
∼
log
pij
1 − pij
Bernoulli(pij );
= γi + xijT β,
iid
where γi = β0 + Ui and Ui ∼ N(0, σu2 ) for i = 1, · · · , m and
j = 1, · · · , ni .
10/15
Likelihood function and MLE
The likelihood function for δ = (β0 , β T , σu2 )T is
L(δ) =
ni
m Z Y
Y
i=1
×√
exp(β0 + Ui + xijT β)yij (1 + exp(β0 + Ui + xijT β))yij −1
j=1
1
2πσu
exp(−
ui2
)dui .
2σu2
Therefore, the maximum likelihood estimator for δ is
δ̂ = arg max L(δ).
For many cases, the likelihood function L(δ) does not have a
closed form. Numerical methods are required to obtain the
MLE.
11/15
EM algorithm when σu2 is known
(0)
Step 1: Starting with some initial values for β0 and β (0) .
Step 2: At the k -th iteration, update β0 , β by solving the following
estimating equations
ni m X
X
1
{yij − E(pij (Ui )|y; δ (k −1) )} = 0
xij
i=1 j=1
where
pij (ui ) =
exp(β0 + ui + xijT β)
1 + exp(β0 + ui + xijT β)
.
Step 3: Repeat Step 2 until convergence.
12/15
EM algorithm
By the definition of pij (Ui ), in the above algorithm, we need to
evaluate the following
E(pij (Ui )|yi ; δ
(k−1)
Z
)=
exp(β0 + ui + xijT β)
1 + exp(β0 + ui +
xijT β)
f (ui |yi ; δ (k −1) ).
In practice, the above intergration could be evaluated by Monte
Carlo simulation. But it might be still challenging. An alternative
approach is to replace E(pij (Ui )|yi ; δ (k−1) ) by pij (Ûi ) where Ûi is
the BLUP of Ui . This would avoid the use of integration and
results in an algorithm that is similar to IRWLS method.
13/15
EM algorithm
I
Let Zij be the surrogate response and Zi = (Zi1 , · · · , Zini )T
where
Zij = log(
I
pij
1
) + (yij − pij )
.
1 − pij
pij (1 − pij )
Let Vi = Qi + σu2 11T where Qi = Diag( p
1
ij (1−pij )
).
14/15
EM algorithm
(0)
2(0)
Step 1: Starting with some initial values for β0 , β (0) and σu
.
Step 2: At the k -th iteration, update β0 , β by
(k) m
m
X
X
β0
−1(k−1)
T −1(k−1)
−1
=(
Xi Vi
Xi )
XiT Vi
Zi
(k)
β
i=1
i=1
2(k−1) T −1
1 Vi (Zi
− Xi β (k−1) ) and


ni
m
m  X

X
X
1
−2(k−2)
2 1
Ûi +
(
pij (1 − pij )) + σ̂u
.
=


m
m
where Ûi = σu
2(k−1)
σu
i=1
i=1
j=1
Step 3: Repeat Step 2 until convergence.
15/15
Download