Fractional Imputation for Longitudinal Data with Nonignorable Missing Shu Yang February 17, 2012

advertisement
Fractional Imputation for Longitudinal Data with
Nonignorable Missing
Shu Yang
Department of Statistics, Iowa State University
February 17, 2012
Shu Yang (Dep. of Statistics, ISU)
PFI for nonignorable missing
February 17, 2012
1 / 25
Outline
Introduction
I
I
EM algorithm
MCEM algorithm
Proposed method: Parametric Fractional Imputation
Approximation: Calibration Fractional Imputation
Simulation study and Result
Shu Yang (Dep. of Statistics, ISU)
PFI for nonignorable missing
February 17, 2012
2 / 25
Introduction
Longitudinal studies are desinged to investigate changes over time in a
characteristic measured repeatedly for each individual.
I E.g. In medical studies, blood pressure obtained from each individual,
at different time points.
Mixed model could be used since individual effects can be modeled by the
inclusion of random effects:
yij = βxij + bi + eij, i = 1, . . . , n, j = 1, . . . , m
(1)
I
I
I
i indexes individual, and j indexes the repeated measurement on
individual).
bi iid ∼ N(0, τ 2 ) specifying individual effects.
eij iid ∼ N(0, σ 2 ) specifying measurement error.
Shu Yang (Dep. of Statistics, ISU)
PFI for nonignorable missing
February 17, 2012
3 / 25
Introduction
Missing data is frequently occurred, and destroys the representativeness of
the remaining sample.
Assumptions about the missing mechanism:
I Missing at random (MAR) or ignorable: missingness is unrelated to the
unobserved values;
I Not missing at random (NMAR) or nonignorable: missingness depends
on missing values;
I A special case of the non-ignorable missing model
Pr (rij = 1|xij , yij ) = p(xij , bi ; φ)
Solution: imputation assigning values for the missing responses to produce a
complete data set.
Pros:
I
I
I
Facilitate analyses using complete data analysis methods.
Ensure different analyses are consistent with one another.
Reduce nonresponse bias.
Shu Yang (Dep. of Statistics, ISU)
PFI for nonignorable missing
February 17, 2012
4 / 25
Basic Setup
Data model: yij = βxij + bi + eij, i = 1, . . . , n, j = 1, . . . , m
xij :always observed; yij :subject to missing (scalar), bi : unobserved.
e (
1, if yij is observed
rij =
0, if yij is missing
Response mechanism
rji |(xij , yij )indept ∼ Bernoulli(pij )
e
pij = p(xij , bi ; φ) konwn up to φ
e
Two types of parameter interested:
I 1. γ = (β, τ 2 , σ 2 , φ) : the parameters in the model
I 2. η : solution to E {U(x, Y ; η)} = 0
F
F
e.g.1: population mean U(x, Y ; η) =
e.g.2: proportion U(x, Y ; η) =
Shu Yang (Dep. of Statistics, ISU)
1
nm
1
nm
n,m
P
n,m
P
yij − η.
i,j=1
I (yij ≤ 2) − η.
i,j=1
PFI for nonignorable missing
February 17, 2012
5 / 25
Introduction(MLE)
For the ”complete” data for individual i: (yi ,bi , ri ).
p(yi , bi , ri |β, σ 2 , τ 2 , φ)
= p(yi |β, σ 2 , bi )p(ri |bi , φ)p(bi |τ 2 )
m Y
=
p(yij |β, σ 2 , bi )p(rij |bi , φ) p(bi |τ 2 )
j=1
l(γ)
=
n X
m
X
i=1
log p(yij |β, σ 2 , bi ) +
j=1
2
m
X
log p(rij |bi , φ) + log p(bi |τ 2 )
j=1
2
= l1 (β, σ ) + l2 (φ) + l3 (τ )
MLE of (β, σ 2 ), φ and τ 2 can be obtained separately by solving the
corresponding score equations:
∂
2
I S1 (β, σ 2 ) =
∂(β,σ 2 ) l1 (β, σ ) = 0
I S2 (φ) = l 0 (φ) = 0
2
I S (τ 2 ) = l 0 (τ 2 ) = 0
3
3
Shu Yang (Dep. of Statistics, ISU)
PFI for nonignorable missing
February 17, 2012
6 / 25
Introduction(EM algorithm)
Expectation-maximization (EM) algorithm is an iterative method for finding
mles of parameters.
Denote (X,Z): X is the observed part of the data, Z is the missing part of
the data:
Expectation step (E step) :
Calculate the expected value of the log likelihood function, with respect to
the conditional distribution of Z given X under the current estimate of the
parameters γ (t) :
Q(γ|γ (t) ) = EZ |X ,γ (t) [log L(γ; X , Z )]
or
∂
S̄(γ|γ (t) ) = EZ |X ,γ (t) [S(γ; X , Z )], S(γ; X , Z ) =
log L(γ; X , Z )
∂γ
Maximization step (M step):
Find the parameter that maximizes this quantity:
γ (t+1) = arg max Q(γ|γ (t) )
γ
or
γ (t+1) sol:S̄(γ|γ (t) ) = 0
Shu Yang (Dep. of Statistics, ISU)
PFI for nonignorable missing
February 17, 2012
7 / 25
Introduction(EM algorithm)
Apply the EM algorithm:
Denote yi = (ymis,i , yobs,i ).
bi is unobserved, we view it as missing data as well.
X = (yobs,i ), Z = (ymis,i , bi )
E-step: Obtain the conditional mean score equation under current
parameter value: (Intractable integral, high dim integral)
S̄(γ|γ (t) ) = Eymis ,b [S1 (β, σ 2 ) + S2 (φ) + S3 (τ 2 )|yobs , r , γ (t) ]
M-step: Find the solution to the conditional mean score equations. (No
problem)
Shu Yang (Dep. of Statistics, ISU)
PFI for nonignorable missing
February 17, 2012
8 / 25
Introduction(MC Approximation by Imputation)
Computing the conditional expetation can be a challenging problem.
Monte Carlo approximation of the conditional expectation using imputation
strong law of large numbers:
M
1 X ∗(k)
E {S1ij (β, σ 2 )|yobs,i , ri , γ̂} ∼
S1ij (β, σ 2 )
=
M
k=1
∗(k)
∗(k)
S1ij (β, σ 2 ) =
∗(k)
where (yij
(∗(k))
, bI
∂ log p(yij
∗(k)
) ∼ p(yij
∂(β, σ 2 )
(∗(k))
, bI
(∗k)
|β, σ 2 , bi
)
.
|yobs,i , ri , γ̂)
γ̂ is a consistent estimator of γ.
Shu Yang (Dep. of Statistics, ISU)
PFI for nonignorable missing
February 17, 2012
9 / 25
Introduction(MCEM)
Monte Carlo EM (MCEM) algorithms:
step 1. Using rejection sampling or Importance sampling generate M
values for each missing value:
∗(k) (∗(k))
∗(k) (∗(k))
) ∼ p(yij , bI
|yobs,i , ri , γ (t) )
(yij , bI
step 2. (E-step) compute the imputed score function
M
P
∗(k)
1
S1ij (β, σ 2 ) = 0,
M
1
M
k=1
M
P
k=1
∗(k)
S2ij (φ) = 0,
1
M
M
P
k=1
∗(k)
S3ij (τ 2 ) = 0
(M-step) solve to update γ̂ (t+1)
MCEM regenerate the imputed values for each EM iteration and the
computation is quite heavy.
The convergence of the MCEM sequence is not guaranteed for fixed M.
Shu Yang (Dep. of Statistics, ISU)
PFI for nonignorable missing
February 17, 2012
10 / 25
Parametric Fractional Imputation (PFI)
Main idea:
MCEM assign equal weights M1 to each imputed values, adjust imputed
values at each t.
PFI generate imputed values once using importance sampling, adjusted
weights at each t.
∗(k)
∗(k)
∼ h1 (·) and yij
M imputed values for two missing: bi
Create a weighted data set for the missing data:
∗(k)
∗(k)
{(wij1 , bi
k=1
M
P
k=1
∗(k)
wij1
∗(k)
wij2
∗(k)
∝
∗(k)
∝
= 1, wij1
= 1, wij2
∗(k)
f
).
) : k = 1, 2, ..., M; i = 1, . . . , n}
∗(k)
∗(k)
{(wij2 , yij )
M
P
∗(k)
∼ h2 (·|bi
: k = 1, 2, ..., M; j ∈ NR}
∗(k)
(bi |yobs,i ,γ̂)
∗(k)
h1 (bi )
∗(k)
∗(k)
|bi ,yobs,i ,γ̂)
∗(k)
∗(k)
h2 (yij )h1 (bi )
f (yij
, where γ̂ is the MLE of γ.
∗(k)
The weights wij1 , wij2 are the normalized importance weights, can be
called fractional weights.
Shu Yang (Dep. of Statistics, ISU)
PFI for nonignorable missing
February 17, 2012
11 / 25
EM algorithm by fractional imputation
The target dsn: p(ymis,ij , bi |yobs,i , ri , γ (t) )
Need to simulate the missing data from the below densities:
p(bi |yobs,i , ri , γ (t) ) ∝ p(bi , yobs,i , ri |γ (t) )
∝ p(yobs,i |b, γ (t) )p(ri |bi , γ (t) )p(bi |γ (t) )
m
Y
Y
=
p(yij |bi , γ (t) )
p(rij |bi , γ (t) )p(bi |γ (t) )
j∈ARi
p(ymis,ij |yobs,i , ri , bi , γ
(t)
) ∝
∝
=
j=1
p(ymis,ij , yobs,i , ri |bi , γ (t) )
p(ymis,ij |bi , γ (t) )p(yobs,i |b, γ (t) )p(ri |bi , γ (t) )
m
Y
Y
p(ymis,ij |bi , γ (t) )
p(yij |bi , γ (t) )
p(rij |bi , γ (t) )
j∈ARi
Shu Yang (Dep. of Statistics, ISU)
PFI for nonignorable missing
j=1
February 17, 2012
12 / 25
EM algorithm by fractional imputation
Step1 (Importance sampling) for k = 1, . . . , M, M is the Monte Carlo
size.
∗(k)
1 for i = 1, . . . , n, b
∼ h1 (·)
i
∗(k)
2 for j ∈ AMi , y
mis,ij ∼ h2 (·)
Step2 (E-step) Calculate weights and update parameter value
∗(k)
∗(k)
∗(k)
∗(k)
∗(k)
wij(t) ∝ p(yobs,i |bi , γ (t) )p(ri |bi , γ (t) )p(bi |γ (t) )/h1 (bi )
M
X
∗(k)
wij(t) = 1, ∀i, j
k=1
(M-step) solve
n X
m X
M
X
∗(k)
∗(k)
, γ (t) )
=
0
∗(k)
, γ (t) )
=
0
wij(t) S2 (φ|rij , bi
i=1 j=1 k=1
n X
m X
M
X
∗(k)
wij(t) S3 (τ 2 |bi
i=1 j=1 k=1
Shu Yang (Dep. of Statistics, ISU)
PFI for nonignorable missing
February 17, 2012
13 / 25
EM algorithm by fractional imputation
Continue (E-step) Calculate weights and update parameter value
∗(k)
∗(k)
∗(k)
∗(k)
wij2(t) ∝ p(yobs,i |bi , γ (t) )p(ri |bi , γ (t) )p(bi |γ (t) )
∗(k)
∗(k)
×
p(ymis,ij |bi
×
p(ri |bi
∗(k)
, γ (t) )p(yobs,i |bi
∗(k)
∗(k)
, γ (t) )
∗(k)
, γ (t) )/h1 (bi )h2 (ymis,ij )
M
X
∗(k)
wij2(t) = 1, ∀i, j
k=1
(M-step) solve
n X
m X
M
X
∗(k)
∗(k)
∗(k)
wij2(t) S1 (β, σ 2 |ymis,ij , bi
, γ (t) ) = 0
i=1 j=1 k=1
repeat until convergent. then γ̂.
Shu Yang (Dep. of Statistics, ISU)
PFI for nonignorable missing
February 17, 2012
14 / 25
Fractional Imputation
Property: The imputed values are generated only once, only the weights are
changed for each EM iteration. Computationally efficient.
The EM sequence {γ̂(t) ; t = 1, 2, . . .} converges to a stationary point γ̂ ∗ .
For sufficiently large M, γ̂ ∗ is asymptotically equivalent to the mle.
M
X
k=1
R
∗(k)
∗(k)
wij S(xij , yij ; γ̂)
e
Shu Yang (Dep. of Statistics, ISU)
f (yij |xij ,rij =0;ψ̂)
∗(k)
; ψ̂)q(yij |xij )dyij
e
e
R f (yij |xij ,rij =0;γ̂)
q(y
|x
)dy
eij |xij )
ij ij
ij
q(y
e
e
= E {S(xij , yij ; γ|xij , rij = 0; γ̂}
e
e
∼
=
e ij |xij )
q(y
PFI for nonignorable missing
S(xij , yij
e
February 17, 2012
15 / 25
Fractional Imputation
Using the final fractional weights after convergence, we can estimate η by
solving a fractionally imputed estimating equation
Ū ∗ (η) =
M
1 X X ∗(k)
∗(k)
wij U(xij , yij ; η) = 0
nm
i,j k=1
Ū ∗ (η) converges Ū(η|ψ̂) = E {U(X , Y ; η)|X , yobs , r; ψ̂} for sufficiently large
M almost everywhere.
η̂ ∗ is consistent and can obtain central limit theorem for it.
Shu Yang (Dep. of Statistics, ISU)
PFI for nonignorable missing
February 17, 2012
16 / 25
Calibration
The proposed method is based on the large sample theory for Monte Carlo
sampling, thus it requires M to be large.
For public use, small or moderate size M is prefered.
To reduce the Monte Carlo sampling error, consdier a calibration weighting
techniuqe in survey sampling.
∗(k)
∗(k)
Suppose we have the initial imputation set of size M1 = 100: {yij , wij0 }
(population).
∗(k)
∗(k)
Sample a small subset from it {yij , wij }i,j,k=1,...,M=10 , adjust the
weights such that
PM
∗(k)
1
= 1, ∀i, j
k=1 wij
P
∗(k)
∗(k) ∗(k)
2
S1 (β, σ 2 |yij , bi , γ̂) = 0
ijk wij
Using regression weighting technique, the weights can be calculated as
X
−1
X
∗(k)
∗(k)
∗(k) ∗(k)
∗(k) ∗(k)
wij = wij0 − (
s̄ij∗ )T
wij0 (sij − s̄ij∗ )⊗2
wij0 (sij − s̄ij∗ )
where
∗(k)
sij
i,j
2
= S1 (β, σ
Shu Yang (Dep. of Statistics, ISU)
i,j,k
∗(k) ∗(k)
|yij , bi , γ̂)
and s̄ij∗ =
PFI for nonignorable missing
PM
k=1
∗(k) ∗(k)
wij0 sij
.
February 17, 2012
17 / 25
Simulation Study
B = 2000 Monte Carlo samples of sizes n × m = 10 × 15 = 150 were
generated from
yij = β0 + β1 xij + bi + eij
where
xij = j/m, bi ∼ N(0, τ 2 ), iid, eij ∼ N(0, σ 2 ),
with β0 = 2, β1 = 1, σ 2 = 0.1, τ 2 = 0.5
rij ∼ Bernoulli(πij )
logit(πij ) = φ0 + φ1 bi
with φ0 = 0, φ1 = 1.
Under this model setup, the average response rate is about 50%.
Shu Yang (Dep. of Statistics, ISU)
PFI for nonignorable missing
February 17, 2012
18 / 25
Simulation study: simulation setup
The following parameters are computed.
1
β1 , τ 2 , σ 2 : slope, variance components in the linear mixed effect model
2
µy : the marginal mean of y.
3
Proportion: Pr (Y < 2).
For each parameter, compute the following estimators:
1
Complete sample estimator(for comparison)
2
Incomplete sample estimator(if using the respondents only)
3
Parametric fractional imputation(PFI) with M=100 and M=10
4
The CFI method, with M=10.
Shu Yang (Dep. of Statistics, ISU)
PFI for nonignorable missing
February 17, 2012
19 / 25
Results
Table: Mean, variance of the point estimators based on 2000 ”Monte Carlo
samples”
Parameter
β1
µy
Pr (y < 2)
Shu Yang (Dep. of Statistics, ISU)
Method
glmer
Incomplete
Imputation100
Imputation10
Calibration10
glmer
Incomplete
Imputation100
Imputation10
Calibration
glmer
Incomplete
Imputation100
Imputation10
Calibration10
Mean
1.00
0.99
1.00
1.00
1.00
2.54
2.74
2.55
2.56
2.55
0.25
0.17
0.25
0.24
0.25
PFI for nonignorable missing
Variance
0.0086
0.0182
0.0191
0.0202
0.0200
0.0498
0.0548
0.0514
0.0515
0.0514
0.0095
0.0072
0.0098
0.0097
0.0097
February 17, 2012
20 / 25
Results
Table: Mean, variance of the point estimators based on 2000 ”Monte Carlo
samples”
Parameter
τ2
σ2
φ0
φ1
Shu Yang (Dep. of Statistics, ISU)
Method
glmer
Incomplete
Imputation100
glmer
Incomplete
Imputation100
Imputation100
Imputation100
Mean
0.49
0.48
0.43
0.10
0.10
0.10
0.06
0.97
PFI for nonignorable missing
Variance
0.0534
0.0557
0.0469
0.0001
0.0003
0.0013
February 17, 2012
21 / 25
Results
The incomplete sample estimators are biased for the mean type of the
parameters. From the response model, individuals with large bi are likely to
response.
For point estimation for the fixed effects and the mean type of parameters,
the proposed estimators are essentially unbiased.
Compared to the imputed estimator with M=10, the CFI estimator is more
efficient, which is expected.
For point estimation for the variance components, the proposed estimator is
biased downwards.
n X
m X
M
X
∗(k)
∗(k) 2
∗
2
wij
(bi ) − τ
=0
S̄2 (τ ) =
i=1 j=1 k=1
∗(k)
Thus for large M, we are expecting τ̂ 2 ≈ E [(bi )2 |yobs,i , ri , γ̂], which would
essentially be unbiased!
∗(k)
∗(k)
E [τ̂ 2 ] ≈ E [E [(bi )2 |yobs,i , ri , γ̂]] = E [(bi )2 ] = τ 2
Shu Yang (Dep. of Statistics, ISU)
PFI for nonignorable missing
February 17, 2012
22 / 25
Discussion
MLE of the variance component is biased downwards since it fails to
account for the loss of degrees of freedom needed to estimate β.
Experiment: Using true value of β and only estimate the variance
components and the proposed method is unbiased! The problem could be
the loss of degrees of freedom because of β̂.
The restricted maximum likelihood estimation (REML) can be derived from
a hierarchy where integrate out the fixed effects first out of the integral.
However, sine the fixed effects are still hidden in the weights, this approach
is not applicable.
We can apply parametric bootstrapping to do bias correction; however, the
objective of fractional imputation is to gain computational effciency.
Parametric bootstrapping is really not the thing we’d like to go for.
Shu Yang (Dep. of Statistics, ISU)
PFI for nonignorable missing
February 17, 2012
23 / 25
Results
Table: Mean, variance of the point estimators based on 2000 ”Monte Carlo
samples”
Parameter
β
τ2
Shu Yang (Dep. of Statistics, ISU)
Method
glmer
Imputation
glmer
Imputation
Mean
5.35
5.29
5.41
2.26
Variance
2.26
1.83
23.98
2.68
PFI for nonignorable missing
asym 95%ci
(5.28, 5.42)
(5.22, 5.35)
(5.19, 5.63)
(2.18, 2.33)
February 17, 2012
24 / 25
Thank You!
Shu Yang (Dep. of Statistics, ISU)
PFI for nonignorable missing
February 17, 2012
25 / 25
Download