Fractional Imputation for Longitudinal Data with Nonignorable Missing Shu Yang Department of Statistics, Iowa State University February 17, 2012 Shu Yang (Dep. of Statistics, ISU) PFI for nonignorable missing February 17, 2012 1 / 25 Outline Introduction I I EM algorithm MCEM algorithm Proposed method: Parametric Fractional Imputation Approximation: Calibration Fractional Imputation Simulation study and Result Shu Yang (Dep. of Statistics, ISU) PFI for nonignorable missing February 17, 2012 2 / 25 Introduction Longitudinal studies are desinged to investigate changes over time in a characteristic measured repeatedly for each individual. I E.g. In medical studies, blood pressure obtained from each individual, at different time points. Mixed model could be used since individual effects can be modeled by the inclusion of random effects: yij = βxij + bi + eij, i = 1, . . . , n, j = 1, . . . , m (1) I I I i indexes individual, and j indexes the repeated measurement on individual). bi iid ∼ N(0, τ 2 ) specifying individual effects. eij iid ∼ N(0, σ 2 ) specifying measurement error. Shu Yang (Dep. of Statistics, ISU) PFI for nonignorable missing February 17, 2012 3 / 25 Introduction Missing data is frequently occurred, and destroys the representativeness of the remaining sample. Assumptions about the missing mechanism: I Missing at random (MAR) or ignorable: missingness is unrelated to the unobserved values; I Not missing at random (NMAR) or nonignorable: missingness depends on missing values; I A special case of the non-ignorable missing model Pr (rij = 1|xij , yij ) = p(xij , bi ; φ) Solution: imputation assigning values for the missing responses to produce a complete data set. Pros: I I I Facilitate analyses using complete data analysis methods. Ensure different analyses are consistent with one another. Reduce nonresponse bias. Shu Yang (Dep. of Statistics, ISU) PFI for nonignorable missing February 17, 2012 4 / 25 Basic Setup Data model: yij = βxij + bi + eij, i = 1, . . . , n, j = 1, . . . , m xij :always observed; yij :subject to missing (scalar), bi : unobserved. e ( 1, if yij is observed rij = 0, if yij is missing Response mechanism rji |(xij , yij )indept ∼ Bernoulli(pij ) e pij = p(xij , bi ; φ) konwn up to φ e Two types of parameter interested: I 1. γ = (β, τ 2 , σ 2 , φ) : the parameters in the model I 2. η : solution to E {U(x, Y ; η)} = 0 F F e.g.1: population mean U(x, Y ; η) = e.g.2: proportion U(x, Y ; η) = Shu Yang (Dep. of Statistics, ISU) 1 nm 1 nm n,m P n,m P yij − η. i,j=1 I (yij ≤ 2) − η. i,j=1 PFI for nonignorable missing February 17, 2012 5 / 25 Introduction(MLE) For the ”complete” data for individual i: (yi ,bi , ri ). p(yi , bi , ri |β, σ 2 , τ 2 , φ) = p(yi |β, σ 2 , bi )p(ri |bi , φ)p(bi |τ 2 ) m Y = p(yij |β, σ 2 , bi )p(rij |bi , φ) p(bi |τ 2 ) j=1 l(γ) = n X m X i=1 log p(yij |β, σ 2 , bi ) + j=1 2 m X log p(rij |bi , φ) + log p(bi |τ 2 ) j=1 2 = l1 (β, σ ) + l2 (φ) + l3 (τ ) MLE of (β, σ 2 ), φ and τ 2 can be obtained separately by solving the corresponding score equations: ∂ 2 I S1 (β, σ 2 ) = ∂(β,σ 2 ) l1 (β, σ ) = 0 I S2 (φ) = l 0 (φ) = 0 2 I S (τ 2 ) = l 0 (τ 2 ) = 0 3 3 Shu Yang (Dep. of Statistics, ISU) PFI for nonignorable missing February 17, 2012 6 / 25 Introduction(EM algorithm) Expectation-maximization (EM) algorithm is an iterative method for finding mles of parameters. Denote (X,Z): X is the observed part of the data, Z is the missing part of the data: Expectation step (E step) : Calculate the expected value of the log likelihood function, with respect to the conditional distribution of Z given X under the current estimate of the parameters γ (t) : Q(γ|γ (t) ) = EZ |X ,γ (t) [log L(γ; X , Z )] or ∂ S̄(γ|γ (t) ) = EZ |X ,γ (t) [S(γ; X , Z )], S(γ; X , Z ) = log L(γ; X , Z ) ∂γ Maximization step (M step): Find the parameter that maximizes this quantity: γ (t+1) = arg max Q(γ|γ (t) ) γ or γ (t+1) sol:S̄(γ|γ (t) ) = 0 Shu Yang (Dep. of Statistics, ISU) PFI for nonignorable missing February 17, 2012 7 / 25 Introduction(EM algorithm) Apply the EM algorithm: Denote yi = (ymis,i , yobs,i ). bi is unobserved, we view it as missing data as well. X = (yobs,i ), Z = (ymis,i , bi ) E-step: Obtain the conditional mean score equation under current parameter value: (Intractable integral, high dim integral) S̄(γ|γ (t) ) = Eymis ,b [S1 (β, σ 2 ) + S2 (φ) + S3 (τ 2 )|yobs , r , γ (t) ] M-step: Find the solution to the conditional mean score equations. (No problem) Shu Yang (Dep. of Statistics, ISU) PFI for nonignorable missing February 17, 2012 8 / 25 Introduction(MC Approximation by Imputation) Computing the conditional expetation can be a challenging problem. Monte Carlo approximation of the conditional expectation using imputation strong law of large numbers: M 1 X ∗(k) E {S1ij (β, σ 2 )|yobs,i , ri , γ̂} ∼ S1ij (β, σ 2 ) = M k=1 ∗(k) ∗(k) S1ij (β, σ 2 ) = ∗(k) where (yij (∗(k)) , bI ∂ log p(yij ∗(k) ) ∼ p(yij ∂(β, σ 2 ) (∗(k)) , bI (∗k) |β, σ 2 , bi ) . |yobs,i , ri , γ̂) γ̂ is a consistent estimator of γ. Shu Yang (Dep. of Statistics, ISU) PFI for nonignorable missing February 17, 2012 9 / 25 Introduction(MCEM) Monte Carlo EM (MCEM) algorithms: step 1. Using rejection sampling or Importance sampling generate M values for each missing value: ∗(k) (∗(k)) ∗(k) (∗(k)) ) ∼ p(yij , bI |yobs,i , ri , γ (t) ) (yij , bI step 2. (E-step) compute the imputed score function M P ∗(k) 1 S1ij (β, σ 2 ) = 0, M 1 M k=1 M P k=1 ∗(k) S2ij (φ) = 0, 1 M M P k=1 ∗(k) S3ij (τ 2 ) = 0 (M-step) solve to update γ̂ (t+1) MCEM regenerate the imputed values for each EM iteration and the computation is quite heavy. The convergence of the MCEM sequence is not guaranteed for fixed M. Shu Yang (Dep. of Statistics, ISU) PFI for nonignorable missing February 17, 2012 10 / 25 Parametric Fractional Imputation (PFI) Main idea: MCEM assign equal weights M1 to each imputed values, adjust imputed values at each t. PFI generate imputed values once using importance sampling, adjusted weights at each t. ∗(k) ∗(k) ∼ h1 (·) and yij M imputed values for two missing: bi Create a weighted data set for the missing data: ∗(k) ∗(k) {(wij1 , bi k=1 M P k=1 ∗(k) wij1 ∗(k) wij2 ∗(k) ∝ ∗(k) ∝ = 1, wij1 = 1, wij2 ∗(k) f ). ) : k = 1, 2, ..., M; i = 1, . . . , n} ∗(k) ∗(k) {(wij2 , yij ) M P ∗(k) ∼ h2 (·|bi : k = 1, 2, ..., M; j ∈ NR} ∗(k) (bi |yobs,i ,γ̂) ∗(k) h1 (bi ) ∗(k) ∗(k) |bi ,yobs,i ,γ̂) ∗(k) ∗(k) h2 (yij )h1 (bi ) f (yij , where γ̂ is the MLE of γ. ∗(k) The weights wij1 , wij2 are the normalized importance weights, can be called fractional weights. Shu Yang (Dep. of Statistics, ISU) PFI for nonignorable missing February 17, 2012 11 / 25 EM algorithm by fractional imputation The target dsn: p(ymis,ij , bi |yobs,i , ri , γ (t) ) Need to simulate the missing data from the below densities: p(bi |yobs,i , ri , γ (t) ) ∝ p(bi , yobs,i , ri |γ (t) ) ∝ p(yobs,i |b, γ (t) )p(ri |bi , γ (t) )p(bi |γ (t) ) m Y Y = p(yij |bi , γ (t) ) p(rij |bi , γ (t) )p(bi |γ (t) ) j∈ARi p(ymis,ij |yobs,i , ri , bi , γ (t) ) ∝ ∝ = j=1 p(ymis,ij , yobs,i , ri |bi , γ (t) ) p(ymis,ij |bi , γ (t) )p(yobs,i |b, γ (t) )p(ri |bi , γ (t) ) m Y Y p(ymis,ij |bi , γ (t) ) p(yij |bi , γ (t) ) p(rij |bi , γ (t) ) j∈ARi Shu Yang (Dep. of Statistics, ISU) PFI for nonignorable missing j=1 February 17, 2012 12 / 25 EM algorithm by fractional imputation Step1 (Importance sampling) for k = 1, . . . , M, M is the Monte Carlo size. ∗(k) 1 for i = 1, . . . , n, b ∼ h1 (·) i ∗(k) 2 for j ∈ AMi , y mis,ij ∼ h2 (·) Step2 (E-step) Calculate weights and update parameter value ∗(k) ∗(k) ∗(k) ∗(k) ∗(k) wij(t) ∝ p(yobs,i |bi , γ (t) )p(ri |bi , γ (t) )p(bi |γ (t) )/h1 (bi ) M X ∗(k) wij(t) = 1, ∀i, j k=1 (M-step) solve n X m X M X ∗(k) ∗(k) , γ (t) ) = 0 ∗(k) , γ (t) ) = 0 wij(t) S2 (φ|rij , bi i=1 j=1 k=1 n X m X M X ∗(k) wij(t) S3 (τ 2 |bi i=1 j=1 k=1 Shu Yang (Dep. of Statistics, ISU) PFI for nonignorable missing February 17, 2012 13 / 25 EM algorithm by fractional imputation Continue (E-step) Calculate weights and update parameter value ∗(k) ∗(k) ∗(k) ∗(k) wij2(t) ∝ p(yobs,i |bi , γ (t) )p(ri |bi , γ (t) )p(bi |γ (t) ) ∗(k) ∗(k) × p(ymis,ij |bi × p(ri |bi ∗(k) , γ (t) )p(yobs,i |bi ∗(k) ∗(k) , γ (t) ) ∗(k) , γ (t) )/h1 (bi )h2 (ymis,ij ) M X ∗(k) wij2(t) = 1, ∀i, j k=1 (M-step) solve n X m X M X ∗(k) ∗(k) ∗(k) wij2(t) S1 (β, σ 2 |ymis,ij , bi , γ (t) ) = 0 i=1 j=1 k=1 repeat until convergent. then γ̂. Shu Yang (Dep. of Statistics, ISU) PFI for nonignorable missing February 17, 2012 14 / 25 Fractional Imputation Property: The imputed values are generated only once, only the weights are changed for each EM iteration. Computationally efficient. The EM sequence {γ̂(t) ; t = 1, 2, . . .} converges to a stationary point γ̂ ∗ . For sufficiently large M, γ̂ ∗ is asymptotically equivalent to the mle. M X k=1 R ∗(k) ∗(k) wij S(xij , yij ; γ̂) e Shu Yang (Dep. of Statistics, ISU) f (yij |xij ,rij =0;ψ̂) ∗(k) ; ψ̂)q(yij |xij )dyij e e R f (yij |xij ,rij =0;γ̂) q(y |x )dy eij |xij ) ij ij ij q(y e e = E {S(xij , yij ; γ|xij , rij = 0; γ̂} e e ∼ = e ij |xij ) q(y PFI for nonignorable missing S(xij , yij e February 17, 2012 15 / 25 Fractional Imputation Using the final fractional weights after convergence, we can estimate η by solving a fractionally imputed estimating equation Ū ∗ (η) = M 1 X X ∗(k) ∗(k) wij U(xij , yij ; η) = 0 nm i,j k=1 Ū ∗ (η) converges Ū(η|ψ̂) = E {U(X , Y ; η)|X , yobs , r; ψ̂} for sufficiently large M almost everywhere. η̂ ∗ is consistent and can obtain central limit theorem for it. Shu Yang (Dep. of Statistics, ISU) PFI for nonignorable missing February 17, 2012 16 / 25 Calibration The proposed method is based on the large sample theory for Monte Carlo sampling, thus it requires M to be large. For public use, small or moderate size M is prefered. To reduce the Monte Carlo sampling error, consdier a calibration weighting techniuqe in survey sampling. ∗(k) ∗(k) Suppose we have the initial imputation set of size M1 = 100: {yij , wij0 } (population). ∗(k) ∗(k) Sample a small subset from it {yij , wij }i,j,k=1,...,M=10 , adjust the weights such that PM ∗(k) 1 = 1, ∀i, j k=1 wij P ∗(k) ∗(k) ∗(k) 2 S1 (β, σ 2 |yij , bi , γ̂) = 0 ijk wij Using regression weighting technique, the weights can be calculated as X −1 X ∗(k) ∗(k) ∗(k) ∗(k) ∗(k) ∗(k) wij = wij0 − ( s̄ij∗ )T wij0 (sij − s̄ij∗ )⊗2 wij0 (sij − s̄ij∗ ) where ∗(k) sij i,j 2 = S1 (β, σ Shu Yang (Dep. of Statistics, ISU) i,j,k ∗(k) ∗(k) |yij , bi , γ̂) and s̄ij∗ = PFI for nonignorable missing PM k=1 ∗(k) ∗(k) wij0 sij . February 17, 2012 17 / 25 Simulation Study B = 2000 Monte Carlo samples of sizes n × m = 10 × 15 = 150 were generated from yij = β0 + β1 xij + bi + eij where xij = j/m, bi ∼ N(0, τ 2 ), iid, eij ∼ N(0, σ 2 ), with β0 = 2, β1 = 1, σ 2 = 0.1, τ 2 = 0.5 rij ∼ Bernoulli(πij ) logit(πij ) = φ0 + φ1 bi with φ0 = 0, φ1 = 1. Under this model setup, the average response rate is about 50%. Shu Yang (Dep. of Statistics, ISU) PFI for nonignorable missing February 17, 2012 18 / 25 Simulation study: simulation setup The following parameters are computed. 1 β1 , τ 2 , σ 2 : slope, variance components in the linear mixed effect model 2 µy : the marginal mean of y. 3 Proportion: Pr (Y < 2). For each parameter, compute the following estimators: 1 Complete sample estimator(for comparison) 2 Incomplete sample estimator(if using the respondents only) 3 Parametric fractional imputation(PFI) with M=100 and M=10 4 The CFI method, with M=10. Shu Yang (Dep. of Statistics, ISU) PFI for nonignorable missing February 17, 2012 19 / 25 Results Table: Mean, variance of the point estimators based on 2000 ”Monte Carlo samples” Parameter β1 µy Pr (y < 2) Shu Yang (Dep. of Statistics, ISU) Method glmer Incomplete Imputation100 Imputation10 Calibration10 glmer Incomplete Imputation100 Imputation10 Calibration glmer Incomplete Imputation100 Imputation10 Calibration10 Mean 1.00 0.99 1.00 1.00 1.00 2.54 2.74 2.55 2.56 2.55 0.25 0.17 0.25 0.24 0.25 PFI for nonignorable missing Variance 0.0086 0.0182 0.0191 0.0202 0.0200 0.0498 0.0548 0.0514 0.0515 0.0514 0.0095 0.0072 0.0098 0.0097 0.0097 February 17, 2012 20 / 25 Results Table: Mean, variance of the point estimators based on 2000 ”Monte Carlo samples” Parameter τ2 σ2 φ0 φ1 Shu Yang (Dep. of Statistics, ISU) Method glmer Incomplete Imputation100 glmer Incomplete Imputation100 Imputation100 Imputation100 Mean 0.49 0.48 0.43 0.10 0.10 0.10 0.06 0.97 PFI for nonignorable missing Variance 0.0534 0.0557 0.0469 0.0001 0.0003 0.0013 February 17, 2012 21 / 25 Results The incomplete sample estimators are biased for the mean type of the parameters. From the response model, individuals with large bi are likely to response. For point estimation for the fixed effects and the mean type of parameters, the proposed estimators are essentially unbiased. Compared to the imputed estimator with M=10, the CFI estimator is more efficient, which is expected. For point estimation for the variance components, the proposed estimator is biased downwards. n X m X M X ∗(k) ∗(k) 2 ∗ 2 wij (bi ) − τ =0 S̄2 (τ ) = i=1 j=1 k=1 ∗(k) Thus for large M, we are expecting τ̂ 2 ≈ E [(bi )2 |yobs,i , ri , γ̂], which would essentially be unbiased! ∗(k) ∗(k) E [τ̂ 2 ] ≈ E [E [(bi )2 |yobs,i , ri , γ̂]] = E [(bi )2 ] = τ 2 Shu Yang (Dep. of Statistics, ISU) PFI for nonignorable missing February 17, 2012 22 / 25 Discussion MLE of the variance component is biased downwards since it fails to account for the loss of degrees of freedom needed to estimate β. Experiment: Using true value of β and only estimate the variance components and the proposed method is unbiased! The problem could be the loss of degrees of freedom because of β̂. The restricted maximum likelihood estimation (REML) can be derived from a hierarchy where integrate out the fixed effects first out of the integral. However, sine the fixed effects are still hidden in the weights, this approach is not applicable. We can apply parametric bootstrapping to do bias correction; however, the objective of fractional imputation is to gain computational effciency. Parametric bootstrapping is really not the thing we’d like to go for. Shu Yang (Dep. of Statistics, ISU) PFI for nonignorable missing February 17, 2012 23 / 25 Results Table: Mean, variance of the point estimators based on 2000 ”Monte Carlo samples” Parameter β τ2 Shu Yang (Dep. of Statistics, ISU) Method glmer Imputation glmer Imputation Mean 5.35 5.29 5.41 2.26 Variance 2.26 1.83 23.98 2.68 PFI for nonignorable missing asym 95%ci (5.28, 5.42) (5.22, 5.35) (5.19, 5.63) (2.18, 2.33) February 17, 2012 24 / 25 Thank You! Shu Yang (Dep. of Statistics, ISU) PFI for nonignorable missing February 17, 2012 25 / 25