Generalized linear mixed effect models 1/17 Generalized linear mixed effect models I Assume the response Yij has the following conditional pdf or pmf given the random effects Ui , namely, ind Yij |Ui ∼ exp I o n y θ − b(θ ) ij ij ij − h(yij , φ) . a(φ) Consider the canonical link such that θij = xijT β + dijT Ui , where β is a p-dim fixed coefficients and Ui is a q-dim random effects. I We are interested in estimating the unknown parameters β. 2/17 Conditional likelihood approach Treat Ui as fixed effects, the likelihood function for β and Ui is then proportional to ni m Y Y exp{yij θij − b(θij )} i=1 j=1 = ni m Y Y exp{β T xij yij + UiT dij yij − b(θij )} i=1 j=1 ni ni ni m m X m X o n X X X X T T b(θij ) . xij yij + Ui dij yij − = exp β i=1 j=1 This implies that Pni j=1 dij yij i=1 j=1 i=1 j=1 is sufficient for Ui for fixed β. 3/17 Conditional likelihood approach The conditional probability mass function for yi = (yi1 , · · · , yini )T P i such that nj=1 xij yij = ai is f (yi | ni X dij yij = bi ; β) j=1 P i f (yi , nj=1 dij yij = bi ; β, Ui ) = Pni f ( j=1 dij yij = bi ; β, Ui ) P i P i f ( nj=1 xij yij = ai , nj=1 dij yij = bi ; β, Ui ) . = Pni f ( j=1 dij yij = bi ; β, Ui ) 4/17 Conditional likelihood approach Thus, if yij are discrete, the above conditional pmf can be written as exp(β T ai + UiT bi ) , Pni T T Ri2 exp(β j=1 xij yij + Ui bi ) P Ri1 P n o P i P i where Ri1 = yi : nj=1 xij yij = ai , nj=1 dij yij = bi and n o P i Ri2 = yi : nj=1 dij yij = bi . 5/17 Conditional maximum likelihood estimator The conditional likelihood estimator for β is β̂ = arg max Lc (β), β where Lc (β) = m Y exp(β T ai ) . Pni T Ri2 exp(β l=1 xil yil ) P Ri1 P i=1 For simple cases such as logistic regression model with random intercept, the conditional likelihood function is reasonably easy to maximize. 6/17 Likelihood method Let δ be the collection of unknown parameters β and the elements of G. Then the likelihood function for δ is L(δ) = ni Z m Y Y f (yij |Ui ; β)f (Ui )dUi . i=1 j=1 In some cases such as the linear mixed models with Gaussian noise, the integral above has a closed form. 7/17 Numerical evaluation of the likelihood I Gaussian-Hermite Quadrature. The Gaussian-Hermite quadrature uses a fixed set of Q nodes (quadrature points) and weights (zq , wq ) to approximate an integral: Z f (z)φ(z)dz ≈ Q X wq f (zq ). q=1 where φ is the pdf of a normal distribution. I The quadrature methods do not perform very well for higher-dimension integration. I In R, this might be implemented by glmm (repeated): Gaussian-Hermite quadrature, models with random intercept only. 8/17 EM algorithm In the M-step of the EM algorithm, we update the parameter values by solving the estimating equations below ni m X X xij [yij − E{µij (Ui )|yi ; δ (k−1) }] = 0 i=1 j=1 1 −1 G 2 m X i=1 E{Ui UiT |yi ; δ (k−1) }G−1 − m −1 G = 0, 2 where µij (Ui ) = E(yij |Ui ) = h−1 (xijT β + dijT Ui ). 9/17 Monte Carlo EM algorithm I However, the conditional expectations in EM algorithm are difficult to evaluate. I To evaluate the conditional expectation, we can draw dependent samples from f (U|y ) using Metropolis algorithm without calculating the marginal distribution f (y). I Then use Monte Carlo method to calculate the conditional expectation. 10/17 Approximate EM algorithm An alternative strategy is to approximate the estimating equations so that integration can be avoid. Plugging in the posterior mode or BLUP Ûi for Ui . Specifically, I Let vij = Var(yij |Ui ) and Qi = Diag{vij h0 (µij )2 }. I Let zi be the surrogate response defined to have elements zij = h(µij ) + (yij − µij )h0 (µij ) for j = 1, · · · , ni . I Define the ni × ni matrix Vi = Qi + Di GDiT where Di is the ni × q matrix whose j-th row is dij . 11/17 Approximate EM algorithm For a fixed G, update the values of β and U are obtained by iteratively solving m m X X T −1 −1 β̂ = ( Xi Vi Xi ) XiT Vi−1 zi i=1 i=1 and Ûi = GDi Vi−1 (zi − Xi β). 12/17 Approximate EM algorithm To estimate G, we estimate G by Ĝ = m−1 m X E(Ui UiT |yi ) i=1 =m −1 m X i=1 E(Ui |yi )E(UiT |yi ) +m −1 m X Var(Ui |yi ) i=1 We then use Ûi to estimate E(Ui |yi ) and use (DiT Qi−1 Di + G−1 )−1 to estimate the conditional variance Var(Ui |yi ). 13/17 Penalized quasi-likelihood (PQL) I PQL is an approximate likelihood method. I The central idea is to approximate the conditional distribution of Ui given yi by a Gaussian distribution with the same mode and curvature. I In PQL, only the mean and variance need to be specified in the conditional mean model of y |U. 14/17 Penalized quasi-likelihood (PQL) I PQL (Breslow and Clayton, 1993) can be derived via Laplace approximation to the GLMM likelihood. I Consider the random effects U as fixed parameters, we can maximize the joint likelihood with respect to β and U 1 log f (y; β, U) − U T D −1 U. 2 15/17 GLMM algorithms in R I glmmPQL (MASS): penalized quasi-likelihood, allows the use of an additional correlation structure. I glmmML (glmmML): maximum likelihood using adaptive Gaussian quadrature or Laplace (default) methods; random intercept only model. 16/17 GLMM algorithms in R I gnlmix (repeated): non-linear regression with mixed random effects for the location parameters. Non-Gaussian mixing distributions are allowed. I glmm (GLMMGibbs): Gibbs sampling. 17/17