Population-Averaged Model Recall the general formulation of a PA (marginal) model: E (Yi | xi ) = fi (xi , β) var ( Yi | xi ) = Vi (β, ξ, xi ) ni × 1, ni × ni . Below, xi,j contains the among-individual covariates for the i th subject, ai , and the within-individual covariates for the j th observation on that subject, zi,j : zi,j xi,j = . ai We assume that 1 1 Vi (β, ξ, xi ) = Ti (β, θ, xi ) 2 Γi (α, xi ) Ti (β, θ, xi ) 2 where Ti (·) is the diagonal matrix of variances, and Γi (·) is the correlation matrix. The variances are specified as in the univariate case: Ti (β, θ, xi ) = diag {var (Yi,1 | xi ) , var ( Yi,2 | xi ) , . . . , var ( Yi,ni | xi )} and var ( Yi,j | xi ) = σ 2 g (β, θ, xi,j )2 . The correlations are specified using one of the standard correlation models, with parameter α. The correlation structure is usually viewed as a working model, and inferences are based on variance estimators that recognize that (sandwich estimators). The overall variance parameter vector is θ ξ= . α Generalized Estimating Equations (GEE) The basic principles of GEE are the same as those for univariate response, even though the equations are more complicated. Things to come: Linear estimating equations for β Quadratic estimating equations for ξ Quadratic estimating equations for β The folklore theorem and robust covariance matrix Implementations in R Strategy: initially, assume the given mean specification and known variance matrices Vi , and the Gaussian distribution. Linear Estimating Equations for β Normal log-likelihood m i 1 Xh T −1 log |2πVi | + {Yi − fi (xi , β)} Vi {Yi − fi (xi , β)} logL = − 2 i=1 Differentiate wrt β: m X Xi (β)T Vi−1 {Yi − fi (xi , β)} = 0. i=1 As always, Xi (β) is a gradient matrix: fβ (xi,1 , β)T f (x , β)T β i,2 Xi (β) = .. . fβ (xi,ni , β)T ni ×p Now use the variance specification Vi = var ( Yi | xi ) = Vi (β, ξ, xi ) . Estimating equation for β becomes m X Xi (β)T Vi (β, ξ, xi )−1 {Yi − fi (xi , β)} = 0. i=1 In terms of stacked XN×p , YN×1 , and fN×1 : X(β)T V−1 {Y − f (β)} = 0 where VN×N is block-diagonal: V = diag (V1 , V2 , . . . , Vm ) . Estimating equations: linear, unbiased. GLS iteration: (0) Initial estimator β̂ , e.g., OLS. (k) For each k ≥ 0, some estimator ξ̂ gives estimated weight matrices (k) (k) −1 V̂i β̂ , ξ̂ , xi . Re-estimate β by solving m X (k) (k) −1 Xi (β)T V̂i β̂ , ξ̂ , xi {Yi − fi (xi , β)} = 0. i=1 for β̂ (k+1) , and iterate C times. Estimating Variance and Correlation Parameters Estimation of ξ q+s = (θ T , αT )T . Hold β fixed in the normal log-likelihood and maximize wrt ξ. Both involve complicated estimating equations that are quadratic in {Yi − fi (xi , β)}. General quadratic equation approach: univariate case: requires variances of residuals i = {Yi − f (xi , β)} . multivariate case: requires variances and covariances of ui,j,k = {Yi,j − f (xi,j , β)} × {Yi,k − f (xi,k , β)} . Two popular working assumptions: Independence: assume cov (ui,j,k , ui,j 0 ,k 0 ) is the same as if Yi,1 , Yi,2 , . . . , Yi,ni were independent. Gaussian: assume cov (ui,j,k , ui,j 0 ,k 0 ) is the same as if Yi,1 , Yi,2 , . . . , Yi,ni were normal with the specified mean and variance structure. Not surprisingly, the Gaussian working assumption leads to the same estimating equation as PL. Quadratic Estimating Equations for β As in the univariate case, maximizing the full normal likelihood wrt β leads to estimating equations that are quadratic in Yi − fi (xi , β), when g (·) depends on β. As when estimating variance parameters, general quadratic estimating equations may be used, based on the covariances cov (ui,j,k , ui,j 0 ,k 0 ). Working assumptions (independence or Gaussian) would usually be necessary. If the working assumptions about the covariance are correct, the quadratic estimator is better than GLS. Otherwise, β̂ Q may be less efficient than GLS, and, if the variance is misspecified, inconsistent. GEE-1: linear equations for β and quadratic equations for ξ; GEE-2: quadratic equations for β and quadratic equations for ξ. The folklore theorem Large sample properties are usually derived with ni fixed and m → ∞. With fixed weight matrices U−1 and true variance matrices Vi , i · β̂ ∼ !−1 m X T −1 N β0 , Xi Ui Xi i=1 m X ! −1 −1 XT i Ui Vi Ui Xi i=1 m X −1 XT i Ui Xi !−1 i=1 Folklore theorem: estimated weight matrices generally lead to the same asymptotic distribution as if the weights were known. Misspecified weight matrices lead to inefficiency. Sandwich Estimator With misspecified weight matrices, model-based standard errors are generally incorrect, while sandwich estimator of standard errors are valid. The asymptotic variance is estimated by m X R XTi U−1 β̂ U−1 i i i Xi i=1 where Ri (b) = {Yi − fi (xi , b)} {Yi − fi (xi , b)}T . Note that E {Ri (β)} = Vi , regardless of the correlation structure of Yi . Implementation in R Recall the general formulation of a PA (marginal) model: E (Yi | xi ) = fi (xi , β) , var (Yi | xi ) = Vi (β, ξ, xi ) with 1 1 Vi (β, ξ, xi ) = Ti (β, θ, xi ) 2 Γi (α, xi ) Ti (β, θ, xi ) 2 where Ti (·) is the diagonal matrix of variances, and Γi (·) is the correlation matrix. Package software implements only a special case, based on generalized linear models: mean is a nonlinear function of a linear predictor: E (Yi,j | xi ) = f xTi,j β ; variance is a function of the mean: var ( Yi,j | xi ) = σ 2 g f xTi,j β ; correlation structure is one of the standard patterns. Note It is possible that no joint distribution of Yi exists with marginal distributions for Yi,j in the associated scaled exponential family. That is, the specification should be viewed as defining Generalized Estimating Equations (GEE); also, the implementations are derived from the Generalized Linear Model point of view, and are based on linear estimating equations for β (GEE-1). SAS proc genmod has a repeated statement to specify correlation structure. In R, use one of the functions: gee() (in the gee library); geese() or geeglm() (in the geepack library). The argument corstr specifies correlation structure. The gee package seize <- read.table("seize.dat", col.names = c("id", "y", "time", "trt", "base", "age")) library(gee) seizeGee.un <- gee(y ~ log(age) + trt*log(base / 4) + (time == 4), id = id, data=seize, corstr="unstructured", family=poisson) summary(seizeGee.un) seizeGee.ex <- update(seizeGee.un, corstr="exchangeable") summary(seizeGee.ex) seizeGee.ar1 <- update(seizeGee.un, corstr="AR-M", Mv = 1) summary(seizeGee.ar1) library(MuMIn) QIC(seizeGee.un) QIC(seizeGee.ex) QIC(seizeGee.ar1) The geepack package library(geepack) seizeGeeglm.un <- geeglm(y ~ log(age) + trt*log(base / 4) + (time == 4), id = id, data=seize, corstr="un", family=poisson) summary(seizeGeeglm.un) seizeGeeglm.ex <- update(seizeGeeglm.un, corstr="exch") summary(seizeGeeglm.ex) seizeGeeglm.ar1 <- update(seizeGeeglm.un, corstr="ar1") summary(seizeGeeglm.ar1) library(MuMIn) QIC(seizeGeeglm.un) QIC(seizeGeeglm.ex) QIC(seizeGeeglm.ar1)