Generalized Estimating Equations (GEEs) Purpose: to introduce GEEs These are used to model correlated data from • Longitudinal/ repeated measures studies • Clustered/ multilevel studies 1 Outline • Examples of correlated data • Successive generalizations – Normal linear model – Generalized linear model – GEE • Estimation • Example: stroke data – exploratory analysis – modelling 2 Correlated data 1. Repeated measures: same subjects, same measure, successive times – expect successive measurements to be correlated Treatment groups Measurement times A Subjects, i = 1,…,n B C Randomize Yi1 Yi2 Yi3 Yi4 3 Correlated data 2. Clustered/multilevel studies Level 3 Level 2 Level 1 E.g., Level 3: populations Level 2: age - sex groups Level 1: blood pressure measurements in sample of people in each age - sex group We expect correlations within populations and within age-sex groups due to genetic, environmental and measurement effects 4 Notation • Repeated measurements: yij, i = 1,… N, subjects; j = 1, … ni, times for subject i • Clustered data: yij, i = 1,… N, clusters; j = 1, … ni, measurements within cluster i yi1 y i2 Vector of measurements for unit i yi yini • Use “unit” for subject or cluster Vector of measurements for all units y1 y y 2 yN 5 Normal Linear Model For unit i: E(yi)=i=Xi; yi~N(i, Vi) Xi: nip design matrix : p1 parameter vector Vi: nini variance-covariance matrix, e.g., Vi=2I if measurements are independent For all units: E(y)==X, y~N(,V) μ1 X1 0 V1 0 μ X μ 2 , X 2 , V 0 0 VN μ N X N This V is suitable if the units are independent 6 Normal linear model: estimation We want to estimate β and V Use log-likelihood function (y μ)T V 1( y μ) Score = U(β) = X T V 1( y μ) β X i Vi1( yi X iβ ) 0 T Solve this set of score equations to estimate β 7 Generalized linear model (GLM) Yij's (elements of yi ) are not necessarily Normal (e.g., Poisson, binomial) E(Yij ) μi j g(μij ) = ηij = x iβ; g is the link function Score = U(β) = T 1 D i Vi ( yi μi ) 0 where Di is matrix of derivatives with e lements μi μi = x ik βk ηk and Vi is diagonal with elements var(Yij ) (If link is identity then Di =X i ) 8 Generalized estimating equations (GEE) Yij's are not necessarily Normal Yij's are not necessarily independent R i is correlation matrix for Yij's 1/ 2 Variance-covariance matrix can be written as A1/i 2R i Ai where Ai is diagonal with elements var(Yij ) Score = U(β) = D V T i where Vi ( A R i A 1/ 2 i i 1/ 2 i 1 ( yi μi ) 0 ) ( allows for over-dispersion) 9 Generalized estimating equations Di is the matrix of derivatives i/j Vi is the ‘working’ covariance matrix of Yi Ai=diag{var(Yik)}, Ri is the correlation matrix for Yi is an overdispersion parameter 10 Overdispersion parameter Estimated using the formula: 1 y ij ij ˆ i j Np var( ij ) Where N is the total number of measurements and p is the number of regression parameters The square root of the overdispersion parameter is called the scale parameter 11 Estimation (1) For Normal linear model Solve U(β) = T ˆ ( X T X)1 X T y X ( y X β ) 0 to get β i i i with var(βˆ ) = (X T V 1X)1 More generally, unless Vi is known, need iteration to solve U (β) DTi Vi1 (y i μi ) 0 1. Guess Vi and estimate by b and hence 2. Calculate residuals, rij=yij-ij 3. Estimate Vi from the residuals 4. Re-estimate b using the new estimate of Vi Repeat steps 2-4 until convergence 12 Estimation (2) – For GEEs Liang and Zeger (1984) showed if R is correctly specified, βˆ is consistent and asymptotically Normal. βˆ is fairly robust, so correct specification of R ('working correlation matrix') is not critical. Also V is estimated so need 'sandwich estimator' for var(βˆ ) Vs (βˆ ) = I -1CI 1 where I = DT Vˆ -1D and C = DT Vˆ -1(y-μˆ )(y-μˆ )T Vˆ -1D 13 Iterative process for GEE’s • Start with Ri=identity (ie independence) and =1: estimate • Use estimates to calculated fitted values: μ̂i g ( Xi ) 1 • And residuals: Yi μ̂i • These are used to estimate Ai, Ri and • Then the GEE’s are solved again to obtain improved estimates of 14 Correlation For unit i 1 ρ12 ρ 1 2 21 Vi = ρn1 ρ.. ρ1n ρ.. 1 For repeated measures ρlm= correl between times l and m For clustered data ρlm= correl between measures l and m For all models considered here Vi is assumed to be same for all units 15 Types of correlation 1. Independent: Vi is diagonal 2. Exchangeable: All measurements on the same unit are equally correlated ρlm ρ Plausible for clustered data Other terms: spherical and compound symmetry 16 Types of correlation 3. Correlation depends on time or distance between measurements l and m ρlm is a function of |l - m|, e.g. ρlm e- |l-m| e.g. first order auto-regressive model has terms , 2, 3 and so on Plausible for repeated measures where correlation is known to decline over time 4. Unstructured correlation: no assumptions about the correlations ρlm Lots of parameters to estimate – may not converge 17 Missing Data For missing data, can estimate the working correlation using the all available pairs method, in which all non-missing pairs of data are used in the estimators of the working correlation parameters. 18 Choosing the Best Model Standard Regression (GLM) AIC = - 2*log likelihood + 2*(#parameters) Values closer to zero indicate better fit and greater parsimony. 19 Choosing the Best Model GEE QIC(V) – function of V, so can use to choose best correlation structure. QICu – measure that can be used to determine the best subsets of covariates for a particular model. the best model is the one with the smallest value! 20 Other approaches – alternatives to GEEs 1. Multivariate modelling – treat all measurements on same unit as dependent variables (even though they are measurements of the same variable) and model them simultaneously (Hand and Crowder, 1996) e.g., SPSS uses this approach (with exchangeable correlation) for repeated measures ANOVA 21 Other approaches – alternatives to GEEs 2. Mixed models – fixed and random effects e.g., y = X + Zu + e : fixed effects; u: random effects ~ N(0,G) e: error terms ~ N(0,R) var(y)=ZGTZT + R so correlation between the elements of y is due to random effects Verbeke and Molenberghs (1997) 22 Example of correlation from random effects Cluster sampling – randomly select areas (PSUs) then households within areas Yij = + ui + eij Yij : income of household j in area i : average income for population ui : is random effect of area i ~ N(0, u ); eij: error ~ N(0, e ) 2 2 E(Yij) = ; var(Yij) = u2 e2 ; cov(Yij,Ykm)= u2 , provided i=k, cov(Yij,Ykm)=0, otherwise. u2 So Vi is exchangeable with elements: 2 =ICC 2 u e (ICC: intraclass correlation coefficient) 23 Numerical example: Recovery from stroke Treatment groups A = new OT intervention B = special stroke unit, same hospital C= usual care in different hospital 8 patients per group Measurements of functional ability – Barthel index measured weekly for 8 weeks Yijk : patients i, groups j, times k • Exploratory analyses – plots • Naïve analyses • Modelling 24 Numerical example: time plots Individual patients and overall regression line score 100 80 60 40 20 0 2 4 6 8 week 19 25 Numerical example: time plots for groups score 80 A:blue B: black 70 60 C: red 50 40 30 2 4 6 8 week 26 Numerical example: research questions • Primary question: do slopes differ (i.e. do treatments have different effects)? • Secondary question: do intercepts differ (i.e. are groups same initially)? 27 Numerical example: Scatter plot matrix Week1 Week2 Week3 Week4 Week5 Week6 Week7 Week8 28 Numerical example Correlation matrix week 1 2 0.93 3 0.88 4 0.83 2 3 4 5 6 7 0.92 0.88 0.95 5 6 7 0.79 0.71 0.62 0.85 0.91 0.92 0.79 0.85 0.88 0.70 0.77 0.83 0.97 0.92 0.96 8 0.55 0.64 0.70 0.77 0.88 0.93 0.98 29 Numerical example 1. Pooled analysis ignoring correlation within patients Yijk α j β jk eijk ; j for groups, k for time Different intercepts and different slopes for groups. Assume all Yijk are independent and same variance (i.e. ignore the correlation between observations). Use multiple regression to compare α j ' s and β j ' s To model different slopes use interaction terms group time 30 Numerical example 2. Data reduction Fit a straight line for each patient Yijk αij βijk eijk assume independence and constant variance use simple linear regression to estimate αij and βij Perform ANOVA using estimates αˆ ij as data and groups as levels of a factor in order to compare α j ' s. Repeat ANOVA using βˆ ij's as data and compare β j's 31 Numerical example 2. Repeated measures analyses using various variance-covariance structures Fit Yijk α j β jk eijk with α j and β j as the parameters of interest Assuming Normality for eijk but try various forms for variance-covariance matrix For the stroke data, from scatter plot matrix and correlations, an auto-regressive structure (e.g. AR(1)) seems most appropriate Use GEEs to fit models 32 Numerical example 4. Mixed/Random effects model Use model Yijk = (j + aij) + (j + bij)k + eijk (i) j and j are fixed effects for groups (ii) other effects are random aij ~ N (0, a2 ) , bij ~ N (0, b2 ) , eijk ~ N (0, e2 ) and all are independent Fit model and use estimates of fixed effects to compare j’s and j’s 33 Numerical example: Results for intercepts Intercept A Asymp SE Robust SE Pooled 29.821 5.772 Data reduction 29.821 7.572 GEE, independent 29.821 5.683 10.395 GEE, exchangeable 29.821 7.047 10.395 GEE, AR(1) 33.492 7.624 9.924 GEE, unstructured 30.703 7.406 10.297 Random effects 29.821 7.047 Results from Stata 8 34 Numerical example: Results for intercepts B-A Asymp SE Robust SE Pooled 3.348 8.166 Data reduction 3.348 10.709 GEE, independent 3.348 8.037 11.884 GEE, exchangeable 3.348 9.966 11.884 GEE, AR(1) -0.270 10.782 11.139 GEE, unstructured 2.058 10.474 11.564 Random effects 3.348 9.966 Results from Stata 8 35 Numerical example: Results for intercepts C -A Asymp SE Robust SE Pooled -0.022 8.166 Data reduction -0.018 10.709 GEE, independent -0.022 8.037 11.130 GEE, exchangeable -0.022 9.966 11.130 GEE, AR(1) -6.396 10.782 10.551 GEE, unstructured -1.403 10.474 10.906 Random effects -0.022 9.966 Results from Stata 8 36 Numerical example: Results for slopes Slope A Asymp SE Robust SE Pooled 6.324 1.143 Data reduction 6.324 1.080 GEE, independent 6.324 1.125 1.156 GEE, exchangeable 6.324 0.463 1.156 GEE, AR(1) 6.074 0.740 1.057 GEE, unstructured 7.126 0.879 1.272 Random effects 6.324 0. 463 Results from Stata 8 37 Numerical example: Results for slopes B-A Asymp SE Robust SE Pooled -1.994 1.617 Data reduction -1.994 1.528 GEE, independent -1.994 1.592 1.509 GEE, exchangeable -1.994 0.655 1.509 GEE, AR(1) -2.142 1.047 1.360 GEE, unstructured -3.556 1.243 1.563 Random effects -1.994 0.655 Results from Stata 8 38 Numerical example: Results for slopes C -A Asymp SE Robust SE Pooled -2.686 1.617 Data reduction -2.686 1.528 GEE, independent -2.686 1.592 1.502 GEE, exchangeable -2.686 0.655 1.509 GEE, AR(1) -2.236 1.047 1.504 GEE, unstructured -4.012 1.243 1.598 Random effects -2.686 0.655 Results from Stata 8 39 Numerical example: Summary of results • All models produced similar results leading to the same conclusion – no treatment differences • Pooled analysis and data reduction are useful for exploratory analysis – easy to follow, give good approximations for estimates but variances may be inaccurate • Random effects models give very similar results to GEEs • don’t need to specify variance-covariance matrix • model specification may/may not be more natural 40