Generalized Estimating Equations

advertisement
Generalized Estimating Equations
(GEEs)
Purpose: to introduce GEEs
These are used to model correlated data
from
• Longitudinal/ repeated measures studies
• Clustered/ multilevel studies
1
Outline
• Examples of correlated data
• Successive generalizations
– Normal linear model
– Generalized linear model
– GEE
• Estimation
• Example: stroke data
– exploratory analysis
– modelling
2
Correlated data
1. Repeated measures: same subjects, same measure,
successive times – expect successive measurements to be correlated
Treatment groups
Measurement times
A
Subjects,
i = 1,…,n
B
C
Randomize
Yi1
Yi2
Yi3
Yi4
3
Correlated data
2. Clustered/multilevel studies
Level 3
Level 2
Level 1
E.g., Level 3: populations
Level 2: age - sex groups
Level 1: blood pressure measurements in sample of people in each
age - sex group
We expect correlations within populations and within age-sex groups due
to genetic, environmental and measurement effects
4
Notation
• Repeated measurements: yij, i = 1,… N, subjects;
j = 1, … ni, times for subject i
• Clustered data: yij, i = 1,… N, clusters; j = 1, … ni,
measurements within cluster i
 yi1 
y 
i2
Vector of measurements for unit i yi   
 
 
 yini 
• Use “unit” for subject or cluster

Vector of measurements for all units
 y1 
y 
y   2
 
 
 yN 
5
Normal Linear Model
For unit i:
E(yi)=i=Xi;
yi~N(i, Vi)
Xi: nip design matrix
: p1 parameter vector
Vi: nini variance-covariance matrix,
e.g., Vi=2I if measurements are independent
For all units: E(y)==X, y~N(,V)
 μ1 
 X1 
0 
V1 0
μ 
X 

μ   2 ,
X   2 ,
V   0 

  
  
 0
VN 
 
 
μ N 
X N 
This V is suitable if the units are independent
6
Normal linear model: estimation
We want to estimate β and V
Use
log-likelihood function
 (y  μ)T V 1( y  μ)

Score = U(β) =
 X T V 1( y  μ)
β
  X i Vi1( yi  X iβ )  0
T
Solve this set of score equations to estimate β
7
Generalized linear model (GLM)
Yij's (elements of yi ) are not necessarily Normal
(e.g., Poisson, binomial)
E(Yij )  μi j
g(μij ) = ηij = x iβ; g is the link function
Score = U(β) =
T 1
D
 i Vi ( yi  μi )  0
where Di is matrix of derivatives with e lements
 μi
 μi
=
x ik
 βk
 ηk
and Vi is diagonal with elements var(Yij )
(If link is identity then Di =X i )
8
Generalized estimating equations
(GEE)
Yij's are not necessarily Normal
Yij's are not necessarily independent
R i is correlation matrix for Yij's
1/ 2
Variance-covariance matrix can be written as A1/i 2R i Ai
where Ai is diagonal with elements var(Yij )
Score = U(β) =
D V
T
i
where Vi  ( A R i A
1/ 2
i
i
1/ 2
i
1
( yi  μi )  0
) ( allows for over-dispersion)
9
Generalized estimating equations
Di is the matrix of derivatives i/j
Vi is the ‘working’ covariance matrix of Yi
Ai=diag{var(Yik)},
Ri is the correlation matrix for Yi
 is an overdispersion parameter
10
Overdispersion parameter
Estimated using the formula:
1
y
ij  ij
ˆ 
i j
Np
var( ij )
Where N is the total number of measurements and
p is the number of regression parameters
The square root of the overdispersion parameter
is called the scale parameter
11
Estimation (1)
For Normal linear model
Solve U(β) =
T
ˆ  ( X T X)1 X T y
X
(
y

X
β
)

0
to
get
β
 i i i
with var(βˆ ) = (X T V 1X)1
More generally, unless Vi is known, need iteration to
solve U (β) 
DTi Vi1 (y i  μi )  0

1. Guess Vi and estimate  by b and hence 
2. Calculate residuals, rij=yij-ij
3. Estimate Vi from the residuals
4. Re-estimate b using the new estimate of Vi
Repeat steps 2-4 until convergence
12
Estimation (2) – For GEEs
Liang and Zeger (1984) showed if R is correctly
specified, βˆ is consistent and asymptotically Normal.
βˆ is fairly robust, so correct specification of
R ('working correlation matrix') is not critical.
Also V is estimated so need 'sandwich estimator'
for var(βˆ )
Vs (βˆ ) = I -1CI 1 where I = DT Vˆ -1D and
C = DT Vˆ -1(y-μˆ )(y-μˆ )T Vˆ -1D
13
Iterative process for GEE’s
• Start with Ri=identity (ie independence) and =1:
estimate 
• Use estimates to calculated fitted values:
μ̂i  g ( Xi )
1
• And residuals:
Yi  μ̂i
• These are used to estimate Ai, Ri and 
• Then the GEE’s are solved again to obtain
improved estimates of 
14
Correlation
For unit i
 1 ρ12
ρ
1
2  21
Vi =


ρn1
ρ..
ρ1n 


ρ.. 

1 
For repeated measures ρlm= correl between times l and m
For clustered data ρlm= correl between measures l and m
For all models considered here Vi is assumed to be same for
all units
15
Types of correlation
1. Independent: Vi is diagonal
2. Exchangeable: All measurements on the same
unit are equally correlated
ρlm  ρ
Plausible for clustered data
Other terms: spherical and compound symmetry
16
Types of correlation
3. Correlation depends on time or distance between
measurements l and m
ρlm is a function of |l - m|, e.g. ρlm  e- |l-m|
e.g. first order auto-regressive model has terms ,
2, 3 and so on
Plausible for repeated measures where correlation is
known to decline over time
4. Unstructured correlation: no assumptions about the
correlations ρlm
Lots of parameters to estimate – may not converge
17
Missing Data
For missing data, can estimate the working
correlation using the all available pairs
method, in which all non-missing pairs of
data are used in the estimators of the
working correlation parameters.
18
Choosing the Best Model
Standard Regression (GLM)
AIC = - 2*log likelihood + 2*(#parameters)
 Values closer to zero indicate better fit
and greater parsimony.
19
Choosing the Best Model
GEE
QIC(V) – function of V, so can use to
choose best correlation structure.
QICu – measure that can be used to
determine the best subsets of
covariates for a particular model.
the best model is the one with the
smallest value!
20
Other approaches – alternatives
to GEEs
1. Multivariate modelling – treat all
measurements on same unit as dependent
variables (even though they are measurements
of the same variable) and model them
simultaneously
(Hand and Crowder, 1996)
e.g., SPSS uses this approach (with
exchangeable correlation) for repeated
measures ANOVA
21
Other approaches – alternatives
to GEEs
2. Mixed models – fixed and random effects
e.g., y = X + Zu + e
: fixed effects; u: random effects ~ N(0,G)
e: error terms ~ N(0,R)
var(y)=ZGTZT + R
so correlation between the elements of y is due to
random effects
Verbeke and Molenberghs (1997)
22
Example of correlation from random effects
Cluster sampling – randomly select areas (PSUs) then
households within areas
Yij =  + ui + eij
Yij : income of household j in area i
 : average income for population
ui : is random effect of area i ~ N(0, u ); eij: error ~ N(0, e )
2
2
E(Yij) = ; var(Yij) =  u2   e2 ;
cov(Yij,Ykm)= u2 , provided i=k, cov(Yij,Ykm)=0, otherwise.
 u2
So Vi is exchangeable with elements:  2
=ICC
2
u e
(ICC: intraclass correlation coefficient)
23
Numerical example: Recovery from stroke
Treatment groups
A = new OT intervention
B = special stroke unit, same hospital
C= usual care in different hospital
8 patients per group
Measurements of functional ability – Barthel index
measured weekly for 8 weeks
Yijk : patients i, groups j, times k
• Exploratory analyses – plots
• Naïve analyses
• Modelling
24
Numerical example: time plots
Individual patients and overall regression line
score
100
80
60
40
20
0
2
4
6
8
week
19
25
Numerical example: time plots for groups
score
80
A:blue
B: black
70
60
C: red
50
40
30
2
4
6
8
week
26
Numerical example: research
questions
• Primary question: do slopes differ
(i.e. do treatments have different
effects)?
• Secondary question: do intercepts
differ (i.e. are groups same
initially)?
27
Numerical example: Scatter plot matrix
Week1
Week2
Week3
Week4
Week5
Week6
Week7
Week8
28
Numerical example
Correlation matrix
week
1
2
0.93
3
0.88
4
0.83
2
3
4
5
6
7
0.92
0.88 0.95
5
6
7
0.79
0.71
0.62
0.85 0.91 0.92
0.79 0.85 0.88
0.70 0.77 0.83
0.97
0.92 0.96
8
0.55
0.64 0.70 0.77
0.88 0.93 0.98
29
Numerical example
1. Pooled analysis ignoring correlation
within patients
Yijk  α j  β jk  eijk ; j for groups, k for time
Different intercepts and different slopes for groups.
Assume all Yijk are independent and same variance
(i.e. ignore the correlation between observations).
Use multiple regression to compare α j ' s and β j ' s
To model different slopes use interaction terms
group  time
30
Numerical example
2. Data reduction
Fit a straight line for each patient
Yijk  αij  βijk  eijk
assume independence and constant variance
use simple linear regression to estimate αij and βij
Perform ANOVA using estimates αˆ ij as data
and groups as levels of a factor in order to compare α j ' s.
Repeat ANOVA using βˆ ij's as data and compare β j's
31
Numerical example
2. Repeated measures analyses using
various variance-covariance structures
Fit Yijk  α j  β jk  eijk
with α j and β j as the parameters of interest
Assuming Normality for eijk but try
various forms for variance-covariance matrix
For the stroke data, from scatter plot matrix and
correlations, an auto-regressive structure (e.g. AR(1))
seems most appropriate
Use GEEs to fit models
32
Numerical example
4. Mixed/Random effects model
Use model
Yijk = (j + aij) + (j + bij)k + eijk
(i) j and j are fixed effects for groups
(ii) other effects are random
aij ~ N (0,  a2 ) , bij ~ N (0, b2 ) , eijk ~ N (0,  e2 )
and all are independent
Fit model and use estimates of fixed effects to
compare j’s and j’s
33
Numerical example: Results for intercepts
Intercept A
Asymp SE
Robust SE
Pooled
29.821
5.772
Data reduction
29.821
7.572
GEE, independent
29.821
5.683
10.395
GEE, exchangeable
29.821
7.047
10.395
GEE, AR(1)
33.492
7.624
9.924
GEE, unstructured
30.703
7.406
10.297
Random effects
29.821
7.047
Results from Stata 8
34
Numerical example: Results for intercepts
B-A
Asymp SE
Robust SE
Pooled
3.348
8.166
Data reduction
3.348
10.709
GEE, independent
3.348
8.037
11.884
GEE, exchangeable
3.348
9.966
11.884
GEE, AR(1)
-0.270
10.782
11.139
GEE, unstructured
2.058
10.474
11.564
Random effects
3.348
9.966
Results from Stata 8
35
Numerical example: Results for intercepts
C -A
Asymp SE
Robust SE
Pooled
-0.022
8.166
Data reduction
-0.018
10.709
GEE, independent
-0.022
8.037
11.130
GEE, exchangeable
-0.022
9.966
11.130
GEE, AR(1)
-6.396
10.782
10.551
GEE, unstructured
-1.403
10.474
10.906
Random effects
-0.022
9.966
Results from Stata 8
36
Numerical example: Results for slopes
Slope A
Asymp SE
Robust SE
Pooled
6.324
1.143
Data reduction
6.324
1.080
GEE, independent
6.324
1.125
1.156
GEE, exchangeable
6.324
0.463
1.156
GEE, AR(1)
6.074
0.740
1.057
GEE, unstructured
7.126
0.879
1.272
Random effects
6.324
0. 463
Results from Stata 8
37
Numerical example: Results for slopes
B-A
Asymp SE
Robust SE
Pooled
-1.994
1.617
Data reduction
-1.994
1.528
GEE, independent
-1.994
1.592
1.509
GEE, exchangeable
-1.994
0.655
1.509
GEE, AR(1)
-2.142
1.047
1.360
GEE, unstructured
-3.556
1.243
1.563
Random effects
-1.994
0.655
Results from Stata 8
38
Numerical example: Results for slopes
C -A
Asymp SE
Robust SE
Pooled
-2.686
1.617
Data reduction
-2.686
1.528
GEE, independent
-2.686
1.592
1.502
GEE, exchangeable
-2.686
0.655
1.509
GEE, AR(1)
-2.236
1.047
1.504
GEE, unstructured
-4.012
1.243
1.598
Random effects
-2.686
0.655
Results from Stata 8
39
Numerical example:
Summary of results
• All models produced similar results leading to the same
conclusion – no treatment differences
• Pooled analysis and data reduction are useful for
exploratory analysis – easy to follow, give good
approximations for estimates but variances may be
inaccurate
• Random effects models give very similar results to GEEs
• don’t need to specify variance-covariance matrix
• model specification may/may not be more natural
40
Download