Marginal and Random Effects Models for Discrete Data • n

advertisement
• Suppose we have i=1,2,...m groups of
observations
Marginal and Random Effects
Models for Discrete Data
• There are ni observations in the i-th
group denoted by
• Chapters 7-10 in Diggle, Heagerty,
Liang and Diggle
Yi1, Yi2, . . . , Yi,ni
for i = 1,2,....,m
• Generalized Linear Models
• The observations in any group are
independent of the observations in
any other group
– Poisson regression (log-linear
models)
– Logistic regression models
• Within a group, the observations may
be correlated
580
581
Example: Logistic regression
Generalized Linear Models
• Yij ∼ Binomial(nij , πij )
Each model has a systematic part and a
random part
• E(Yij ) = μij = nij πij
• Logit link function:
• Expectation of the response
⎛
T β)
E(Yij ) = μij = h−1(Xij
⎜
h(μij ) = log ⎜⎜⎝
μij
nij − μij
⎞
⎛
⎟
⎟
⎟
⎠
= log ⎜⎜⎝
πij
⎜
1 − πij
⎞
⎟
⎟
⎟
⎠
where h is a known link function, or
Tβ
h(μij ) = Xij
⎛
⎞
π
ij
• log ⎝ 1−π
⎠
ij
• Specify the distribution of Yij
⇒
– Yij ∼ P oisson(μij )
= β0 + β1X1ij + . . . + βk Xkij
πij =
T β)
exp(Xij
T β)
1 + exp(Xij
• exp(βr ) is a conditional odds ratio:
The odds of success with Xr +1 divided
by the odds of success with Xr when all
other variables are held constant
– Yij ∼ Binomial(n, π)
582
583
If the Yij are mutually independent, the
likelihood function is
L(β) =
n
m
i
⎛
⎞
nij ⎟⎟ Yij
nij −Yij
⎟π
⎠ ij (1 − πij )
i=1 j=i Yij
⎜
⎜
⎜
⎝
exp(X T β)
ij
where πij =
T β)
1+exp(Xij
Likelihood Equations:
and the log-likelihood function is
⎡
(β) =
⎤
ni
m
⎢
⎣
i=1 j=1
log(nij !) − log(Yij !) − log((nij − Yij )!)
ni
m
+
Yij log(πij ) +
i=1 j=1
⎡
=
i=1 j=1
ni
⎛
Yij log
⎜
⎝
i=1 j=1
m
=
ni
i=1 j=1
+
m
(nij − Yij ) log(1 − πij )
∂β0
1 − πij
⎟
⎠
+
m
ni
i=1 j=1
m
ni
i=1 j=1
Yij −
(i,j)
nij
T
exp(Xij
β)
T β)
1 + exp(Xij
(Yij − nij πij )
i=1 j=1
∂
0 =
∂βr
=
⎞
πij
=
ni
m
=
log(nij !) − log(Yij !) − log((nij − Yij )!)⎥⎦
m
+
i=1 j=1
⎤
ni
m
⎢
⎣
ni
m
∂
0 =
⎥
⎦
=
m
m
ni
i=1 j=1
nij log(1 − πij )
ni
i=1 j=1
Yij Xrij −
m
ni
nij Xrij
i=1 j=1
Xrij (Yij − nij πij )
T
exp(Xij
β)
T β)
1 − exp(Xij
for r = 1, 2, ..., k
[log(nij !) − log(Yij !) − log((nij − Yij )!)]
ni
i=1 j=1
T
Yij Xij
β−
m
ni
i=1 j=1
T
nij log(1 + exp(Xij
β))
584
585
Second partial derivatives
likelihood function are:
For the i-th group of responses let
⎡
Xi =
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎣
T
Xi1
T
Xi2
..
T
Xi,n
i
⎤
⎡
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎦
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎣
Yi =
Yi1
Yi2
..
Yi,ni
⎤
⎡
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎦
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎣
μi =
ni1πi1
ni2πi2
..
ni,ni πi,ni
⎡
⎤
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎦
H =
Stack these on top of each other to form
a model matrix and vectors of responses
and estimated means for the entire data
set:
⎡
X=
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎣
X1
X2
..
Xm
⎤
⎡
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎦
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎣
Y=
Y1
Y2
..
Ym
⎤
⎡
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎦
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎣
μ=
μ1
μ2
..
μm
⎤
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎦
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎣
−∂ 2 ∂β02
−∂ 2 ∂β0∂β1
−∂ 2 −∂ 2 ∂β1 ∂βk
..
∂β0∂βk
..
of
the
2
−∂ · · · ∂β
0 ∂βk
..
...
···
−∂ 2 ∂βk2
log⎤
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎦
= (X T V X)
=
m
i=1
XiT ViXi
where
⎡
Vi =
⎢
⎢
⎢
⎢
⎢
⎢
⎣
nij πi1(1 − πi1)
⎤
...
ni,ni πi,ni (1 − πi,ni )
for i=1,2,...,m, and
The likelihood equations are
⎡
V = V ar(Y ) =
0 = Q = X T (Y − μ)
=
m
i=1
⎢
⎢
⎢
⎢
⎢
⎢
⎣
V1
⎤
...
Vm
⎥
⎥
⎥
⎥
⎥
⎥
⎦
XiT (Yi − μi)
H is the Fisher information matrix and it
does not depend on the observed counts.
586
587
⎥
⎥
⎥
⎥
⎥
⎥
⎦
Likelihood equations are solved iteratively
• Usually no closed form solution exists
The likelihood equations can be written as
0 = X T (Y − μ)
• Newton-Raphson algorithm is equivalent to Fisher scoring
β̂
(S+1)
= β̂
(S)
+ Ĥ −1 Q̂
these are evaluated at β̂
= X T V V −1(Y − μ)
=
(S)
=
• Modification (halving):
(S+1)
than at β̂
(S+1)
⎡
Di = XiT Vi =
⎛
⎞
(S)
(S+1)⎟
⎠
= 12 ⎜⎝β̂
+ β̂
and repeat the check.
•
Starting values: β̂r(0) = 0 r=1,2,..k
⎛
⎞
ΣY
(0)
and β̂0 = log ⎝ P̂ ⎠ where P̂ = Σnij
1−P̂
ij
588
Generalized Estimating Equations
(GEE)
To estimate β solve
0 =
m
i=1
Di =
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎣
DiVi−1(Yi − μi)
∂μi1
∂β0
∂μi1
∂β1
∂μi,n
i
∂β0
∂μi,n
i
∂β1
..
DiVi−1(Yi − μi)
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎢
⎣
∂μi1
∂β0
∂μi1
∂β1
..
···
..
∂μi1
∂βk
∂μi,n
i
∂β0
∂μi,n
i
∂β1
···
∂μi,n
i
∂βk
..
..
⎤
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎦
and
μij = nij πij = nij
T β)
exp(Xij
T β)
1 + exp(Xij
589
Marginal Model for
Correlated Binary Responses
Suppose several measurements are taken
on the same subject or experimental unit
• Examine a patient for the presence or
absence of a rash at different points in
time
where Vi = V ar(Yi) and
⎡
XiT ViVi−1(Yi − μi)
where
(S)
– If it is, go to the next iteration
– If not, try β̂
m
i=1
– Check if the log-likelihood is
larger at β̂
m
i=1
..
···
..
∂μi1
∂βk
···
∂μi,n
i
∂βk
..
• Mastitis infections in different lactation
periods of a dairy cow
⎤
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎥
⎦
• Survival of ducklings to migration time
and
μij = nij πij = nij
• Present or absence of tumors in different organs
T β)
exp(Xij
T β)
1 + exp(Xij
590
591
Suppose patients treated for a reoccurring rash are examined at a series of
time points. For the i − th patient at the
j − th time point
Yij =
⎧
⎪
⎪
⎨
⎪
⎪
⎩
1
0
⎜
log ⎜⎜⎝
ρjr = ρ
ρjr = ρ|j−r|
if rash is present
if rash is absent
α
ρjr = ρ|tj −tr |
Assume Yij ∼ Binomial(1, πij ) where
⎛
Some possibilities are:
⎞
πij ⎟⎟
⎟ = β
0 + β1X1ij + · · · + βk Xkij
⎠
1 − πij
Specify a covariance matrix for
Yi = (Yi1, . . . , Yi,ni )T
V ar(Yij ) = φπij (1 − πij )
Cov(Yij , Yir ) = φρjr πij (1 − πij )πir (1 − πir )
• The marginal variance depends on the
marginal mean, V ar(Yij ) = φ ν(πij )
where ν is a known variance function
and φ is a scale parameter that may
have to be estimated,
e.g. ν(πij ) = πij (1 − πij )
φ=1
• correlation between Yij and Yir is
Corr(Yij , Yir ) = ρ(πij , πir ; α) where ρ is
a known function
592
You must estimate association parameters
(e.g. α) so you can evaluate Vi in the
GEE’s
593
• Get an updated estimate β̂GEE by
solving
• Get an initial estimate β̂IW M by solving
IWM equations
0 =
m
i=1
−1
DiVIW
M,i(Yi − μi)
where VIW M,i is the diagonal part of
Vi = V ar(Yi)
i=1
⎡
m
⎢ ⎣
i=1
⎤
D̂iV̂i−1D̂iT ⎥⎦
−1
• A robust (empirical) estimate of the
covariance matrix for β̂GEE is
ˆ
(Yij − μ̂IW M,ij )/ V ar(Y
ij )
to estimate association
(see page 147 in DHLZ)
D̂iV̂i−1(Yi − μi)
• The large sample (large m) distribution
of β̂GEE is approximately multivariate
normal with expectation β and covariance matrix (model estimate)
• Obtain μ̂IW M,i by evaluating μi at
β̂IW M
• Use values of
m
0 =
parameters
⎡
⎣
m
i=1
⎤−1 ⎡
D̂iV̂i−1D̂iT ⎦
⎣
m
i=1
⎤
D̂iV̂i−1 [Yi − μ̂i][Yi − μ̂i ]T V̂i−1D̂iT ⎦
⎡
×⎣
• Use the estimates of the association
parameters to evaluate V̂i
594
m
i=1
⎤−1
D̂iV̂i−1 D̂iT ⎦
595
data set1;
infile "c:\stat565\dhlz.example8_1.data";
input patient class y int x1 x2 x12 order ;
/* Recode the treatment variable */
x1=abs(1-x1);
x12=x1*x2;
run;
/* Analyze the data from a crossover
trial on cerebrovascular deficiency
from example 8.1 in DHLZ
*/
/*
THE VARIABLES WERE CODED AS FOLLOWS:
ID = CLUSTER VARIABLE
CLASS = 1, NEEDED TO RUN GEE2
Y = 1
0
abnormal electrocardiogram
normal electrocardiogram
Int = 1 (INTERCEPT)
/* First fit a model with an order effect */
X1 = 1 placebo (treatment B)
0 active drug A
proc genmod data=set1 descending;
class patient;
model y = x1 x2 x12 /
dist=binomial link=logit
itprint pscale
converge=1e-8 maxit=50;
repeated subject=patient / type=un
modelse covb corrw;
run;
X2 = 1 if period 2
0 if period 1
X12 = X1*X2
order = 1 if B before A
0 if A before B
REFERENCE:
JONES AND KENWARD(1989)
CHAPMAN AND HALL, P.9 */
596
597
The GENMOD Procedure
Criteria For Assessing Goodness Of Fit
Model Information
Data Set
Distribution
Link Function
Dependent Variable
Observations Used
Criterion
WORK.SET1
Binomial
Logit
y
134
Deviance
Scaled Deviance
Pearson Chi-Square
Scaled Pearson X2
Log Likelihood
Values
patient
1 2 3
17 18
30 31
43 44
56 57
4 5 6
19 20
32 33
45 46
58 59
7 8 9
21 22
34 35
47 48
60 61
10
23
36
49
62
11
24
37
50
63
12
25
38
51
64
13
26
39
52
65
Value
Value/DF
130
130
130
130
162.0983
162.0983
133.9999
133.9999
-81.0491
1.2469
1.2469
1.0308
1.0308
Analysis Of Initial Parameter Estimates
Class Level Information
Class
DF
14
27
40
53
66
15
28
41
54
67
16
29
42
55
Response Profile
Ordered
Value
y
Total
Frequency
1
2
1
0
92
42
PROC GENMOD is modeling the probability that y=’1’.
598
Parameter
DF
Estimate
Intercept
x1
x2
x12
Scale
1
1
1
1
0
0.4308
1.1097
0.1754
-1.0226
1.0000
Standard
Wald 95%
Error
Confidence Limits
0.3563
0.5738
0.5057
0.7710
0.0000
-0.2675
-0.0151
-0.8158
-2.5338
1.0000
Parameter
Pr > ChiSq
Intercept
x1
x2
x12
Scale
0.2266
0.0531
0.7288
0.1847
1.1290
2.2344
1.1665
0.4885
1.0000
NOTE: The scale parameter was held fixed.
599
ChiSquare
1.46
3.74
0.12
1.76
Covariance Matrix (Model-Based)
Prm1
GEE Model Information
Correlation Structure
Subject Effect
Number of Clusters
Correlation Matrix Dimension
Maximum Cluster Size
Minimum Cluster Size
Unstructured
patient (67 levels)
67
2
2
2
Prm1
Prm2
Prm3
Prm4
Prm2
0.12692
-0.12692
-0.12692
0.21114
-0.12692
0.32930
0.23027
-0.51687
Prm1
Gradient
Prm1
Prm2
Prm3
Prm4
1.882E-11
17.376763
5.8342866
10.618293
4.1806177
Prm2
9.522E-11
5.8342866
20.797139
5.7067855
12.425128
Prm3
Prm4
-7.64E-11
10.618293
5.7067855
25.581146
12.425128
-1.3E-15
4.1806177
12.425128
12.425128
12.425128
Prm4
-0.12692
0.23027
0.25571
-0.44328
0.21114
-0.51687
-0.44328
0.96959
Covariance Matrix (Empirical)
Prm1
Last Evaluation Of The Generalized Gradient And Hessian
Prm3
Prm1
Prm2
Prm3
Prm4
Prm2
0.12692
-0.12692
-0.12692
0.20769
Prm3
-0.12692
0.32930
0.22811
-0.51126
Prm4
-0.12692
0.22811
0.25571
-0.43767
0.20769
-0.51126
-0.43767
0.95837
Working Correlation Matrix
Row1
Row2
Col1
Col2
1.0000
0.6402
0.6402
1.0000
600
601
Analysis Of GEE Parameter Estimates
Empirical Standard Error Estimates
Parameter Estimate
Intercept
x1
x2
x12
0.4308
1.1097
0.1754
-1.0227
Standard
Error
0.3563
0.5739
0.5057
0.9790
95% Confidence
Limits
-0.2675
-0.0151
-0.8158
-2.9414
1.1290
2.2344
1.1665
0.8961
Z
1.21
1.93
0.35
-1.04
Pr>|Z|
0.2266
0.0531
0.7288
0.2962
Analysis Of GEE Parameter Estimates
Model-Based Standard Error Estimates
Parameter Estimate
Intercept
x1
x2
x12
Scale
0.4308
1.1097
0.1754
-1.0227
1.0000
Standard
Error
0.3563
0.5739
0.5057
0.9847
.
95% Confidence
Limits
-0.2675
-0.0151
-0.8158
-2.9526
.
1.1290
2.2344
1.1665
0.9073
.
Z
1.21
1.93
0.35
-1.04
.
Pr>|Z|
0.2266
0.0531
0.7288
0.2990
.
/* Fit a model ignoring any order effect */
proc genmod data=set1 descending;
class patient;
model y = x1 x2 /
dist=binomial link=logit
itprint pscale;
repeated subject=patient / type=un
modelse covb corrw;
run;
NOTE: The scale parameter was held fixed.
602
603
Working Correlation Matrix
Criteria For Assessing Goodness Of Fit
Criterion
Deviance
Scaled Deviance
Pearson Chi-Square
Scaled Pearson X2
Log Likelihood
DF
Value
131
131
131
131
163.8863
163.8863
133.5123
133.5123
-81.9432
DF
Estimate
Standard
Error
Intercept
x1
x2
Scale
1
1
1
0
0.6604
0.5582
-0.2743
1.0000
0.3213
0.3784
0.3768
0.0000
Parameter
Row1
Row2
1.2510
1.2510
1.0192
1.0192
Analysis Of Initial Parameter Estimates
Parameter
Col2
1.0000
0.6389
0.6389
1.0000
Parameter Estimate
Wald 95%
Confidence Limits
ChiSquare
0.0307
-0.1835
-1.0129
1.0000
4.23
2.18
0.53
1.2901
1.2998
0.4642
1.0000
Analysis Of GEE Parameter Estimates
Empirical Standard Error Estimates
Intercept
x1
x2
0.6660
0.5690
-0.2953
Pr>ChiSq
0.0398
0.1402
0.4666
Standard
Error
95% Confidence
Limits
0.2879
0.2327
0.2311
0.1017
0.1129
-0.7483
1.2303
1.0252
0.1577
Z
2.31
2.45
-1.28
Pr>|Z|
0.0207
0.0145
0.2013
Analysis Of GEE Parameter Estimates
Model-Based Standard Error Estimates
Parameter Estimate
Intercept
x1
x2
Scale
Col1
Value/DF
Intercept
x1
x2
Scale
0.6660
0.5690
-0.2953
1.0000
Standard
Error
95% Confidence
Limits
0.2842
0.2288
0.2272
.
0.1090
0.1206
-0.7405
.
1.2231
1.0174
0.1499
.
604
Z
2.34
2.49
-1.30
.
Pr>|Z|
0.0191
0.0129
0.1936
.
605
Criteria For Assessing Goodness Of Fit
Criterion
Deviance
Scaled Deviance
Pearson Chi-Square
Scaled Pearson X2
Log Likelihood
/* Fit a model assuming completely
independent responses */
proc genmod data=set1 descending;
class patient;
model y = x1 x2 /
dist=binomial link=logit
itprint robust
aggregate=patient;
run;
DF
Value
131
131
131
131
163.8863
163.8863
133.5123
133.5123
-81.9432
Value/DF
1.2510
1.2510
1.0192
1.0192
Analysis Of Parameter Estimates
606
Parameter
DF
Estimate
Standard
Error
Intercept
x1
x2
Scale
1
1
1
0
0.6604
0.5582
-0.2743
1.0000
0.3293
0.3781
0.3765
0.0000
Wald 95%
Confidence Limits
0.0151
-0.1829
-1.0122
1.0000
Parameter
Pr>ChiSq
Intercept
x1
x2
Scale
0.0449
0.1399
0.4662
1.3057
1.2992
0.4636
1.0000
607
ChiSquare
4.02
2.18
0.53
Respiratory Infection in Indonesian
Preschool Children
• Sommer et al. (1984) Amer. Jour. of
Clinical Nutrition, 40, 1090-1095.
Cross-sectional Analysis
• Use only data measured at entry into
the study
• Fit a logistic regression model
• DHLZ, Examples 8.4
⎛
• Is prevalence of respiratory infection
higher among children who suffer
xerophthalmia, a manifestation of
chronic vitamin A deficiency?
⎜
log ⎜⎝
⎞
πi ⎟⎟
⎠ =
1 − πi
−1.47 − 0.66(sex)
(0.36) (0.44)
−0.11(height)
(0.041)
+0.44(xerophthalmia)
(1.15)
• Does prevale of respiratory infection
change with age?
−0.089(age) − .0026(age2)
(0.027)
(0.0011)
• 275 preschool children
• Examined in up to six consecutive
quarters
• The sex and xerophthalmia effects are
not significant
608
Longitudinal and cross-sectional
analysis
• Separate time into two components
• agei1 = age at entry into the study
• (ageij − agei1) = time since entry
• Fit the model
⎛
⎞
πij ⎟⎟
⎜
⎟ =
log ⎜⎜⎝
⎠
1 − πij
−2.21 − 0.53(sexi)
(0.32) (0.24)
−0.048(heighti)
(0.024)
+0.53(xerophthalmiai)
(0.45)
−0.053(agei1) − .0013(age2i2)
(0.013)
(0.0005)
−0.19(ageij − agei1)
(0.071)
+.013(ageij − agei1)2
(0.004)
610
609
• No significant effect of presence of
xerophthalmia on incidence of
respiratory infection
• Incidence of respiratory infection
increases up to about 20 months,
then it declines
• Quadratic follow-up time effect can be
explained by seasonality (higher
incidence of respiratory infection in
summer)
• Association among repeated measures
ˆ
log(γ)
=
0.49
(0.26)
where
γ=
P r(Yj = 1, Yk = 1)P r(Yj = 0, Yk = 0)
P r(Yj = 0, Yk = 1)P r(Yj = 1, Yk = 0)
for all (j, k)
611
Transect factors:
Quail Egg Predation
• x1 =
• Examine effects of local environmental
conditions on predation rates
• Sampling units: 136 transects
• x2 =
• x3 =
– Along Iowa roads
– Each transect is divided into two
sub-units
• x4 =
⎧
⎪
⎪
⎨
⎪
⎪
⎩
⎧
⎪
⎪
⎨
⎪
⎪
⎩
⎧
⎪
⎪
⎨
⎪
⎪
⎩
⎧
⎪
⎪
⎨
⎪
⎪
⎩
0 paved road
1 unpaved road
0 row crops
1 other
0 fence or trees
1 no fence or trees
0 trees
1 no trees
∗ foreslope
Sub-transect factor:
∗ backslope
• Z=
⎧
⎪
⎪
⎨
⎪
⎪
⎩
0 foreslope
1 backslope
612
613
Binary Response
Yijk =
⎧
⎪
⎪
⎨
⎪
⎪
⎩
1 nest was disturbed
1 otherwise
A logistic regression model
for the k-th nest in the j -th sub-transect
of the i-th transect.
⎛
log
Conditional expectation given the values of
the transect and sub-transect variables is
E(Yijk ) = πij
⎜
⎜
⎜
⎝
πij
1 − πij
⎞
⎟
⎟
⎟
⎠
=
β0 + β1X1i + β2X2i + β3X3i
+β4X4i + β5X1iX2i + β6X2iX4i
+β7X1iX3i + β8Z
the conditional probability that a nest is
disturbed. This is the same for all nests in
a particular sub-transect.
614
615
Distributional assumptions
• Number of disturbed nests in the j -th
sub-transect of the i-th transect is
• Positive correlations
Yij+ = Yij1 + Yij2 + · · · + Yijnij
• Transects are far enough apart for
⎡
⎢
⎢
⎣
Yi1+
Yi2+
– If a predator finds one nest it will
look for more
⎤
⎥
⎥
⎦
i = 1, 2, ..., m
– It will attract the attention of other
predators
to be independent
⎛
• Variance
V
V ar(Yij+) =
n
ij
V ar(Yijk ) + 2
k=1
= nij πij (1 − πij )(1 +
⎞
⎟
⎟
⎠
⎡
1/2 ⎢⎢
= Vi
⎣
⎤
1 ρ ⎥⎥ 1/2
⎦ Vi
ρ 1
where
k< 2
nij
Yi1+
Yi2+
Cov(Yijk , Yij)
k< = nij πij (1 − πij ) + 2
⎜
⎜
⎝
ρk πij (1 − πij )
⎡
Vi =
⎣
θ1ni1 πi1(1 − πi1 )
0
0
θ2ni2 πi2(1 − π2)
⎤
⎦
ρk)
k< = θij nij πij (1 − πij )
≥
binomial variance
616
/* This program uses features in
PROC GENMOD in SAS to fit logistic
regression models to nest predation
data from a split-plot experiment.
. The code is stored in the file
617
/* Sort data with respect to the
transects and sub-transects */
proc sort data=set1; by loc slope;
run;
nestgee.sas
The data are stored in a different file
we currently do not have permission to
give to you.
*/
/* Compute parameter estimates and
the covariance matrix
for the IWM model using GENMOD */
data set1;
infile ’c:\st557\nestall.dat’;
input wshed $ loc $ round roadside $
roadtype $ adjhab $ rdepth rwidth
foreback percip mtemp ftotal floss
btotal bloss ctotal closs;
if(roadtype=’grav’) then x1=1; else x1=0;
if(adjhab=’nrc’) then x2=1; else x2=0;
if(roadside=’nfen’) then x3=1; else x3=0;
if(roadside=’wd’) then x4=1; else x4=0;
x5=x1*x2; x6=x2*x4; x7=x1*x3;
slope=0; loss=floss; total=ftotal; output;
slope=1; loss=bloss; total=btotal; output;
keep wshed loc x1 x2 x3 x4 x5 x6 x7
slope loss total;
run;
proc genmod data=set1;
class loc;
model loss/total= x1 x2 x3 x4
x5 x6 x7 slope/
dist=binomial link=logit
itprint
converge=1e-8 maxit=50;
run;
618
619
Parameter
Prm1
Prm2
Prm3
Prm4
Prm5
Prm6
Prm7
Prm8
Prm9
The GENMOD Procedure
Model Information
Data Set
Distribution
Link Function
Response Variable (Events)
Response Variable (Trials)
Observations Used
Number Of Events
Number Of Trials
WORK.SET1
Binomial
Logit
loss
total
272
307
1332
Iter Ridge
Class Level
Information
Class
Levels
loc
Log
Likelihood
0
0
-704.50723
1
0
-680.23768
2
0
-679.60783
3
0
-679.60595
4
0
-679.60595
5
0
-679.60595
136
Class Level Information
Class
Values
loc
1 10 100 101 102 103 104 105 106 107 108 109 11
110 111 112 113 114 115 116 117 118 119 12 120 121
122 123 124 125 126 127 128 129 13 130 131 132 133
134 135 136 14 15 16 17 18 19 2 20 21 22 23 24 25
26 27 28 29 3 30 31 32 33 34 35 36 37 38 39 4 40
41 ...
Effect
Intercept
x1
x2
x3
x4
x5
x6
x7
slope
Prm1
Prm4
Prm7
-0.954212
-0.057336
0.5526403
-1.468752
0.1193446
1.2046592
-1.546383
0.1514569
1.3454385
-1.548711
0.152238
1.3505862
-1.548714
0.1522385
1.3505943
-1.548714
0.1522385
1.3505943
Prm2
Prm5
Prm8
-0.164353
0.139895
-0.364248
0.0036198
-0.195381
-0.910456
0.0315938
-0.269986
-1.102685
0.0323312
-0.272661
-1.117141
0.0323321
-0.272664
-1.117218
0.0323321
-0.272664
-1.117218
620
Prm3
Prm6
Prm9
-0.66138
0.4733128
0.5854818
-0.890005
0.5827854
0.7281776
-0.99817
0.6511028
0.7832247
-1.005069
0.6561795
0.7856011
-1.005088
0.6561945
0.7856059
-1.005088
0.6561945
0.7856059
621
Criteria For Assessing Goodness Of Fit
Criterion
Deviance
Scaled Deviance
Pearson Chi-Square
Scaled Pearson X2
Log Likelihood
DF
Value
Value/DF
263
263
263
263
546.8136
546.8136
491.7968
491.7968
-679.6059
2.0791
2.0791
1.8699
1.8699
The GENMOD Procedure
Last Evaluation Of The Negative Of The Gradient and Hessian
Prm1
Grad
Prm1
Prm2
Prm3
Prm4
Prm5
Prm6
Prm7
Prm8
Prm9
3.1443E-8
222.26001
179.74131
81.501383
23.896581
49.417896
70.05557
33.959366
13.147433
132.96838
Gradient
Prm1
Prm2
Prm2
3.0052E-8
179.74131
179.74131
70.05557
13.147433
45.679179
70.05557
30.220649
13.147433
106.98113
Prm3
Prm4
Prm5
7.2257E-9 2.9764E-8 2.929E-10
81.501383 23.896581 49.417896
70.05557 13.147433 45.679179
81.501383 3.2291792 33.959366
3.2291792 23.896581 -3.55E-15
33.959366 -3.55E-15 49.417896
70.05557 2.1569729 30.220649
33.959366 -1.07E-14 33.959366
2.1569729 13.147433 -1.42E-14
48.22479 14.886677 28.289297
Prm6
Prm7
Prm8
Prm9
5.8884E-9
70.05557
70.05557
2.188E-10
33.959366
30.220649
2.9604E-8
13.147433
13.147433
622
1.76E-8
132.96838
106.98113
Analysis Of Parameter Estimates
Parameter
DF
Estimate
Standard
Error
Intercept
x1
x2
x3
x4
x5
x6
x7
slope
Scale
1
1
1
1
1
1
1
1
1
0
-1.5487
0.0323
-1.0051
0.1522
-0.2727
0.6562
1.3506
-1.1172
0.7856
1.0000
0.2277
0.2380
0.3618
0.3621
0.2766
0.3897
0.3561
0.4646
0.1371
0.0000
Wald 95%
Confidence Limits
-1.9950
-0.4342
-1.7143
-0.5574
-0.8147
-0.1076
0.6527
-2.0279
0.5169
1.0000
Parameter
Pr > ChiSq
Intercept
x1
x2
x3
x4
x5
x6
x7
slope
Scale
<.0001
0.8920
0.0055
0.6741
0.3242
0.0922
0.0001
0.0162
<.0001
-1.1024
0.4989
-0.2959
0.8619
0.2694
1.4200
2.0485
-0.2066
1.0543
1.0000
623
ChiSquare
46.26
0.02
7.72
0.18
0.97
2.84
14.39
5.78
32.84
Analysis Of Initial Parameter Estimates
/* Compute GEE parameter estimates
for an unstructured covariance
structure. With only two sub-plots
this is equivalent to the
exchangeable correlation model */
proc genmod data=set1;
class loc ;
model loss/total= x1 x2 x3 x4 x5
x6 x7 slope /
dist=binomial link=logit
itprint
converge=1e-8 maxit=50;
repeated subject=loc / type=un
modelse covb corrw;
run;
Parameter
DF
Estimate
Intercept
x1
x2
x3
x4
x5
x6
x7
slope
Scale
1
1
1
1
1
1
1
1
1
0
-1.5487
0.0323
-1.0051
0.1522
-0.2727
0.6562
1.3506
-1.1172
0.7856
1.0000
Standard
Wald 95%
Error
Confidence Limits
0.2277
0.2380
0.3618
0.3621
0.2766
0.3897
0.3561
0.4646
0.1371
0.0000
-1.9950
-0.4342
-1.7143
-0.5574
-0.8147
-0.1076
0.6527
-2.0279
0.5169
1.0000
Parameter
Pr > ChiSq
Intercept
x1
x2
x3
x4
x5
x6
x7
slope
Scale
<.0001
0.8920
0.0055
0.6741
0.3242
0.0922
0.0001
0.0162
<.0001
-1.1024
0.4989
-0.2959
0.8619
0.2694
1.4200
2.0485
-0.2066
1.0543
1.0000
ChiSquare
46.26
0.02
7.72
0.18
0.97
2.84
14.39
5.78
32.84
NOTE: The scale parameter was held fixed.
624
625
Last Evaluation Of The Generalized Gradient And Hessian
GEE Model Information
Correlation Structure
Subject Effect
Number of Clusters
Correlation Matrix Dimension
Maximum Cluster Size
Minimum Cluster Size
Unstructured
loc (136 levels)
136
2
2
2
Iteration History For GEE Parameter Estimates
Iter
0
1
2
3
Prm1
Prm6
-1.548714
0.6561945
-1.548991
0.7019773
-1.548988
0.7019642
-1.548989
0.70197
Prm2
Prm7
0.0323321
1.3505943
0.0334912
1.3096132
0.0334707
1.3102313
0.033471
1.3102316
Prm3
Prm8
-1.005088
-1.117218
-1.033577
-1.133245
-1.033953
-1.133288
-1.033956
-1.133289
Prm4
Prm9
Prm5
0.1522385
0.7856059
0.1479605
0.7843777
0.1479469
0.7843802
0.1479475
0.7843806
-0.272664
626
-0.24497
-0.245076
-0.245078
Grad
Prm1
Prm2
Prm3
Prm4
Prm5
Prm6
Prm7
Prm8
Prm9
Prm1
Prm2
Prm3
Prm4
Prm5
-9.089E-7
176.97859
143.34113
64.883073
18.897032
39.396613
55.922266
26.788608
10.399388
111.95357
-8.395E-7
143.34113
143.34113
55.922266
10.399388
36.484802
55.922266
23.876798
10.399388
90.062312
-6.848E-7
64.883073
55.922266
64.883073
2.5661978
26.788608
55.922266
26.788608
1.7283638
40.460711
-6.721E-8
18.897032
10.399388
2.5661978
18.897032
7.105E-15
1.7283638
1.421E-14
10.399388
12.591011
4.9742E-7
39.396613
36.484802
26.788608
7.105E-15
39.396613
23.876798
26.788608
7.105E-15
23.549996
Gradient
Prm1
Prm2
Prm3
Prm4
Prm5
Prm6
Prm7
Prm8
Prm9
Prm6
-6.316E-7
55.922266
55.922266
55.922266
1.7283638
23.876798
55.922266
23.876798
1.7283638
34.328207
Prm7
6.9117E-7
26.788608
23.876798
26.788608
1.421E-14
26.788608
23.876798
26.788608
1.421E-14
14.994503
Prm8
-6.623E-8
10.399388
10.399388
1.728363
10.399388
7.105E-15
1.7283638
1.421E-14
10.399388
7.2474518
Prm9
-6.46E-7
111.95357
90.062312
40.460711
12.591011
23.549996
34.328207
14.994503
7.2474518
143.3098
627
Covariance Matrix (Empirical)
Covariance Matrix (Model-Based)
Prm1
Prm1
Prm2
Prm3
Prm4
Prm5
Prm6
Prm7
Prm8
Prm9
0.11642
-0.10598
-0.09483
-0.09654
0.0009203
0.09609
-0.004322
0.09756
-0.01674
Prm2
Prm3
Prm4
Prm5
-0.10598
0.13328
0.09637
0.09610
-0.02771
-0.12360
0.02835
-0.11929
0.000612
-0.09483
0.09637
0.31102
0.06537
-0.000239
-0.29506
-0.03778
-0.06823
-0.001912
-0.09654
0.09610
0.06537
0.30971
0.0000137
-0.06766
0.005461
-0.30932
0.000641
0.000920
-0.02771
-0.000239
0.0000137
0.17605
0.02679
-0.17510
0.02337
-0.001448
Prm1
Prm1
Prm2
Prm3
Prm4
Prm5
Prm6
Prm7
Prm8
Prm9
0.13142
-0.12181
-0.11250
-0.10806
0.002421
0.10950
-0.003849
0.10795
-0.01571
Prm1
Prm2
Prm3
Prm4
Prm5
Prm6
Prm7
Prm8
Prm9
Prm7
0.09609
-0.12360
-0.29506
-0.06766
0.02679
0.36020
-0.03929
0.08361
0.001057
-0.004322
0.02835
-0.03778
0.005461
-0.17510
-0.03929
0.29445
-0.01955
0.004114
-0.12181
0.14966
0.11228
0.10285
-0.02860
-0.14174
0.03128
-0.13426
0.001040
Prm3
Prm4
Prm5
-0.11250
0.11228
0.20404
0.06787
0.000369
-0.19443
-0.03371
-0.06493
0.0000132
-0.10806
0.10285
0.06787
0.39579
-0.000521
-0.07099
0.009810
-0.39647
0.008370
0.002421
-0.02860
0.0003693
-0.000521
0.22449
0.02818
-0.22502
0.03198
-0.003726
Covariance Matrix (Empirical)
Covariance Matrix (Model-Based)
Prm6
Prm2
Prm8
Prm6
Prm9
0.09756
-0.11929
-0.06823
-0.30932
0.02337
0.08361
-0.01955
0.50996
-0.002421
-0.01674
0.000612
-0.001912
0.000641
-0.001448
0.001057
0.004114
-0.002421
0.02590
Prm1
Prm2
Prm3
Prm4
Prm5
Prm6
Prm7
Prm8
Prm9
0.10950
-0.14174
-0.19443
-0.07099
0.02818
0.26392
-0.03766
0.10646
0.005864
Prm7
-0.003849
0.03128
-0.03371
0.009810
-0.22502
-0.03766
0.35906
-0.04859
0.002353
Prm8
Prm9
0.10795
-0.13426
-0.06493
-0.39647
0.03198
0.10646
-0.04859
0.57248
-0.008068
-0.01571
0.001040
0.0000132
0.008370
-0.003726
0.005864
0.002353
-0.008068
0.02399
628
629
Working Correlation Matrix
Row1
Row2
Col1
Col2
1.0000
0.2681
0.2681
1.0000
Analysis Of GEE Parameter Estimates
Model-Based Standard Error Estimates
Parameter Estimate
Analysis Of GEE Parameter Estimates
Empirical Standard Error Estimates
Parameter Estimate
Intercept
x1
x2
x3
x4
x5
x6
x7
slope
-1.5490
0.0335
-1.0340
0.1479
-0.2451
0.7020
1.3102
-1.1333
0.7844
Standard
Error
0.3625
0.3869
0.4517
0.6291
0.4738
0.5137
0.5992
0.7566
0.1549
95% Confidence
Limits
-2.2595
-0.7248
-1.9193
-1.0851
-1.1737
-0.3049
0.1358
-2.6162
0.4808
-0.8385
0.7917
-0.1486
1.3810
0.6836
1.7089
2.4847
0.3497
1.0880
Z
-4.27
0.09
-2.29
0.24
-0.52
1.37
2.19
-1.50
5.06
630
Pr>|Z|
<.0001
0.9311
0.0221
0.8141
0.6050
0.1718
0.0288
0.1342
<.0001
Intercept
x1
x2
x3
x4
x5
x6
x7
slope
Scale
-1.5490
0.0335
-1.0340
0.1479
-0.2451
0.7020
1.3102
-1.1333
0.7844
1.3673
Standard
Error
0.3412
0.3651
0.5577
0.5565
0.4196
0.6002
0.5426
0.7141
0.1609
.
95% Confidence
Limits
-2.2177
-0.6821
-2.1270
-0.9428
-1.0675
-0.4743
0.2467
-2.5329
0.4690
.
-0.8802
0.7490
0.0591
1.2387
0.5773
1.8783
2.3738
0.2663
1.0998
.
Z
-4.54
0.09
-1.85
0.27
-0.58
1.17
2.41
-1.59
4.87
.
Pr>|Z|
<.0001
0.9269
0.0637
0.7904
0.5592
0.2421
0.0158
0.1125
<.0001
.
NOTE: The scale parameter for GEE estimation was computed as
the square root of the normalized Pearson’s chi-square.
631
Specify a log-linear model for the mean
response
Tβ
log(μij ) = β0 + β1X1ij + . . . + βk Xkij = Xij
Poisson Regression
Suppose Yij ∼ P oisson(μij ) is a count
provided by the i-th subject at the
j -th inspection time
• Number of skin tumors on a mouse
Specify a Poisson distribution for each
observed count
Yij ∼ P oisson(μij )
Then
P r(Yij = r) =
μrij e−μij
r!
,
r = 0, 1, 2, ....
E(Yij ) = μij
• Number of epileptic seizures in successive two week periods (example 1.6 and
example 8.4 in DHLZ)
V ar(Yij ) = μij
where
• Number of birds found at different locations along a transect
μij = e−(β0+β1X1ij +...+βkXkij )
The standard assumption is that the Yij
are all independent.
In longitudinal studies, counts observed on
the same subject at different points in time
may be correlated.
632
633
• Explanatory variables
X1= 0 for placebo
1 for progabide treatment
Epileptic seizure data
X2= (Age in years) - 29
• Thall and Vail(1990), Biometrics 46,
657-671
• Table 1.5 and Example 8.5 in DHLZ
• The subjects were 59 epileptics suffering from partial seizures who were randomized to treatment
– anti-epileptic drug progabide
(m1 = 31)
– placebo (m2 = 28)
634
Time = 0,2,4,6,8 weeks
• Repeated measures on the i-th patient
Yi0 = 0.25(Number of seizures during 8 weeks prior to start of
treatment)
Yi1 = Number of seizures during
weeks 1 and 2 after start of
treatment
Yi2 = Number of seizures during
weeks 3 and 4 after start of
treatment
Yi3 = Number of seizures during
weeks 5 and 6 after start of
treatment
Yi4 = Number of seizures during
weeks 7 and 8 after start of
treatment
635
/*
Use the GEE option in PROC GENMOD
to fit a Poisson regression model
to the epileptic seizure
data from Thall and Vail(1990). */
data set1;
infile ’c:\stat565\seizures.dat’;
input y1-y4 trt base age;
patient = _N_;
age=(age-29);
y0 = base/4;
z = 1;
run;
proc format; value trt 0 = ’Placebo’
1 = ’Progabide’;
/*
Modify the data file to put
repeated measures on different lines.
Also create a time variable. */
data set2; set set1;
y = y0; time=0; xt=time;
y = y1; time=2; xt=time;
y = y2; time=4; xt=time;
y = y3; time=6; xt=time;
y = y4; time=8; xt=time;
run;
output;
output;
output;
output;
output;
636
proc sort data=set2; by trt time; run;
proc means data=set2 n mean stderr noprint;
by trt time;
var y;
output out=means mean=y;
run;
axis1 label=(f=swiss h=1.2 a=90 r=0 "Seizures per 2 weeks")
order = 5 to 10 by 1
length= 4.5in
value=(f=swiss h=1.0);
axis2 label=(f=swiss h=1.2 "Time(weeks)")
order = 0 to 8 by 2
value=(f=swiss h=1.0) w= 4.0
length = 6. in;
SYMBOL1 V=dot H=1.5 w=4 l=1 i=join cv=black;
SYMBOL2 V=circle H=1.5 w=4 l=3 i=join cv=black;
PROC GPLOT DATA=means;
PLOT y*time=trt / vaxis=axis1 haxis=axis2;
TITLE1 ls=0.4 H=2.0 F=swiss "Two-Week Seizure Rates";
legend frame across=2 down=3;
footnote h=2 " ";
format trt trt.;
RUN;
637
638
The GENMOD Procedure
Model Information
/*
Data Set
Distribution
Link Function
Dependent Variable
Observations Used
Use PROC GENMOD in SAS to fit a
log-linear model with no correlation
among repeated measurements. Standard
errors are based on independent
Poisson counts. */
proc genmod data=set2;
class patient;
model y = z time trt time*time age trt*age
/ noint dist=poisson link=log
covb itprint
converge=.0000001 maxit=50;
run;
Class
Values
patient
1 2 3
21 22
38 39
55 56
WORK.SET2
Poisson
Log
y
295
Class Level
Information
Class
patient
Levels
59
4 5 6
23 24
40 41
57 58
7 8 9 10 11 12 13 14 15 16 17 18 19 20
25 26 27 28 29 30 31 32 33 34 35 36 37
42 43 44 45 46 47 48 49 50 51 52 53 54
59
Parameter
Effect
Prm1
Prm2
Prm3
Prm4
Prm5
Prm6
Prm7
Intercept
z
time
trt
time*time
age
trt*age
639
640
Criteria For Assessing Goodness Of Fit
Criterion
Iteration History For Parameter Estimates
Iter
0
Ridge
0
Log
Likelihood
2139.1864
1
0
2623.44379
2
0
2654.9526
3
0
2655.22079
4
0
2655.22082
5
0
2655.22082
Prm1
Prm4
Prm7
Prm2
Prm5
Prm3
Prm6
0
0.0592947
-0.078664
0
-0.017065
-0.065618
0
-0.070177
-0.057241
0
-0.07897
-0.056049
0
-0.079092
-0.056035
0
-0.079092
-0.056035
2.2871859
-0.023962
0.2123583
0.0290219
2.0638572
-0.016116
0.1298421
0.0247138
2.0551993
-0.011475
0.0814555
0.0222975
2.0595697
-0.01089
0.0754608
0.0220365
2.0596283
-0.010885
0.0754051
0.0220349
2.0596283
-0.010885
0.0754051
0.0220349
Deviance
Scaled Deviance
Pearson Chi-Square
Scaled Pearson X2
Log Likelihood
DF
Value
Value/DF
289
289
289
289
2661.2980
2661.2980
4154.2822
4154.2822
2655.2208
9.2086
9.2086
14.3747
14.3747
Analysis Of Parameter Estimates
641
Parameter
DF
Estimate
Intercept
z
time
trt
time*time
age
trt*age
Scale
0
1
1
1
1
1
1
0
0.0000
2.0596
0.0754
-0.0791
-0.0109
0.0220
-0.0560
1.0000
Standard
Wald 95%
Error
Confidence Limits
0.0000
0.0488
0.0257
0.0422
0.0031
0.0050
0.0063
0.0000
Parameter
Intercept
z
time
trt
time*time
age
trt*age
Scale
0.0000
1.9640
0.0251
-0.1618
-0.0170
0.0122
-0.0684
1.0000
0.0000
2.1552
0.1257
0.0036
-0.0048
0.0319
-0.0437
1.0000
Pr > ChiSq
.
<.0001
0.0033
0.0608
0.0004
<.0001
<.0001
642
ChiSquare
.
1783.69
8.62
3.52
12.34
19.34
79.00
Criteria For Assessing Goodness Of Fit
Criterion
/*
Deviance
Scaled Deviance
Pearson Chi-Square
Scaled Pearson X2
Log Likelihood
Use PROC GENMOD in SAS to obtain
GEE estimates of coefficients in a
log-linear model with an exchangeable
correlation structure for repeated
measurements. Standard errors are
based on this correlation structure.
Results for a robust covariance
estimator are also provided. */
DF
Value
Value/DF
289
289
289
289
2661.2980
2661.2980
4154.2822
4154.2822
2655.2208
9.2086
9.2086
14.3747
14.3747
Analysis Of Initial Parameter Estimates
proc genmod data=set2;
class patient;
model y = z time trt time*time age trt*age
/ noint dist=poisson link=log
covb itprint
converge=.0000001 maxit=50;
repeated subject=patient / type=exch
modelse covb corrw;
run;
Parameter
DF
Estimate
Intercept
z
time
trt
time*time
age
trt*age
Scale
0
1
1
1
1
1
1
0
0.0000
2.0596
0.0754
-0.0791
-0.0109
0.0220
-0.0560
1.0000
Standard
Wald 95%
Error
Confidence Limits
0.0000
0.0488
0.0257
0.0422
0.0031
0.0050
0.0063
0.0000
0.0000
1.9640
0.0251
-0.1618
-0.0170
0.0122
-0.0684
1.0000
Parameter
Pr > ChiSq
Intercept
z
time
trt
time*time
age
trt*age
Scale
.
<.0001
0.0033
0.0608
0.0004
<.0001
<.0001
0.0000
2.1552
0.1257
0.0036
-0.0048
0.0319
-0.0437
1.0000
643
ChiSquare
.
1783.69
8.62
3.52
12.34
19.34
79.00
644
GEE Model Information
The GENMOD Procedure
Correlation Structure
Subject Effect
Number of Clusters
Correlation Matrix Dimension
Maximum Cluster Size
Minimum Cluster Size
Exchangeable
patient (59 levels)
59
5
5
5
The GENMOD Procedure
Covariance Matrix (Model-Based)
Prm2
Prm3
Prm4
Prm5
Prm6
Prm7
Prm2
Prm3
Prm4
0.05570
-0.004227
-0.04823
0.0004383
-0.000861
0.0008610
-0.004227
0.002676
-3.86E-18
-0.000310
5.325E-19
-4.72E-19
-0.04823
-3.86E-18
0.09807
4.374E-19
0.0008610
0.001240
Iteration History For GEE Parameter Estimates
Covariance Matrix (Model-Based)
Iter
0
1
2
3
Prm1
Prm6
Prm2
Prm7
Prm3
Prm4
0
0.0220349
0
0.0270354
0
0.0270001
0
0.0269995
2.0596283
-0.056035
2.0570934
-0.062262
2.0566815
-0.062231
2.0566817
-0.06223
0.0754051
-0.079092
-0.010885
0.0754012
-0.079228
-0.010884
0.0754012
-0.078869
-0.010884
0.0754012
-0.078869
-0.010884
645
Prm5
Prm2
Prm3
Prm4
Prm5
Prm6
Prm7
Prm5
Prm6
Prm7
0.0004383
-0.000310
4.374E-19
0.0000391
-6.44E-20
5.496E-20
-0.000861
5.325E-19
0.0008610
-6.44E-20
0.001362
-0.001362
0.0008610
-4.72E-19
0.001240
5.496E-20
-0.001362
0.002173
646
Working Correlation Matrix
Covariance Matrix (Empirical)
Prm2
Prm3
Prm4
Prm5
Prm6
Prm7
Prm2
Prm3
Prm4
0.02772
-0.000767
-0.03698
0.0000715
0.0007528
-0.000076
-0.000767
0.002171
0.002781
-0.000225
0.0002396
-0.000645
-0.03698
0.002781
0.08632
-0.000284
-0.001074
-0.001856
Row1
Row2
Row3
Row4
Row5
Prm5
Prm6
Prm7
0.0000715
-0.000225
-0.000284
0.0000254
-0.000031
0.0000760
0.0007528
0.0002396
-0.001074
-0.000031
0.0007463
-0.000746
-0.000076
-0.000645
-0.001856
0.0000760
-0.000746
0.001410
Intercept
z
time
trt
time*time
age
trt*age
Analysis Of GEE Parameter Estimates
Model-Based Standard Error Estimates
Intercept
z
time
trt
time*time
age
trt*age
Scale
0.0000
2.0567
0.0754
-0.0789
-0.0109
0.0270
-0.0622
3.7927
0.0000
0.2360
0.0517
0.3132
0.0063
0.0369
0.0466
.
95% Confidence
Limits
0.0000
1.5941
-0.0260
-0.6927
-0.0231
-0.0453
-0.1536
.
0.0000
2.5192
0.1768
0.5349
0.0014
0.0993
0.0291
.
0.7212
1.0000
0.7212
0.7212
0.7212
Col3
Col4
Col5
0.7212
0.7212
1.0000
0.7212
0.7212
0.7212
0.7212
0.7212
1.0000
0.7212
0.7212
0.7212
0.7212
0.7212
1.0000
Z
.
8.71
1.46
-0.25
-1.74
0.73
-1.34
.
Pr>|Z|
.
<.0001
0.1450
0.8012
0.0819
0.4644
0.1819
.
NOTE: The scale parameter for GEE estimation was computed as
the square root of the normalized Pearson’s chi-square.
649
0.0000
2.0567
0.0754
-0.0789
-0.0109
0.0270
-0.0622
Standard
Error
0.0000
0.1665
0.0466
0.2938
0.0050
0.0273
0.0376
95% Confidence
Limits
0.0000
1.7304
-0.0159
-0.6547
-0.0208
-0.0265
-0.1358
0.0000
2.3830
0.1667
0.4970
-0.0010
0.0805
0.0114
Z
.
12.35
1.62
-0.27
-2.16
0.99
-1.66
648
/*
Standard
Error
1.0000
0.7212
0.7212
0.7212
0.7212
Parameter Estimate
647
Parameter Estimate
Col2
Analysis Of GEE Parameter Estimates
Empirical Standard Error Estimates
Covariance Matrix (Empirical)
Prm2
Prm3
Prm4
Prm5
Prm6
Prm7
Col1
Use PROC GENMOD in SAS to obtain
GEE estimates of coefficients in a
log-linear model with an autocorrelation
correlation structure for repeated
measurements. Standard errors are
based on the model and also on a
robust covariance estimator. */
proc genmod data=set2;
class patient;
model y = z time trt time*time age trt*age
/ noint dist=poisson link=log
covb obstats itprint
converge=.0000001 maxit=50;
repeated subject=patient / type=ar(1)
modelse covb corrw;
run;
650
Pr>|Z|
.
<.0001
0.1056
0.7884
0.0307
0.3230
0.0975
GEE Model Information
Correlation Structure
Subject Effect
Number of Clusters
Correlation Matrix Dimension
Maximum Cluster Size
Minimum Cluster Size
AR(1)
patient (59 levels)
59
5
5
5
The GENMOD Procedure
Covariance Matrix (Model-Based)
Prm2
Prm3
Prm4
Prm5
Prm6
Prm7
Iteration History For GEE Parameter Estimates
Iter
Prm1
Prm6
Prm2
Prm7
0
0
0.0220349
0
0.0174423
0
0.0179254
0
0.0179171
0
0.0179172
2.0596283
-0.056035
2.044588
-0.051091
2.0447496
-0.05163
2.0447376
-0.051621
2.0447377
-0.05162
Prm3
Prm4
Prm5
0.0754051
-0.079092
-0.010885
0.0836068
-0.068481
-0.011834
0.0835014
-0.069229
-0.011822
0.0835029
-0.069209
-0.011822
0.0835029
-0.069209
-0.011822
Prm2
Prm3
Prm4
0.05623
-0.004009
-0.04661
0.0002306
-0.000429
0.0004291
-0.004009
0.002898
-1.05E-17
-0.000293
1E-19
-1.58E-19
-0.04661
-1.05E-17
0.09429
1.179E-18
0.0004291
0.001503
Covariance Matrix (Model-Based)
1
2
3
4
Prm2
Prm3
Prm4
Prm5
Prm6
Prm7
Prm5
Prm6
Prm7
0.0002306
-0.000293
1.179E-18
0.0000372
-9.19E-21
1.653E-20
-0.000429
1E-19
0.0004291
-9.19E-21
0.001353
-0.001353
0.0004291
-1.58E-19
0.001503
1.653E-20
-0.001353
0.002124
651
652
Working Correlation Matrix
Covariance Matrix (Empirical)
Prm2
Prm3
Prm4
Prm5
Prm6
Prm7
Prm2
Prm3
Prm4
0.02811
-0.001517
-0.03616
0.0001492
0.0005895
0.0001351
-0.001517
0.002852
0.004495
-0.000303
0.0003495
-0.000852
-0.03616
0.004495
0.07461
-0.000415
-0.000902
-0.001433
Row1
Row2
Row3
Row4
Row5
Col1
Col2
Col3
1.0000
0.8122
0.6597
0.5359
0.4353
0.8122
1.0000
0.8122
0.6597
0.5359
0.6597
0.8122
1.0000
0.8122
0.6597
Col4
Col5
0.5359
0.6597
0.8122
1.0000
0.8122
0.4353
0.5359
0.6597
0.8122
1.0000
Analysis Of GEE Parameter Estimates
Empirical Standard Error Estimates
Covariance Matrix (Empirical)
Prm2
Prm3
Prm4
Prm5
Prm6
Prm7
Prm5
Prm6
Prm7
0.0001492
-0.000303
-0.000415
0.0000334
-0.000043
0.0000923
0.0005895
0.0003495
-0.000902
-0.000043
0.0006165
-0.000617
0.0001351
-0.000852
-0.001433
0.0000923
-0.000617
0.001190
653
Parameter Estimate
Intercept
z
time
trt
time*time
age
trt*age
0.0000
2.0447
0.0835
-0.0692
-0.0118
0.0179
-0.0516
Standard
Error
0.0000
0.1676
0.0534
0.2731
0.0058
0.0248
0.0345
95% Confidence
Limits
0.0000
1.7162
-0.0212
-0.6046
-0.0232
-0.0307
-0.1192
0.0000
2.3733
0.1882
0.4662
-0.0005
0.0666
0.0160
Z
.
12.20
1.56
-0.25
-2.04
0.72
-1.50
654
Pr>|Z|
.
<.0001
0.1179
0.8000
0.0409
0.4705
0.1346
Random effects models for
discrete data
• Natural extension of models for
Gaussian continuous responses
Analysis Of GEE Parameter Estimates
Model-Based Standard Error Estimates
Parameter Estimate
Intercept
z
time
trt
time*time
age
trt*age
Scale
Standard
Error
0.0000
2.0447
0.0835
-0.0692
-0.0118
0.0179
-0.0516
3.7975
0.0000
0.2371
0.0538
0.3071
0.0061
0.0368
0.0461
.
95% Confidence
Limits
0.0000
1.5800
-0.0220
-0.6711
-0.0238
-0.0542
-0.1420
.
• Now, denote random effects as Ui
(instead of bi)
Z
0.0000
2.5095
0.1890
0.5326
0.0001
0.0900
0.0387
.
.
8.62
1.55
-0.23
-1.94
0.49
-1.12
.
Pr>|Z|
.
<.0001
0.1208
0.8217
0.0526
0.6262
0.2627
.
General model
• responses yij have conditional
density
f (yij |Ui) = exp[{(yij θij −ψ(θij ))}/φ+c(yij , φ)]
• with conditional moments
NOTE: The scale parameter for GEE estimation was computed as
the square root of the normalized Pearson’s chi-square.
μij = E[Yij |Ui] = ψ (θij )
and
νij = V ar(Yij |Ui) = ψ (θij )φ
• h(μij ) = xij β + dij Ui and νij = ν(μij )φ
655
656
Interpretation
• The random effects Ui are
independent with common
distribution F (typically
normal): this accounts for
heterogeneity across individuals
• For linear links with Gaussian
responses, coefficients in marginal
and random effects models are
interpreted in the same way
• Can make inferences about individuals
(as opposed to the population)
• For nonlinear links, this is no longer
the case
Example: logistic model for binary data
⎛
μ
⎞
ij ⎠
h(μij ) = log ⎝ 1−μ
= β0 + β1Xij + Ui
ij
νij = V ar(Yij |Ui) = μij (1 − μij )
• To obtain the marginal model from
the conditional model, the following
integral must be evaluated (for binary
data with random intercept)
P (Yij = 1) = P (Yij = 1|Ui)dF (Ui)
Correlation and extra-variation are
introduced by random effects
=
657
exp(β0,re + Ui + β1,rexij )
1 + exp(β0,re + Ui + β1,rexij )
f (Ui; v 2)
658
• Let βm and βre denote the coefficients
from the marginal and random effects
models, respectively
Model Fitting for random effects models
• can use ML or REML (see Laird and
Ware, 1982)
• |Bk,m| ≤ |βk,re| for k = 1, . . . , p
• This
discrepancy
V ar(Ui)
increases
with
• Example: logistic regression with a random intercept (Ui ∼ N (0, v 2), then
βm = (c2 v 2 + 1)−1/2 βre where c2 ≈ .346)
• Integrate out the random effects to obtain the marginal likelihood: This usually cannot be done analytically
• Typical approaches
– Approximate likelihood [Laplace approximation]: Breslow and Clayton,
1993 JASA, Goldstein and Rasbash,
1996, JRSS-A
• Alternatives: Marginalized multi-level
models (Heagerty and Zeger, 2000,
Statistical Science; Heagerty, 2000 Biometrics:
marginal regression coefficients with random effects
– Gibbs sampling: Zeger and Karim,
1991 JASA; Daniels and Gatsonis,
1999 JASA
659
/*
/*
Use GLIMMIX in SAS to fit a
Poisson regression model with
random subject effects to
the epileptic seizure data
from Thall and Vail(1990). */
660
/*
Modify the data file to put
repeated measures on different lines.
Also create a time variable. */
data set2; set set1;
y = y0; time=0; xt=time;
y = y1; time=2; xt=time;
y = y2; time=4; xt=time;
y = y3; time=6; xt=time;
y = y4; time=8; xt=time;
run;
This program is stored in the file
seizglmm.sas
*/
data set1;
infile ’c:\stat565\seizures.dat’;
input y1-y4 trt base age;
patient = _N_;
age=(age-29);
y0 = base/4;
z = 1;
run;
/*
output;
output;
output;
output;
output;
Use the GLIMMIX macro to include
random patient effects */
%inc
’c:\stat565\sas\glmm800.sas’ /
nosource;
run;
proc format; value trt 0 = ’Placebo’
1 = ’Progabide’;
proc print data=set1; run;
662
663
Two-Week Seizure Rates
/*
The Mixed Procedure
Log-linear models with fixed and
random effects */
Model Information
%glimmix(data=set2,
stmts=%str(
class patient;
model y = trt time time*time age age*trt /
solution;
random intercept / type=un subject=patient;),
error=poisson, link=log,
converge=1e-8, maxit=20, out=setp
)
run;
Data Set
Dependent Variable
Weight Variable
Covariance Structure
Subject Effect
Estimation Method
Residual Variance Method
Fixed Effects SE Method
Degrees of Freedom Method
WORK._DS
_z
_w
Unstructured
patient
REML
Profile
Model-Based
Containment
Class Level Information
Class
Levels
patient
Values
59
1 2 3
14 15
24 25
34 35
44 45
54 55
proc print data=setp (obs=5); run;
4 5 6
16 17
26 27
36 37
46 47
56 57
7 8 9
18 19
28 29
38 39
48 49
58 59
10
20
30
40
50
664
11
21
31
41
51
12
22
32
42
52
13
23
33
43
53
665
Solution for Fixed Effects
Parameter Search
CovP1
CovP2
Variance
0.6749
2.1112
2.1112
Res Log Like
-355.0676
Iteration History
Iteration
Evaluations
-2 Res Log Like
Criterion
1
1
710.13514506
0.00000000
Convergence criteria met.
Effect
Estimate
Standard
Error
DF
t Value
Pr>|t|
Intercept
trt
time
time*time
age
trt*age
1.7740
-0.1650
0.07541
-0.01088
0.009338
-0.02939
0.1739
0.2282
0.03732
0.004501
0.02871
0.03395
55
234
234
234
234
234
10.20
-0.72
2.02
-2.42
0.33
-0.87
<.0001
0.4706
0.0445
0.0164
0.7453
0.3875
Type 3 Tests of Fixed Effects
Effect
Covariance Parameter Estimates
Cov Parm
Subject
Estimate
UN(1,1)
Residual
patient
0.6749
2.1112
trt
time
time*time
age
trt*age
Num
DF
Den
DF
F Value
Pr > F
1
1
1
1
1
234
234
234
234
234
0.52
4.08
5.85
0.11
0.75
0.4706
0.0445
0.0164
0.7453
0.3875
GLIMMIX Model Statistics
Fit Statistics
-2 Res Log Likelihood
AIC (smaller is better)
AICC (smaller is better)
BIC (smaller is better)
Description
710.1
714.1
714.2
718.3
666
Deviance
Scaled Deviance
Pearson Chi-Square
Scaled Pearson Chi-Square
Extra-Dispersion Scale
Value
556.5463
263.6195
507.7219
240.4929
2.1112
667
Obs
1
2
3
4
5
_y
_offset _wght _orig
2.75
5.00
3.00
3.00
3.00
Obs
StdErr
Pred
1
2
3
4
5
0.31910
0.31521
0.31591
0.31528
0.31966
0
0
0
0
0
1
1
1
1
1
DF Alpha
234
234
234
234
234
0.05
0.05
0.05
0.05
0.05
Obs lowereta uppereta
1
2
3
4
5
0.63153
0.74648
0.76529
0.69965
0.53706
Obs
1
2
3
4
5
1.88890
1.98850
2.01008
1.94195
1.79661
y
y
y
y
y
y
time age trt patient
2.75
5.00
3.00
3.00
3.00
0
2
4
6
8
2
2
2
2
2
0
0
0
0
0
Lower
Upper
Resid
0.63153
0.74648
0.76529
0.69965
0.53706
1.88890
1.98850
2.01008
1.94195
1.79661
-0.22012
0.27373
-0.25104
-0.19923
-0.06595
mu
dmu
3.52619
3.92548
4.00556
3.74641
3.21182
3.52619
3.92548
4.00556
3.74641
3.21182
1
1
1
1
1
Pred
1.26022
1.36749
1.38768
1.32080
1.16684
etam stderreta
1.26022
1.36749
1.38768
1.32080
1.16684
0.3191
0.3152
0.3159
0.3153
0.3197
stderrmu lowermu uppermu
1.12522
1.23735
1.26541
1.18117
1.02668
/* This program uses features in the
SAS GLIMMIX macro to fit logistic
regression models to nest predation
data from a split-plot experiment.
The code is stored in the file
nestglmm.sas
*/
1.88049
2.10956
2.14961
2.01305
1.71098
6.61211
7.30457
7.46391
6.97232
6.02917
var
resraw
reschi
deta
_w
_z
3.52619
3.92548
4.00556
3.74641
3.21182
-0.77619
1.07452
-1.00556
-0.74641
-0.21182
-0.28448
0.37325
-0.34579
-0.26540
-0.08134
0.28359
0.25475
0.24965
0.26692
0.31135
3.52619
3.92548
4.00556
3.74641
3.21182
1.04010
1.64122
1.13664
1.12156
1.10089
data set1;
infile ’c:\stat565\nestall.dat’;
input wshed $ loc $ round roadside $
roadtype $ adjhab $ rdepth rwidth
foreback percip mtemp ftotal floss
btotal bloss ctotal closs;
if(roadtype=’grav’) then x1=1; else x1=0;
if(adjhab=’nrc’) then x2=1; else x2=0;
if(roadside=’nfen’) then x3=1; else x3=0;
if(roadside=’wd’) then x4=1; else x4=0;
x5=x1*x2; x6=x2*x4; x7=x1*x3;
slope=0; loss=floss; total=ftotal; output;
slope=1; loss=bloss; total=btotal; output;
keep wshed loc x1 x2 x3 x4 x5 x6 x7
slope loss total;
run;
668
/*
Fit a model using GLIMMIX to include
random location effects */
%inc
’c:\stat565\sas\glmm800.sas’ /
nosource;
run;
669
Data Set
Dependent Variable
Weight Variable
Covariance Structure
Subject Effect
Estimation Method
Residual Variance Method
Fixed Effects SE Method
Degrees of Freedom Method
/*
Logistic regression with random
effects for transect and sub-transects */
Class Level Information
Class
%glimmix(data=set1,
stmts=%str(
class loc ;
model loss/total = x1 x2 x3 x4
x5 x6 x7 slope /
solution ddfm=kr;
random intercept slope / subject=loc g gcorr;),
error=binomial, link=logit,
converge=1e-8, maxit=20, out=setp
)
run;
proc print data=setp (obs=5); run;
670
WORK._DS
_z
_w
Variance Components
loc
REML
Profile
Prasad-Rao-JeskeKackar-Harville
Kenward-Roger
loc
Levels
136
Values
1 10 100 101 102 103 104 105
106 107 108 109 11 110 111 112
. . . . . . . . . .
92 93 94 95 96 97 98 99
Dimensions
Covariance Parameters
Columns in X
Columns in Z Per Subject
Subjects
Max Obs Per Subject
Observations Used
Observations Not Used
Total Observations
3
9
2
136
2
272
0
272
671
Fit Statistics
Parameter Search
CovP1
CovP2
CovP3
Variance
Res
Log Like
-2 Res
Log Like
0.9868
1.2259
0.7644
0.7644
-521.3338
1042.6675
-2 Res Log Likelihood
AIC (smaller is better)
AICC (smaller is better)
BIC (smaller is better)
Solution for Fixed Effects
Iteration History
Iteration
Evaluations
-2 Res Log Like
1
1042.66752512
Convergence criteria met.
1
2
Effect
loc
Intercept
slope
1
1
Col1
Intercept
x1
x2
x3
x4
x5
x6
x7
slope
Col2
0.9868
1.2259
Subject
Intercept
slope
Residual
loc
loc
-1.7023
0.04349
-0.9033
0.1095
-0.3174
0.4538
1.5878
-1.0348
0.7507
0.3948
0.4363
0.6029
0.6688
0.4968
0.6623
0.6449
0.8064
0.1664
DF
t Value
125
113
140
119
126
133
115
135
144
-4.31
0.10
-1.50
0.16
-0.64
0.69
2.46
-1.28
4.51
GLIMMIX Model Statistics
Description
Covariance Parameter Estimates
Cov Parm
Standard
Error
_y
0.0
0.0
0.2
0.5
0.2
_offset
0
0
0
0
0
_wght
5
5
5
4
5
_orig
y
y
y
y
y
Deviance
Scaled Deviance
Pearson Chi-Square
Scaled Pearson Chi-Square
Extra-Dispersion Scale
0.9868
1.2259
0.7644
loss
0
0
1
2
1
Upper
-1.83461
-1.06390
-1.59365
-0.82062
-1.14603
Obs
1
2
3
4
5
mu
0.05551
0.11416
0.07799
0.15645
0.17951
Resid
-1.05877
-1.12887
1.69678
2.60307
0.13908
dmu
0.05243
0.10113
0.07191
0.13198
0.14729
eta
-2.83406
-2.04892
-2.46998
-1.68485
-1.51964
stderrmu
0.026612
0.050590
0.032003
0.057926
0.027944
total
5
5
5
4
5
stderreta
0.50758
0.50026
0.44506
0.43891
0.18972
lowermu
0.02118
0.04592
0.03402
0.07249
0.13088
Value
x1
1
1
1
1
1
x2
1
1
0
0
0
uppermu
0.13769
0.25656
0.16887
0.30563
0.24121
lowereta
-3.83350
-3.03393
-3.34632
-2.54907
-1.89324
x3
1
1
1
1
0
x4
0
0
0
0
0
x5
1
1
0
0
0
uppereta
-1.83461
-1.06390
-1.59365
-0.82062
-1.14603
var
resraw
0.05243 -0.05551
0.10113 -0.11416
0.07191 0.12201
0.13198 0.34355
0.14729 0.02049
674
163.1343
213.4204
107.3120
140.3909
0.7644
673
StdErr
Obs x6 x7 slope loc
Pred
Pred
DF
Alpha
Lower
1 0 1
0
1 -2.83406 0.50758 263.000 0.05 -3.83350
2 0 1
1
1 -2.04892 0.50026 263.000 0.05 -3.03393
3 0 1
0
2 -2.46998 0.44506 263.000 0.05 -3.34632
4 0 1
1
2 -1.68485 0.43891 263.000 0.05 -2.54907
5 0 0
0
3 -1.51964 0.18972 257.500 0.05 -1.89324
Obs
1
2
3
4
5
<.0001
0.9208
0.1363
0.8702
0.5241
0.4944
0.0153
0.2016
<.0001
Estimate
672
Obs
1
2
3
4
5
Pr>|t|
0.00000000
Estimated G Matrix
Row
Estimate
Criterion
Effect
1
1042.7
1048.7
1048.8
1057.4
/*
Logistic regression with heterogeneous
extra-binomial variance adjustments
for the foreslope and backslope*/
%glimmix(data=set1,
stmts=%str(
class loc ;
model loss/total = x1 x2 x3 x4
x5 x6 x7 slope /
solution ddfm=kr;
repeated / type=un subject=loc r rcorr; ),
error=binomial, link=logit,
converge=1e-8, maxit=20, out=setp
)
run;
proc print data=setp (obs=5); run;
675
Model Information
Data Set
Dependent Variable
Weight Variable
Covariance Structure
Subject Effect
Estimation Method
Residual Variance Method
Fixed Effects SE Method
Degrees of Freedom Method
WORK._DS
_z
_w
Unstructured
loc
REML
None
Prasad-Rao-JeskeKackar-Harville
Kenward-Roger
Parameter Search
CovP1
CovP2
CovP3
Res Log Like
1.6352
0.5326
2.1321
-499.0835
Iteration
Evaluations
Class Level Information
loc
Levels
136
998.1670
Iteration History
1
Class
-2 Res Log Like
Values
-2 Res Log Like
Criterion
1
998.16698466
Convergence criteria met.
0.00000000
Estimated R Matrix for
loc 1/Weighted by _w
1 10 100 101 102 103 104 105
106 107 108 109 11 110 111 112
113 114 115 116 117 118 119 12
. . . . . . . . . .
92 93 94 95 96 97 98 99
Row
Col1
Col2
1
2
6.2378
1.4630
1.4630
4.2165
Dimensions
Covariance Parameters
Columns in X
Columns in Z
Subjects
Max Obs Per Subject
Observations Used
Observations Not Used
Total Observations
Estimated R Correlation
Matrix for loc 1/Weighted
by _w
3
9
0
136
2
272
0
272
Row
Col1
Col2
1
2
1.0000
0.2853
0.2853
1.0000
676
677
Solution for Fixed Effects
Effect
Intercept
x1
x2
x3
x4
x5
x6
x7
slope
Covariance Parameter Estimates
Cov Parm
Subject
UN(1,1)
UN(2,1)
UN(2,2)
loc
loc
loc
Estimate
1.6352
0.5326
2.1321
Estimate
Standard
Error
DF
t Value
Pr>|t|
-1.5433
0.0237
-1.0199
0.1615
-0.2280
0.6558
1.3425
-1.1119
0.7851
0.3442
0.3741
0.5694
0.5682
0.4304
0.6130
0.5561
0.7288
0.1572
191
158
251
154
181
222
165
218
183
-4.48
0.06
-1.79
0.28
-0.53
1.07
2.41
-1.53
4.99
<.0001
0.9496
0.0745
0.7766
0.5969
0.2858
0.0169
0.1285
<.0001
Fit Statistics
-2 Res Log Likelihood
AIC (smaller is better)
AICC (smaller is better)
BIC (smaller is better)
998.2
1004.2
1004.3
1012.9
GLIMMIX Model Statistics
Description
Deviance
Scaled Deviance
Pearson Chi-Square
Scaled Pearson Chi-Square
Extra-Dispersion Scale
678
Value
546.8712
546.8712
491.4987
491.4987
1.0000
679
Obs
1
2
3
4
5
_y
_offset _wght _orig
0.0
0.0
0.2
0.5
0.2
0
0
0
0
0
5
5
5
4
5
loss total x1
y
y
y
y
y
0
0
1
2
1
5
5
5
4
5
1
1
1
1
1
x2
x3
x4
x5
1
1
0
0
0
1
1
1
1
0
0
0
0
0
0
1
1
0
0
0
StdErr
Obs x6 x7 slope loc
Pred
Pred
DF
Alpha
Lower
1 0 1
0
1 -2.83406 0.50758 263.000 0.05 -3.83350
2 0 1
1
1 -2.04892 0.50026 263.000 0.05 -3.03393
3 0 1
0
2 -2.46998 0.44506 263.000 0.05 -3.34632
4 0 1
1
2 -1.68485 0.43891 263.000 0.05 -2.54907
5 0 0
0
3 -1.51964 0.18972 257.500 0.05 -1.89324
Obs
1
2
3
4
5
Upper
-1.83461
-1.06390
-1.59365
-0.82062
-1.14603
Obs
1
2
3
4
5
mu
0.05551
0.11416
0.07799
0.15645
0.17951
Resid
-1.05877
-1.12887
1.69678
2.60307
0.13908
dmu
0.05243
0.10113
0.07191
0.13198
0.14729
eta
-2.83406
-2.04892
-2.46998
-1.68485
-1.51964
stderrmu
0.026612
0.050590
0.032003
0.057926
0.027944
stderreta
0.50758
0.50026
0.44506
0.43891
0.18972
lowermu
0.02118
0.04592
0.03402
0.07249
0.13088
uppermu
0.13769
0.25656
0.16887
0.30563
0.24121
lowereta
-3.83350
-3.03393
-3.34632
-2.54907
-1.89324
uppereta
-1.83461
-1.06390
-1.59365
-0.82062
-1.14603
What if you ignore correlation?
• Incorrect inferences about β ’s
• Inefficient estimates of β ’s
• IF YOU DON’T MODEL IT
CORRECTLY, AT LEAST
’ADJUST’ FOR IT
var
resraw
0.05243 -0.05551
0.10113 -0.11416
0.07191 0.12201
0.13198 0.34355
0.14729 0.02049
680
Should I spend much effirt on
modelling the covariance
structure?
• Situation 1: regression of y on x is the
main focus and m >> n: invest majority of time modelling the mean, less
time on correlation (use robust methods)
• Situation 2: correlation of prime interest or m small or subject specific
prediction is very important (for example, subject-specific curves): need
mean and covariance models approximately correct
682
681
Download