Connections between log-linear and logistic regression models Sex (S)

advertisement
Connections between log-linear and
logistic regression models
Any log-linear model can be expressed as a logistic regression
model
{ Logisitic regression requires specication of a response variable
{ Log-linear models address association among all of the variables
Several log-linear models may correspond to the same logistic regression model
Wisconsin Driver Data:
Sex (S) (i=1) 'Male'
(i=2) 'Female'
Age (A) (j=1) '16-36'
(j=2) '36-55'
(j=3) 'over 55'
Disease (D) (k=1) 'Disease'
(k=2) 'Control'
Violation (V) (`=1) 'Some'
(`=2) 'None'
1176
Conditional independence of Disease status
and Violation status given any combination of
sex and age categories:
SD
log(mijk`) = + Si + Aj + Dk + V` + SA
ij + ik
AD
AV
SAD
SAV
+SV
i` + jk + j` + ijk + ij`
1177
Polychotomous Logistic Regression
There are J > 2 response categories:
8
>
<
ji = Pr >
:
9
>
response is =
in the j -th X1i; X2i; ; Xki>;
category
for j = 1; 2; : : : ; J
Conditional log-odds of a trac violation:
ijk1
log m
mijk2 = log(mijk1) ; log(mijk2)
Note that
;
SD
= + Si + Aj + Dk + V1 + SA
ij + ik
AD
AV
SAD
SAV
+SV
i1 + jk + j 1 + ijk + ij 1
;
SD
; + Si + Aj + Dk + V2 + SA
ij + ik
SV
AD
AV
SAD
SAV
+i2 + jk + j2 + ijk + ij2
;
;
SV
= V1 ; V2 + SV
i1 ; i2
; AV
;
SAV
SAV + j1 ; AV
j 2 + ij 1 ; ij 2
1178
J
X
j =1
ji = 1
for each i = 1; 2; : : : ; n.
There are several ways to
construct \simultaneous" logistic
regression models for such data.
1179
Then
Log-odds with respect to a baseline
category (e.g., the last category)
log
1 = + X + X + + X
01 11 1i 21 2i
k1 ki
ji =
exp(0j + 1j X1i + + kj Xki)
JX
;1
1 + exp(0` + i`X1i + + kj Xki)
`=1
for j = 1; 2; :::; J ; 1, and
i
Ji
log 2 = 02 + 12X1i + 22X2i + + k2Xki
.
log ;1 = 0;J ;1 + 1;J ;1X1i + +K;J ;1Xki
i
Ji
J
Ji =
1+
JX
;1
j =1
1
exp(0j + 1j X1i + + kj Xki)
;i
Ji
1180
1181
Adjacent-categories logits:
PROC CATMOD in SAS uses this when
the LOGISTIC response is applied to a variable with more than two categories.
It can be used for nominal response variables.
Other log-odds
!
!
ji
ji
Ji
log = log Ji `i
`i
!
!
= log ji ; log `i
Ji
Ji
h
i
= 0j + 1j X1i + + kj Xki
; [0` + 1`X1i + + k`X`i]
= (0j ; 0`) + (1j ; )1`)X1i
+ + (kj ; k`)Xki
1182
!
log 1i = 01 + 11X1i + + k1Xki
2i
!
log 2i = 02 + 12X1i + + k2Xki
3i
..
!
log J;1;i = 0;J ;1 + 1;J ;1X1i + J;i
+k;J ;1Xki
equivalent to the \baselinecategory" logistic model
sometimes used when the polychotomous response is an ordinal
variable
1183
Example: (3 response categories)
Cummulative logits:
!
log + +1i + = 01 + 11X1i + + k1Xki
2i
3i
Ji
= 02 + 12X1i + + k2Xki
log +1i + +2i
3i
Ji
..
+
+
+
J ;1;i
log 1i 2i = 0;J ;1 + 1;J ;1X1i
Ji
+ + k;J ;1Xki
log +1i = 01 + 11X1i + 21X2i = X0i1
2i
3i
!
+
1
i
2
i
log = 02 + 12X1i + 22X2i = X0i2
3i
Then
0
1i = (2i + 3i)eX 1
0
1i + 2i = 3ieX 2
1i + 2i + pi3i = 1
i
used for ordinal response variables
Does not yield the same estimates
of fjig as the \baseline-category"
logit model.
log 12ii , for example, is generally
not a linear function of the parameters
i
1184
and
1185
Continuation-ratio logits:
Fit J ; 1 separate logistic regression
models:
0
X 1
1i = e X0 1+e 1
i
log + +1i + 2i
3i
Ji
i
X0 X0 e 2;e 1
0
0
(1 + eX 1)(1 + eX 2)
1
3i =
0
1 + eX 2
2i =
i
log + +2i + 3i
4i
Ji
i
i
i
log J;1;i
Ji
i
Cosequently,
3
2
!
X0 1 (1 + eX0 1 ) 7
6e
1
i
log = log 4 X0 X0 5
2i
e 2;e 1
i
= 02 + 12X1i + + k2Xki
..
= 0;J ;1 + 1;J ;1X1i + +k;J ;1Xki
used for ordinal response variables
log +1 + + is the conditional logodds that a response falls in the j -th category given that it does not fall in a category
preceeding the j -th category
i
i
= 01 + 11X1i + + k1Xki
ji
j
i
1186
;i
J;i
1187
Example: Toxicity study (Price, 1987)
(from Agresti, pp 320{321)
Administered
diethylene-glycoldimethylether (DIEGdiME) to
pregnant mice
Each mouse was exposed to one of
5 concentrations for exactly 10 days
early in the pregnancy
{ level 0 is a control group
{ each fetus was classifed as
(1) non-live
(2) malformed
(3) normal
Concentration
Response rate (%)
Number
mg/kg/day
non-live malformed normal
exposed
______________________________________________________
0
5.05
0.34
94.61
297
62.5
7.02
0
92.98
242
125
7.05
2.24
90.71
312
250
12.71
19.73
67.56
299
500
50.53
46.32
3.16
285
______________________________________________________
1188
1189
Baseline-logit model:
Baseline-logit model:
!
log ^^non-live = ;3:969 + :0119(conc.)
normal
(:191) (:0007)
!
log ^malformed
= ;4:952 + :014(conc.)
^normal
(:249) (:0008)
!
log ^^non-live = ;2:7824 ; :00168(conc.)
normal
(:2009) (:00208)
+:000025(conc.)2
(:0000042)
!
log ^malformed
= ;6:7156 + :0252(conc.)
^normal
(:7724) (:00512)
;:00001(conc.)2
(:0000077)
Lack-of-t test:
G2 = 57.50 with 6 d.f. (p-value < .0001)
Lack-of-t test:
G2 = 4.51 with 4 d.f. (p-value = .341)
1190
1191
Continuous-ratio logit model:
^non-live
log ^
= ;3:2479 + :00639(conc.)
^
malformed + normal
(:1577) (:000435)
G21 = 5:78 on 3 d.f.(:123)
log ^malformed
= ;5:7019 + :0174(conc.)
^normal
(:3322) (:00123)
2
G1 = 6:06 on 3 d.f.(:109)
Overall lack-of-t test:
G2 = G21 + G22 = 11:84 on 6 d.f.(0:066)
1192
1193
exp(.00639 100) = 1.9
odds of \non-live" increases by a factor
of 1.9 for each 100mg/kg/day increase in
concentration (1.73), 2.07)
exp(0.0174 100) = 5.7
given that a fetus survives, the odds of
malformation increase by a factor of 5.7
for every 100mg/kg/day increase in concentration (4.48, 7.25)
exp(-3.2479) = .039
odds that a fetus fails to survive a zero
concentration (.028, .053)
1194
1195
PROC LOGISTIC and PROC GENMOD in
SAS t a special form of the \cummulativelogit" model
Walker-Duncan model (Biometrika, 1967
pp 167-179).
this model has \proportional odds" constraint
!
1
i
log 1 ; = 1 + 1X1i + + kXki
1i
log 1 ;1i + ;2i
1i 2i
!
log 1i + + J ;1;i
!
J;i
= 2 + 1X1i + + kXki
..
= j;1 + 1X1i + + kXki
1196
For the toxicity data:
log ^ ^non-live
malform + ^normal
= ;4:5311 + :00962(conc)
(:1783)
(:00044)
+ ^non-live = ;3:1533 + :00962(conc)
log ^malform
^normal
(:1381)
(:00044)
Score test for the \proportional odds" hypothesis
X 2 = 267:62 on 1 d.f. (p-value < :0001)
G2 = G2Walker-Duncan ; G2cumulative logit model
= (1640:41) ; (1029:54 + 431:28)
= 179:6 on 1 d.f.
1197
Download