Connections between log-linear and logistic regression models Any log-linear model can be expressed as a logistic regression model { Logisitic regression requires specication of a response variable { Log-linear models address association among all of the variables Several log-linear models may correspond to the same logistic regression model Wisconsin Driver Data: Sex (S) (i=1) 'Male' (i=2) 'Female' Age (A) (j=1) '16-36' (j=2) '36-55' (j=3) 'over 55' Disease (D) (k=1) 'Disease' (k=2) 'Control' Violation (V) (`=1) 'Some' (`=2) 'None' 1176 Conditional independence of Disease status and Violation status given any combination of sex and age categories: SD log(mijk`) = + Si + Aj + Dk + V` + SA ij + ik AD AV SAD SAV +SV i` + jk + j` + ijk + ij` 1177 Polychotomous Logistic Regression There are J > 2 response categories: 8 > < ji = Pr > : 9 > response is = in the j -th X1i; X2i; ; Xki>; category for j = 1; 2; : : : ; J Conditional log-odds of a trac violation: ijk1 log m mijk2 = log(mijk1) ; log(mijk2) Note that ; SD = + Si + Aj + Dk + V1 + SA ij + ik AD AV SAD SAV +SV i1 + jk + j 1 + ijk + ij 1 ; SD ; + Si + Aj + Dk + V2 + SA ij + ik SV AD AV SAD SAV +i2 + jk + j2 + ijk + ij2 ; ; SV = V1 ; V2 + SV i1 ; i2 ; AV ; SAV SAV + j1 ; AV j 2 + ij 1 ; ij 2 1178 J X j =1 ji = 1 for each i = 1; 2; : : : ; n. There are several ways to construct \simultaneous" logistic regression models for such data. 1179 Then Log-odds with respect to a baseline category (e.g., the last category) log 1 = + X + X + + X 01 11 1i 21 2i k1 ki ji = exp(0j + 1j X1i + + kj Xki) JX ;1 1 + exp(0` + i`X1i + + kj Xki) `=1 for j = 1; 2; :::; J ; 1, and i Ji log 2 = 02 + 12X1i + 22X2i + + k2Xki . log ;1 = 0;J ;1 + 1;J ;1X1i + +K;J ;1Xki i Ji J Ji = 1+ JX ;1 j =1 1 exp(0j + 1j X1i + + kj Xki) ;i Ji 1180 1181 Adjacent-categories logits: PROC CATMOD in SAS uses this when the LOGISTIC response is applied to a variable with more than two categories. It can be used for nominal response variables. Other log-odds ! ! ji ji Ji log = log Ji `i `i ! ! = log ji ; log `i Ji Ji h i = 0j + 1j X1i + + kj Xki ; [0` + 1`X1i + + k`X`i] = (0j ; 0`) + (1j ; )1`)X1i + + (kj ; k`)Xki 1182 ! log 1i = 01 + 11X1i + + k1Xki 2i ! log 2i = 02 + 12X1i + + k2Xki 3i .. ! log J;1;i = 0;J ;1 + 1;J ;1X1i + J;i +k;J ;1Xki equivalent to the \baselinecategory" logistic model sometimes used when the polychotomous response is an ordinal variable 1183 Example: (3 response categories) Cummulative logits: ! log + +1i + = 01 + 11X1i + + k1Xki 2i 3i Ji = 02 + 12X1i + + k2Xki log +1i + +2i 3i Ji .. + + + J ;1;i log 1i 2i = 0;J ;1 + 1;J ;1X1i Ji + + k;J ;1Xki log +1i = 01 + 11X1i + 21X2i = X0i1 2i 3i ! + 1 i 2 i log = 02 + 12X1i + 22X2i = X0i2 3i Then 0 1i = (2i + 3i)eX 1 0 1i + 2i = 3ieX 2 1i + 2i + pi3i = 1 i used for ordinal response variables Does not yield the same estimates of fjig as the \baseline-category" logit model. log 12ii , for example, is generally not a linear function of the parameters i 1184 and 1185 Continuation-ratio logits: Fit J ; 1 separate logistic regression models: 0 X 1 1i = e X0 1+e 1 i log + +1i + 2i 3i Ji i X0 X0 e 2;e 1 0 0 (1 + eX 1)(1 + eX 2) 1 3i = 0 1 + eX 2 2i = i log + +2i + 3i 4i Ji i i i log J;1;i Ji i Cosequently, 3 2 ! X0 1 (1 + eX0 1 ) 7 6e 1 i log = log 4 X0 X0 5 2i e 2;e 1 i = 02 + 12X1i + + k2Xki .. = 0;J ;1 + 1;J ;1X1i + +k;J ;1Xki used for ordinal response variables log +1 + + is the conditional logodds that a response falls in the j -th category given that it does not fall in a category preceeding the j -th category i i = 01 + 11X1i + + k1Xki ji j i 1186 ;i J;i 1187 Example: Toxicity study (Price, 1987) (from Agresti, pp 320{321) Administered diethylene-glycoldimethylether (DIEGdiME) to pregnant mice Each mouse was exposed to one of 5 concentrations for exactly 10 days early in the pregnancy { level 0 is a control group { each fetus was classifed as (1) non-live (2) malformed (3) normal Concentration Response rate (%) Number mg/kg/day non-live malformed normal exposed ______________________________________________________ 0 5.05 0.34 94.61 297 62.5 7.02 0 92.98 242 125 7.05 2.24 90.71 312 250 12.71 19.73 67.56 299 500 50.53 46.32 3.16 285 ______________________________________________________ 1188 1189 Baseline-logit model: Baseline-logit model: ! log ^^non-live = ;3:969 + :0119(conc.) normal (:191) (:0007) ! log ^malformed = ;4:952 + :014(conc.) ^normal (:249) (:0008) ! log ^^non-live = ;2:7824 ; :00168(conc.) normal (:2009) (:00208) +:000025(conc.)2 (:0000042) ! log ^malformed = ;6:7156 + :0252(conc.) ^normal (:7724) (:00512) ;:00001(conc.)2 (:0000077) Lack-of-t test: G2 = 57.50 with 6 d.f. (p-value < .0001) Lack-of-t test: G2 = 4.51 with 4 d.f. (p-value = .341) 1190 1191 Continuous-ratio logit model: ^non-live log ^ = ;3:2479 + :00639(conc.) ^ malformed + normal (:1577) (:000435) G21 = 5:78 on 3 d.f.(:123) log ^malformed = ;5:7019 + :0174(conc.) ^normal (:3322) (:00123) 2 G1 = 6:06 on 3 d.f.(:109) Overall lack-of-t test: G2 = G21 + G22 = 11:84 on 6 d.f.(0:066) 1192 1193 exp(.00639 100) = 1.9 odds of \non-live" increases by a factor of 1.9 for each 100mg/kg/day increase in concentration (1.73), 2.07) exp(0.0174 100) = 5.7 given that a fetus survives, the odds of malformation increase by a factor of 5.7 for every 100mg/kg/day increase in concentration (4.48, 7.25) exp(-3.2479) = .039 odds that a fetus fails to survive a zero concentration (.028, .053) 1194 1195 PROC LOGISTIC and PROC GENMOD in SAS t a special form of the \cummulativelogit" model Walker-Duncan model (Biometrika, 1967 pp 167-179). this model has \proportional odds" constraint ! 1 i log 1 ; = 1 + 1X1i + + kXki 1i log 1 ;1i + ;2i 1i 2i ! log 1i + + J ;1;i ! J;i = 2 + 1X1i + + kXki .. = j;1 + 1X1i + + kXki 1196 For the toxicity data: log ^ ^non-live malform + ^normal = ;4:5311 + :00962(conc) (:1783) (:00044) + ^non-live = ;3:1533 + :00962(conc) log ^malform ^normal (:1381) (:00044) Score test for the \proportional odds" hypothesis X 2 = 267:62 on 1 d.f. (p-value < :0001) G2 = G2Walker-Duncan ; G2cumulative logit model = (1640:41) ; (1029:54 + 431:28) = 179:6 on 1 d.f. 1197