Deviance Plots: Global deviance:

advertisement
Deviance Plots:
Assess the overall t of the model.
log (1;ii) = 0 + 1X1i + + rXri
2 3
66 Y.1 77
Y=4 . 5
Yn
binary responses
D(^
; Y)
=
n
X
i=1
= ;2
d(^
i; Yi)
n
X
[Yi log(^i)
i=1
+(1 ; Yi) log(1 ; ^i)]
2 3
6 ^1 7
^ = 64 .. 75 estimates
^n
Global deviance:
of i =
P r(Yi = 1j(X1i; : : : ; Xpi))
from the tted model, i.e.
= G2 for testing
H0 : proposed model vs.
HA : general alternative
0 i 1
exp(b0+b1X1i++brXpi)
^i = 1+exp(
b0+b1X1i++br Xpi)
1144
1145
Deviance plot:
When the model is appropriate,
the global deviance tends to be
small.
When the model is inappropriate,
the data will exhibit systematic
deviations from predicted values
of i (a lack-of-t component)
that tend to inate the global deviance.
G2
does not have a 2 distribution
when there is only one observed response for each Xi, even if H0 is true.
1146
1. Partition the n observational units
into K non-overlapping clusters,
with nk units in the k-th cluster.
(a) Ward's method
(b) Complete linkage
(c) Centroid method
2. Create a model matrix Z where the
k8-th column has
< 1 if in k;th cluster
: 0 otherwise
1147
Let ^i denote the estimate of i from
the model in Step 3.
For the `-th unit in the k-th cluster,
dene
3. Fit the model:
2
1 ) 3
log(
1.;1 77
66
64
. 75
n
log( 1;n )
= X + Z
=
dk`
2 3 2 3
64X Z 75 4 5
4. Use the model t in Step 3. to compute \local" deviance contributions
by summing deviances within each
cluster.
= d ^hk`; Yk` i
= ;2 Yk` log ^k` + (1 ; Yk`) log 1 ; ^k`
5. Order the K clusters so that
S1 S2 Sk
where Sk is a measure of \within"
cluster inhomogeneity, e.g.,
nk
1 X
Sk =
(Xik ; X k)0(Xik ; X k)
nk i=1
1148
6. Compute running means of local deviances:
0 nk
1, t
t
X
X
t = @ dk`A X (nK ; 1)
D
k=1 `=1
k=1
t against the \d.f."
7. Plot
D
t
X
(nt ; 1) for t = 1; 2; : : : ; k.
k=1
Superimpose a horizontal line representing the global mean deviance
= D(^; Y)=(n ; r ; 1)
D
If the proposed model is appropriate
the values of D t will approach D .
1150
1149
Smoothed partial residual plots:
These are used to determine how the
response is related to each particular
covariate after adjusting for the eects
of (other) covariates included in the
proposed model.
Partial residuals:
With respect to covariate
rpar;i =
2 3
6 V1 7
V = 64 .. 75
Vn
(Yi ; ^^i) + ^^ V
i
^^i(1 ; ^^i)
1151
1. Plot rpar;i against Vi.
where ^^ and ^^ are obtained from tting
the model:
8
2
> X + V 1 ) 3
if V is not a
log(
(1.;1) 77 >><
66
column of X
64
. 75 = >
>
log( (1;nn)
>
: X;v ;v + V otherwise
2. Draw a \smooth" curve on the partial
residual plot by using Cleveland's (1979,
JASA, pp 829{836) method for tting a
robust locally weighted regression of rpar;i
on Vi. (loess)
(i) \Straight-line" curves suggest the Vi
should enter the model as BiVi.
(ii) horizontal line: suggest Vi is not important (i.e., i = 0).
(iii) a \curved" curve suggests that Vi
should enter the model as coecient *
function(Vi) + 1152
1153
Based on this analysis:
1. Add new variables or functions of variables
to the variable list in the model search.
2. Do a new selection of variabales.
3. Look at diagnostics for the new model.
(i) leverage, inuence, residuals, etc.
(ii) deviance plots
True Model: log 1; = ;1 + X5i + X6i ; 2X62i
Fitted Model: log 1; = 0 + 1X5i + 2X6i
i
i
(iii) partial residual plots
i
i
Iterate on these steps until you identify a reasonable model.
1154
from Landwher, et al. (1984) JASA
1155
Download