Logistic regression models 1/13

advertisement
Logistic regression models
1/13
Logistic regression model
Suppose we observe binomial distributed data S1 , · · · , Sm ,
which are, respectively, the numbers of successes among
n1 , · · · , nm trials. Assume that Si ∼ Binomial(ni , pi ) and
pi
log
= XiT β,
1 − pi
where Xi is a p-dimensional covariate.
2/13
Asymptotic normality
Using the large sample property of the maximum likelihood
estimator, we know that
β̂ − β ∼ N(0, (X T VX )−1 ),
where V = diag{n1 p1 (1 − p1 ), · · · , nm pm (1 − pm )} and
X = (X1 , · · · , Xm )T is the m × p design matrix.
3/13
Wald type inference
A 1 − α confidence interval for βj (j = 1, · · · , p) is
β̂j ± zα/2 [(X T VX )−1/2 ]jj ,
where [(X T VX )−1/2 ]jj is the j-th diagonal component of
(X T VX )−1/2 , and zα/2 is the upper α/2 quantile of a standard
normal.
4/13
Log-likelihood of β
Recall that the log-likelihood of β is
m
m
X
X
ni
pi
)+
ni log(1 − pi )
`(β) =
log{
}+
Si log(
1 − pi
Si
i=1
i=1
i=1
m h
i
X
n
=
log{ i } + Si log(pi ) + (ni − Si ) log(1 − pi ) .
Si
m
X
i=1
5/13
Testing the significance of the predictors
I
Consider the covariates Xi = (1, Xi1 , · · · , Xi(p−1) )T . The
first covariate is set to be 1, which is corresponding to the
intercept in the logistic regression model.
I
If all the predictors are not significant, then the reduced
model is
pi =
exp(β1 )
,
1 + exp(β1 )
which is a constant probability model.
I
To test the significant of the predictors Xi1 , · · · , Xi(p−1) , we
would like to test the significance of β2 , · · · , βp .
6/13
Testing the significance of the predictors
To test the significance of the predictors, we test the following
hypothesis
H0 : β2 = · · · = βp = 0 vs H1 : one of βj is not 0, j = 2, · · · , p.
7/13
Log-likelihood ratio
I
Assume p̂j is the maximum likelihood estimator of pj under
the logistic regression model. Let β̂ is the MLE of β, then
p̂j = exp(β̂ T Xi )/{1 + exp(β̂ T Xi )}.
I
The log-likelihood under the alternative is
`(β) =
m h
X
log{
i=1
I
i
ni
} + Si log(p̂i ) + (ni − Si ) log(1 − p̂i ) .
Si
Under the null hypothesis, the log-likelihood is
`(β) =
m h
X
i=1
i
n
log{ i } + Si log(p̂) + (ni − Si ) log(1 − p̂)
Si
where p̂ = S/n, S =
Pm
i=1 Si
and n =
Pm
i=1 ni .
8/13
Rejection region of the likelihood ratio test
For testing the significance of predictors,
H0 : β2 = · · · = βp = 0 vs H1 : one of βj is not 0, j = 2, · · · , p.
An α level rejection region of the likelihood ratio test is
m h
n X
i
LR = 2
Si log(p̂i ) + (ni − Si ) log(1 − p̂i )
i=1
o
− 2S log(S) − 2(n − S) log(n − S) + 2n log(n) > χ2p−1,α .
9/13
Diagnostic tools: goodness-of-fit test
I
To test the goodness-of-fit of the full model, we compare
the full model (with p covariates) against a saturated
model.
I
A saturated model has m unknown parameters. It is a
nonparametric model. Specifically, Si ∼ Binomial(ni , pi ), pi
are all the unknown parameters. Note m is the largest
possible number of unknown parameters.
I
It is clear that in the saturated model, the maximum
likelihood estimator of pi is Si /ni .
10/13
Deviance
I
The log-likelihood ratios statistics for testing the
goodness-of-fit is
m h
i
X
D=2
Si log(Si /ni ) + (ni − Si ) log(1 − Si /ni )
i=1
m h
X
i
Si log(p̂i ) + (ni − Si ) log(1 − p̂i ) .
−2
i=1
I
The above log-likelihood ratio is called deviance.
I
The deviance D follows a χ2m−p distribution under the null
hypothesis that the full model (with p covariates) is
adequate for the data.
11/13
Diagnostic tools: checking residuals
I
Ordinary residual: ei = Si − Ŝi = Si − ni p̂i for i = 1, · · · , m.
I
Deviance residual:
√
di =
h
n n − S oi1/2
S
i
i
2Sign(ei ) Si log( i )+(n −Si ) log
ni p̂i
ni (1 − p̂i )
for i = 1, · · · , m.
12/13
Diagnostic tools: checking residuals
I
Pearson residual:
S − ni p̂i
ri = p i
ni p̂i (1 − p̂i )
I
Standardized Pearson residual:
sri = √
ri
1 − hii
where hii = (H)ii and H = V 1/2 X (X T VX )−1 X T V 1/2 .
13/13
Download