Logistic regression

advertisement
Say good things, think good thoughts,
and do good deeds
1
Categorical
Data Analysis
Chapter 5 (I): Logistic
Regression for
Quantitative Factors
2
Logistic Regression
• Binary response variable: Y ~ Bernoulli(p)
• k quantitative/ordinal factors: x1, … , xk
• model:
 p ( x) 
    1 x1  ...   k xk
log 
 1  p ( x) 
SAS textbook Sec 8.5
3
Interpretation
(for Only One Factor)
•
(Multiplicative effect on the odds)
Increasing x by one unit is estimated to
give the odds of response a increase by
a factor of exp()— Not easy for
investigators to understand
4
Interpretation
(for Only One Factor)
•
Interpretation of the effect of x on Y in
terms of risk (or called response rate):
–
The bigger  (the effect of X on Y) is, the
bigger the slope of a tangent line of the
fitting curve (with respect to X) is:
p ( x)
 p ( x)(1  p ( x))
x
how the risk changes instantly at x
5
LD 50
• LD50 (LD = lethal dose)=
the dose level at which toxicity rate
p(dose) is 50%
• In the logistic regression with dose
being the only x, LD50= -/
• The instant change rate of risk (p)
at LD50 is /4
6
Example: Insecticide vs Beetles
dosage
1.691
# of beetles # of dead
exposed
beetles
59
6
…
…
…
1.884
60
60
See handout for SAS code and output
7
Only One Factor
• The estimate of , 34.27, can be
interpreted as:
Increasing dose by one unit is estimated
to give the odds of death a increase by a
factor of exp(34.27)
• Interpret the effect on the risk of death at
dose 1.70
8
More than 1 Factors
• The estimate of  is 34.27 can be
interpreted as:
Increasing x by one unit, keeping other
factors fixed, is estimated to give the odds
of death a increase by a factor of
exp(34.27)
 1 is called the logistic regression
coefficient for x1 adjusted for other factors
9
Confidence Intervals
Based on Fisher information matrix
and asymptotical results of mle
• Wald C.I. for i : found by SAS
• Wald C. I. for p(x): found by SAS
PROC GENMOD with the OBSTATS
option
10
Significance Tests
• H0: 10 vs. H1: 1 is not zero
• Wald test
• LR test
11
Examining the Fit of the
Logit Model
• Plot fitted and observed rates on the
same plot
• Residuals for logit models
12
Grouped Data
• Grouping data makes overall goodness of fit
test sensible and possible
• Example: Crab data grouped by the width
– Ungrouped:
deviance=191.7, df=165,
p-value=.076
– Grouped:
deviance=6.25, df=6,
p-value=.40
13
Download