Uploaded by Nouran Hamza

lec4module3

advertisement
Why logistic regression
Let’s know our data
• plot(Treatment_dummy,outcome_dummy)
Logistic Function
P(“Success”|Newdrug)
1,0
e  o  1 X
P(" Success"| X ) 
1  e  o  1 X
0,8
0,6
0,4
0,2
0,0
Newdrug
Logit Transformation
The logistic regression model is given by
 o  1 X
e
P (Y | X ) 
 o  1 X
1 e
 P (Y | X ) 
   o  1 X
ln
which is equivalent to  1  P(Y | X ) 
This is called the
Logit Transformation
Dichotomous Predictor
Consider a dichotomous predictor (X) which represents
receiving Treatment(1 = Newdrug)
Treament (X)
Outcome (Y)
Success (Y = 1)
Newdrug
Placebo
(X = 1)
(X = 0)
P(Y  1 X  1)
Failure (Y = 0) 1  P(Y  1 X  1)
P
 e  o  1 X
1 P
Therefore the odds
ratio (OR)
P(Y  1 X  0)
1  P(Y  1 X  0)
P(Y  1 | X  1)
 e  o  1
1 - P(Y  1 | X  1)
P(Y  1 | X  0)
Odds for Success with Placebo 
 e o
1 - P(Y  1 | X  0)
Odds for Success with Newdrug 
Odds for Success with Newdrug e  o  1

  o  e 1
Odds for Success with Placebo
e
Dichotomous Predictor
OR  e 1
• Therefore, for the odds ratio associated with
receiving treatment we have
• Taking the natural logarithm we have
ln(OR)  1
thus the estimated regression coefficient associated
with a 0-1 coded dichotomous predictor is the
natural log of the OR associated with receiving
treatment!!!
Logit is Directly Related to Odds
The logistic model can be written
 P (Y | X ) 
 P 
  ln
ln
   o  1 X
1 P 
 1  P (Y | X ) 
This implies that the odds for success can be
expressed as
P
 o  1 X
e
1 P
Take care
Change the reference value
How to interpret
• The odds of success for a patient
receiving placebo is 0.28
• The odds of success for a patient
receiving Newdrug is 10.09
• The probability of success
for a patient receiving
placebo is 21.9%
• The probability of success
for a patient receiving
Newdrug is 90.9%
What about multiple Xs
> correlations <- cor(logistic[,5:8])
correlation
> corrplot(correlations, method="circle")
•
A dot-representation was used where blue represents positive correlation. The larger the dot
the larger the correlation. You can see that the matrix is symmetrical and that the diagonal
are perfectly positively correlated because it shows the correlation of each variable with
itself.
Getting the correlation matrix
> correlations
Take care 
Download