Logistic

advertisement
Logistic (regression)
single and multiple
Overview

Defined:
A model for predicting one variable from
other variable(s).

Variables:
IV(s) is continuous/categorical,
DV is dichotomous

Relationship:
Prediction of group membership

Example:
Can we predict bar passage from LSAT score
(and/or GPA, etc)

Assumptions:
Multicollinearity (not linearity or normality)
Comparison to Linear Regression:



Since dichotomous outcome,
can’t use linear regression
because not linear
Since dichotomous outcome,
we are now talking about
“probabilities” (of 0 or 1)
So logistic is about predicting
the probability of the outcome
occurring.
Comparison to Linear Regression:

Logistic is based upon “odds ratio”


which is the probability of an event divided by probability
of non-event.
For example, if Exp(b) =2, then a one unit change would
make the event twice as likely (.67/.33) to occur.
Exp(b) 
Odds after a unit change in the predictor
Odds before a unit change in the predictor
Comparison to Linear Regression:




Single predictor
Multiple predictor
P(Y ) 
P(Y ) 
1
1 e
( b0  b1X1  i )
1
1 e ( b0  b1X1 b2 X 2 ... bn X n  i )
Notice the linear regression equation
e is the base of the natural logarithm (about 2.718)
Comparison to Linear Regression:

Linear = measure of fit was sum of squares

Summing the squared difference between the line and actual
outcomes
Logistic = measure of fit is log-likelihood

Summing the probabilities associated with the predicted and
actual outcomes
Comparison to Linear Regression:

Linear = overall variance explained by R2
Logistic = overall “variance explained” by…



-2LL (log-likelihood score x 2, higher means worse fit)
R2cs (Cox and Snell’s statistic for comparison to baseline)
R2n (Nagelkerke’s statistic variation of R2cs)
NOTE:

There is no direct analog of R2 in logistic analysis.




This is because an R2 measure seeks to make a statement about the "percent of
variance explained," but the variance of a dichotomous or categorical
dependent variable depends on the frequency distribution of that variable.
For a dichotomous dependent variable, for instance, variance is at a maximum
for a 50-50 split, and the more lopsided the split, the lower the variance.
This means that R2 measures for logistic analysis with differing marginal
distributions of their respective dependent variables cannot be compared
directly, and comparison of logistic R2 measures with R2 from OLS regression
is also problematic.
Nonetheless, a number of logistic “pseudo” R2 measures
have been proposed, all of which should be reported as
approximations to OLS R2, BUT NOT as actual percent
of variance explained.
Comparison to Linear Regression:

Linear = unique contributions of variable by...



unstandardized b (for the regression equation)
standardized b
(for interpretation, similar to r)
significance level (t-test)
Logistic = unique contributions of variable by...



unstandardized b (for the logistic equation)
exp(b)
(for interpretation, as odds ratio)
significance level (Wald, using chi-square test)
Wald 
b
SE b
Comparison to Linear Regression:
Logistic = unique contributions of variable by...



unstandardized b (for the logistic equation)
exp(b)
(for interpretation, as odds ratio)
significance level (Wald, using chi-square test)
(1) Both gre and gpa are significant predictors while topnotch is not.
(2) For a one unit increase in gpa, the log odds of being admitted to graduate
school (vs. not being admitted) increases by .668.
(3) For a one unit increase in gpa, the odds of being admitted to graduate
school (vs. not being admitted) increased by a factor of 1.949.
Comparison to Linear Regression:

Linear = each variable (without controlling)…

Bivariate correlation
Logistic = each variable (without controlling)…

Logistic output shows you the following information:
Comparison to Linear Regression:

Linear = different methods…



Entry
Hierarchical
Stepwise
Logistic = different methods…



Entry (same as with linear regression)
Hierarchical (same as with linear regression)
Stepwise (see Field’s textbook page 226)
Download