Odds Ratios

advertisement
POL 682 Stata/lecture notes
Some Odds and Ends on Odds Ratios
Let’s estimate a garden variety logit Model where the dependent
variable is 1 if the incumbent loses the general or primary election
and 0 if he/she wins the election. The “event” is electoral defeat.
Here is the model:
. logit winlose leader priorm scandal
Iteration
Iteration
Iteration
Iteration
Iteration
Iteration
0:
1:
2:
3:
4:
5:
log
log
log
log
log
log
likelihood
likelihood
likelihood
likelihood
likelihood
likelihood
=
=
=
=
=
=
-1347.2829
-1262.5742
-1217.3586
-1214.3361
-1214.2961
-1214.2961
Logit estimates
Number of obs
LR chi2(3)
Prob > chi2
Pseudo R2
Log likelihood = -1214.2961
=
=
=
=
5036
265.97
0.0000
0.0987
-----------------------------------------------------------------------------winlose |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------leader | -1.029759
.516245
-1.99
0.046
-2.04158
-.0179374
priorm | -.0420528
.0035401
-11.88
0.000
-.0489913
-.0351143
scandal |
2.839299
.3194128
8.89
0.000
2.213261
3.465337
_cons | -1.487179
.0859136
-17.31
0.000
-1.655566
-1.318791
------------------------------------------------------------------------------
Interpretation is standard. A positive coefficient says the log-odds
of defeat are decreasing as a function of being in a leadership
position and prior electoral margin (the more votes you got last time,
the less likely you are to lose this time [why might this be the
case?]). They are increasing as a function of scandal: if embroiled in
a scandal, your more likely to lose.
The log-odds interpretation is a function of the logit distribution.
Note that we can motivate the logit model in terms of the odds of
success vs. failure, which is given by: p_i/1-p_i. The logistic
transformation (which gives rise to the logit model) is the log of this
odds ratio. Hence, model estimates from logit are properly referred to
as “log-odds” estimates.
But since no one thinks in terms of log of the odds ratio, it is more
natural to interpret the logit model differently. We have learned the
probability interpretation (refer to other handouts), but there are
other ways. One popular way is through the ODDS RATIO.
This quantity isn’t difficult to compute HINT: if logit gives you logodds ratios, how do you convert a logged coefficient back into natural
units? You exponentiate it! Hence, if we exponenatiate the
coefficients from above, then we obtain odds ratios.
Stata will do this automatically. If you type “logistic” instead of
logit, the model estimates are presented as odds ratios, not log-odds
ratios:
1
. logistic winlose leader priorm scandal
Logit estimates
Number of obs
LR chi2(3)
Prob > chi2
Pseudo R2
Log likelihood = -1214.2961
=
=
=
=
5036
265.97
0.0000
0.0987
-----------------------------------------------------------------------------winlose | Odds Ratio
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------leader |
.357093
.1843475
-1.99
0.046
.1298234
.9822225
priorm |
.9588192
.0033943
-11.88
0.000
.9521895
.965495
scandal |
17.10377
5.463164
8.89
0.000
9.145496
31.98723
Of course we could also use Stata as a handy dandy calculator.
first rerun the logit model:
Let’s
. logit winlose leader priorm scandal
(We’ll suppress the output).
Now if we were to type
. display exp(_b[leader])
This command would return
.35709305
which is what the “logistic” procedure told us from before. Thus, the
odds ratio (NOT log odds) of an incumbent who is in the House
leadership losing is .36 to 1. What does this odds ratio mean? It
says that the chances of an incumbent who is in the leadership losing
are only about 36 percent of the odds of a non-leadership incumbent
losing. The odds are lower.
Let’s work our way to this result using these data as our guide. Let’s
say that the “probability” of a defeat solely as a function of the
leadership variable is exp(b1)/1+exp(b1), where b1 is the coefficient
for the leadership variable (i.e. –1.03). This probability would be
. display exp(_b[leader])/(1+exp(_b[leader]))
.26313085
Let’s call this number p. Now let’s call the probability of winning q.
The laws of probability say that p+q=1; thus p=1-q and q=1-p. Hence,
for these data, the “probability” of winning (i.e. q) would be what?
It would be 1-.263=.737. More accurately, it would be:
. display 1-exp(_b[leader])/(1+exp(_b[leader]))
.73686915
If as we previously did, define the odds of success vs. failure as
p_i/1-p_i then it is easy to see this ratio is p_i/q_i or simply p/q.
2
Thus, the odds ratio is the ratio between the Pr(Y=1|d=1)/Pr(Y=0|d=1)
which is*
. display (exp(_b[leader])/(1+exp(_b[leader])))/(1exp(_b[leader])/(1+exp(_b[leader]))))
.35709305
More simply, p/q=.263/.737=.357. This .357 looks like what? It is
exactly equal to the odds ratio we computed above (or what Stata’s
logistic procedure gave us)!
Expressed in this way, it is a little easier to see what is going on
with the odds ratio. When the probability of a one (“success”) is less
than the probability of a 0 (“failure), then the odds ratio will be
less than 1. When the probability of a one is greater than the
probability of a 0, the odds ratio will be greater than 1. When the
odds ratio is exactly 1, this says the odds of success and failure are
even---an “even money bet.”
For incumbents in the leadership, the odds of defeat are
p/q=.263/.737=.357, or .357 to 1. This says that the chances an
incumbent in the leadership will lose are only .357 (or 36 percent) of
the chances an incumbent not in the leadership losing. This may seem
awkward sounding (and it is), but there is another, equivalent way to
interpret the odds. If the odds of losing are .357, then what are the
odds of winning?
From the results above, it must be the case that q/p=.737/.357=2.802.
What does this number mean? It tells us that the odds of an incumbent
in the leadership winning election are 2.802 to 1. That is, incumbents
in the leadership are nearly 3 times more likely to win reelection than
when compared to incumbents who are not in the leadership.
An important fact about odds ratios is seen by this result:
1/.357≈2.802 and 1/2.802≈.357.
That is, the odds of success and failure are reciprocals of one
another. These means that either odds conveys the same information, it
just depends on which feature of the data you want to describe
(successes [“1s”] or failures [“0s”]).
Note this result. Since the logistic transformation is given by the
log of the odds ratio, log(p/q), then if we take the log of the odds
ratio for the leadership variable we obtain
. display log((exp(_b[leader])/(1+exp(_b[leader])))/(1(exp(_b[leader])/(1+exp(_b[leader])))))
-1.0297589
which in simpler notation is log(.263/.737)≈-1.03. This is the logodds ratio and it is the number produced by Stata when you estimate a
logit model. Now if we exponentiate this coefficient, we return to the
3
odds ratio. Hence, we see how we can easily jump between log odds and
odds ratios (as well as probabilities).
Now consider the scandal covariate. We see the coefficient estimate is
2.839. The odds ratio is easily computed by taking advantage of the
results in the previous paragraph. Thus, exponentiating this
coefficient gives us
display exp(2.839299)
17.103772
which says the odds of a incumbent losing who is embroiled in a scandal
are about 17 times higher than an incumbent not embroiled in a scandal.
Repeat the exercises from above to verify this.
Odds ratios for continuous variables do not have quite as nice of an
interpretation mainly because there is no natural baseline group to
compare the odds. Nevertheless, odds ratio interpretations are still
useful for these variables.
Consider the coefficient for the prior electoral margin variable.
odds ratio is given by
The
. display exp(_b[priorm])
.95881916
which says that the odds of losing are decreasing as prior electoral
margin increases.
Huh?
This isn’t so hard to see.
margin is 100 percent?
What is the odds ratio when prior electoral
. display exp(_b[priorm]*100)
.01491663
The odds are very low, as we might predict they would: compared to
someone whose political party got 0 percent in the previous election,
the odds of an incumbent whose party got 100 percent in the previous
election losing are only about 1 percent. But this doesn’t help us see
where the numbers come from.
What is the odds ratio when prior electoral margin is 99 percent (i.e.
1 unit lower than 100)?
. display exp(_b[priorm]*99)
.01555729
This too is a small number.
odds is
Note, however, that the ratio of these two
. display exp(_b[priorm]*100)/exp(_b[priorm]*99)
4
.95881916
What is the number equal to? It is the odds ratio for the prior margin
coefficient that we see from above. Hence, for continuous variables,
the odds ratio tells us what the odds ratio is for a unit change in x.
Any adjacent pair of values for prior margin will give us this ratio. S
Frequently, the following formula is useful in order to compute the
percentage change in the odds ratio:
%Δ(Odds)=[exp(bx=i)-exp(bx=j)]/exp(bx=j)]*100%
For these data, we see that the percentage change in the odds in “going
from” 100 percent of the vote to 99 percent of the vote yields
. display (exp(_b[priorm]*100)-exp(_b[priorm]*99))/exp(_b[priorm]*99)*100
-4.1180836
about a 4 percent drop in the odds of losing. Verify that this change
is constant for all adjacent pairs of values on the prior margin
variable. Note the close connection between our –4.11 estimate and the
odds ratio of the prior margin variable (i.e. .9588). One minus this
ratio gives us
. display 1-exp(_b[priorm])
.04118084
which when multiplied by 100 gives us the percentage drop. Hence, the
.0411 gives us the proportional drop in the odds of defeat for a unit
increase in x.
To conclude, odds ratios are very simple quantities to interpret.
Unfortunately, workers do not often interpret their results in terms of
odds ratios. As we will see, this interpretation will come in handy
with other kinds of logit models.
5
Download