POL 682 Stata/lecture notes Some Odds and Ends on Odds Ratios Let’s estimate a garden variety logit Model where the dependent variable is 1 if the incumbent loses the general or primary election and 0 if he/she wins the election. The “event” is electoral defeat. Here is the model: . logit winlose leader priorm scandal Iteration Iteration Iteration Iteration Iteration Iteration 0: 1: 2: 3: 4: 5: log log log log log log likelihood likelihood likelihood likelihood likelihood likelihood = = = = = = -1347.2829 -1262.5742 -1217.3586 -1214.3361 -1214.2961 -1214.2961 Logit estimates Number of obs LR chi2(3) Prob > chi2 Pseudo R2 Log likelihood = -1214.2961 = = = = 5036 265.97 0.0000 0.0987 -----------------------------------------------------------------------------winlose | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------leader | -1.029759 .516245 -1.99 0.046 -2.04158 -.0179374 priorm | -.0420528 .0035401 -11.88 0.000 -.0489913 -.0351143 scandal | 2.839299 .3194128 8.89 0.000 2.213261 3.465337 _cons | -1.487179 .0859136 -17.31 0.000 -1.655566 -1.318791 ------------------------------------------------------------------------------ Interpretation is standard. A positive coefficient says the log-odds of defeat are decreasing as a function of being in a leadership position and prior electoral margin (the more votes you got last time, the less likely you are to lose this time [why might this be the case?]). They are increasing as a function of scandal: if embroiled in a scandal, your more likely to lose. The log-odds interpretation is a function of the logit distribution. Note that we can motivate the logit model in terms of the odds of success vs. failure, which is given by: p_i/1-p_i. The logistic transformation (which gives rise to the logit model) is the log of this odds ratio. Hence, model estimates from logit are properly referred to as “log-odds” estimates. But since no one thinks in terms of log of the odds ratio, it is more natural to interpret the logit model differently. We have learned the probability interpretation (refer to other handouts), but there are other ways. One popular way is through the ODDS RATIO. This quantity isn’t difficult to compute HINT: if logit gives you logodds ratios, how do you convert a logged coefficient back into natural units? You exponentiate it! Hence, if we exponenatiate the coefficients from above, then we obtain odds ratios. Stata will do this automatically. If you type “logistic” instead of logit, the model estimates are presented as odds ratios, not log-odds ratios: 1 . logistic winlose leader priorm scandal Logit estimates Number of obs LR chi2(3) Prob > chi2 Pseudo R2 Log likelihood = -1214.2961 = = = = 5036 265.97 0.0000 0.0987 -----------------------------------------------------------------------------winlose | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------leader | .357093 .1843475 -1.99 0.046 .1298234 .9822225 priorm | .9588192 .0033943 -11.88 0.000 .9521895 .965495 scandal | 17.10377 5.463164 8.89 0.000 9.145496 31.98723 Of course we could also use Stata as a handy dandy calculator. first rerun the logit model: Let’s . logit winlose leader priorm scandal (We’ll suppress the output). Now if we were to type . display exp(_b[leader]) This command would return .35709305 which is what the “logistic” procedure told us from before. Thus, the odds ratio (NOT log odds) of an incumbent who is in the House leadership losing is .36 to 1. What does this odds ratio mean? It says that the chances of an incumbent who is in the leadership losing are only about 36 percent of the odds of a non-leadership incumbent losing. The odds are lower. Let’s work our way to this result using these data as our guide. Let’s say that the “probability” of a defeat solely as a function of the leadership variable is exp(b1)/1+exp(b1), where b1 is the coefficient for the leadership variable (i.e. –1.03). This probability would be . display exp(_b[leader])/(1+exp(_b[leader])) .26313085 Let’s call this number p. Now let’s call the probability of winning q. The laws of probability say that p+q=1; thus p=1-q and q=1-p. Hence, for these data, the “probability” of winning (i.e. q) would be what? It would be 1-.263=.737. More accurately, it would be: . display 1-exp(_b[leader])/(1+exp(_b[leader])) .73686915 If as we previously did, define the odds of success vs. failure as p_i/1-p_i then it is easy to see this ratio is p_i/q_i or simply p/q. 2 Thus, the odds ratio is the ratio between the Pr(Y=1|d=1)/Pr(Y=0|d=1) which is* . display (exp(_b[leader])/(1+exp(_b[leader])))/(1exp(_b[leader])/(1+exp(_b[leader])))) .35709305 More simply, p/q=.263/.737=.357. This .357 looks like what? It is exactly equal to the odds ratio we computed above (or what Stata’s logistic procedure gave us)! Expressed in this way, it is a little easier to see what is going on with the odds ratio. When the probability of a one (“success”) is less than the probability of a 0 (“failure), then the odds ratio will be less than 1. When the probability of a one is greater than the probability of a 0, the odds ratio will be greater than 1. When the odds ratio is exactly 1, this says the odds of success and failure are even---an “even money bet.” For incumbents in the leadership, the odds of defeat are p/q=.263/.737=.357, or .357 to 1. This says that the chances an incumbent in the leadership will lose are only .357 (or 36 percent) of the chances an incumbent not in the leadership losing. This may seem awkward sounding (and it is), but there is another, equivalent way to interpret the odds. If the odds of losing are .357, then what are the odds of winning? From the results above, it must be the case that q/p=.737/.357=2.802. What does this number mean? It tells us that the odds of an incumbent in the leadership winning election are 2.802 to 1. That is, incumbents in the leadership are nearly 3 times more likely to win reelection than when compared to incumbents who are not in the leadership. An important fact about odds ratios is seen by this result: 1/.357≈2.802 and 1/2.802≈.357. That is, the odds of success and failure are reciprocals of one another. These means that either odds conveys the same information, it just depends on which feature of the data you want to describe (successes [“1s”] or failures [“0s”]). Note this result. Since the logistic transformation is given by the log of the odds ratio, log(p/q), then if we take the log of the odds ratio for the leadership variable we obtain . display log((exp(_b[leader])/(1+exp(_b[leader])))/(1(exp(_b[leader])/(1+exp(_b[leader]))))) -1.0297589 which in simpler notation is log(.263/.737)≈-1.03. This is the logodds ratio and it is the number produced by Stata when you estimate a logit model. Now if we exponentiate this coefficient, we return to the 3 odds ratio. Hence, we see how we can easily jump between log odds and odds ratios (as well as probabilities). Now consider the scandal covariate. We see the coefficient estimate is 2.839. The odds ratio is easily computed by taking advantage of the results in the previous paragraph. Thus, exponentiating this coefficient gives us display exp(2.839299) 17.103772 which says the odds of a incumbent losing who is embroiled in a scandal are about 17 times higher than an incumbent not embroiled in a scandal. Repeat the exercises from above to verify this. Odds ratios for continuous variables do not have quite as nice of an interpretation mainly because there is no natural baseline group to compare the odds. Nevertheless, odds ratio interpretations are still useful for these variables. Consider the coefficient for the prior electoral margin variable. odds ratio is given by The . display exp(_b[priorm]) .95881916 which says that the odds of losing are decreasing as prior electoral margin increases. Huh? This isn’t so hard to see. margin is 100 percent? What is the odds ratio when prior electoral . display exp(_b[priorm]*100) .01491663 The odds are very low, as we might predict they would: compared to someone whose political party got 0 percent in the previous election, the odds of an incumbent whose party got 100 percent in the previous election losing are only about 1 percent. But this doesn’t help us see where the numbers come from. What is the odds ratio when prior electoral margin is 99 percent (i.e. 1 unit lower than 100)? . display exp(_b[priorm]*99) .01555729 This too is a small number. odds is Note, however, that the ratio of these two . display exp(_b[priorm]*100)/exp(_b[priorm]*99) 4 .95881916 What is the number equal to? It is the odds ratio for the prior margin coefficient that we see from above. Hence, for continuous variables, the odds ratio tells us what the odds ratio is for a unit change in x. Any adjacent pair of values for prior margin will give us this ratio. S Frequently, the following formula is useful in order to compute the percentage change in the odds ratio: %Δ(Odds)=[exp(bx=i)-exp(bx=j)]/exp(bx=j)]*100% For these data, we see that the percentage change in the odds in “going from” 100 percent of the vote to 99 percent of the vote yields . display (exp(_b[priorm]*100)-exp(_b[priorm]*99))/exp(_b[priorm]*99)*100 -4.1180836 about a 4 percent drop in the odds of losing. Verify that this change is constant for all adjacent pairs of values on the prior margin variable. Note the close connection between our –4.11 estimate and the odds ratio of the prior margin variable (i.e. .9588). One minus this ratio gives us . display 1-exp(_b[priorm]) .04118084 which when multiplied by 100 gives us the percentage drop. Hence, the .0411 gives us the proportional drop in the odds of defeat for a unit increase in x. To conclude, odds ratios are very simple quantities to interpret. Unfortunately, workers do not often interpret their results in terms of odds ratios. As we will see, this interpretation will come in handy with other kinds of logit models. 5