Chapter 13 Dummy Dependent Variable Techniques Copyright © 2011 Pearson Addison-Wesley. All rights reserved. Slides by Niels-Hugo Blunch Washington and Lee University The Linear Probability Model • The linear probability model is simply running OLS for a regression, where the dependent variable is a dummy (i.e. binary) variable: (13.1) where Di is a dummy variable, and the Xs, βs, and ε are typical independent variables, regression coefficients, and an error term, respectively • The term linear probability model comes from the fact that the right side of the equation is linear while the expected value of the left side measures the probability that Di = 1 © 2011 Pearson Addison-Wesley. All rights reserved. 13-1 Problems with the Linear Probability Model 1. R2 is not an accurate measure of overall fit: – Di can equal only 1 or 0, but must move in a continuous fashion from one extreme to the other (as also illustrated in Figure 13.1) – Hence, is likely to be quite different from Di for some range of Xi – Thus, R2 is likely to be much lower than 1 even if the model actually does an exceptional job of explaining the choices involved 2 – As an alternative, one can instead use R p, a measure based on the percentage of the observations in the sample that a particular estimated equation explains correctly – To use this approach, consider a > .5 to predict that Di = 1 and a < .5 to predict that Di = 0 and then simply compare these predictions with the actual Di 2. is not bounded by 0 and 1: – The alternative binomial logit model, presented in Section 13.2, will address this issue © 2011 Pearson Addison-Wesley. All rights reserved. 13-2 Figure 13.1 A Linear Probability Model © 2011 Pearson Addison-Wesley. All rights reserved. 13-3 The Binomial Logit Model • The binomial logit is an estimation technique for equations with dummy dependent variables that avoids the unboundedness problem of the linear probability model • It does so by using a variant of the cumulative logistic function: (13.7) • Logits cannot be estimated using OLS but are instead estimated by maximum likelihood (ML), an iterative estimation technique that is especially useful for equations that are nonlinear in the coefficients • Again, for the logit model • This is illustrated by Figure 13.2 © 2011 Pearson Addison-Wesley. All rights reserved. is bounded by 1 and 0 13-4 Figure 13.2 Is Bounded by 0 and 1 in a Binomial Logit Model © 2011 Pearson Addison-Wesley. All rights reserved. 13-5 Interpreting Estimated Logit Coefficients • The signs of the coefficients in the logit model have the same meaning as in the linear probability (i.e. OLS) model • The interpretation of the magnitude of the coefficients differs, though, the dependent variable has changed dramatically. • That the “marginal effects” are not constant can be seen from Figure 13.2: the slope (i.e. the change in probability) of the graph of the logit changes as moves from 0 to 1! • We’ll consider three ways for helping to interpret logit coeffcients meaningfully: © 2011 Pearson Addison-Wesley. All rights reserved. 13-6 Interpreting Estimated Logit Coefficients (cont.) 1. Change an average observation: – Create an “average” observation by plugging the means of all the independent variables into the estimated logit equation and then calculating an “average” – Then increase the independent variable of interest by one unit and recalculate the – The difference between the two s then gives the marginal effect 2. Use a partial derivative: – Taking a derivative of the logit yields the result that the change in the expected value of caused by a one unit increase in holding constant the other independent variables in the equation equals – To use this formula, simply plug in your estimates of and Di – From this, again, the marginal impact of X does indeed depend on the value of 3. Use a rough estimate of 0.25: – Plugging in into the previous equation, we get the (more handy!) result that multiplying a logit coefficient by 0.25 (or dividing by 4) yields an equivalent linear probability model coefficient © 2011 Pearson Addison-Wesley. All rights reserved. 13-7 Other Dummy Dependent Variable Techniques • The Binomial Probit Model: – Similar to the logit model this an estimation technique for equations with dummy dependent variables that avoids the unboundedness problem of the linear probability model – However, rather than the logistic function, this model uses a variant of the cumulative normal distribution • The Multinomial Logit Model: – Sometimes there are more than two qualitative choices available – The sequential binary model estimates such choices as a series of binary decisions – If the choice is made simultaneously, however, this is not appropriate – The multinomial logit is developed specifically for the case with more than two qualitative choices and the choice is made simultaneously © 2011 Pearson Addison-Wesley. All rights reserved. 13-8 Key Terms from Chapter 13 • Linear probability model 2 • Rp • Binomial logit model • The interpretation of an estimated logit coefficient • Binomial probit model • Sequential binary model • Multinomial logit model © 2011 Pearson Addison-Wesley. All rights reserved. 13-9