Qualitative and Limited Dependent Variable Models Prepared by Vera Tabakova, East Carolina University 16.1 Models with Binary Dependent Variables 16.2 The Logit Model for Binary Choice 16.3 Multinomial Logit 16.4 Conditional Logit 16.5 Ordered Choice Models 16.6 Models for Count Data 16.7 Limited Dependent Variables Principles of Econometrics, 3rd Edition Slide 16-2 Examples: An economic model explaining why some states in the United States have ratified the Equal Rights Amendment, and others have not. An economic model explaining why some individuals take a second, or third, job and engage in “moonlighting.” An economic model of why some legislators in the U. S. House of Representatives vote for a particular bill and others do not. An economic model of why the federal government awards development grants to some large cities and not others. Principles of Econometrics, 3rd Edition Slide16-3 An economic model explaining why some loan applications are accepted and others not at a large metropolitan bank. An economic model explaining why some individuals vote “yes” for increased spending in a school board election and others vote “no.” An economic model explaining why some female college students decide to study engineering and others do not. Principles of Econometrics, 3rd Edition Slide16-4 1 individual drives to work y 0 individual takes bus to work (16.1) If the probability that an individual drives to work is p, then P y 1 p. It follows that the probability that a person uses public transportation is P y 0 1 p . f ( y) p y (1 p)1 y , y 0,1 (16.2) E y p; var y p 1 p Principles of Econometrics, 3rd Edition Slide16-5 y E ( y) e p e (16.3) E ( y ) p 1 2 x (16.4) y E ( y ) e 1 2 x e Principles of Econometrics, 3rd Edition (16.5) Slide16-6 One problem with the linear probability model is that the error term is heteroskedastic; the variance of the error term e varies from one observation to another. y value e value Probability 1 1 1 2 x p 1 2 x 0 1 2 x 1 p 1 1 2 x Principles of Econometrics, 3rd Edition Slide16-7 var e 1 2 x 1 1 2 x Using generalized least squares, the estimated variance is: ˆ i2 var ei b1 b2 xi 1 b1 b2 xi (16.6) yi* yi ˆ i xi* xi ˆ i yi* 1ˆ i1 2 xi* ei* Principles of Econometrics, 3rd Edition Slide16-8 p̂ b1 b2 x dp 2 dx (16.7) (16.8) Problems: We can easily obtain values of p̂ that are less than 0 or greater than 1. Some of the estimated variances in (16.6) may be negative. Principles of Econometrics, 3rd Edition Slide16-9 Figure 16.1 (a) Standard normal cumulative distribution function (b) Standard normal probability density function Principles of Econometrics, 3rd Edition Slide16-10 1 .5 z 2 ( z ) e 2 1 .5u 2 e du 2 (16.9) p P[ Z 1 2 xp̂] (1 2 x) (16.10) ( z ) P[ Z z ] z Principles of Econometrics, 3rd Edition Slide16-11 dp d (t ) dt (1 2 x)2 dx dt dx (16.11) where t 1 2 x and (1 2 x) is the standard normal probability density function evaluated at 1 2 x. Principles of Econometrics, 3rd Edition Slide16-12 Equation (16.11) has the following implications: 1. Since (1 2 x) is a probability density function its value is always positive. Consequently the sign of dp/dx is determined by the sign of 2. In the transportation problem we expect 2 to be positive so that dp/dx > 0; as x increases we expect p to increase. Principles of Econometrics, 3rd Edition Slide16-13 2. As x changes the value of the function Φ(β1 + β2x) changes. The standard normal probability density function reaches its maximum when z = 0, or when β1 + β2x = 0. In this case p = Φ(0) = .5 and an individual is equally likely to choose car or bus transportation. The slope of the probit function p = Φ(z) is at its maximum when z = 0, the borderline case. Principles of Econometrics, 3rd Edition Slide16-14 3. On the other hand, if β1 + β2x is large, say near 3, then the probability that the individual chooses to drive is very large and close to 1. In this case a change in x will have relatively little effect since Φ(β1 + β2x) will be nearly 0. The same is true if β1 + β2x is a large negative value, say near 3. These results are consistent with the notion that if an individual is “set” in their ways, with p near 0 or 1, the effect of a small change in commuting time will be negligible. Principles of Econometrics, 3rd Edition Slide16-15 Predicting the probability that an individual chooses the alternative y = 1: pˆ (1 2 x) 1 yˆ 0 Principles of Econometrics, 3rd Edition (16.12) pˆ 0.5 pˆ 0.5 Slide16-16 f ( yi ) [(1 2 xi )]yi [1 (1 2 xi )]1 yi , yi 0,1 (16.13) f ( y1 , y2 , y3 ) f ( y1 ) f ( y2 ) f ( y3 ) Suppose that y1 = 1, y2 = 1 and y3 = 0. Suppose that the values of x, in minutes, are x1 = 15, x2 = 20 and x3 = 5. Principles of Econometrics, 3rd Edition Slide16-17 P[ y1 1, y2 1, y3 0] f (1,1,0) f (1) f (1) f (0) P[ y1 1, y2 1, y3 0] [1 2 (15)] [1 2 (20)] 1 [1 2 (5)] (16.14) In large samples the maximum likelihood estimator is normally distributed, consistent and best, in the sense that no competing estimator has smaller variance. Principles of Econometrics, 3rd Edition Slide16-18 Principles of Econometrics, 3rd Edition Slide16-19 1 2 DTIMEi .0644 .0299 DTIMEi (se) (16.15) (.3992) (.0103) dp (1 2 DTIME )2 (0.0644 0.0299 20)(0.0299) dDTIME (.5355)(0.0299) 0.3456 0.0299 0.0104 Principles of Econometrics, 3rd Edition Slide16-20 If an individual is faced with the situation that it takes 30 minutes longer to take public transportation than to drive to work, then the estimated probability that auto transportation will be selected is pˆ (1 2 DTIME ) (0.0644 0.0299 30) .798 Since the estimated probability that the individual will choose to drive to work is 0.798, which is greater than 0.5, we “predict” that when public transportation takes 30 minutes longer than driving to work, the individual will choose to drive. Principles of Econometrics, 3rd Edition Slide16-21 (l ) el 1 e l 2 , l (16.16) 1 l p[ L l ] 1 e l p P L 1 2 x 1 2 x Principles of Econometrics, 3rd Edition (16.17) 1 1 e 1 2 x (16.18) Slide16-22 p 1 1 e 1 2 x exp 1 2 x 1 exp 1 2 x 1 1 p 1 exp 1 2 x Principles of Econometrics, 3rd Edition Slide16-23 Examples of multinomial choice situations: 1. Choice of a laundry detergent: Tide, Cheer, Arm & Hammer, Wisk, etc. 2. Choice of a major: economics, marketing, management, finance or accounting. 3. Choices after graduating from high school: not going to college, going to a private 4-year college, a public 4 year-college, or a 2-year college. The explanatory variable xi is individual specific, but does not change across alternatives. Principles of Econometrics, 3rd Edition Slide16-24 pij P individual i chooses alternative j pi1 1 1 exp 12 22 xi exp 13 23 xi , j 1 (16.19a) exp 12 22 xi pi 2 , j2 1 exp 12 22 xi exp 13 23 xi (16.19b) exp 13 23 xi pi 3 , j 3 1 exp 12 22 xi exp 13 23 xi (16.19c) Principles of Econometrics, 3rd Edition Slide16-25 P y11 1, y22 1, y33 1 p11 p22 p33 1 1 exp 12 22 x1 exp 13 23 x1 exp 12 22 x2 1 exp 12 22 x2 exp 13 23 x2 exp 13 23 x3 1 exp 12 22 x3 exp 13 23 x3 L 12 , 22 , 13 , 23 Principles of Econometrics, 3rd Edition Slide16-26 p01 pim xi 1 1 exp 12 22 x0 exp 13 23 x0 all else constant 3 pim pim 2 m 2 j pij xi j 1 (16.20) p1 pb1 pa1 1 1 exp 12 22 xb exp 13 23 xb Principles of Econometrics, 3rd Edition 1 1 exp 12 22 xa exp 13 23 xa Slide16-27 P yi j pij exp 1 j 2 j xi P yi 1 pi1 pij pi1 xi 2 j exp 1 j 2 j xi j 2,3 j 2,3 (16.21) (16.22) An interesting feature of the odds ratio (16.21) is that the odds of choosing alternative j rather than alternative 1 does not depend on how many alternatives there are in total. There is the implicit assumption in logit models that the odds between any pair of alternatives is independent of irrelevant alternatives (IIA). Principles of Econometrics, 3rd Edition Slide16-28 Principles of Econometrics, 3rd Edition Slide16-29 Principles of Econometrics, 3rd Edition Slide16-30 Example: choice between three types (J = 3) of soft drinks, say Pepsi, 7-Up and Coke Classic. Let yi1, yi2 and yi3 be dummy variables that indicate the choice made by individual i. The price facing individual i for brand j is PRICEij. Variables like price are to be individual and alternative specific, because they vary from individual to individual and are different for each choice the consumer might make Principles of Econometrics, 3rd Edition Slide16-31 pij P individual i chooses alternative j pij exp 1 j 2 PRICEij exp 11 2 PRICEi1 exp 12 2 PRICEi 2 exp 13 2 PRICEi 3 Principles of Econometrics, 3rd Edition (16.23) Slide16-32 P y11 1, y22 1, y33 1 p11 p22 p33 exp 11 2 PRICE11 exp 11 2 PRICE11 exp 12 2 PRICE12 exp 2 PRICE13 exp 12 2 PRICE22 exp 11 2 PRICE21 exp 12 2 PRICE22 exp 2 PRICE23 exp 2 PRICE33 exp 11 2 PRICE31 exp 12 2 PRICE32 exp 2 PRICE33 L 12 , 22 , 2 Principles of Econometrics, 3rd Edition Slide16-33 The own price effect is: pij PRICEij pij 1 pij 2 (16.24) pij pik 2 (16.25) The cross price effect is: pij PRICEik Principles of Econometrics, 3rd Edition Slide16-34 pij pik exp 1 j 2 PRICEij exp 1k 2 PRICEik exp 1 j 1k 2 PRICEij PRICEik The odds ratio depends on the difference in prices, but not on the prices themselves. As in the multinomial logit model this ratio does not depend on the total number of alternatives, and there is the implicit assumption of the independence of irrelevant alternatives (IIA). Principles of Econometrics, 3rd Edition Slide16-35 Principles of Econometrics, 3rd Edition Slide16-36 The predicted probability of a Pepsi purchase, given that the price of Pepsi is $1, the price of 7-Up is $1.25 and the price of Coke is $1.10 is: pˆ i1 exp 11 2 1.00 exp 11 2 1.00 exp 12 2 1.25 exp 2 1.10 Principles of Econometrics, 3rd Edition .4832 Slide16-37 The choice options in multinomial and conditional logit models have no natural ordering or arrangement. However, in some cases choices are ordered in a specific way. Examples include: 1. Results of opinion surveys in which responses can be strongly disagree, disagree, neutral, agree or strongly agree. 2. Assignment of grades or work performance ratings. Students receive grades A, B, C, D, F which are ordered on the basis of a teacher’s evaluation of their performance. Employees are often given evaluations on scales such as Outstanding, Very Good, Good, Fair and Poor which are similar in spirit. Principles of Econometrics, 3rd Edition Slide16-38 3. Standard and Poor’s rates bonds as AAA, AA, A, BBB and so on, as a judgment about the credit worthiness of the company or country issuing a bond, and how risky the investment might be. 4. Levels of employment are unemployed, part-time, or full-time. When modeling these types of outcomes numerical values are assigned to the outcomes, but the numerical values are ordinal, and reflect only the ranking of the outcomes. Principles of Econometrics, 3rd Edition Slide16-39 Example: 1 2 y 3 4 5 strongly disagree disagree neutral agree strongly agree Principles of Econometrics, 3rd Edition Slide16-40 3 4-year college (the full college experience) y 2 2-year college (a partial college experience) 1 no college (16.26) The usual linear regression model is not appropriate for such data, because in regression we would treat the y values as having some numerical meaning when they do not. Principles of Econometrics, 3rd Edition Slide16-41 yi* GRADESi ei 3 (4-year college) if y 2 (2-year college) if 1 (no college) if Principles of Econometrics, 3rd Edition yi* 2 1 yi* 2 yi* 1 Slide16-42 Figure 16.2 Ordinal Choices Relation to Thresholds Principles of Econometrics, 3rd Edition Slide16-43 P yi 1 P yi* 1 P GRADESi ei 1 P ei 1 GRADESi 1 GRADESi Principles of Econometrics, 3rd Edition Slide16-44 P yi 2 P 1 yi* 2 P 1 GRADESi ei 2 P 1 GRADESi ei 2 GRADESi 2 GRADESi 1 GRADESi Principles of Econometrics, 3rd Edition Slide16-45 P yi 3 P yi* 2 P GRADESi ei 2 P ei 2 GRADESi 1 2 GRADESi Principles of Econometrics, 3rd Edition Slide16-46 L , 1 , 2 P y1 1 P y2 2 P y3 3 The parameters are obtained by maximizing the log-likelihood function using numerical methods. Most software includes options for both ordered probit, which depends on the errors being standard normal, and ordered logit, which depends on the assumption that the random errors follow a logistic distribution. Principles of Econometrics, 3rd Edition Slide16-47 The types of questions we can answer with this model are: 1. What is the probability that a high-school graduate with GRADES = 2.5 (on a 13 point scale, with 1 being the highest) will attend a 2year college? The answer is obtained by plugging in the specific value of GRADES into the predicted probability based on the maximum likelihood estimates of the parameters, P y 2 | GRADES 2.5 2 2.5 1 2.5 Principles of Econometrics, 3rd Edition Slide16-48 2. What is the difference in probability of attending a 4-year college for two students, one with GRADES = 2.5 and another with GRADES = 4.5? The difference in the probabilities is calculated directly as P y 2 | GRADES 4.5 P y 2 | GRADES 2.5 Principles of Econometrics, 3rd Edition Slide16-49 3. If we treat GRADES as a continuous variable, what is the marginal effect on the probability of each outcome, given a 1-unit change in GRADES? These derivatives are: P y 1 1 GRADES GRADES P y 2 1 GRADES 2 GRADES GRADES P y 3 2 GRADES GRADES Principles of Econometrics, 3rd Edition Slide16-50 Principles of Econometrics, 3rd Edition Slide16-51 When the dependent variable in a regression model is a count of the number of occurrences of an event, the outcome variable is y = 0, 1, 2, 3, … These numbers are actual counts, and thus different from the ordinal numbers of the previous section. Examples include: The number of trips to a physician a person makes during a year. The number of fishing trips taken by a person during the previous year. The number of children in a household. The number of automobile accidents at a particular intersection during a month. The number of televisions in a household. The number of alcoholic drinks a college student takes in a week. Principles of Econometrics, 3rd Edition Slide16-52 If Y is a Poisson random variable, then its probability function is e y f y P Y y , y! y ! y y 1 y 2 y 0,1,2, (16.27) 1 E Y exp 1 2 x (16.28) This choice defines the Poisson regression model for count data. Principles of Econometrics, 3rd Edition Slide16-53 L 1 , 2 P Y 0 P Y 2 P Y 2 ln L 1 , 2 ln P Y 0 ln P Y 2 ln P Y 2 e y ln P Y y ln y ln ln y ! y! exp 1 2 x y 1 2 x ln y ! ln L 1 , 2 exp 1 2 xi yi 1 2 xi ln yi ! N i 1 Principles of Econometrics, 3rd Edition Slide16-54 E y0 0 exp 1 2 x0 Pr Y y Principles of Econometrics, 3rd Edition exp 0 0y y! , y 0,1, 2, Slide16-55 E yi i 2 xi (16.29) %E y E yi E yi 100 1002 % xi xi Principles of Econometrics, 3rd Edition Slide16-56 E yi i exp 1 2 xi Di E yi | Di 0 exp 1 2 xi E yi | Di 1 exp 1 2 xi exp 1 2 xi exp 1 2 xi 100 % 100 e 1 % exp 1 2 xi Principles of Econometrics, 3rd Edition Slide16-57 Principles of Econometrics, 3rd Edition Slide16-58 16.7.1 Censored Data Figure 16.3 Histogram of Wife’s Hours of Work in 1975 Principles of Econometrics, 3rd Edition Slide16-59 Having censored data means that a substantial fraction of the observations on the dependent variable take a limit value. The regression function is no longer given by (16.30). E y | x 1 2 x (16.30) The least squares estimators of the regression parameters obtained by running a regression of y on x are biased and inconsistent—least squares estimation fails. Principles of Econometrics, 3rd Edition Slide16-60 Having censored data means that a substantial fraction of the observations on the dependent variable take a limit value. The regression function is no longer given by (16.30). E y | x 1 2 x (16.30) The least squares estimators of the regression parameters obtained by running a regression of y on x are biased and inconsistent—least squares estimation fails. Principles of Econometrics, 3rd Edition Slide16-61 We give the parameters the specific values and 1 9 and 2 1. yi* 1 2 xi ei 9 xi ei (16.31) 2 Assume ei ~ N 0, 16 . yi 0 if yi* 0; yi yi* if yi* 0. Principles of Econometrics, 3rd Edition Slide16-62 Create N = 200 random values of xi that are spread evenly (or uniformly) over the interval [0, 20]. These we will keep fixed in further simulations. Obtain N = 200 random values ei from a normal distribution with mean 0 and variance 16. Create N = 200 values of the latent variable. * 0 if y i 0 Obtain N = 200 values of the observed yi using yi * * y if y i 0 i Principles of Econometrics, 3rd Edition Slide16-63 Figure 16.4 Uncensored Sample Data and Regression Function Principles of Econometrics, 3rd Edition Slide16-64 Figure 16.5 Censored Sample Data, and Latent Regression Function and Least Squares Fitted Line Principles of Econometrics, 3rd Edition Slide16-65 yˆi 2.1477 .5161xi (se) (.3706) (.0326) yˆi 3.1399 .6388 xi (se) (1.2055) (.0827) 1 EMC bk NSAM Principles of Econometrics, 3rd Edition (16.32a) (16.32b) NSAM m1 bk ( m ) (16.33) Slide16-66 The maximum likelihood procedure is called Tobit in honor of James Tobin, winner of the 1981 Nobel Prize in Economics, who first studied this model. The probit probability that yi = 0 is: P yi 0 P[ yi 0] 1 1 2 xi 1 2 1 2 xi 1 2 2 L 1 , 2 , 1 2 exp 2 yi 1 2 xi yi 0 yi 0 2 Principles of Econometrics, 3rd Edition Slide16-67 The maximum likelihood estimator is consistent and asymptotically normal, with a known covariance matrix. Using the artificial data the fitted values are: yi 10.2773 1.0487 xi (se) (1.0970) (.0790) Principles of Econometrics, 3rd Edition (16.34) Slide16-68 Principles of Econometrics, 3rd Edition Slide16-69 E y | x 1 2 x 2 x (16.35) Because the cdf values are positive, the sign of the coefficient does tell the direction of the marginal effect, just not its magnitude. If β2 > 0, as x increases the cdf function approaches 1, and the slope of the regression function approaches that of the latent variable model. Principles of Econometrics, 3rd Edition Slide16-70 Figure 16.6 Censored Sample Data, and Regression Functions for Observed and Positive y values Principles of Econometrics, 3rd Edition Slide16-71 HOURS 1 2 EDUC 3 EXPER 4 AGE 4 KIDSL6 e (16.36) E HOURS 2 73.29 .3638 26.34 EDUC Principles of Econometrics, 3rd Edition Slide16-72 Principles of Econometrics, 3rd Edition Slide16-73 Problem: our sample is not a random sample. The data we observe are “selected” by a systematic process for which we do not account. Solution: a technique called Heckit, named after its developer, Nobel Prize winning econometrician James Heckman. Principles of Econometrics, 3rd Edition Slide16-74 The econometric model describing the situation is composed of two equations. The first, is the selection equation that determines whether the variable of interest is observed. zi* 1 2 wi ui i 1, , N * 1 z i 0 zi 0 otherwise Principles of Econometrics, 3rd Edition (16.37) (16.38) Slide16-75 The second equation is the linear model of interest. It is yi 1 2 xi ei i 1, ,n E yi | zi* 0 1 2 xi i 1 2 wi i 1 2 wi Principles of Econometrics, 3rd Edition N n i 1, (16.39) ,n (16.40) (16.41) Slide16-76 The estimated “Inverse Mills Ratio” is 1 2 wi i 1 2 wi The estimating equation is yi 1 2 xi i vi Principles of Econometrics, 3rd Edition i 1, ,n (16.42) Slide16-77 ln WAGE .4002 .1095EDUC .0157 EXPER (t-stat) ( 2.10) (7.73) R 2 .1484 (16.43) (3.90) P LFP 1 1.1923 .0206 AGE .0838EDUC .3139 KIDS 1.3939MTR (t-stat) ( 2.93) (3.61) ( 2.54) ( 2.26) 1.1923 .0206 AGE .0838EDUC .3139KIDS 1.3939MTR IMR 1.1923 .0206 AGE .0838EDUC .3139KIDS 1.3939MTR Principles of Econometrics, 3rd Edition Slide16-78 ln WAGE .8105 .0585EDUC .0163EXPER .8664IMR (t-stat) (1.64) (2.45) (t-stat-adj) (1.33) (1.97) (4.08) (3.88) ( 2.65) ( 2.17) (16.44) The maximum likelihood estimated wage equation is ln WAGE .6686 .0658EDUC .0118EXPER (t-stat) (2.84) (3.96) (2.87) The standard errors based on the full information maximum likelihood procedure are smaller than those yielded by the two-step estimation method. Principles of Econometrics, 3rd Edition Slide16-79 binary choice models censored data conditional logit count data models feasible generalized least squares Heckit identification problem independence of irrelevant alternatives (IIA) index models individual and alternative specific variables individual specific variables latent variables likelihood function limited dependent variables linear probability model Principles of Econometrics, 3rd Edition logistic random variable logit log-likelihood function marginal effect maximum likelihood estimation multinomial choice models multinomial logit odds ratio ordered choice models ordered probit ordinal variables Poisson random variable Poisson regression model probit selection bias tobit model truncated data Slide 16-80