Discrete choice models— models for qualitative dependent variables. Ragnar Nymoen Department of Economics, UiO 26 March 2009 ECON 4610: Lecture 11 Overview Large cross section data sets (thousands of observations) with information about individuals’choice behaviour are now common. Often, the variable to be explained in models that utilize these data are of a discrete or qualitative nature: Whether to work 50% of the normal working week or zero. Whether to buy a car or not. The linear regression model is not relevant, when the dependent variable is qualitative. Syllabus: G Ch 23.1-23.3 B Note DC, K Ch 15.1 ECON 4610: Lecture 11 Regression approach to binary dependent variables Assume that we have a sample of n observations from individuals’choice between to alternatives A and B (“not A”). We can represent this as n variables. yi D 1 if individual i chooses A 0 if individual i chooses B i D 1, 2, .., n Clearly, this is a qualitative variable–a binary variable or a dummy variable. In earlier models we have used dummy variables as part of the set of explanatory variables in a linear regression. Now the situation is that the yi are dependent variables, which can no longer be treated as deterministic. The regression approach would be yi D xi C "i , i D 1, 2, ..., n (1) where xi is just a single variable, since the multivariate setting do not represent any interesting new issues. ECON 4610: Lecture 11 The distribution function of y We need to take seriously that yi is a random variable. Its distribution function is P.yi P.yi D 1/ D Pi D 0/ D 1 Pi where 0 < Pi < 1 is the probability for event A. In the light of the linear model, we have "i D 1 xi , with probability Pi xi , with probability 1 Pi For E ."i / D 0 we need (the conditioning on xi is understood): E ."i / D Pi .1 Pi D xi . xi / .1 Pi / xi D Pi ECON 4610: Lecture 11 xi D 0, giving: Why is the linear probability model inappropriate? Since Pi D xi (1) can be written as yi D Pi C "i (2) which explains why the linear regression model is referred to as the linear probability model in this context. The linear probability model has two main drawbacks: The disturbance " i is heteroscedastic (see Biorns note DC). Since Pi D xi the model does not give logically that Pi is a probability: 0 < Pi < 1. ECON 4610: Lecture 11 Two solutions: Logit and Probit Pi is called the response probability. As a model of Pi , the linear regression model is not relevant. The solution is to model Pi as a probability from the outset If we let F .xi , / denote a cumulative distribution function with as a parameter, we can set: Pi D F .xi , / and choose a speci…c distribution: 8 1 e xi > < D , logistic, x i 1Ce 1 C e xi Pi D . 2 R > : xi p1 e u2 du, normal N(0,1). 1 2 Choosing the logistic distribution gives the Logit model, while choosing the standard normal distribution gives the Probit model. ECON 4610: Lecture 11 Likelihood function of the Logit model The likelihood function L is the joint probability of the of the endogenous variables conditional on the exogenous variables. For one our binary endogenous variable we have y Pi /1 Li D Pi i .1 yi D 1 Pi , for yi D 1 Pi , for yi D 0 and therefore, the joint likelihood for n independent variables are: LD ln.L/ D n Y Pi /1 fyi ln Pi C .1 yi / ln.1 i D1 n X iD n Y y Li D Pi i .1 yi (3) i D1 ECON 4610: Lecture 11 Pi /g (4) The derivative of the Likelihood function ln Pi ln.1 @ ln Pi @ @ ln.1 Pi / @ D xi Pi / D D xi D ln.1 C e xi / ln.1 C e xi / e xi xi D xi .1 1 C e xi e xi xi D Pi xi 1 C e xi n @ ln.L/ X fyi xi .1 D @ i D1 Pi / C .1 Pi / yi /. Pi xi /g ECON 4610: Lecture 11 The Maximum likelihood estimator in the Logit model We choose the parameter value that maximizes the likelihood of the sample, i.e., the under which the model would have been most likely to generate the observed sample. Let O denote the ML estimator and PO the associated response probability. From the previous slide, if we set @ ln@ .OL/ D 0: n n X yi xi .1 i D1 PO i / C .1 o yi /. PO i xi / D 0 which simpli…es to n X yi xi i D1 n X i D1 yi xi D D n X PO i xi , or i D1 1 C e xi b i D1 n X e xi b xi (5) this non-linear equation de…nes the ML estimator. With k explanatory variables, we obtain k similar equations. ECON 4610: Lecture 11