Instructor: R. Makoto richard makoto UZ Econ313 Lecture notes 1 Lecture 7: Qualitative response regression models The dependent variable is qualitative rather than continuous. Qualitative response regression models can have dependent variables with either two categories or more than two categories. Those with two categories are known as binary dependent variable models Those with more than two categories are referred to as polychotomous or multi-response dependent variable models e.g. Poisson models, multinomial logit and probit models, ordered probit models, e.t.c. richard makoto UZ Econ313 Lecture notes 2 Lecture 7: Binary dependent variable models There several types of such models Some of them include the Linear Probability Model (LPM), the Probit model, the Logit model, latent regressions, random utility models, e.t.c. Binary dependent variable models are also known as dichotomous dependent variable models. At this level, we will however concentrate on the first three models namely the LPM, the Logit and the Probit models. richard makoto UZ Econ313 Lecture notes 3 Example: An application of the binary dependent variable model Two people, identical but for their race, walk into a bank and apply for mortgage, a large loan so that each can buy an identical house. Does the bank treat them the same way? Are they both equally likely to have their mortgage applications accepted? By law they must receive identical treatment. But whether they actually do is a matter of great concern among bank regulators. We can model the factors influencing loan application acceptance as follows: richard makoto UZ Econ313 Lecture notes 4 Example: An application of the binary dependent variable model continued …. Dependent variable takes two values; Y=1 if the mortgage application is denied and Y=0 if otherwise. The model is therefore in the form of a probability model, i.e. The Xs are the explanatory variables such as race, gender, wealth, previous loans, loan payment record, age, and many other socio-economic factors. Several approaches can be used to estimate binary models. richard makoto UZ Econ313 Lecture notes 5 Lecture 7: The Linear Probability Model It is a multiple regression model with a dependent variable in the form of binary rather than continuous. Because the dependent variable Y is binary, the population regression function corresponds to the probability that the dependent variable equals 1 given explanatory variables, Xs, i.e. is the change in the probability that Y=1 associated with a unit change in , i.e. richard makoto UZ Econ313 Lecture notes 6 Lecture 7: The LPM continued……………. The regression coefficients in the LPM are estimated by OLS. The usual (heteroscedastic-robust) OLS standard errors can be used to construct confidence intervals and hypotheses tests. Let be the probability that Y=1 (probability of success), then = probability that Y=0 (probability of failure). Therefore; Probability 0 1 richard makoto UZ Econ313 Lecture notes 7 Lecture 7: Weaknesses of the LPM The disturbances are not normally distributed. Probability 0 1 follows the Bernoulli distribution, a special type of the binomial distribution with only one draw; a violation of one of the assumptions of CLRM. richard makoto UZ Econ313 Lecture notes 8 Lecture 7: Weaknesses of the LPM continued………………….. The variance of the disturbances are heteroscedastic. The value of the variance of the error depends on the values of the explanatory variables (Xs) hence it is not homoscedastic. The problem of heteroscedastic variances can be corrected by applying the Weighted Least Squares (WLS) estimation technique. richard makoto UZ Econ313 Lecture notes 9 Lecture 7: Weaknesses of the LPM continued……… R-squared or the coefficient of determination is of limited use. It cannot be used to measure the goodness of fit of the model. The major weakness of the LPM is its failure to guarantee that the estimated probabilities always lie between zero and one. Probabilities generated in the LPM sometimes exceed one or fall short of zero which is nonsensical. It is this weakness that gives rise to better methods of estimating binary dependent variable models. richard makoto UZ Econ313 Lecture notes 10 Lecture 8: The Logit model A logit model is a probability econometric model derived from the logistic distribution function; it ensures that whatever the value of is. As approaches positive infinity, approaches 1 and as approaches negative infinity, approaches 0. Consider a binary dependent variable model, say; where takes only two values, 1 or 0, richard makoto UZ Econ313 Lecture notes 11 Lecture 8:The Logit model continued……. we can use the logistic distribution function in dealing with such models. Such a function can be expressed as: ……. (1) Where , is the probability that =1 (the probability of success) and is the probability that = 0 or the probability of failure. It can be verified that as ranges from -∞ to +∞, ranges between zero and one and is not linearly related to . richard makoto UZ Econ313 Lecture notes 12 Lecture 8: The Logit model continued……. Equation (1) is non-linear in both , therefore we cannot apply OLS to estimate the parameters of this equation. To proceed in estimating equation (1), we can linearize it as follows: If is the probability of success, then the probability of failure is given by: ……………….(2) richard makoto UZ Econ313 Lecture notes 13 Lecture8: The Logit model continued……. The odds ratio in favour of success is given by: ……………………………………(3) Expressing (3) in natural logarithms, we obtain: ……….(4) is known as the Logit or the natural log of the odds ratio, hence the name, “Logit model”. richard makoto UZ Econ313 Lecture notes 14 Lecture 8: Properties of the logit model 1. The logit is not bounded between 0 and 1 although the 2. 3. 4. 5. 6. probabilities lie between 0 and 1. Although the logit is linear in X, the probabilities are not linear in X. Z can contain so many regressors If the logit is positive, it means that when the value of the regressors increases, the odds that the regressand equals 1 increases. Estimating the probability can be done directly from equation (1) as long as Z is known. Unlike in the LPM, the logit assumes that it is the log of the odds ratio which is linearly related to X not the Prob. richard makoto UZ Econ313 Lecture notes 15 Lecture 8: Interpretation of slope coefficients in the logit model The slope coefficient, , measures the change in the Logit (Log of odds ratio) resulting from a marginal change in the regressor, . Mathematically we have: The Maximum Likelihood (ML) method is used to estimate the logit model in (4). richard makoto UZ Econ313 Lecture notes 16 Lecture 9: The Probit model Unlike the Logit model which is derived from the cumulative logistic distribution function, the probit model uses the cumulative normal distribution function, hence sometimes referred to as the Normit model. The probit model is similar to the logit model except that the logistic function is replaced by the normal distribution function. Where and is the cumulative normal distribution function. richard makoto UZ Econ313 Lecture notes 17 Lecture 9: Interpretation of the probit model The probit model coefficients are not easy to interpret. Where is the standard normal probability density function (PDF) evaluated at . The evaluation will depend on the particular value of the X variables. In the probit model the rate of change is complicated and is given as explained above. richard makoto UZ Econ313 Lecture notes 18 Lecture 9: Logit and Probit Models Logit and Probit models give qualitatively similar results. In most applications the models are similar, the main difference being that the logistic distribution has slightly fatter tails than the normal distribution. One can use any model between the two and obtain similar results. richard makoto UZ Econ313 Lecture notes 19 Tutorial questions on binary dependent variable models. 1. Explain the weaknesses of the LPM. 2. Why is the Logit or the Probit a better model that the LPM? 3. Derive the Logit model from the Logistic function and explain the meaning of its slope coefficients. 4. Do all the questions on the tutorial sheet to be provided. richard makoto UZ Econ313 Lecture notes 20 Lab session 3: Binary dependent variable models. Groups of 5 students Identify a hypothetical economic problem that requires the use of qualitative dependent variable modeling (it might be micro or macro in nature). The dependent variable must be binary. Enter the data in excel of at least 50 individuals. Saved data must be ready for use in the first week of November. richard makoto UZ Econ313 Lecture notes 21