lecture notes Limited Dependent Econometrics II

Instructor: R. Makoto
richard makoto UZ Econ313 Lecture notes
Lecture 7: Qualitative response
regression models
 The dependent variable is qualitative rather than
 Qualitative response regression models can have
dependent variables with either two categories or more
than two categories.
 Those with two categories are known as binary
dependent variable models
 Those with more than two categories are referred to as
polychotomous or multi-response dependent variable
models e.g. Poisson models, multinomial logit and probit
models, ordered probit models, e.t.c.
richard makoto UZ Econ313 Lecture notes
Lecture 7: Binary dependent variable
 There several types of such models
 Some of them include the Linear Probability Model
(LPM), the Probit model, the Logit model, latent
regressions, random utility models, e.t.c.
 Binary dependent variable models are also known as
dichotomous dependent variable models.
 At this level, we will however concentrate on the first
three models namely the LPM, the Logit and the Probit
richard makoto UZ Econ313 Lecture notes
Example: An application of the binary
dependent variable model
 Two people, identical but for their race, walk into a bank
and apply for mortgage, a large loan so that each can
buy an identical house.
 Does the bank treat them the same way? Are they both
equally likely to have their mortgage applications
accepted? By law they must receive identical treatment.
 But whether they actually do is a matter of great concern
among bank regulators.
 We can model the factors influencing loan application
acceptance as follows:
richard makoto UZ Econ313 Lecture notes
Example: An application of the binary
dependent variable model continued ….
 Dependent variable takes two values; Y=1 if the
mortgage application is denied and Y=0 if otherwise.
 The model is therefore in the form of a probability model,
 The Xs are the explanatory variables such as race,
gender, wealth, previous loans, loan payment record,
age, and many other socio-economic factors.
 Several approaches can be used to estimate binary
richard makoto UZ Econ313 Lecture notes
Lecture 7: The Linear Probability Model
 It is a multiple regression model with a dependent
variable in the form of binary rather than continuous.
 Because the dependent variable Y is binary, the
population regression function corresponds to the
probability that the dependent variable equals 1 given
explanatory variables, Xs, i.e.
is the change in the probability that Y=1 associated
with a unit change in
, i.e.
richard makoto UZ Econ313 Lecture notes
Lecture 7: The LPM continued…………….
 The regression coefficients in the LPM are estimated by
 The usual (heteroscedastic-robust) OLS standard errors
can be used to construct confidence intervals and
hypotheses tests.
 Let be the probability that Y=1 (probability of success),
= probability that Y=0 (probability of failure).
 Therefore;
richard makoto UZ Econ313 Lecture notes
Lecture 7: Weaknesses of the LPM
 The disturbances are not normally distributed.
follows the Bernoulli distribution, a special type of the
binomial distribution with only one draw; a violation of
one of the assumptions of CLRM.
richard makoto UZ Econ313 Lecture notes
Lecture 7: Weaknesses of the LPM
 The variance of the disturbances are heteroscedastic.
 The value of the variance of the error depends on the
values of the explanatory variables (Xs) hence it is not
 The problem of heteroscedastic variances can be
corrected by applying the Weighted Least Squares
(WLS) estimation technique.
richard makoto UZ Econ313 Lecture notes
Lecture 7: Weaknesses of the LPM
 R-squared or the coefficient of determination is of limited
use. It cannot be used to measure the goodness of fit of
the model.
 The major weakness of the LPM is its failure to
guarantee that the estimated probabilities always lie
between zero and one.
 Probabilities generated in the LPM sometimes exceed
one or fall short of zero which is nonsensical.
 It is this weakness that gives rise to better methods of
estimating binary dependent variable models.
richard makoto UZ Econ313 Lecture notes
Lecture 8: The Logit model
 A logit model is a probability econometric model derived
from the logistic distribution function;
 it ensures that
whatever the value of is. As
approaches positive infinity,
approaches 1 and as
approaches negative infinity,
approaches 0.
 Consider a binary dependent variable model, say;
 where
takes only two values, 1 or 0,
richard makoto UZ Econ313 Lecture notes
Lecture 8:The Logit model continued…….
 we can use the logistic distribution function in dealing
with such models. Such a function can be expressed as:
……. (1)
 Where
is the
probability that
=1 (the probability of success) and
is the probability that = 0 or the probability of failure.
 It can be verified that as ranges from -∞ to +∞,
ranges between zero and one and is not linearly related
to .
richard makoto UZ Econ313 Lecture notes
Lecture 8: The Logit model continued…….
 Equation (1) is non-linear in both
, therefore we
cannot apply OLS to estimate the parameters of this
 To proceed in estimating equation (1), we can linearize it
as follows:
 If is the probability of success, then the probability of
failure is given by:
richard makoto UZ Econ313 Lecture notes
Lecture8: The Logit model continued…….
 The odds ratio in favour of success is given by:
 Expressing (3) in natural logarithms, we obtain:
is known as the Logit or the natural log of the
odds ratio, hence the name, “Logit model”.
richard makoto UZ Econ313 Lecture notes
Lecture 8: Properties of the logit model
1. The logit is not bounded between 0 and 1 although the
probabilities lie between 0 and 1.
Although the logit is linear in X, the probabilities are not
linear in X.
Z can contain so many regressors
If the logit is positive, it means that when the value of
the regressors increases, the odds that the regressand
equals 1 increases.
Estimating the probability can be done directly from
equation (1) as long as Z is known.
Unlike in the LPM, the logit assumes that it is the log of
the odds ratio which is linearly related to X not the Prob.
richard makoto UZ Econ313 Lecture notes
Lecture 8: Interpretation of slope
coefficients in the logit model
 The slope coefficient,
, measures the change in the
Logit (Log of odds ratio) resulting from a marginal change
in the regressor, . Mathematically we have:
 The Maximum Likelihood (ML) method is used to
estimate the logit model in (4).
richard makoto UZ Econ313 Lecture notes
Lecture 9: The Probit model
 Unlike the Logit model which is derived from the
cumulative logistic distribution function, the probit model
uses the cumulative normal distribution function, hence
sometimes referred to as the Normit model.
 The probit model is similar to the logit model except that
the logistic function is replaced by the normal distribution
 Where
is the cumulative
normal distribution function.
richard makoto UZ Econ313 Lecture notes
Lecture 9: Interpretation of the probit
 The probit model coefficients are not easy to interpret.
 Where
is the standard normal
probability density function (PDF) evaluated at
. The evaluation will depend on the
particular value of the X variables.
 In the probit model the rate of change is complicated and
is given as explained above.
richard makoto UZ Econ313 Lecture notes
Lecture 9: Logit and Probit Models
 Logit and Probit models give qualitatively similar results.
 In most applications the models are similar, the main
difference being that the logistic distribution has slightly
fatter tails than the normal distribution.
 One can use any model between the two and obtain
similar results.
richard makoto UZ Econ313 Lecture notes
Tutorial questions on binary dependent
variable models.
1. Explain the weaknesses of the LPM.
2. Why is the Logit or the Probit a better model that the
3. Derive the Logit model from the Logistic function and
explain the meaning of its slope coefficients.
4. Do all the questions on the tutorial sheet to be provided.
richard makoto UZ Econ313 Lecture notes
Lab session 3: Binary dependent variable
 Groups of 5 students
 Identify a hypothetical economic problem that requires
the use of qualitative dependent variable modeling (it
might be micro or macro in nature).
 The dependent variable must be binary.
 Enter the data in excel of at least 50 individuals.
 Saved data must be ready for use in the first week of
richard makoto UZ Econ313 Lecture notes