lecture notes Limited Dependent Econometrics II

advertisement
Instructor: R. Makoto
richard makoto UZ Econ313 Lecture notes
1
Lecture 7: Qualitative response
regression models
 The dependent variable is qualitative rather than
continuous.
 Qualitative response regression models can have
dependent variables with either two categories or more
than two categories.
 Those with two categories are known as binary
dependent variable models
 Those with more than two categories are referred to as
polychotomous or multi-response dependent variable
models e.g. Poisson models, multinomial logit and probit
models, ordered probit models, e.t.c.
richard makoto UZ Econ313 Lecture notes
2
Lecture 7: Binary dependent variable
models
 There several types of such models
 Some of them include the Linear Probability Model
(LPM), the Probit model, the Logit model, latent
regressions, random utility models, e.t.c.
 Binary dependent variable models are also known as
dichotomous dependent variable models.
 At this level, we will however concentrate on the first
three models namely the LPM, the Logit and the Probit
models.
richard makoto UZ Econ313 Lecture notes
3
Example: An application of the binary
dependent variable model
 Two people, identical but for their race, walk into a bank
and apply for mortgage, a large loan so that each can
buy an identical house.
 Does the bank treat them the same way? Are they both
equally likely to have their mortgage applications
accepted? By law they must receive identical treatment.
 But whether they actually do is a matter of great concern
among bank regulators.
 We can model the factors influencing loan application
acceptance as follows:
richard makoto UZ Econ313 Lecture notes
4
Example: An application of the binary
dependent variable model continued ….
 Dependent variable takes two values; Y=1 if the
mortgage application is denied and Y=0 if otherwise.
 The model is therefore in the form of a probability model,
i.e.
 The Xs are the explanatory variables such as race,
gender, wealth, previous loans, loan payment record,
age, and many other socio-economic factors.
 Several approaches can be used to estimate binary
models.
richard makoto UZ Econ313 Lecture notes
5
Lecture 7: The Linear Probability Model
 It is a multiple regression model with a dependent
variable in the form of binary rather than continuous.
 Because the dependent variable Y is binary, the
population regression function corresponds to the
probability that the dependent variable equals 1 given
explanatory variables, Xs, i.e.

is the change in the probability that Y=1 associated
with a unit change in
, i.e.
richard makoto UZ Econ313 Lecture notes
6
Lecture 7: The LPM continued…………….
 The regression coefficients in the LPM are estimated by
OLS.
 The usual (heteroscedastic-robust) OLS standard errors
can be used to construct confidence intervals and
hypotheses tests.
 Let be the probability that Y=1 (probability of success),
then
= probability that Y=0 (probability of failure).
 Therefore;
Probability
0
1
richard makoto UZ Econ313 Lecture notes
7
Lecture 7: Weaknesses of the LPM
 The disturbances are not normally distributed.
Probability
0
1

follows the Bernoulli distribution, a special type of the
binomial distribution with only one draw; a violation of
one of the assumptions of CLRM.
richard makoto UZ Econ313 Lecture notes
8
Lecture 7: Weaknesses of the LPM
continued…………………..
 The variance of the disturbances are heteroscedastic.
 The value of the variance of the error depends on the
values of the explanatory variables (Xs) hence it is not
homoscedastic.
 The problem of heteroscedastic variances can be
corrected by applying the Weighted Least Squares
(WLS) estimation technique.
richard makoto UZ Econ313 Lecture notes
9
Lecture 7: Weaknesses of the LPM
continued………
 R-squared or the coefficient of determination is of limited
use. It cannot be used to measure the goodness of fit of
the model.
 The major weakness of the LPM is its failure to
guarantee that the estimated probabilities always lie
between zero and one.
 Probabilities generated in the LPM sometimes exceed
one or fall short of zero which is nonsensical.
 It is this weakness that gives rise to better methods of
estimating binary dependent variable models.
richard makoto UZ Econ313 Lecture notes
10
Lecture 8: The Logit model
 A logit model is a probability econometric model derived
from the logistic distribution function;
 it ensures that
whatever the value of is. As
approaches positive infinity,
approaches 1 and as
approaches negative infinity,
approaches 0.
 Consider a binary dependent variable model, say;
 where
takes only two values, 1 or 0,
richard makoto UZ Econ313 Lecture notes
11
Lecture 8:The Logit model continued…….
 we can use the logistic distribution function in dealing
with such models. Such a function can be expressed as:
……. (1)

 Where
,
is the
probability that
=1 (the probability of success) and
is the probability that = 0 or the probability of failure.
 It can be verified that as ranges from -∞ to +∞,
ranges between zero and one and is not linearly related
to .
richard makoto UZ Econ313 Lecture notes
12
Lecture 8: The Logit model continued…….
 Equation (1) is non-linear in both
, therefore we
cannot apply OLS to estimate the parameters of this
equation.
 To proceed in estimating equation (1), we can linearize it
as follows:
 If is the probability of success, then the probability of
failure is given by:
……………….(2)

richard makoto UZ Econ313 Lecture notes
13
Lecture8: The Logit model continued…….
 The odds ratio in favour of success is given by:
……………………………………(3)
 Expressing (3) in natural logarithms, we obtain:
……….(4)


is known as the Logit or the natural log of the
odds ratio, hence the name, “Logit model”.
richard makoto UZ Econ313 Lecture notes
14
Lecture 8: Properties of the logit model
1. The logit is not bounded between 0 and 1 although the
2.
3.
4.
5.
6.
probabilities lie between 0 and 1.
Although the logit is linear in X, the probabilities are not
linear in X.
Z can contain so many regressors
If the logit is positive, it means that when the value of
the regressors increases, the odds that the regressand
equals 1 increases.
Estimating the probability can be done directly from
equation (1) as long as Z is known.
Unlike in the LPM, the logit assumes that it is the log of
the odds ratio which is linearly related to X not the Prob.
richard makoto UZ Econ313 Lecture notes
15
Lecture 8: Interpretation of slope
coefficients in the logit model
 The slope coefficient,
, measures the change in the
Logit (Log of odds ratio) resulting from a marginal change
in the regressor, . Mathematically we have:

 The Maximum Likelihood (ML) method is used to
estimate the logit model in (4).
richard makoto UZ Econ313 Lecture notes
16

Lecture 9: The Probit model
 Unlike the Logit model which is derived from the
cumulative logistic distribution function, the probit model
uses the cumulative normal distribution function, hence
sometimes referred to as the Normit model.
 The probit model is similar to the logit model except that
the logistic function is replaced by the normal distribution
function.
 Where
and
is the cumulative
normal distribution function.
richard makoto UZ Econ313 Lecture notes
17
Lecture 9: Interpretation of the probit
model
 The probit model coefficients are not easy to interpret.
 Where
is the standard normal
probability density function (PDF) evaluated at
. The evaluation will depend on the
particular value of the X variables.
 In the probit model the rate of change is complicated and
is given as explained above.
richard makoto UZ Econ313 Lecture notes
18
Lecture 9: Logit and Probit Models
 Logit and Probit models give qualitatively similar results.
 In most applications the models are similar, the main
difference being that the logistic distribution has slightly
fatter tails than the normal distribution.
 One can use any model between the two and obtain
similar results.
richard makoto UZ Econ313 Lecture notes
19
Tutorial questions on binary dependent
variable models.
1. Explain the weaknesses of the LPM.
2. Why is the Logit or the Probit a better model that the
LPM?
3. Derive the Logit model from the Logistic function and
explain the meaning of its slope coefficients.
4. Do all the questions on the tutorial sheet to be provided.
richard makoto UZ Econ313 Lecture notes
20
Lab session 3: Binary dependent variable
models.
 Groups of 5 students
 Identify a hypothetical economic problem that requires
the use of qualitative dependent variable modeling (it
might be micro or macro in nature).
 The dependent variable must be binary.
 Enter the data in excel of at least 50 individuals.
 Saved data must be ready for use in the first week of
November.
richard makoto UZ Econ313 Lecture notes
21
Download