Ch16

advertisement
Qualitative and Limited Dependent Variable
Models
Prepared by Vera Tabakova, East Carolina University

16.1 Models with Binary Dependent Variables

16.2 The Logit Model for Binary Choice

16.3 Multinomial Logit

16.4 Conditional Logit

16.5 Ordered Choice Models

16.6 Models for Count Data

16.7 Limited Dependent Variables
Principles of Econometrics, 3rd Edition
Slide 16-2

Examples:
 An economic model explaining why some states in the United
States have ratified the Equal Rights Amendment, and others have
not.
 An economic model explaining why some individuals take a
second, or third, job and engage in “moonlighting.”
 An economic model of why some legislators in the U. S. House of
Representatives vote for a particular bill and others do not.
 An economic model of why the federal government awards
development grants to some large cities and not others.
Principles of Econometrics, 3rd Edition
Slide16-3
 An economic model explaining why some loan applications are
accepted and others not at a large metropolitan bank.
 An economic model explaining why some individuals vote “yes”
for increased spending in a school board election and others vote
“no.”
 An economic model explaining why some female college students
decide to study engineering and others do not.
Principles of Econometrics, 3rd Edition
Slide16-4
1 individual drives to work
y
0 individual takes bus to work
(16.1)
If the probability that an individual drives to work is p, then
P  y  1  p. It follows that the probability that a person uses public
transportation is P  y  0  1  p .
f ( y)  p y (1  p)1 y , y  0,1
(16.2)
E  y   p; var  y   p 1  p 
Principles of Econometrics, 3rd Edition
Slide16-5
y  E ( y)  e  p  e
(16.3)
E ( y )  p  1  2 x
(16.4)
y  E ( y )  e  1  2 x  e
Principles of Econometrics, 3rd Edition
(16.5)
Slide16-6
One problem with the linear probability model is that the error term is
heteroskedastic; the variance of the error term e varies from one
observation to another.
y value
e value
Probability
1
1   1  2 x 
p  1  2 x
0
  1  2 x 
1  p  1  1  2 x 
Principles of Econometrics, 3rd Edition
Slide16-7
var  e   1  2 x 1  1  2 x 
Using generalized least squares, the estimated variance is:
ˆ i2  var  ei    b1  b2 xi 1  b1  b2 xi 
(16.6)
yi*  yi ˆ i
xi*  xi ˆ i
yi*  1ˆ i1  2 xi*  ei*
Principles of Econometrics, 3rd Edition
Slide16-8
p̂  b1  b2 x
dp
 2
dx
(16.7)
(16.8)
Problems:

We can easily obtain values of p̂ that are less than 0 or greater than 1.

Some of the estimated variances in (16.6) may be negative.
Principles of Econometrics, 3rd Edition
Slide16-9
Figure 16.1 (a) Standard normal cumulative distribution function (b) Standard normal
probability density function
Principles of Econometrics, 3rd Edition
Slide16-10
1 .5 z 2
( z ) 
e
2
1 .5u 2
e du
2
(16.9)
p  P[ Z  1  2 xp̂]   (1  2 x)
(16.10)
( z )  P[ Z  z ]  
z

Principles of Econometrics, 3rd Edition
Slide16-11
dp d (t ) dt

  (1  2 x)2
dx
dt dx
(16.11)
where t  1  2 x and (1  2 x) is the standard normal probability
density function evaluated at 1  2 x.
Principles of Econometrics, 3rd Edition
Slide16-12
Equation (16.11) has the following implications:
1.
Since (1  2 x) is a probability density function its value is always
positive. Consequently the sign of dp/dx is determined by the sign of
2. In the transportation problem we expect 2 to be positive so that
dp/dx > 0; as x increases we expect p to increase.
Principles of Econometrics, 3rd Edition
Slide16-13
2.
As x changes the value of the function Φ(β1 + β2x) changes. The
standard normal probability density function reaches its maximum
when z = 0, or when β1 + β2x = 0. In this case p = Φ(0) = .5 and an
individual is equally likely to choose car or bus transportation.
The slope of the probit function p = Φ(z) is at its maximum when
z = 0, the borderline case.
Principles of Econometrics, 3rd Edition
Slide16-14
3.
On the other hand, if β1 + β2x is large, say near 3, then the
probability that the individual chooses to drive is very large and
close to 1. In this case a change in x will have relatively little effect
since Φ(β1 + β2x) will be nearly 0. The same is true if β1 + β2x is a
large negative value, say near 3. These results are consistent with
the notion that if an individual is “set” in their ways, with p near 0 or
1, the effect of a small change in commuting time will be negligible.
Principles of Econometrics, 3rd Edition
Slide16-15
Predicting the probability that an individual chooses the alternative
y = 1:
pˆ  (1  2 x)
1
yˆ  
0
Principles of Econometrics, 3rd Edition
(16.12)
pˆ  0.5
pˆ  0.5
Slide16-16
f ( yi )  [(1 2 xi )]yi [1  (1 2 xi )]1 yi , yi  0,1
(16.13)
f ( y1 , y2 , y3 )  f ( y1 ) f ( y2 ) f ( y3 )
Suppose that y1 = 1, y2 = 1 and y3 = 0.
Suppose that the values of x, in minutes, are x1 = 15, x2 = 20 and x3 = 5.
Principles of Econometrics, 3rd Edition
Slide16-17
P[ y1  1, y2  1, y3  0]  f (1,1,0)  f (1) f (1) f (0)
P[ y1  1, y2  1, y3  0] 
[1  2 (15)]  [1  2 (20)]  1  [1  2 (5)]
(16.14)
In large samples the maximum likelihood estimator is normally
distributed, consistent and best, in the sense that no competing
estimator has smaller variance.
Principles of Econometrics, 3rd Edition
Slide16-18
Principles of Econometrics, 3rd Edition
Slide16-19
1  2 DTIMEi  .0644  .0299 DTIMEi
(se)
(16.15)
(.3992) (.0103)
dp
 (1  2 DTIME )2  (0.0644  0.0299  20)(0.0299)
dDTIME
 (.5355)(0.0299)  0.3456  0.0299  0.0104
Principles of Econometrics, 3rd Edition
Slide16-20
If an individual is faced with the situation that it takes 30 minutes
longer to take public transportation than to drive to work, then the
estimated probability that auto transportation will be selected is
pˆ  (1  2 DTIME )  (0.0644  0.0299  30)  .798
Since the estimated probability that the individual will choose to drive
to work is 0.798, which is greater than 0.5, we “predict” that when
public transportation takes 30 minutes longer than driving to work,
the individual will choose to drive.
Principles of Econometrics, 3rd Edition
Slide16-21
(l ) 
el
1  e 
l 2
,   l  
(16.16)
1
  l   p[ L  l ] 
1  e l
p  P  L  1  2 x    1  2 x  
Principles of Econometrics, 3rd Edition
(16.17)
1
1 e
 1 2 x 
(16.18)
Slide16-22
p
1
1  e 1 2 x 
exp  1  2 x 

1  exp  1  2 x 
1
1 p 
1  exp  1  2 x 
Principles of Econometrics, 3rd Edition
Slide16-23
Examples of multinomial choice situations:
1. Choice of a laundry detergent: Tide, Cheer, Arm & Hammer, Wisk,
etc.
2. Choice of a major: economics, marketing, management, finance or
accounting.
3. Choices after graduating from high school: not going to college,
going to a private 4-year college, a public 4 year-college, or a 2-year
college.
The explanatory variable xi is individual specific, but does not
change across alternatives.
Principles of Econometrics, 3rd Edition
Slide16-24
pij  P individual i chooses alternative j 
pi1 
1
1  exp  12  22 xi   exp 13  23 xi 
, j 1
(16.19a)
exp  12  22 xi 
pi 2 
, j2
1  exp  12  22 xi   exp  13  23 xi 
(16.19b)
exp  13  23 xi 
pi 3 
, j 3
1  exp  12  22 xi   exp  13  23 xi 
(16.19c)
Principles of Econometrics, 3rd Edition
Slide16-25
P  y11  1, y22  1, y33  1  p11  p22  p33

1
1  exp  12  22 x1   exp  13  23 x1 

exp  12  22 x2 

1  exp  12  22 x2   exp  13  23 x2 
exp  13  23 x3 
1  exp  12  22 x3   exp  13  23 x3 
 L  12 , 22 , 13 , 23 
Principles of Econometrics, 3rd Edition
Slide16-26
p01 
pim
xi
1



1  exp 12  22 x0  exp 13  23 x0
all else constant

3


pim

 pim 2 m   2 j pij 
xi
j 1


(16.20)
p1  pb1  pa1


1


1  exp 12  22 xb  exp 13  23 xb
Principles of Econometrics, 3rd Edition



1


1  exp 12  22 xa  exp 13  23 xa
Slide16-27

P  yi  j  pij

 exp  1 j  2 j xi 
P  yi  1 pi1
  pij pi1 
xi
 2 j exp 1 j  2 j xi 
j  2,3
j  2,3
(16.21)
(16.22)
An interesting feature of the odds ratio (16.21) is that the odds of choosing
alternative j rather than alternative 1 does not depend on how many alternatives
there are in total. There is the implicit assumption in logit models that the odds
between any pair of alternatives is independent of irrelevant alternatives (IIA).
Principles of Econometrics, 3rd Edition
Slide16-28
Principles of Econometrics, 3rd Edition
Slide16-29
Principles of Econometrics, 3rd Edition
Slide16-30
Example: choice between three types (J = 3) of soft drinks, say Pepsi,
7-Up and Coke Classic.
Let yi1, yi2 and yi3 be dummy variables that indicate the choice made
by individual i. The price facing individual i for brand j is PRICEij.
Variables like price are to be individual and alternative specific,
because they vary from individual to individual and are different for
each choice the consumer might make
Principles of Econometrics, 3rd Edition
Slide16-31
pij  P individual i chooses alternative j 
pij 
exp  1 j  2 PRICEij 
exp  11  2 PRICEi1   exp  12  2 PRICEi 2   exp  13  2 PRICEi 3 
Principles of Econometrics, 3rd Edition
(16.23)
Slide16-32
P  y11  1, y22  1, y33  1  p11  p22  p33

exp  11  2 PRICE11 

exp  11  2 PRICE11   exp  12  2 PRICE12   exp  2 PRICE13 
exp  12  2 PRICE22 

exp  11  2 PRICE21   exp  12  2 PRICE22   exp  2 PRICE23 
exp  2 PRICE33 
exp  11  2 PRICE31   exp  12  2 PRICE32   exp  2 PRICE33 
 L  12 , 22 , 2 
Principles of Econometrics, 3rd Edition
Slide16-33

The own price effect is:
pij
PRICEij

 pij 1  pij  2
(16.24)
  pij pik 2
(16.25)
The cross price effect is:
pij
PRICEik
Principles of Econometrics, 3rd Edition
Slide16-34
pij
pik

exp  1 j  2 PRICEij 
exp  1k  2 PRICEik 
 exp  1 j  1k   2  PRICEij  PRICEik  
The odds ratio depends on the difference in prices, but not on the prices
themselves. As in the multinomial logit model this ratio does not depend on
the total number of alternatives, and there is the implicit assumption of the
independence of irrelevant alternatives (IIA).
Principles of Econometrics, 3rd Edition
Slide16-35
Principles of Econometrics, 3rd Edition
Slide16-36
The predicted probability of a Pepsi purchase, given that the price of
Pepsi is $1, the price of 7-Up is $1.25 and the price of Coke is $1.10
is:
pˆ i1 


exp 11  2 1.00





exp 11  2 1.00  exp 12  2 1.25  exp 2 1.10
Principles of Econometrics, 3rd Edition

 .4832
Slide16-37
The choice options in multinomial and conditional logit models have
no natural ordering or arrangement. However, in some cases choices
are ordered in a specific way. Examples include:
1.
Results of opinion surveys in which responses can be strongly
disagree, disagree, neutral, agree or strongly agree.
2.
Assignment of grades or work performance ratings. Students receive
grades A, B, C, D, F which are ordered on the basis of a teacher’s
evaluation of their performance. Employees are often given
evaluations on scales such as Outstanding, Very Good, Good, Fair
and Poor which are similar in spirit.
Principles of Econometrics, 3rd Edition
Slide16-38
3.
Standard and Poor’s rates bonds as AAA, AA, A, BBB and so on, as
a judgment about the credit worthiness of the company or country
issuing a bond, and how risky the investment might be.
4.
Levels of employment are unemployed, part-time, or full-time.
When modeling these types of outcomes numerical values are
assigned to the outcomes, but the numerical values are ordinal, and
reflect only the ranking of the outcomes.
Principles of Econometrics, 3rd Edition
Slide16-39
Example:
1
2

y  3
4

5
strongly disagree
disagree
neutral
agree
strongly agree
Principles of Econometrics, 3rd Edition
Slide16-40
3 4-year college (the full college experience)

y  2 2-year college (a partial college experience)
1 no college

(16.26)
The usual linear regression model is not appropriate for such data, because
in regression we would treat the y values as having some numerical
meaning when they do not.
Principles of Econometrics, 3rd Edition
Slide16-41
yi*  GRADESi  ei
3 (4-year college) if

y  2 (2-year college) if
1 (no college)
if

Principles of Econometrics, 3rd Edition
yi*   2
1  yi*   2
yi*  1
Slide16-42
Figure 16.2 Ordinal Choices Relation to Thresholds
Principles of Econometrics, 3rd Edition
Slide16-43
P  yi  1  P  yi*  1   P GRADESi  ei  1 
 P ei  1  GRADESi 
   1  GRADESi 
Principles of Econometrics, 3rd Edition
Slide16-44
P  yi  2  P 1  yi*   2   P 1  GRADESi  ei   2 
 P 1  GRADESi  ei   2  GRADESi 
    2  GRADESi     1  GRADESi 
Principles of Econometrics, 3rd Edition
Slide16-45
P  yi  3  P  yi*   2   P GRADESi  ei   2 
 P  ei   2  GRADESi 
 1     2  GRADESi 
Principles of Econometrics, 3rd Edition
Slide16-46
L  , 1 , 2   P  y1  1  P  y2  2  P  y3  3
The parameters are obtained by maximizing the log-likelihood
function using numerical methods. Most software includes options for
both ordered probit, which depends on the errors being standard
normal, and ordered logit, which depends on the assumption that the
random errors follow a logistic distribution.
Principles of Econometrics, 3rd Edition
Slide16-47
The types of questions we can answer with this model are:
1.
What is the probability that a high-school graduate with GRADES =
2.5 (on a 13 point scale, with 1 being the highest) will attend a 2year college? The answer is obtained by plugging in the specific
value of GRADES into the predicted probability based on the
maximum likelihood estimates of the parameters,



P  y  2 | GRADES  2.5   2  2.5   1  2.5
Principles of Econometrics, 3rd Edition

Slide16-48
2.
What is the difference in probability of attending a 4-year college for
two students, one with GRADES = 2.5 and another with GRADES =
4.5? The difference in the probabilities is calculated directly as
P  y  2 | GRADES  4.5  P  y  2 | GRADES  2.5
Principles of Econometrics, 3rd Edition
Slide16-49
3.
If we treat GRADES as a continuous variable, what is the marginal
effect on the probability of each outcome, given a 1-unit change in
GRADES? These derivatives are:
P  y  1
   1   GRADES  
GRADES
P  y  2
 
 1  GRADES      2  GRADES  

GRADES
P  y  3
    2  GRADES  
GRADES
Principles of Econometrics, 3rd Edition
Slide16-50
Principles of Econometrics, 3rd Edition
Slide16-51
When the dependent variable in a regression model is a count of the number
of occurrences of an event, the outcome variable is y = 0, 1, 2, 3, … These
numbers are actual counts, and thus different from the ordinal numbers of
the previous section. Examples include:

The number of trips to a physician a person makes during a year.

The number of fishing trips taken by a person during the previous year.

The number of children in a household.

The number of automobile accidents at a particular intersection during a
month.

The number of televisions in a household.

The number of alcoholic drinks a college student takes in a week.
Principles of Econometrics, 3rd Edition
Slide16-52
If Y is a Poisson random variable, then its probability function is
e  y
f  y   P Y  y  
,
y!
y !  y   y  1   y  2  
y  0,1,2,
(16.27)
1
E Y     exp 1  2 x 
(16.28)
This choice defines the Poisson regression model for count data.
Principles of Econometrics, 3rd Edition
Slide16-53
L  1 , 2   P Y  0   P Y  2   P Y  2 
ln L  1 , 2   ln P Y  0   ln P Y  2   ln P Y  2 
 e  y 
ln  P Y  y    ln 
   y ln     ln  y !

 y! 
  exp  1  2 x   y  1  2 x   ln  y !
ln L  1 , 2    exp 1  2 xi   yi  1  2 xi   ln  yi !
N
i 1
Principles of Econometrics, 3rd Edition
Slide16-54

E  y0   0  exp 1  2 x0
Pr Y  y  
Principles of Econometrics, 3rd Edition


exp  0  0y
y!
,

y  0,1, 2,
Slide16-55
E  yi 
  i 2
xi
(16.29)
%E  y 
E  yi  E  yi 
 100
 1002 %
xi
xi
Principles of Econometrics, 3rd Edition
Slide16-56
E  yi    i  exp  1  2 xi  Di 
E  yi | Di  0   exp  1  2 xi 
E  yi | Di  1  exp  1  2 xi   
 exp  1  2 xi     exp  1  2 xi  


100 
%

100
e
 1 %


exp  1  2 xi 


Principles of Econometrics, 3rd Edition
Slide16-57
Principles of Econometrics, 3rd Edition
Slide16-58

16.7.1 Censored Data
Figure 16.3 Histogram of Wife’s Hours of Work in 1975
Principles of Econometrics, 3rd Edition
Slide16-59
Having censored data means that a substantial fraction of the
observations on the dependent variable take a limit value. The
regression function is no longer given by (16.30).
E  y | x   1  2 x
(16.30)
The least squares estimators of the regression parameters obtained by
running a regression of y on x are biased and inconsistent—least
squares estimation fails.
Principles of Econometrics, 3rd Edition
Slide16-60
Having censored data means that a substantial fraction of the
observations on the dependent variable take a limit value. The
regression function is no longer given by (16.30).
E  y | x   1  2 x
(16.30)
The least squares estimators of the regression parameters obtained by
running a regression of y on x are biased and inconsistent—least
squares estimation fails.
Principles of Econometrics, 3rd Edition
Slide16-61
We give the parameters the specific values and 1  9 and 2  1.
yi*  1  2 xi  ei  9  xi  ei
(16.31)
2
Assume ei ~ N  0,   16  .
yi  0 if yi*  0;
yi  yi* if yi*  0.
Principles of Econometrics, 3rd Edition
Slide16-62

Create N = 200 random values of xi that are spread evenly (or
uniformly) over the interval [0, 20]. These we will keep fixed in
further simulations.

Obtain N = 200 random values ei from a normal distribution with
mean 0 and variance 16.

Create N = 200 values of the latent variable.
*

0
if
y
i 0

 Obtain N = 200 values of the observed yi using yi  
*
*
y
if
y

i 0
 i
Principles of Econometrics, 3rd Edition
Slide16-63
Figure 16.4 Uncensored Sample Data and Regression Function
Principles of Econometrics, 3rd Edition
Slide16-64
Figure 16.5 Censored Sample Data, and Latent Regression Function and
Least Squares Fitted Line
Principles of Econometrics, 3rd Edition
Slide16-65
yˆi  2.1477  .5161xi
(se) (.3706) (.0326)
yˆi  3.1399  .6388 xi
(se) (1.2055) (.0827)
1
EMC  bk  
NSAM
Principles of Econometrics, 3rd Edition
(16.32a)
(16.32b)
NSAM

m1
bk ( m )
(16.33)
Slide16-66
The maximum likelihood procedure is called Tobit in honor of James
Tobin, winner of the 1981 Nobel Prize in Economics, who first
studied this model.
The probit probability that yi = 0 is:
P  yi  0  P[ yi  0]  1   1  2 xi  
1



2 
 1  2 xi  
 1
2
2
L 1 , 2 ,     1   
     2  exp   2  yi  1  2 xi   

yi 0 

  yi 0 
 2

Principles of Econometrics, 3rd Edition
Slide16-67
The maximum likelihood estimator is consistent and asymptotically
normal, with a known covariance matrix.
Using the artificial data the fitted values are:
yi  10.2773  1.0487 xi
(se) (1.0970) (.0790)
Principles of Econometrics, 3rd Edition
(16.34)
Slide16-68
Principles of Econometrics, 3rd Edition
Slide16-69
E  y | x 
 1  2 x 
 2  

x
  
(16.35)
Because the cdf values are positive, the sign of the coefficient does
tell the direction of the marginal effect, just not its magnitude. If
β2 > 0, as x increases the cdf function approaches 1, and the slope of
the regression function approaches that of the latent variable model.
Principles of Econometrics, 3rd Edition
Slide16-70
Figure 16.6 Censored Sample Data, and Regression Functions
for Observed and Positive y values
Principles of Econometrics, 3rd Edition
Slide16-71
HOURS  1  2 EDUC  3 EXPER  4 AGE  4 KIDSL6  e
(16.36)
E  HOURS 
 2  73.29  .3638  26.34
EDUC
Principles of Econometrics, 3rd Edition
Slide16-72
Principles of Econometrics, 3rd Edition
Slide16-73

Problem: our sample is not a random sample. The data we observe are
“selected” by a systematic process for which we do not account.

Solution: a technique called Heckit, named after its developer, Nobel
Prize winning econometrician James Heckman.
Principles of Econometrics, 3rd Edition
Slide16-74

The econometric model describing the situation is composed of two
equations. The first, is the selection equation that determines
whether the variable of interest is observed.
zi*  1   2 wi  ui
i  1, , N
*

1
z
i 0

zi  

0 otherwise
Principles of Econometrics, 3rd Edition
(16.37)
(16.38)
Slide16-75

The second equation is the linear model of interest. It is
yi  1  2 xi  ei
i  1,
,n
E  yi | zi*  0  1  2 xi  i
  1   2 wi 
i 
  1   2 wi 
Principles of Econometrics, 3rd Edition
N n
i  1,
(16.39)
,n
(16.40)
(16.41)
Slide16-76

The estimated “Inverse Mills Ratio” is
  1   2 wi 
i 
  1   2 wi 

The estimating equation is
yi  1  2 xi  i  vi
Principles of Econometrics, 3rd Edition
i  1,
,n
(16.42)
Slide16-77
ln WAGE   .4002  .1095EDUC  .0157 EXPER
(t-stat)
(  2.10) (7.73)
R 2  .1484
(16.43)
(3.90)
P  LFP  1   1.1923  .0206 AGE  .0838EDUC  .3139 KIDS  1.3939MTR 
(t-stat)
(  2.93)
(3.61)
(  2.54)
(  2.26)
 1.1923  .0206 AGE  .0838EDUC  .3139KIDS  1.3939MTR 
  IMR 
 1.1923  .0206 AGE  .0838EDUC  .3139KIDS  1.3939MTR 
Principles of Econometrics, 3rd Edition
Slide16-78
ln WAGE   .8105  .0585EDUC  .0163EXPER  .8664IMR
(t-stat)
(1.64) (2.45)
(t-stat-adj) (1.33) (1.97)

(4.08)
(3.88)
(  2.65)
(  2.17)
(16.44)
The maximum likelihood estimated wage equation is
ln WAGE   .6686  .0658EDUC  .0118EXPER
(t-stat)
(2.84) (3.96)
(2.87)
The standard errors based on the full information maximum likelihood
procedure are smaller than those yielded by the two-step estimation method.
Principles of Econometrics, 3rd Edition
Slide16-79















binary choice models
censored data
conditional logit
count data models
feasible generalized least squares
Heckit
identification problem
independence of irrelevant
alternatives (IIA)
index models
individual and alternative specific
variables
individual specific variables
latent variables
likelihood function
limited dependent variables
linear probability model
Principles of Econometrics, 3rd Edition

















logistic random variable
logit
log-likelihood function
marginal effect
maximum likelihood estimation
multinomial choice models
multinomial logit
odds ratio
ordered choice models
ordered probit
ordinal variables
Poisson random variable
Poisson regression model
probit
selection bias
tobit model
truncated data
Slide 16-80
Download