Chapter 14 powerpoint slides for fixed effects and time trends

advertisement
Traffic Fatality Rate
Nebraska
Colorado
Colorado
Implements
Policy
time
Difference in Difference Model with two states and two time periods. The
estimating equation is:
yst = α0 + δ Treatments*Postt + εst
Crime Rate
Nebraska
Average yst is α0
δ
Average yst is
α0 + δ (Note
here that δ is
negative)
Colorado
Colorado
Implements
Policy
time
Suppose the estimating equation is:
yst = α0 + δ Treatedst + εst
Crime Rate
Nebraska
Average yst is α0
δ
Average yst is
α0 + δ (Note
here that δ is
negative)
Colorado
Colorado
Implements
Policy
time
Suppose the estimating equation is:
yst = α0 + δ Treatedst + εst
Traffic Fatality Rate
Nebraska
Colorado
Colorado
Implements
Policy
time
Difference in Difference Model with two states and two time periods. The
estimating equation is:
yst = α0 + α1Treatments + δ Treatments*Postt + εst
Traffic Fatality Rate
Nebraska
α0
Colorado
α0 + α1 + δ
α 0 + α1
Colorado
Implements
Policy
time
Difference in Difference Model with two states and two time periods. The
estimating equation is:
yst = α0 + α1Treatments + δ Treatments*Postt + εst
Traffic Fatality Rate
Nebraska
α0
Colorado
α0 + α1 + δ
δ
α 0 + α1
Colorado
Implements
Policy
time
Difference in Difference Model with two states and two time periods. The
estimating equation is:
yst = α0 + α1Treatments + δ Treatments*Postt + εst
Traffic Fatality Rate
Nebraska
Colorado
Colorado
Implements
Policy
time
Difference in Difference Model with two states and two time periods. The
estimating equation is:
yst = α0 + α1Treatments +α2Postt + + δ Treatments*Postt + εst
Traffic Fatality Rate
Nebraska
Colorado
Colorado
Implements
Policy
time
Difference in Difference Model with two states and two time periods. The
estimating equation is:
yst = α0 + α1Treatments + +α2Postt + δ Treatments*Postt + εst
Traffic Fatality Rate
Nebraska
α2
Colorado
α2+ δ
α0
Colorado
Implements
Policy
α0+ α1
time
Difference in Difference Model with two states and two time periods. The
estimating equation is:
yst = α0 + α1Treatments +α2Postt + + δ Treatments*Postt + εst
Traffic Fatality Rate
Nebraska
α2
Colorado
δ
α2
α0
Colorado
Implements
Policy
α0+ α1
time
Difference in Difference Model with two states and two time periods. The
estimating equation is:
yst = α0 + α1Treatments +α2Postt + + δ Treatments*Postt + εst
Traffic Fatality Rate
Nebraska
α2
Colorado
.
δ
α2
Colorado
Implements
Policy
time
Difference in Difference Model with two states and two time periods. The
estimating equation is:
yst = α0 + α1Treatments +α2Postt + + δ Treatments*Postt + εst
Traffic Fatality Rate
Nebraska
α2
Colorado
.
δ
α2
Colorado
Implements
Policy
time
This estimating equation is exactly the same as:
yist = α0 + α1dCOs +α2d2t + + δ Policyst + εist
dCO is dummy for CO; d2 is dummy for year 2; Policy is dummy =1 in CO after policy is passed
Traffic Fatality Rate
Nebraska
Colorado
.
Colorado
Implements
Policy
time
𝑦𝑖𝑠𝑑 − 𝑦𝑠 = α2d2t + + δ Policyst +πœ€π‘–π‘ π‘‘ − πœ€π‘ 
Traffic Fatality Rate
Colorado
.
δ
α2
Nebraska
time
The estimate of δ is exactly the same as obtained by subtracting the mean for each
state:
𝑦𝑖𝑠𝑑 − 𝑦𝑠 = α2d2t + + δ Policyst +πœ€π‘–π‘ π‘‘ − πœ€π‘ 
Traffic Fatality Rate
Colorado
.
δ
Nebraska
time
And δ is exactly the same after subtracting the mean for each time period:
𝑦𝑖𝑠𝑑 − 𝑦𝑠 − 𝑦𝑑 = δ Policyst +πœ€π‘–π‘ π‘‘ − πœ€π‘  − πœ€π‘‘
When we have lots of states and years, an author
typically writes
yst = β0 + δPolicyst + vs + zt +εst
And then the author might say that the equation is
estimated including state and year fixed effects
Start with state fixed effects, common time trend
Traffic Fatality Rate
Nebraska
Colorado
.
α0
Colorado
Implements
Policy
α0 + α 1
time
If we’re using state-level data, then each state contributes one observation per
year. The estimating equation with a common time trend is
yist = α0 + α1dCOs +α2t + δ Policyst + εist
Traffic Fatality Rate
Colorado
Nebraska
.
Colorado
Implements
Policy
time
Remove state specific meansοƒ  δ will still be the vertical change
𝑦𝑖𝑠𝑑 − 𝑦𝑠 = α2t + δ Policyst + πœ€π‘–π‘ π‘‘ − πœ€π‘ 
Traffic Fatality Rate
Nebraska
Colorado
New Mexico
.
Colorado
Implements
Policy
.
New Mexico
Implements
Policy
time
With multiple states δ will be the average vertical change across the states
yst = α0 + α2 t + δPolicyst + vs + εst
Traffic Fatality Rate
Nebraska
.
Colorado
New Mexico
..
Colorado
Implements
Policy
.
New Mexico
Implements
Policy
time
If we’re using repeated cross-sectional data at the individual level, then each state contributes multiple
observations per year. The estimating equation is:
yist = α0 + δ Policyst + α2 t + vs + εist
Pooled model: MRit = β0 +β1unemit +eit
id state year
19 LA 87
19 LA 90
19 LA 93
murder
rate
11.1
17.2
20.3
5
5
5
10.6
11.9
13.1
CA
CA
CA
87
90
93
unem
12
6.2
7.4
5.8
5.6
9.2
FD model: ΔMRit = β1Δunemit + Δ eit
id state year
19 LA 87
19 LA 90
19 LA 93
murder
rate
11.1
17.2
20.3
5
5
5
10.6
11.9
13.1
CA
CA
CA
87
90
93
ΔMR
-6.1
3.1
-1.3
1.2
unem ΔUN
12
-6.2 -5.8
7.4 1.2
5.8
5.6
9.2
--.02
3.6
FE model: MRit -𝑀𝑅𝑖 = β1unemit−π‘’π‘›π‘’π‘ši + eit
id state
19 LA
19 LA
19 LA
5
5
5
CA
CA
CA
murder
year rate
87 11.1
90 17.2
93 20.3
87 10.6
90 11.9
93 13.1
UNmean
MRMean mean unem mean
16.2 -5.1
12 8.5 3.5
16.2
1
6.2 8.5 -2.3
16.2
4.1
7.4 8.5 -1.1
11.9
11.9
11.9
-1.3
0
1.2
5.8
5.6
9.2
6.9
6.9
6.9
-1.1
-1.3
2.3
The estimated coefficient on unemployment
from these two models will be the same if
there are only 2 years.
Otherwise, they are both consistent
estimators of β1 but their exact values will
differ
FE model: MRit = α0+β1unemit + α1Statei +eit
id state
19 LA
19 LA
19 LA
5
5
5
CA
CA
CA
murder
year rate unem
87 11.1
12
90 17.2
6.2
93 20.3
7.4
87 10.6
90 11.9
93 13.1
5.8
5.6
9.2
State
1
1
1
0
0
0
FE model: MRit = α0+β1unemit + α1Statei +eit
id state
19 LA
19 LA
19 LA
5
5
5
CA
CA
CA
murder
year rate unem
87 11.1
12
90 17.2
6.2
93 20.3
7.4
87 10.6
90 11.9
93 13.1
5.8
5.6
9.2
State
1
1
1
The constant term in
this model and the
previous version of
the FE model will be
different (how?)
0
0
0
The estimated
coefficient on
unemployment WILL
BE EXACTLY THE
SAME
Now look at time shocks.
What if all states experience a common shock in a
given year?
What if the means varies by time as well as by state?
Crime Rate
Jefferson legalizes
by-the-drink sales
.
z8
Jefferson
z3
z8
Barber legalizes
by-the-drink sales
Barber
z3
Positive shock to crime
Negative shock to crime
time
Common time fixed effects-- δ will still be the average vertical change from before and
after the policy
Violent Crimect = π0 + δ Wet Lawct + X‘ctπ2 + vc + zt + εct
Crime Rate
Jefferson legalizes
by-the-drink sales
.
z8
Jefferson
z3
z8
Barber legalizes
by-the-drink sales
Barber
z3
Positive shock to crime
Negative shock to crime
time
Common time fixed effects-- δ will still be the average vertical change from before and
after the policy
Violent Crimect = π0 + δ Wet Lawct + X‘ctπ2 + vc + zt + εct
Crime Rate
Jefferson legalizes
by-the-drink sales
.
Jefferson
Barber
Barber legalizes
by-the-drink sales
Positive shock to crime
Negative shock to crime
But note if you look at it, Barber is increasing faster than Jefferson:
Violent Crimect = π0 + π1Wet Lawct + X‘ctπ2 + vc + zt + εct
time
Crime Rate
Jefferson legalizes
by-the-drink sales
.
Jefferson
Franklin
Barber
Barber legalizes
by-the-drink sales
Franklin
legalizes by-thedrink sales
Positive shock to crime
Negative shock to crime
Adding state specific time trends
Violent Crimect = π0 + δ Wet Lawct + X‘ctπ2 + vc + zt + Θcβˆ™ t + εct
time
Meyer et al.
• Workers’ compensation
• State run insurance program
• Compensate workers for medical expenses and lost work due to on the job
accident
• Premiums
• Paid by firms
• Function of previous claims and wages paid
• Benefits -- % of income w/ cap
35
• Typical benefits schedule
•
•
•
•
Min( pY,C)
P=percent replacement
Y = earnings
C = cap
• e.g., 65% of earnings up to $400/month
36
• Concern:
• Moral hazard. Benefits will discourage return to work
• Empirical question: duration/benefits gradient
• Previous estimates
• Regress duration (y) on replaced wages (x)
• Problem:
• given progressive nature of benefits, replaced wages
reveal a lot about the workers
• Replacement rates higher in higher wage states
37
• Yi = Xiβ + αRi + εi
• Y (duration)
• R (replacement rate)
• Expect α > 0
• Expect Cov(Ri, εi)
• Higher wage workers have lower R and higher duration (understate)
• Higher wage states have longer duration and longer R (overstate)
38
Solution
• Quasi experiment in KY and MI
• Increased the earnings cap
• Increased benefit for high-wage workers
• (Treatment)
• Did nothing to those already below original cap (comparison)
• Compare change in duration of spell before and after change for
these two groups
39
40
41
Model
• Yit = duration of spell on WC
• Ait = period after benefits hike
• Hit = high earnings group (Income>E3)
• Yit = β0 + β1Hit + β2Ait + β3AitHit + β4Xit’ + εit
• Diff-in-diff estimate is β3
42
43
Questions to ask?
• What parameter is identified by the quasi-experiment? Is this an
economically meaningful parameter?
• What assumptions must be true in order for the model to provide
and unbiased estimate of β3?
• Do the authors provide any evidence supporting these assumptions?
44
Almond et al.
• Neonatal mortality, dies in first 28 days
• Infant mortality, died in first year
• Babies born w/ low birth weight(< 2500 grams) are more prone to
• Die early in life
• Have health problems later in life
• Educational difficulties
• generated from cross-sectional regressions
• 6% of babies in US are low weight
• Highest rate in the developed world
45
• Let Yit be outcome for baby t from mother I
• e.g., mortality
• Yit = α + bwit β + Xi γ + αi + εit
• bw is birth weight (grams)
• Xi observed characteristics of moms
• αi unobserved characteristics of moms
46
• Cross sectional model is of the form
• Yit = α + bwit β + Xi γ + uit
• where uit =αi + εit
• Many observed factors that might explain health (Y) of an infant
• Prenatal care, substance abuse, smoking, weight gain (of lack of it)
• Some unobserved as well
• Quality of diet, exercise, generic predisposition
• αi not included in model
• Cov(bwit,uit) < 0
47
• Solution: Twins
• Possess same mother, same environmental characterisitics
• Yi1 = α + bwi1 β + Xi γ + αi + εi1
• Yi2 = α + bwi2 β + Xi γ + αi + εi2
• ΔY = Yi2-Yi1 = (bwi2-bwi1) β + (εi2- εi1)
48
Questions to consider?
• What are the conditions under which this will generate unbiased
estimate of β?
• What impact (treatment effect) does the model identify?
49
50
Big Drop in
Coefficient on
Birth weight
Large change
In R2
51
52
More general model
• Many within group estimators that do not have the
nice discrete treatments outlined above are also
called difference in difference models
• Cook and Tauchen. Examine impact of alcohol taxes
on heavy drinking
• States tax alcohol
• Examine impact on consumption and results of
heavy consumption death due to liver cirrhosis
53
• Yit = β0 + β1 INCit + β2 INCit-1 + β1 TAXit + β2 TAXit-1 + ui + vt + εit
• i is state, t is year
• Yit is per capita alcohol consumption
• INC is per capita income
• TAX is tax paid per gallon of alcohol
54
• Model requires that untreated groups provide estimate of baseline
trend would have been in the absence of intervention
• Key – find adequate comparisons
• If trends are not aligned, cov(TitAit,εit) ≠0
• Omitted variables bias
• How do you know you have adequate comparison sample?
55
• Concern: suppose that the intervention is more likely in a state with
a different trend
• Do the pre-treatment samples look similar?
• Tricky. D-in-D model does not require means match – only trends.
• If means match, no guarantee trends will
• However, if means differ, aren’t you suspicious that trends will as well?
56
Add state specific time trends
• Yit = β0 + β1 INCit + β2 INCit-1 + β1 TAXit + β2 TAXit-1 + βi T + ui + vt + εit
• i is state, t is year
• Yit is per capita alcohol consumption
• INC is per capita income
• TAX is tax paid per gallon of alcohol
• β i gives state specific time trend
57
First-Stage Estimates: Wet Laws and On-Premises Alcohol Licenses, 1977-2011
On-Premises Licenses
Wet Law
.177***
(.025)
On-Premises Licenses
.143***
(.022)
Mean of the dependent variable
.617
.617
N
3,352
3,352
R2
.893
.942
F-Statistic
51.9
41.9
Year FEs
Yes
Yes
County FEs
Yes
Yes
Covariates
Yes
Yes
County linear trends
No
Yes
Notes: Regressions are weighted by county population and standard errors are corrected for clustering at
the county level. The dependent variable is equal to the number of active on-premises liquor licenses per
1,000 population in county c and year t. The years 1995, 1996, and 1999 are excluded because of
missing crime data.
On-Premises Alcohol Licenses and Violent Crime, 1977-2011
OLS
Violent Crime
OLS
Violent Crime
2SLS
Violent Crime
2SLS
Violent Crime
On-Premises
Licenses
.853*
(.500)
1.24*
(.667)
4.31**
(1.77)
5.00***
(1.87)
N
3,352
3,352
3,352
3,352
R2
.758
.847
.740
.836
Year FEs
Yes
Yes
Yes
Yes
County FEs
Yes
Yes
Yes
Yes
Covariates
Yes
Yes
Yes
Yes
County linear
trends
No
Yes
No
Yes
Notes: Regressions are weighted by county population and standard errors are corrected for clustering at the
county level. The dependent variable is equal to the number of violent crimes per 1,000 population in county
c and year t. The years 1995, 1996, and 1999 are excluded because of missing crime data.
Falsification tests
• Add “leads” to the model for the treatment
• Intervention should not change outcomes before it appears
• If it does, then suspicious that covariance between trends and
intervention
• Yit = β0 + β3 Ait + α1Ait-1 + α2 Ait-2 + α3Ait-3 + ui + vt + εit
• Three “leads”
• Test null: Ho: α1=α2=α3=0
60
Pick control groups that have similar
pre-treatment trends
• Most studies pick all untreated data as controls
• Example: Some states raise cigarette taxes. Use states that do not change
taxes as controls
• Example: Some states adopt welfare reform prior to TANF. Use all nonreform states as controls
• Can also use econometric procedure to pick controls
• Appealing if interventions are discrete and few in number
• Easy to identify pre-post
61
Download