Traffic Fatality Rate Nebraska Colorado Colorado Implements Policy time Difference in Difference Model with two states and two time periods. The estimating equation is: yst = α0 + δ Treatments*Postt + εst Crime Rate Nebraska Average yst is α0 δ Average yst is α0 + δ (Note here that δ is negative) Colorado Colorado Implements Policy time Suppose the estimating equation is: yst = α0 + δ Treatedst + εst Crime Rate Nebraska Average yst is α0 δ Average yst is α0 + δ (Note here that δ is negative) Colorado Colorado Implements Policy time Suppose the estimating equation is: yst = α0 + δ Treatedst + εst Traffic Fatality Rate Nebraska Colorado Colorado Implements Policy time Difference in Difference Model with two states and two time periods. The estimating equation is: yst = α0 + α1Treatments + δ Treatments*Postt + εst Traffic Fatality Rate Nebraska α0 Colorado α0 + α1 + δ α 0 + α1 Colorado Implements Policy time Difference in Difference Model with two states and two time periods. The estimating equation is: yst = α0 + α1Treatments + δ Treatments*Postt + εst Traffic Fatality Rate Nebraska α0 Colorado α0 + α1 + δ δ α 0 + α1 Colorado Implements Policy time Difference in Difference Model with two states and two time periods. The estimating equation is: yst = α0 + α1Treatments + δ Treatments*Postt + εst Traffic Fatality Rate Nebraska Colorado Colorado Implements Policy time Difference in Difference Model with two states and two time periods. The estimating equation is: yst = α0 + α1Treatments +α2Postt + + δ Treatments*Postt + εst Traffic Fatality Rate Nebraska Colorado Colorado Implements Policy time Difference in Difference Model with two states and two time periods. The estimating equation is: yst = α0 + α1Treatments + +α2Postt + δ Treatments*Postt + εst Traffic Fatality Rate Nebraska α2 Colorado α2+ δ α0 Colorado Implements Policy α0+ α1 time Difference in Difference Model with two states and two time periods. The estimating equation is: yst = α0 + α1Treatments +α2Postt + + δ Treatments*Postt + εst Traffic Fatality Rate Nebraska α2 Colorado δ α2 α0 Colorado Implements Policy α0+ α1 time Difference in Difference Model with two states and two time periods. The estimating equation is: yst = α0 + α1Treatments +α2Postt + + δ Treatments*Postt + εst Traffic Fatality Rate Nebraska α2 Colorado . δ α2 Colorado Implements Policy time Difference in Difference Model with two states and two time periods. The estimating equation is: yst = α0 + α1Treatments +α2Postt + + δ Treatments*Postt + εst Traffic Fatality Rate Nebraska α2 Colorado . δ α2 Colorado Implements Policy time This estimating equation is exactly the same as: yist = α0 + α1dCOs +α2d2t + + δ Policyst + εist dCO is dummy for CO; d2 is dummy for year 2; Policy is dummy =1 in CO after policy is passed Traffic Fatality Rate Nebraska Colorado . Colorado Implements Policy time π¦ππ π‘ − π¦π = α2d2t + + δ Policyst +πππ π‘ − ππ Traffic Fatality Rate Colorado . δ α2 Nebraska time The estimate of δ is exactly the same as obtained by subtracting the mean for each state: π¦ππ π‘ − π¦π = α2d2t + + δ Policyst +πππ π‘ − ππ Traffic Fatality Rate Colorado . δ Nebraska time And δ is exactly the same after subtracting the mean for each time period: π¦ππ π‘ − π¦π − π¦π‘ = δ Policyst +πππ π‘ − ππ − ππ‘ When we have lots of states and years, an author typically writes yst = β0 + δPolicyst + vs + zt +εst And then the author might say that the equation is estimated including state and year fixed effects Start with state fixed effects, common time trend Traffic Fatality Rate Nebraska Colorado . α0 Colorado Implements Policy α0 + α 1 time If we’re using state-level data, then each state contributes one observation per year. The estimating equation with a common time trend is yist = α0 + α1dCOs +α2t + δ Policyst + εist Traffic Fatality Rate Colorado Nebraska . Colorado Implements Policy time Remove state specific meansο δ will still be the vertical change π¦ππ π‘ − π¦π = α2t + δ Policyst + πππ π‘ − ππ Traffic Fatality Rate Nebraska Colorado New Mexico . Colorado Implements Policy . New Mexico Implements Policy time With multiple states δ will be the average vertical change across the states yst = α0 + α2 t + δPolicyst + vs + εst Traffic Fatality Rate Nebraska . Colorado New Mexico .. Colorado Implements Policy . New Mexico Implements Policy time If we’re using repeated cross-sectional data at the individual level, then each state contributes multiple observations per year. The estimating equation is: yist = α0 + δ Policyst + α2 t + vs + εist Pooled model: MRit = β0 +β1unemit +eit id state year 19 LA 87 19 LA 90 19 LA 93 murder rate 11.1 17.2 20.3 5 5 5 10.6 11.9 13.1 CA CA CA 87 90 93 unem 12 6.2 7.4 5.8 5.6 9.2 FD model: ΔMRit = β1Δunemit + Δ eit id state year 19 LA 87 19 LA 90 19 LA 93 murder rate 11.1 17.2 20.3 5 5 5 10.6 11.9 13.1 CA CA CA 87 90 93 ΔMR -6.1 3.1 -1.3 1.2 unem ΔUN 12 -6.2 -5.8 7.4 1.2 5.8 5.6 9.2 --.02 3.6 FE model: MRit -ππ π = β1unemit−π’πππi + eit id state 19 LA 19 LA 19 LA 5 5 5 CA CA CA murder year rate 87 11.1 90 17.2 93 20.3 87 10.6 90 11.9 93 13.1 UNmean MRMean mean unem mean 16.2 -5.1 12 8.5 3.5 16.2 1 6.2 8.5 -2.3 16.2 4.1 7.4 8.5 -1.1 11.9 11.9 11.9 -1.3 0 1.2 5.8 5.6 9.2 6.9 6.9 6.9 -1.1 -1.3 2.3 The estimated coefficient on unemployment from these two models will be the same if there are only 2 years. Otherwise, they are both consistent estimators of β1 but their exact values will differ FE model: MRit = α0+β1unemit + α1Statei +eit id state 19 LA 19 LA 19 LA 5 5 5 CA CA CA murder year rate unem 87 11.1 12 90 17.2 6.2 93 20.3 7.4 87 10.6 90 11.9 93 13.1 5.8 5.6 9.2 State 1 1 1 0 0 0 FE model: MRit = α0+β1unemit + α1Statei +eit id state 19 LA 19 LA 19 LA 5 5 5 CA CA CA murder year rate unem 87 11.1 12 90 17.2 6.2 93 20.3 7.4 87 10.6 90 11.9 93 13.1 5.8 5.6 9.2 State 1 1 1 The constant term in this model and the previous version of the FE model will be different (how?) 0 0 0 The estimated coefficient on unemployment WILL BE EXACTLY THE SAME Now look at time shocks. What if all states experience a common shock in a given year? What if the means varies by time as well as by state? Crime Rate Jefferson legalizes by-the-drink sales . z8 Jefferson z3 z8 Barber legalizes by-the-drink sales Barber z3 Positive shock to crime Negative shock to crime time Common time fixed effects-- δ will still be the average vertical change from before and after the policy Violent Crimect = π0 + δ Wet Lawct + X‘ctπ2 + vc + zt + εct Crime Rate Jefferson legalizes by-the-drink sales . z8 Jefferson z3 z8 Barber legalizes by-the-drink sales Barber z3 Positive shock to crime Negative shock to crime time Common time fixed effects-- δ will still be the average vertical change from before and after the policy Violent Crimect = π0 + δ Wet Lawct + X‘ctπ2 + vc + zt + εct Crime Rate Jefferson legalizes by-the-drink sales . Jefferson Barber Barber legalizes by-the-drink sales Positive shock to crime Negative shock to crime But note if you look at it, Barber is increasing faster than Jefferson: Violent Crimect = π0 + π1Wet Lawct + X‘ctπ2 + vc + zt + εct time Crime Rate Jefferson legalizes by-the-drink sales . Jefferson Franklin Barber Barber legalizes by-the-drink sales Franklin legalizes by-thedrink sales Positive shock to crime Negative shock to crime Adding state specific time trends Violent Crimect = π0 + δ Wet Lawct + X‘ctπ2 + vc + zt + Θcβ t + εct time Meyer et al. • Workers’ compensation • State run insurance program • Compensate workers for medical expenses and lost work due to on the job accident • Premiums • Paid by firms • Function of previous claims and wages paid • Benefits -- % of income w/ cap 35 • Typical benefits schedule • • • • Min( pY,C) P=percent replacement Y = earnings C = cap • e.g., 65% of earnings up to $400/month 36 • Concern: • Moral hazard. Benefits will discourage return to work • Empirical question: duration/benefits gradient • Previous estimates • Regress duration (y) on replaced wages (x) • Problem: • given progressive nature of benefits, replaced wages reveal a lot about the workers • Replacement rates higher in higher wage states 37 • Yi = Xiβ + αRi + εi • Y (duration) • R (replacement rate) • Expect α > 0 • Expect Cov(Ri, εi) • Higher wage workers have lower R and higher duration (understate) • Higher wage states have longer duration and longer R (overstate) 38 Solution • Quasi experiment in KY and MI • Increased the earnings cap • Increased benefit for high-wage workers • (Treatment) • Did nothing to those already below original cap (comparison) • Compare change in duration of spell before and after change for these two groups 39 40 41 Model • Yit = duration of spell on WC • Ait = period after benefits hike • Hit = high earnings group (Income>E3) • Yit = β0 + β1Hit + β2Ait + β3AitHit + β4Xit’ + εit • Diff-in-diff estimate is β3 42 43 Questions to ask? • What parameter is identified by the quasi-experiment? Is this an economically meaningful parameter? • What assumptions must be true in order for the model to provide and unbiased estimate of β3? • Do the authors provide any evidence supporting these assumptions? 44 Almond et al. • Neonatal mortality, dies in first 28 days • Infant mortality, died in first year • Babies born w/ low birth weight(< 2500 grams) are more prone to • Die early in life • Have health problems later in life • Educational difficulties • generated from cross-sectional regressions • 6% of babies in US are low weight • Highest rate in the developed world 45 • Let Yit be outcome for baby t from mother I • e.g., mortality • Yit = α + bwit β + Xi γ + αi + εit • bw is birth weight (grams) • Xi observed characteristics of moms • αi unobserved characteristics of moms 46 • Cross sectional model is of the form • Yit = α + bwit β + Xi γ + uit • where uit =αi + εit • Many observed factors that might explain health (Y) of an infant • Prenatal care, substance abuse, smoking, weight gain (of lack of it) • Some unobserved as well • Quality of diet, exercise, generic predisposition • αi not included in model • Cov(bwit,uit) < 0 47 • Solution: Twins • Possess same mother, same environmental characterisitics • Yi1 = α + bwi1 β + Xi γ + αi + εi1 • Yi2 = α + bwi2 β + Xi γ + αi + εi2 • ΔY = Yi2-Yi1 = (bwi2-bwi1) β + (εi2- εi1) 48 Questions to consider? • What are the conditions under which this will generate unbiased estimate of β? • What impact (treatment effect) does the model identify? 49 50 Big Drop in Coefficient on Birth weight Large change In R2 51 52 More general model • Many within group estimators that do not have the nice discrete treatments outlined above are also called difference in difference models • Cook and Tauchen. Examine impact of alcohol taxes on heavy drinking • States tax alcohol • Examine impact on consumption and results of heavy consumption death due to liver cirrhosis 53 • Yit = β0 + β1 INCit + β2 INCit-1 + β1 TAXit + β2 TAXit-1 + ui + vt + εit • i is state, t is year • Yit is per capita alcohol consumption • INC is per capita income • TAX is tax paid per gallon of alcohol 54 • Model requires that untreated groups provide estimate of baseline trend would have been in the absence of intervention • Key – find adequate comparisons • If trends are not aligned, cov(TitAit,εit) ≠0 • Omitted variables bias • How do you know you have adequate comparison sample? 55 • Concern: suppose that the intervention is more likely in a state with a different trend • Do the pre-treatment samples look similar? • Tricky. D-in-D model does not require means match – only trends. • If means match, no guarantee trends will • However, if means differ, aren’t you suspicious that trends will as well? 56 Add state specific time trends • Yit = β0 + β1 INCit + β2 INCit-1 + β1 TAXit + β2 TAXit-1 + βi T + ui + vt + εit • i is state, t is year • Yit is per capita alcohol consumption • INC is per capita income • TAX is tax paid per gallon of alcohol • β i gives state specific time trend 57 First-Stage Estimates: Wet Laws and On-Premises Alcohol Licenses, 1977-2011 On-Premises Licenses Wet Law .177*** (.025) On-Premises Licenses .143*** (.022) Mean of the dependent variable .617 .617 N 3,352 3,352 R2 .893 .942 F-Statistic 51.9 41.9 Year FEs Yes Yes County FEs Yes Yes Covariates Yes Yes County linear trends No Yes Notes: Regressions are weighted by county population and standard errors are corrected for clustering at the county level. The dependent variable is equal to the number of active on-premises liquor licenses per 1,000 population in county c and year t. The years 1995, 1996, and 1999 are excluded because of missing crime data. On-Premises Alcohol Licenses and Violent Crime, 1977-2011 OLS Violent Crime OLS Violent Crime 2SLS Violent Crime 2SLS Violent Crime On-Premises Licenses .853* (.500) 1.24* (.667) 4.31** (1.77) 5.00*** (1.87) N 3,352 3,352 3,352 3,352 R2 .758 .847 .740 .836 Year FEs Yes Yes Yes Yes County FEs Yes Yes Yes Yes Covariates Yes Yes Yes Yes County linear trends No Yes No Yes Notes: Regressions are weighted by county population and standard errors are corrected for clustering at the county level. The dependent variable is equal to the number of violent crimes per 1,000 population in county c and year t. The years 1995, 1996, and 1999 are excluded because of missing crime data. Falsification tests • Add “leads” to the model for the treatment • Intervention should not change outcomes before it appears • If it does, then suspicious that covariance between trends and intervention • Yit = β0 + β3 Ait + α1Ait-1 + α2 Ait-2 + α3Ait-3 + ui + vt + εit • Three “leads” • Test null: Ho: α1=α2=α3=0 60 Pick control groups that have similar pre-treatment trends • Most studies pick all untreated data as controls • Example: Some states raise cigarette taxes. Use states that do not change taxes as controls • Example: Some states adopt welfare reform prior to TANF. Use all nonreform states as controls • Can also use econometric procedure to pick controls • Appealing if interventions are discrete and few in number • Easy to identify pre-post 61