Chapter 3 - the University of California, Davis

Chapter 3: Empirical Tools of Public Finance
 Empirical public finance is the use of data and statistical methodologies
to measure the impact of government policy on individuals and markets.
 For example, helps us figure out the magnitude of the labor supply
response from a TANF benefit cut.
 Key issue in empirical public finance is separating causation from
 Correlated means that two economic variables move together.
 Casual means that one of the variables is causing the movement in
the other.
 This lesson overviews the kinds of methods that economists rely on to
learn about the causal effects of government policy.
 There are many examples where causation and correlation get confused.
 It is critical for government policy to understand the difference;
otherwise policy may not have the intended impact.
 One interesting example is about Russian peasants.
 There was a cholera epidemic. Government sent doctors to the
worst-affected areas to help.
 Peasants observed that in areas with lots of doctors, there was lots
of cholera.
 Peasants concluded doctors were making things worse.
 Based on this insight, they murdered the doctors.
 Another example concerns SAT preparation courses.
 In 1988, Harvard interviewed its freshmen and found those who
took SAT “coaching” courses scored 63 points lower than those
who did not.
 One dean concluded that the SAT courses were unhelpful and “the
coaching industry is playing on parental anxiety.”
The Problem
 In both examples, there is a common problem: an attempt to interpret a
correlation as a causal relationship, without sufficient thought to the
underlying data generating process.
 For any correlation between two variables A and B, there are three
possible explanations for a correlation:
1. A is causing B.
2. B is causing A.
3. Some other factor is causing both (XA & B)
4. Some other factors is causing A and B (XA and YB)
 Ex1: In the case of Russian peasant, the possibilities might be:
1. Doctors cause peasants to die from cholera through incompetent
more Dr  more Peasant death
2. Higher incidence of death caused more physicians to be present.
more Peasant death  more Drs
 Peasants thought possibility (1) was correct.
 Ex2: In the Harvard SAT case, the possibilities could be:
(1) SAT prep  worsen preparation for the SATs.
(2) Those with poorer test taking ability take prep courses to try to
catch up. Or would be failing students took the prep courses.
Poor performers  take SAT prep
(3) Those who are generally nervous both like to take prep courses
and do the worst on standardized exams.
Nervousness taking SAT prep
Nervousness  perform poorly
 Harvard dean thought possibility (1) was correct.
 Although the peasants or the Harvard dean could actually be correct,
odds are they are misinterpreting the underlying process at work.
 For policy purposes, what we care about is causation.
 Knowing that two factors are correlated gives you no predictive power.
 The “gold standard” of causality is a randomized trial.
 The trial proceeds by taking a group of volunteers and randomly
assigning them to either a “treatment” group that gets the intervention,
or a “control” (or comparison) group that is denied the intervention.
 With random assignment, the assignment of the intervention is not
determined by anything about the subjects.
 As a result, the treatment group is identical to the control group in every
facet but one: the treatment group gets the intervention.
 In the SAT example,
o the “treatment” group members are those who took the coaching
o the “control” group members are those who did not.
 In the Russian peasant example,
o the “treatment” group were communities where doctors were
o the “control” group were communities where doctors were not
The Problem of Bias
 In both cases, the assignment of the intervention was not random.
 This means the treatment and control groups are not identical.
 Non-random assignment, in turn, could cause bias.
 Bias represents any source of difference between treatment and control
groups that is correlated with the treatment, but not due to the treatment.
o In the SAT example, the impact of SAT courses is biased by the
fact that those who take the prep course are likely to do worse on
the SAT for other reasons.
o In the Russian peasant example, the estimates are biased by the
fact that the government assigned doctors to the worst-off
 By definition, such differences do not exist in a randomized trial, since
the groups are not different in any consistent fashion.
 As a result, randomized trials have no bias, and it is for this reason they
are the “gold standard” for empirically estimating causal effects.
Randomized Trials in the TANF Context
 In the last lesson, we learned that economic theory predicts increases in
labor supply when TANF benefits are cut, but the magnitude of the effect
is unclear.
 One could design a randomized trial to learn about the elasticity of
employment with respect to TANF benefits.
 Imagine a large group (say, 2000) of single mothers were randomly
assigned to one of two groups with a coin flip:
o The “control” group continues to receive a guarantee of $5,000.
o The “treatment” group now has their TANF benefit cut to $3,000.
 Follow groups for a period of time, and measure the work effort.
 In an experiment like this in California in 1992
o the elasticity of employment w.r.t welfare benefits was estimated
to be -0.67.
o Thus, a 10% decrease in benefits resulted in a 6.7% increase in
Why We Need to Go Beyond Randomized Trials
 Randomized trials present some problems:
1. They can be expensive.
2. They can take a long time to complete.
3. They may raise ethical issues (especially in the context of medical
4. The inferences from them may not generalize to the population as
a whole (valid only for volunteered people).
5. Subjects may drop out of the experiment for non-random reasons,
a problem known as attrition.
 For these reasons (especially the first one about randomized trials being
expensive), economists often take different approaches to try to assess
causal relationships in empirical research.
 Often researchers are faced with observational data, data generated from
individual behavior observed in the real world.
 For example, the data from the SAT example would consist of data on
which Harvard freshmen took the coaching course, along with their SAT
 The goal of this section is to review the kinds of approaches researchers
use to estimate causal effects with data like this.
 There are five main approaches:
a. Time series analysis
b. Cross-sectional regression analysis
c. Quasi-experiments
d. Structural modeling
e. Panel data
Time Series Analysis
 Time series analysis documents the correlation between the variables of
interest over time.
 For example, could gather data over time on the TANF income
guarantee, and compare that to the labor supply of single mothers over
 Figure 1 illustrates these trends.
 Figure 1 reveals that real benefits have declined dramatically over time,
while average hours have risen substantially.
 Apparently supports the theory that TANF benefit cuts should increase
labor supply.
 Problems:
1. Two sub-periods (1968-1976, and 1978-1983) show negative effect on
labor supply, or zero effect.
o Highlights difficulty that when there is a slow moving trend
(benefit declines), it is very difficult to infer causal effect of this on
another variable.
2. There are many potential explanations for the changes, too, such as:
o Greater acceptance of women in workplace.
o Better child care options.
o Changes in social norms about working.
o Other government program like the earned income tax credit.
 EITC is a federal wage subsidy to low income people
o Economic growth.
 Thus, in many cases time series analysis is not all that useful.
 But if there are sharp changes in a policy variable over time, then there
may be some room for valid inference.
o Cigarette price war in April 1993.
o Tobacco settlement in 1998.
 Figure 2 illustrates these trends.
Cross-Sectional Regression Analysis
 Cross-sectional regression analysis is a statistical method for assessing
the relationship between two variables while holding other factors
 “Cross-sectional” means comparing many individuals at one point in
 Bivariate regression is a means of quantifying the extent to which two
series covary.
 For example, Figure 3 shows this with TANF benefit cuts and labor
 Figure 3
Labor Supply
The downward sloping line shows
the negative correlation between
benefits and labor supply.
The slope of the line is -0.2(=∆L/∆G = 90/450
meaning that a $1 reduction in TANF leads to
+0.2 hours of work.
The other person works
90 hours when faced
with $4,550 in benefits.
One person works zero
hours when faced with
$5,000 in benefits.
TANF benefits
 Regression analysis takes the correlation further by finding the line that
best fits the data.
 It describes the relationship between the dependent variable (in this case,
labor supply), and the independent variables (in this case, TANF
 With 2 data points, the line fits perfectly.
 In real data, there will be many more individuals.
 For example, the Current Population Survey (CPS) collects information
on sources of income, hours of work, and health insurance.
 Figure 4 graphs hours of work per year on the y-axis for all single
mothers in the CPS data set against TANF benefits on the x-axis.
 The “best fitting” line has a slope of -110.
 Can convert this into an elasticity: L= a + blnG
o b = dL/dlnG = ∆L/%∆G so b/L = %∆L/%∆G
o elasticity = %∆L/%∆G = b/L
o L=Average work effort is 748 hours per year, b = -110
o 100% rise in TANF leads to a 15% reduction in hours worked
(-110/748) = -0.147.
o Thus the elasticity of work with respect to benefits is -0.15, a
fairly inelastic response.
 This line corresponds to the regression:
HOURS i    TANFi   i
 Where there is one observation for each mother “i”. In the regression, α
is the constant term, β is the slope coefficient, and ε is the error term.
 ε represents the difference for each observation between its actual value
and its predicted value on the regression line.
 Interpreting the results is potentially problematic.
1. One interpretation is that higher TANF benefits “cause” lower
labor supply.
2. Another interpretation is that single mothers with a greater “taste”
for leisure get higher TANF benefits due to the tax rate.
- These people wouldn’t work much even if TANF were not
- Working mothers automatically get lower benefits.
=> Higher taste for leisure => TANF↑ and not the other way around
 Figure 3, for example, showed that the relationship may not be causal.
Instead, preferences may differ.
 Less obvious in Figure 4, since we do not know the underlying utility
functions of the CPS respondents
 One advantage of regression analysis is the ability to include control
variables, that is, other independent variables that may affect the
dependent variable (in addition to TANF benefits).
 Control variables are included to account for differences between
treatment and control groups that can lead to bias.
- i.e., we might be able to include a control variable for “taste for
- Including these control variables allows us to reduce the
systematic differences between different groups.
 In reality, however, the kinds of control variables that exist in typical data
sets like the CPS are crude at best.
 “Taste for leisure” might be proxied with education, number of kids, age,
 Adding control variables changes the regression:
HOURS i    TANFi  CONTROLi   i
 Where the control variables account for race, education, age, and
 As the Appendix shows, many of these control variables are “yes/no”
indicator variables.
 In most applications, including this one, it is unlikely that control
variables will ever completely solve the problem.
 Thus, it’s difficult to get rid of the bias totally.
 Economists typically cannot set up randomized trials for many public
policy discussions. Yet, the time-series and cross-sectional approaches
are often unsatisfactory.
Quasi (or natural)-Experiments
 Quasi-experiment considers policy reform itself as an experiment and
tries to find a naturally occurring comparison group that can mimic the
properties of the control group in the properly designed experiment
 Quasi-experiments are changes in the economic environment that create
roughly identical treatment and control groups for studying the effect of
that environmental change.
o This allows researchers to take advantage of randomization created
by external forces.
o Find a naturally occurring comparison group that can mimic the
properties of the control group.
 Let outside forces do the randomization for us. In some cases, the
situation happens naturally.
o Suppose, for example, that Arkansas cut its TANF benefit by 20%
in 1997, and that we had a large sample of single mothers in
Arkansas in 1996 and 1998.
o At the same time, imagine that Louisiana’s benefits remained
 In principle, the alteration in the states’ policies has essentially performed
our randomization for us.
o The women in Arkansas
 who experienced the decrease in benefits are the treatment
o The women in Louisiana
 whose benefits were unchanged are the control.
o By computing the change in labor supply across these groups, and
then examining the difference between treatment (Arkansas) and
control (Louisiana), we can obtain an estimate of the impact of
benefits on labor supply that is free from bias.
 Imagine we simply studied single mothers in Arkansas alone.
 In the “experiment” single mothers in 1996 are the control group, and
those in 1998 are the treatment group.
HOURSAR,1998 - HOURSAR,1996 = Treatment effect + Bias from economic boom
 In practice, this comparison runs into the criticisms that confront us with
time series analysis.
o i.e., the national economy was growing exceptionally fast during
this period. This contains both the treatment effect and the bias
from the economic boom.
To fix this problem we compare the treatment group for whom the policy
changed to a control group for whom it did not.
 Single mothers in Louisiana did not experience the TANF cut, yet benefit
from the growth in the economy.
 The treatment group for whom the policy changed: Arkansas
HOURSAR,1998 - HOURSAR,1996= Treatment effect + Bias from economic boom
 The control group for whom the policy did not change: Louisiana
HOURSLA,1998 - HOURSLA,1996 = Bias from economic boom
 By subtracting the change in hours of work in Louisiana from that in
Arkansas, we control for the bias caused by the economic boom.
AK ,1998
 HOURS AK ,1996    HOURS LA,1998  HOURS LA,1996 
= Bias from economic boom
 Assumes that in the absence of the reform hours of work of single
mothers in Arkansas would have changed by the same amount as in
Louisiana. The method relies on two important assumptions:
o Common time effects across groups
o No systematic composition changes in each group
 We examine single mothers in the neighboring state of Louisiana, in the
bottom panel of Table 1.
Using Quasi-Experimental Variation
Benefit Guarantee
Hours of Work Per Year
We can gather the same
kind of data for Louisiana.
Benefit levels did not fall in Louisiana.
Benefit Guarantee
Hours of Work Per Year
But labor supply still increased, perhaps
due to the growing economy.
It appears that 50 hours of the 200 hour increase was
due to economic conditions, not TANF.
 In Arkansas while benefits fell by 20%, hours of work increased by 20%.
HOURESAK, 1998 - HOURESAK, 1996 = (1200 – 1000) =200 (20% rise)
o => elasticity of labor supply w.r.t. benefits levels = -1.
o larger than the -0.67 elasticity estimate found in the randomized
trial in California.
 There is likely to be bias in this “first-difference,” because there was
major economic growth during this period.
o Thus, single mothers in Arkansas may have increased their work
effort even if TANF benefits had not fallen.
 This approach yields the difference-in-difference (DID) estimator – the
difference between the changes in outcomes for the treatment group that
experiences an intervention and a control group that does not.
 The difference-in-difference estimator is:
AK ,1998
 HOURS AK ,1996    HOURS LA,1998  HOURS LA,1996 
(1200 – 1000) – (1050 – 1100) =150 hr
These net out the bias from the growing economy.
o Thus, the causal effect of TANF benefit cuts would be a 150-hour
increase in labor supply.
o The implied elasticity of hours w.r.t. welfare benefits = 0.75
 elast. =%∆l/%∆G = (150/1000)/(1000/5000) = 15%/20%
 Similar to that of California
 Note that the cross-sectional analysis would suggest that the reduction in
welfare benefits leads to a 100-hour increase in work,
 The analogous difference-in-difference regression would be:
H ijt    1 Bijt  2Yijt  3 Bijt Yijt  ijt
 In this case, H is hours of work, B is a dummy for the state with a TANF
benefits cut, and Y is a dummy for post-1997.
 The coefficient β3 represents the causal effect of the policy change.
 The subscripts i, j, and t indicate individual i in state j in time period t.
Problems with quasi-experimental analysis
 It is possible that the two states experienced different time effects
- i.e., different economic growth rates (this problem can be dealt
with DDD estimator see HW problem “A question in Quasi
- the economic boom can affect the two regions differently
 More generally, single mothers may be different across states.
 We can never be completely certain that we have purged the treatmentcontrol comparisons of bias.
Structural Modeling
 Both randomized trials and quasi-experiments suffer from two
1) They only provide an estimate of the causal impact of a particular
treatment. It is difficult to extrapolate beyond the changes in
2) The approaches often do not tell us why the outcomes change. For
example, the approaches do not separate out income and
substitution effects in the TANF example.
 The former approaches provide reduced-form estimates only, which
leaves much of the economic theory in a “black box.”
 Structural estimation attempt to estimate the underlying parameters of
the utility function.
 This allows a more thorough exploration of economic responses.
 These structural models are more difficult to estimate, and tend to rely on
the same set of limited information as the reduced form models.
 Structural models essentially assume an explicit form to the utility
function, but if the form is incorrect, the policy conclusions could be
 The approach of the text will be to rely on reduced form estimates
because it is easier to think about and explain.