I read Alberto`s paper

advertisement
Conference:
“New Methods For Cohesion Policy Evaluation: Promoting Accountability And Learning”
Warsaw, 30 November-1 December 2009
Five Directions to Improve the Applicability of Counterfactual Impact
Evaluations to Cohesion Policies
Discussion notes presented at the workshop
“Rigorous impact evaluation using counterfactuals”
Daniele Bondonio
University Piemonte Orientale (Italy)
Research Center for Evaluation Studies in the Public Sector
(Ph.D Carnegie Mellon University)
I)
The data for Counterfactual Impact Evaluations (CIEs) of cohesion policies
For Enterprise support and R&D support policies, undertaking credible CIEs requires ideally the
availability of firm-level data with unique firm-identifiers that can be merged with the data on
programs activity (number of and value of the incentives paid to each assisted firm). See for
example Bondonio (2007, 2009) When available, establishment level data, are an optimal choice as
they allow to avoid the possible measurement errors arising from the existing of multipleestablishments firms in which only certain establishments and not others where eligible for the
program assistance.
Besides the availability of firm-level or establishment-level data offered by national or regional
statistical services, it’s crucial that the data on program activities do include not only the enterprise
support or R&D support policies co-funded by the Structural Funds but also the entire spectrum of
alternative enterprise support policies offered by national and/or regional legislations. This is
because, in many regions where enterprise support was available by programs co-funded by the
Structural Funds, a large number of alternative enterprise support policies is available from national
or regional legislations (in Objective 2 areas such as the Piemonte Region of Italy, the number of
enterprise support programs available from national and regional legislations in addition to those
co-funded by the Structural funds is in excess of twenty).
As a result, without acquiring the data concerning the incentive payments offered by the coexisting
national or regional enterprise support programs, any counterfactual impact evaluations of
enterprise support policies co-funded by the structural funds would be at high risk of not being
reliable (as non-treated firms in the comparison group are at risk to be assisted by other enterprise
support programs not observable in the analysis).
To evaluate human capital investment policies, program activity data should be ideally linked with
individual-level data maintained by national social security or revenue service agencies. The
1
availability of such data would allow to build credible control groups of individuals not affected by
the policies to be evaluated but with similar characteristics of the treated individuals.
To evaluate geographically targeted programs of any type (for example urban renewal policies), at
the present stage, throughout the large majority of member states, the smallest current geographical
units at which data are easily available are NUTS_IIIs, which do not match the boundaries of cities
or of important assisted areas such as the Objective 2 areas.
Attempting to apply CIE or urban renewal policies would require the availability of reliable and
stable over-time data for smaller geographic units. While in the US such data are easily available
through the Census Tracts, in the EU, stable-over-time data at smaller geographical levels (such as
NUTS V) are instead, at the present time, very difficult to obtain and very unreliable for
comparisons across times as they are based on city administration boundaries that changes too
often.
For these reasons, applying CIEs to geographically targeted policies requires carefully planning the
evaluation design at the same time when the policy intervention is designed, so that the boundaries
of the target areas are such that can be covered by the existing statistical sources within the different
EU regions in which the programs are implemented.
II) The importance of estimating different impact evaluation parameters when undertaking
CIEs
In cases when both the characteristics of the incentives and the pre-intervention observable
characteristics of the treated units are all fairly homogeneous, to obtain policy relevant empirical
evidence is sufficient to estimate the Average Treatment Effect on the Treated (ATT) parameter,
which yields the average program impact of a single homogenous binary treatment variable.
When, instead, the policy treatment has quite different economic values across the population of
treated units, or when the treatment impact is expected to be different according to different preintervention observable characteristics of the treated units, a single average program impact
estimate such as the ATT is often of less policy relevance. In such cases, the recent methodological
developments in the CIE literature1 allow to retrieve policy relevant empirical evidence by
estimating different ATTs for different subpopulations of the treated units and/or for different
treatment categories.
Estimating such different categorical impacts for important subgroups of the treated units is better
achievable when the sample of treated units is sufficiently large (e.g. in the case of enterprise
support and R&D support policies). In such cases, as also argued by other authors (e.g Bartik 2004),
statistical CIE can offer some insights into why and how a program works, provided that sufficient
variation in program designs is observed and measured.
Applied to the case of enterprise support policies such categorical impacts can be estimated for
different ranges of the economic value of the incentives and for the different types of assistance
offered to firms (capital grants, soft-loans, technical assistance or infrastructure improvements). See
Bondonio 2007 for an empirical estimation with firm-level data of the Piemonte region of Italy.
1
For example: categorical propensity score matching (PSM) or the use of PSM as a first stage processing for reducing
model dependence. See Bondonio 2009.
2
For human capital investment policies, is often of great interest to estimate the distribution of
counterfactual treatment effects represented, for example, by the proportion of treated units
(considering either the entire population of the treated or each policy-relevant category of treated
units) for whom the outcome with the treatment is greater or equal to the counterfactual outcome.
A recent stream of the methodological literature has been devoted to impact identification strategies
for such distributional effects (often expressed as quantile treatment effects) and has underlined the
policy relevance of estimating impact results beyond mean effects on the treated (see for example:
Abadie et al. 2002, Cheser 2003, Carneiro et al 2003, Chernozhukov and Hansen 2005, 2006).
When feasible, estimating such distributions of treatment effects over different categories of
individuals based on their relevant pre-intervention characteristics would considerably enhance the
quality of the results that CIEs can offer.
III ) Tailoring CIEs to the specific features of the programs and of the available data
CIE works at his best when the specific empirical models to retrieve treatment effects are tailored to
the specific features of the policies to be evaluated, the data available for the analysis and the
features of the plausible causal links between the program intervention and the desired positive
social outcomes to be achieved (as the most convincing CIE models are often developed selecting
the control variables based on a credible theory outlying the causal links between the intervention
and changes in the outcome variable of the evaluation). No single pre-determined and exact
empirical models for CIE are to be advocated for each of the programs within the different sectors
of cohesion policies.
In evaluating enterprise support policies, for example, the appropriate empirical models for CIE
could vary quite sharply depending on whether or not eligible firms sell their goods and/or services
predominantly within the local markets in which they are located (e.g. craft enterprises). When
firms serve predominantly local markets, the major threats to the validity of the analysis would
come from changes (exogenous to the program incentives) that may occur in the different local
economies in which the assisted and non-assisted firms are located. As a result, searching for
geographical natural experiment conditions in which homogeneous communities are crossed by
administrative borders determining the differences in the treatment status of firms would provide a
great advantage over other established quasi-experimental techniques2 which would be more
appropriate for the cases in which the assisted firms sell their goods and/or services on national or
international markets.
IV) A word of caution on applying CIE to macroeconomic and long-term-effects evaluations
of enterprise support policies
In many cases, policy makers do show interest in knowing program impacts on regional
macroeconomic outcomes. In principle, any program of all sorts is somehow capable of affecting
distant outcomes, such as macro-economic or long-run indicators of the well-being of residents
measured at the level of the entire provinces, regions, in which eligible firms are located. In the
cases, however, in which the economic importance of the group of assisted firms, compared to the
size of the province/region/state economy in which they are located is very small, CIEs should be
2
Such as Propensity score matching, which would have to rely either on conditional independence assumption (CIA,
i.e. selection into treatment is based on observables characteristics of firms’ local markets/communities) or on the
hypothesis that all (or part of) the unobserved heterogeneity of firms’ local markets/communities are fixed effects (in
case of Difference in Difference schemes applied to comparisons of firms outcomes), or at least fixed linear growth
trends (in case of triple differencing schemes, see Bondonio 2009 for further details).
3
applied with caution. This is because, in such cases, any actual program impact (in the form of a
positive impulse given to the province/region/state economy) would become virtually undetectable
from the changes to the outcome variable of the evaluation caused by many confounding factors
(including, in many cases, the presence of other business incentive programs) of a much greater
importance than the possible program-induced improvements in the economic activity of the
assisted firms.
Using rigorous CIE designs to assess whether or not business incentives had long-lasting impacts
on employment or economic activity outcomes of assisted firms is also to be done very cautiously
Assisted firms are economic units embedded in many ways in a network of economic transactions
from ones to the others. In the medium/long-run, a possible positive program impulse produced on
the assisted firms employment or economic activity is likely to have enough time to generate
subsequent impacts also on non-assisted firms. In this case, outcome data from non-assisted firms
could not be considered anymore as unaffected by the program incentives and used to retrieve
counterfactual estimates.
V) Can randomized experiments have role in cohesion policies?
Pilot programs with random assignment are difficult to be implemented for cohesion policies
because of ethical and political difficulties in excluding some eligible firms or target areas from the
incentives.
For some type of policies (e.g. enterprise support policies and R&D support policies), however,
such difficulties could be eased if the experimentation takes the form of random selection of firms
for targeted marketing of the program. If such randomly assigned marketing efforts are strong
enough, the result should be some sharp difference in the usage of the program incentives between
the treated firms (those receiving the marketing efforts) and the control-group firms (those not
receiving the marketing efforts). The results would be a source of variation in program usage that
does not directly affect the outcome variable of the evaluation.
A second possibility to ease-up the ethical and political difficulties associated with experiments is
offered by delay-of-treatment randomization schemes in which the program intervention in the
comparison group is delayed in time rather than being denied.
References
Abadie A., Angrist J., Imbens G. (2002), Instrumental Variables estimates of the Effect of
Subsidized Training on the Quantiles of Trainee Earnings, Econometrica 70, 91-117.
Bartik T.J. (2004), Evaluating the impacts of local economic development policies on local
economic outcomes: What has been done and what is doable, in: Evaluating local economic and
employment development: How to assess what works among programmes and policies, OECD,
Paris, 113-141.
Bondonio D. (2009), Impact identification strategies for evaluating business incentive programs,
POLIS Working Papers n. 145/2009 University Piemonte Orientale, Italy
(http://polis.unipmn.it/pubbl/index.php?paper=2440).
4
Bondonio D. (2007), The Employment Impact of Business Incentive Policies: a Comparative
Evaluation of Different Forms of Assistance, POLIS Working Papers n. 101/2007 University
Piemonte Orientale, Italy (http://polis.unipmn.it/pubbl/index.php?paper=2052).
Carneiro P., Hansen K. T., Heckman J.J. (2003), Estimating Distributions of Treatment Effects with
an Application to the Returns to Schooling and Measurement of the Effects of Uncertainty on
College Choice, IZA Working Papers n. 767.
Chernozhukov V., Hansen C. (2005), An IV Model of Quantile Treatment Effects, Econometrica
73, 245-261.
Chernozhukov V., Hansen C. (2006), Instrumental Quantile Regression Inference for Structural and
Treatment Effect Models, Journal of Econometrics 132, 491-525.
Cheser A. (2003), Identification in Nonseparable Models, Econometrica 71, 1405-1441.
5
Download