OVE’s Experience with Impact Evaluations Paris June, 2005 Impact Evaluations Alternative definitional models: – time elapsed since intervention – Counterfactual comparison OVE adopted the counterfactual approach, and further limited the initial sample to programs with partial coverage. Partial coverage allows observation of treatment effects through comparison of treated and untreated groups Policy The general evaluative question proposed by the IDB’s ex post policy is “…the extent to which the development objectives of IDB-financed projects have been attained.” This questions is most convincingly answered through treatment effect evaluations Selecting Projects Random selection is appropriate for accountability-oriented evaluations Purposive selection of projects of similar design across countries is better for generating learning regarding the model underlying the interventions Clusters of like projects permit meta-evaluations of models Projects Selected Chose purposive cluster sampling strategy but some stand-alone projects. A total of 16 projects were selected Clusters: (i) Neighborhood Improvement projects and (ii) Land Titling Projects. Stand-alone: Cash Transfer (Argentina), Potable Water (Ecuador); Agricultural Subsidies and Cash Transfer Programs (Mexico’s Procampo and Opportunities programs); Social Investment Fund (Panama) Stand-alones serve as pilots for future clusters Both Performance Monitoring and Treatment Effect Are Required Performance Monitoring versus Treatment Effect Evaluation Performance monitoring Treatment effect evaluation Primary goal: accountability to stakeholders and resolution of execution problems, costefficiency Primary goal: knowledge creation (understanding and improving program treatment effects), costeffectiveness and cost-benefit Analysis of outputs & gross outcome effects to improve implementation Analysis of net effects (treatment effects) of development outcomes to improve project design (concurrently or for future similar projects) Data collection is ongoing, relying on readily accessible and regularly collected data Data collection is periodic, more intensive and requires information on both beneficiaries and nonbeneficiaries over time. Treatment Effect includes randomized design; propensity score matching, controlled comparison, discontinuous regressions. Limits to Treatment Effect Evaluations Comparison of Alternative Approaches to Program Evaluation Treatment Effect Approach Structural Econometric Approach Evaluates the treatment effect of Evaluates the treatment effect of Range of questions existing program. Evaluates one existing program. Forecasts the addressed program in one environment. Cannot program’s effect in a new environment. Predicts the effects of a predict effects of a new program. program never tried before. Programs with partial coverage Range of programs that Programs with either partial or (treatment and control groups) universal coverage depending on can be evaluated variation in data (prices and endowments Not generally comparable unless High comparability across Comparability Across evaluations designed for a metaevaluations (program invariant studies evaluation of similar programs. parameters) Source: modified from Table V in “Structural Equations, Treatment Effects And Econometric Policy Evaluations” by James J. Heckman and Edward Vytlacil, NBER Working Paper No. 11259, March 2005. Note this article proposes a synthesis of the two approaches, which is ignored in this modified table. Experience The required information supposedly generated through standardized performance monitoring is absent in a large majority of IDB projects examined 10 of the 16 selected projects had inadequate data for treatment effect evaluation 6 of the 16 could be retrofitted with sufficient data to attempt a treatment effect evaluation Retrofitting implied significant data collection costs, costs that could have been avoided had adequate performance monitoring been in place over the life of the project. The Bank’s Current Portfolio Of 593 active projects in mid -2004: 97 (16%) claim existence of information for at least one development outcome, of which 27 have the information in an electronic form, of which for 5 the information is held in the Bank, of which 2 appear to be collecting data for treatment effect evaluation Experience: Limits to Retrofitting The questions answered are dependent on the information found rather than on the relevance and usefulness of the hypotheses being tested: the tail wagging the dog . It severely limits the set of control variables’ information thus reduces the veracity of treatment effect findings retrofitted data may not correspond to the development outcomes declared by the projects. A project can be evaluated using intended and unintended effects, but should at least consider as a minimum the intended ones. Experience Confirms the Value of Treatment Effect Evaluation In just one project (Neighborhood Improvement, Rio-Brazil) comparing naïve and treatment effect the following held regarding naïve and treatment effects: positive/negative, negative/positive; greater/smaller; smaller/greater; and the same. 0.50 Naive Treatment 0.40 0.30 0.20 0.10 0.00 -0.10 -0.20 Water Sewer Rubbish Illiteracy Income Rent Child Mortality Homicide Rate So pictures need to be interpreted with caution Before After Experience “…the six treatment effect evaluations undertaken during 2004 do show that the Bank’s interventions have a significant development effect for at least one declared development objective. These findings suggest that the Bank may be currently understating its contribution to development.” EXPERIENCE: Findings Land Titling “Beneficiaries of Land Regularization projects saw property values for their land increase …. However, for the other purported development effects (greater productivity, increased investment, and greater access to credit), no unambiguous treatment effects were found. Ramifications for project design: for small and poor producers to benefit from a pro-market regime, titling alone is not sufficient Transaction costs and market distortions that limit access to credit must be also simultaneously be addressed EXPERIENCE: Findings Potable Water heterogeneity of results important. Impact on infant mortality 0.1 0 proportional change -0.1 a regressive relationship between treatment effect and income, where more educated (and wealthier) households did better than less educated (and poorer) households Ramification for project design: -0.2 projects should include or be coordinated with, as a hypothesis to be tested, a health education component together with potable water expansion. -0.3 -0.4 All Sample At least Primary -0.5 Bottom 25% 25%-50% 50%-75% Expenditure level Top 25% EXPERIENCE: Findings from cash transfer and agricultural subsidy programs Issue: Do conditions attached to cash transfers produce more change than the transfers alone Income effect alone may be substantial, conditionalities are costly to administer and monitor and In a comparison between two programs in Mexico with and without conditionality the following ramifications for project design were found: – Conditionality (school and clinic attendance) does result in an effect over and above the income effect of the transfer. – Transfers to the mother as opposed to the father matters as the effects are greater when the transfer is to the mother EXPERIENCE: Low Costs •Treatment effect evaluations can be done inexpensively, if attention is paid to data at the time of design and during implementation •Data collection costs can be substantial if retrofitted, but still within reasonable limits. •Costs ranged from $28,000 to $92,000 per evaluation, much lower than the “norm”: small budget high returns Land Titling (Peru) Neighbourhood Improvement (Brazil) Potable Water (Ecuador) Cash Transfer (Argentina) Agricultural Subsidies and Cash Transfer (Mexico) Social Investment Fund (Panama) Total Program Value (US$ million) 83 600 280 637 1881 38 3518 Direct Costs (US$) 91,544 83,606 89,934 37,545 50,000 27,777 380406 Total Costs ( US$) 141,837 133,899 140,227 87,838 100,293 78,070 682164 Total Costs as a Percent of Program Value 0.17% 0.02% 0.05% 0.01% 0.01% 0.21% 0.08% Summary Initial experience with treatment effect impact evaluations provided considerable knowledge relevant for future project design Costs were moderate, and can be expected to be lower in the future if the performance monitoring system is improved Data has value to researchers, and cost-sharing in data collection was possible in several cases Treatment effect evaluation provides the only convincing basis for asserting development effectiveness