The Intersection of Performance Measurement and Program Evaluation: Searching for the Counterfactual Moscow 2011 Douglas J. Besharov School of Public Policy University of Maryland Performance Management Efficiency studies (“outputs”) How much does the program cost? Monetary, nonmonetary, and opportunity costs Could it be delivered more efficiently? Effectiveness studies (“outcomes” and “impacts”) Does the program achieve its goals? Could it be more effective? Both require a comparison, or a “counterfactual” Douglas J. Besharov, June 2011 8 Appearances Can Be Deceiving Giving children a “Head Start” Douglas J. Besharov, June 2011 2 It Matters How Children Are Raised Douglas J. Besharov, June 2011 3 Ineffective Early Childhood Education Programs IHDP 1985-1988 CCDP 1990-1995 Early Head Start 1996-2008 • Low-birth weight, preterm infants and their parents • Poor children under age 1 and their parents • Poor children ages 0-2 and their parents • Home visits, parenting education, and early childhood education • Case management, parenting education, early childhood education, and referrals to communitybased services • Child development, parenting education, child care, and family support services • $20,400 per child per year • $19,000 per family per year ($60 million annually) • $18,500 per child per year ($700 million annually) • No significant impacts, initial IQ gains fade • No significant impacts • No significant impacts Douglas J. Besharov, June 2011 4 Program Improvement, not Program Dismantling “The closest thing to immortality on this Earth is a federal government program.” – Ronald Reagan Douglas J. Besharov (October 21, 2008) Performance Management Leadership, Management, and Measurement Douglas J. Besharov, June 2011 6 Performance Management Leadership, Management, and Measurement Douglas J. Besharov, June 2011 7 Point #1 Counterfactuals are needed for accurate performance measurement. Douglas J. Besharov, June 2011 9 Impact Evaluations Take Too Long to Manage Performance Head Start Impact Study (2010): 10 years and running Moving to Opportunity Study (1994): 17 years and running Employment Retention and Advancement evaluation (1998): 13 years and running Building Strong Families Project (2002): 9 years and running National Job Corps Study (1993): 15 years to complete Douglas J. Besharov, June 2011 10 Logic Model for Job Training Programs Problem: Some unemployed do not have the necessary skills to obtain and keep well-paying employment, leading to lower income, greater use of government benefits, and a weaker economy. Theory: If government provides job training to the unemployed, than the unemployed will receive job skills necessary for good jobs, increased earnings, and a stronger economy Design: (1) Job search/job readiness training, (2) skills training, (3) in a classroom. Inputs Training facilities Staff Funding Client characteris tics Activities Job search/job readiness training Classroom instruction Job skills training Outputs Hours of training instruction Hours of practice Staff admin Skill certificates Outcomes Job search skills Technical job skills Interpersonal skills Proximal Impacts Earnings Employment UI/Welfare Receipt Distal Impacts Higher lifetime earnings/ employment Lower poverty Crime External Community and Societal Context Stronger economy Point #2 Carefully applied, a measured outcome coupled with a logic model’s theory of change —often buttressed by other evidence— can serve as a more timely and more useful performance measure than a formal evaluation of long-term impacts. Douglas J. Besharov, June 2011 12 When “Outputs” Imply “Outcomes” There is no output, so no positive outcome can be reasonably predicted. The output itself is sufficiently suggestive of a likely outcome. The output is produced at such a prohibitively high cost, that, regardless of its likely outcome, it does not meet cost-effectiveness or cost-benefit tests. Douglas J. Besharov, June 2011 13 Feasible “Outcome” Evaluations Evaluations of on-going programs Rolling randomized experiments Pre-post studies (with embedded counterfactual) Regression-discontinuity designs Evaluations of specific program “improvements” Randomized experiments Pipeline studies (or rolling implementation) Interrupted time series studies Douglas J. Besharov, June 2011 14 A Clear Interrupted Time Series Douglas J. Besharov, June 2011 15 Circling the Wagons Douglas J. Besharov, June 2011 16 Accountability Systems Top-down administrative and funding incentives -- together with -- Bottom-up voucher-like programs Douglas J. Besharov, June 2011 17 Appearances Can Be Deceiving Giving children a “Head Start” Douglas J. Besharov, June 2011 2 When “Outcomes” Imply “Impacts” When the desired impact is reasonably predicted to follow from the measured outcome Douglas J. Besharov, June 2011 10 Teen Pregnancy in Anson County, NC 2001-2008 Adolescent Parenting Number of Program pregnancies Scale: 0-100 Rate per 1,000 100 100 80 80 60 60 40 40 20 20 0 0 2001 2002 2003 2004 2005 2006 2007 2008 Ineffective Job Training Programs Job Corps • Low-income youth • JTPA 1987-1994 • Low-income adults, dislocated workers, and outof school youth WIA (dislocated) 2003-2005 •cts • Classroom training, on-thejob training, job search assistance, adult basic education, and other services • • $2,400 per participant for 3-4 months ($60 million annually) •Women: Small initial gains in earnings, employment, and GED receipt fade by 5 yrs •Men: Small initial gains in earnings fade by 5 yrs, no Douglas J. Besharov, June 2011 2