May, 2014 MILLIONS SAVED 3RD EDITION CRITERIA FOR CASE STUDY SELECTION1 1. IMPORTANCE The case study must address a problem or problems of public health significance2 in LMICs. o Standardized measures of health status will be used to gauge the significance of the health problem(s), mainly using data on mortality, morbidity and disability adjusted life years (DALYs) as reported in global burden of disease studies and WHO global health estimates; other measures such as QALYs, YLL, and related can also be used to assess importance as relevant.3 While potentially minor in terms of contribution to total current burden of disease, MS3 also considers the following to be of public health significance: - neglected tropical diseases, which continue to disproportionately plague the poorest populations despite the existence of low-cost preventative measures. - prevention to keep old epidemic threats from re-emerging, e.g., vaccines. - responses to virulent new epidemic threats, increasingly important in the globalized world. MS3 views the burden of disease related to malnutrition as a health issue. ISSUES: o This approach may exclude scaled programs with documented benefits on outputs known to be essential to improving health outcomes, e.g., skilled birth attendance, immunization. o National/regional burden of disease may not reflect important health threats that are only beginning to emerge as noted above. 2. IMPACT The program must demonstrate a positive, statistically significant, attributable impact on a population health outcome using an experimental or quasi-experimental study design. Programs that demonstrate a population-level change that is directly attributable to the intervention will be considered in the absence of experimental or quasi-experimental studies, e.g., smallpox eradication. Demonstration of impact should be quantitative. Cases that demonstrate impact on related outcomes, especially equity and financial protection, will be prioritized as outlined below. Two “deficient at scale” cases will be included that demonstrate cost-effectiveness on a small scale but failure to achieve health impact following scale-up (see below*). ISSUES: o May exclude nascent health programs addressing emerging BoD; too early for measurable health impact. 1 Information on candidates that fail to meet our criteria will be available in MS3 web-based supplemental material. Although a term frequently employed in the literature, there is no standard definition of “public health significance”. 3 The burden of disease will be described at the time the program was operational. 2 May, 2014 o The meaning of global health “success” depends on one’s perspective: at various stages the concept has been driven by differing conceptions of vertical versus horizontal, technical versus social, centrally driven versus locally defined, disease-based versus health-based, individually versus collectively-oriented, doctor-centered versus healer-centered versus community-centered, and so on.4 o Health impact results can be controversial/contested, e.g., JSY. o Average program effects may reflect the program’s impact on some subpopulations and not others, masking important diversity. Less evidence exists on effective approaches to reaching the most marginalized and disadvantaged. 3. SCALE Programs should be implemented on a significant scale. In some cases this will be nationwide, but regional or other relevant population scales are also acceptable. Programs may be characterized as “national” if they represent a national-level commitment, even if they target a health problem that affects only a limited geographic area or sub-group. Given the nascent state of chronic and NCD programming in the developing world, consideration will be given to smaller-scale programs to address these challenges. Where there are no cases that address the major causes of BoD meeting MS3’s scale criteria, special consideration will be given to small-scale programs or trials as “promising” cases. “Promising” cases should adhere to the other criteria. ISSUES: o May exclude programs with limited coverage that yield important lessons for wide-scale programming, e.g., on MDR-TB. o Few programs operating ‘at scale’ have undergone robust study/evaluation. 4. ECONOMIC EVALUATION Interventions should be highly cost-effective as determined by a standardized measure of cost-effectiveness, e.g., costing less than a GDP per capita for each DALY averted.5 A context- specific measure of cost-effectiveness (e.g., $ per case averted; financial or agricultural return on investment) may also be included. Additional economic analysis, including cost-benefit and assessments of impact on financial protection and equity, may also be included in conjunction with DCP3’s economic analysis (TBD in consultation with DCP3). ISSUES: o MS3 will report on the limitations of the cost-effectiveness analyses used, and point to specific areas for improvements o May preclude consideration of scaled programs with demonstrated effectiveness that lack sufficient cost information to make the calculation. Scaled up programs are less likely to have undergone cost-effectiveness analysis than small-scale projects or trials. 4 See for example, Anne-Emmanuelle Birn: The stages of international (global) health: histories of success or successes of history? http://www.ncbi.nlm.nih.gov/pubmed/19153930 5 http://www.who.int/choice/en/ May, 2014 o Attribution of impact to specific input will be a challenge in programming on financial protection and universal health coverage; similarly, many programs to address the structural determinants of health do not lend themselves to costeffectiveness analysis. 5. DURATION It is preferable that interventions have functioned “at scale” for at least five years. It is preferable that the health impact be sustained over time, with sufficient follow-up results to determine medium to long term benefits. ISSUES: o Excludes consideration of recent programs addressing the emerging BoD or application of major new research findings, e.g., male circumcision for HIV prevention. 6. RELEVANCE Case information should be of interest and programmatically relevant in other settings. It is preferable that information about the program’s characteristics and success factors be available to give a sense of the broader applicability, with greater weight given to programs with transparency and credible documentation. ISSUES: o True generalizability may be at odds with the scientific rigor required by MS3; hence MS3 should exercise caution when making claims about generalizability.6 External validity is not a formal selection criterion. o Context matters – cases will include information on the factors specific to the setting that hindered and promoted their success to shed light on generalizability. 7. EQUITY For measurement purposes, we consider equity in health “the absence of systematic disparities in health (or in the major social determinants of health) between groups with different levels of underlying social advantage/disadvantage—that is, wealth, power, or prestige.”7 MS3 will look for programs that are pro-poor and include specific measures to reduce the barriers that prevent specific sub-populations – those disadvantaged by low SES, gender inequality, geography, ethnicity – from accessing health benefits. We will make the most of indirect measures of equity, e.g., from DHS and other national survey data The equity criterion will be applied on a scale rather than a ‘yes/no’. Consideration may be given to programs that meet other criteria but are unable to demonstrate impact on health equity. 6 Lant Pritchett and Justin Sandefur. 2013. “Context Matters for Size: Why External Validity Claims and Development Practice Don’t Mix.” CGD Working Paper 336. Washington, DC: Center for Global Development. 7 Defining equity in health, Bravement and Gruskin, J Epidemiol Community Health 2003;57:254-258, doi:10.1136/jech.57.4.254 May, 2014 ISSUES: o It may not be possible to quantify the program’s impact on equity in the absence of results disaggregated by SES, sex, etc. 8. FINANCIAL PROTECTION MS3 will look for programs that aim to reduce the financial hardship and impoverishment associated with health problems. Financial protection can be quantified by a reduction in out-of-pocket spending on health (prevention, diagnosis, treatment).8 Other indications of financial protection also may be used, bearing in mind concerns about a narrow definition of financial protection9 (in discussion with DCP3). We will make the most of indirect measures of financial protection, e.g., from DHS and other national survey data (including benefit incidence). Programs that include measures to improve health service delivery and health system responsiveness, e.g., results-based financing, also will be considered in the context of Universal Health Coverage. The financial protection criterion will be applied on a scale rather than a ‘yes/no’. Consideration may be given to programs that meet other criteria but are unable to demonstrate impact on financial protection. ISSUES: o Information needed to quantify financial protection may not be available, particularly in older case study candidates. * CONSIDERATIONS FOR DEFICIENT AT SCALE CASES Four cases will help readers learn from failure. As with the other programs, these cases will highlight the large-scale application of interventions with proven success in a controlled setting that failed to deliver at scale. These cases will illustrate the importance of considering factors beyond cost-effectiveness, particularly delivery and demand-side risks. To qualify as a “deficient at scale” case: The intervention should be proven as effective and cost-effective in a pilot or trial. The program will adhere to the MS3 criteria of importance and scale as outlined above; however, it will fail to achieve impact at scale. Failure to achieve widespread impact can be related to insufficient attention to contextual factors such as the barriers to health service access, gender inequality, political economy, and demands on the health system, as well as the multi-factorial nature of health risks, a theory of change, etc. Information on the reasons for failure should be available in the published literature, supported by grey literature and interviews with key actors as feasible. 8 http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1106043 E.g., Ruger alternative framework at: http://www.plosmedicine.org/article/info%3Adoi%2F10.1371%2Fjournal.pmed.1001294 9