This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this material constitutes acceptance of that license and the conditions of use of materials on this site. Copyright 2006, The Johns Hopkins University and Jane Bertrand. All rights reserved. Use of these materials permitted only in accordance with license rights granted. Materials provided “AS IS”; no representations or warranties provided. User assumes all responsibility for use, and all liability related thereto, and must independently review all materials for accuracy and efficacy. May contain materials owned by others. User is responsible for Fundamentals of Program Evaluation obtaining permissions for use from third parties as needed. Fundamentals of Program Evaluation Course 380.611 Experimental, Non-experimental and Quasi-Experimental Designs Fundamentals of Program Evaluation Topics to cover: Study designs to measure impact Non-experimental designs Experimental designs Quasi-experimental designs Observational studies with advanced multivariate analysis Fundamentals of Program Evaluation Which study designs (next class) control to threats to validity? Experimental (randomized control trials) Quasi-experimental Controls some threats to validity Non experimental “gold standard” for assessing impact Controls threat to validity Do not control threats to validity Observational with statistical controls: Controls some threats to validity Fundamentals of Program Evaluation Notation in study designs 1) 2) X = intervention or program O = observation (data collection point) RA = random allocation (assignment) RS = random selection How would we interpret? O X O O O Fundamentals of Program Evaluation Source of data for “O” Routine health information (service statistics) Survey data Other types of quantitative data collection Fundamentals of Program Evaluation Non-experimental designs Post-test only Pretest post-test Static group comparison Sources: Fisher and Foreit, 2002 Cook & Campbell, Campbell & Stanley Fundamentals of Program Evaluation Posttest-Only Design Time Experimental Group Fundamentals of Program Evaluation X O1 Example: % of youth that used condom at last sex Time Experimental Group X Note: “X” is the campaign What are the threats to validity? Fundamentals of Program Evaluation 15% Pretest-Posttest Design Time Experimental Group Fundamentals of Program Evaluation O2 X O1 Example: % of youth that used condoms at last sex Exp. 3% Threats to validity? Fundamentals of Program Evaluation X 15% Static-Group Comparison Time Experimental Group Comparison Group Fundamentals of Program Evaluation X O1 O2 Example: % of youth that used condoms at last sex Exp. group Control group Threats to validity? Fundamentals of Program Evaluation X 15% 5% Questions Do evaluators ever use nonexperimental designs? If so, why? Fundamentals of Program Evaluation Experimental designs Pretest post-test control group design Post-test only control group design Randomized controlled trials (RCTs): Are a subset of experimental designs Stay tuned for Ron Gray’s lecture (Lecture 12) Fundamentals of Program Evaluation Pretest-Posttest Control Group Design Time Experimental Group O1 X O2 Control Group O3 RA Fundamentals of Program Evaluation O4 Example: % of youth that used condom at last sex Time 1 Exp 10% Control 11% Time 2 X 24% RA 12% Note: Youth aren’t randomized to use/not use condoms. This design implies that they are randomly allocated to a Fundamentals Evaluation group thatof Program RECEIVES/DOES NOT RECEIVE the program. Posttest-Only Control Group Design Time Experimental Group RA Control Group Fundamentals of Program Evaluation X O1 O2 Example: % of pregnant women who deliver with skilled attendant Time 1 RA Exp. Control X Time 2 40% 25% Note: Women aren’t randomized to deliver w/w-out. This design implies that they are randomly allocated to a group that RECEIVES/DOES NOT RECEIVE the program. Fundamentals of Program Evaluation Multiple Treatment Designs Although practical difficulties in conducting research increase with design complexity, experiments that examine the multiple treatments are frequently used in OR studies. One advantage is that using multiple treatment designs increases the available alternatives. Time RA Experimental Group 1 O1 X O2 Experimental Group 2 O3 Y O4 Control Group Fundamentals of Program Evaluation O5 O6 Limitations of the experimental design Difficult to use with full-coverage programs Generalizability (external validity) is low Politically unpopular (In some cases) unethical Fundamentals of Program Evaluation Quasi-experimental designs When randomization is unfeasible or unethical Treatment must still be manipulable Threats to validity must be explicitly controlled Fundamentals of Program Evaluation Three most common quasiexperimental designs: Time series Pretest post-test non-equivalent control group design Separate sample pre-test post-test design Fundamentals of Program Evaluation Quasi-experimental designs: Time Series Multiple observations before and after X Can be used prospectively or retrospectively Fundamentals of Program Evaluation Time Series Design Time Experimental Group Fundamentals of Program Evaluation O1 O2 O3 X O4 O5 O6 Example 1 of Time Series Millions of Condoms Distributed Sudden increase at program intervention 7 6 5 4 3 2 1 0 O1 O2 O3 O4 O5 Program Intervention (X) Fundamentals of Program Evaluation O6 Example 2 of Time Series Millions of Condoms Distributed Steady increase regardless of intervention 7 6 5 4 3 2 1 0 O1 O2 O3 O4 O5 Program Intervention (X) Fundamentals of Program Evaluation O6 Example 3 of Time Series Millions of Condoms Distributed Increases and decreases before & after intervention 7 6 5 4 3 2 1 0 O1 O2 O3 O4 O5 Program Intervention (X) Fundamentals of Program Evaluation O6 Example 4 of Time Series Millions of Condoms Distributed Temporary impact of intervention 7 6 5 4 3 2 1 0 O1 O2 O3 O4 O5 Program Intervention (X) Fundamentals of Program Evaluation O6 Using service statistics to demonstrate probable effect Advantages Lowers costs Takes advantage of existing information Constitutes a natural experiment Disadvantages Falls short of demonstrating causality Can’t know “what would have happened in the absence of the intervention” Fundamentals of Program Evaluation Millions of Condoms Distributed Time Series Analysis: Condom Sales in Ghana 11 10 9 8 7 6 5 4 3 2 Early Late Early Late Early Late Early Late Early Late Early Late 96 96 97 97 98 98 99 99 00 00 01 01 Sales & distribution figures from MOH, GSMF, and PPAG Fundamentals of Program Evaluation Program Intervention (February 2000) Clinic visits in Nepal Before & After Radio Programming Radio Spots on Air Radio Serial on Air No Intervention Fundamentals of Program Evaluation Average Monthly Clinic Visits Example: Mass Media Vasectomy Promotion in Brazil Reasons that it’s a good case study: Topic is almost exclusive to the media campaign little “naturally occurring” communication on subject Available service statistics Routinely collected High quality Fundamentals of Program Evaluation Key media events that promoted Pro-Pater 1983: 3 minute broadcast on vasectomy and ProPater Resulted in 50% increase in # vasectomies 1985: 10 week newspaper and magazine promotion (with Population Council) 54% increase in # vasectomies Fundamentals of Program Evaluation 1989 campaign Vasectomy promotion in 3 cities (Sao Paulo, Salvador and Fortaleza) Objective of the campaign: Increase knowledge/awareness (and eliminate misconceptions) Increase # vasectomies among lower middle class men aged 25-49 Fundamentals of Program Evaluation 4 phases of the 1989-90 campaign (1) Pre-campaign P.R. Press releases, press conference (2) Campaign-“vasectomy is an act of love” TV spots (May, June 1989) Radio Pamphlets, billboard, magazine ads Fundamentals of Program Evaluation 4 phases of the 1989-90 campaign (3) Rebroadcasting of TV spots September 1989 2-5 times daily in evenings (4) Mini-campaign: (Jan-March 1990) Ad in Veja magazine Electronic billboard in San Paulo Direct mailing of pamphlet to Veja subscribers Fundamentals of Program Evaluation Mean number of daily calls to PRO-Pater clinic: 1989-90 120 100 80 60 40 20 0 Before campaign Campaign Fundamentals of Program Evaluation After campaign Minicampaign After minicampaign Results of Brazil Study Source: D. Lawrence Kincaid; Alice Payne Merritt; Liza Nickerson; Sandra de Castro Buffington; Marcos Paulo P. de Castro; Bernadete Martin de Castro, Fundamentals of Program Evaluation "Impact of a Mass media Vasectomy Promotion Campaign in Brazil," International Family Planning Perspectives, Vol. 22, No. 4. (Dec., 1996), pp. 169-175. Number of vasectomies Poisson regression: measured slopes based on monthly points Immediate/substantial increase after start of the campaign Gradual decline through 1990 After last campaign, downward trend became even greater Fundamentals of Program Evaluation Possible hypotheses Cost of operation increased Competition from other doctors (that Pro-Pater had trained) increased Fundamentals of Program Evaluation Take-home messages re: time series “What was the impact of the program?” Combining service stats/special studies Useful to understanding program dynamics Time series (quasi-experimental design) can’t definitively demonstrate causality “What would have happened in the absence of the program”???? Fundamentals of Program Evaluation Non-Equivalent Control Group Time Experimental Group O1 X O2 Control Group O3 Fundamentals of Program Evaluation O4 Example: vasectomy promotion in Guatemala (1983-84) 3 communication strategies Radio and promoter Radio only Promoter only One community per strategy (all on Southern coast) Before/after survey 400 men per community, each survey Fundamentals of Program Evaluation Results of Vasectomy Promotion: % interested in vasectomy Radio & Promoter Radio only Promoter only Comparison Fundamentals of Program Evaluation Pre 16 32 33 23 Post 22 32 37 27 Results of Vasectomy promotion: % operated Radio & Promoter Radio only Promoter only Comparison Fundamentals of Program Evaluation Pre 0.5 1.5 1.0 0.3 Post 0.8 2.0 3.0* 1.0 Separate Sample PretestPosttest Design Time RS Target population O1 X RS Target Population X O2 Fundamentals of Program Evaluation Findings from Mayan Birthspacing Project 100 90 80 70 60 50 40 30 20 10 0 Before After Knows Approves Fundamentals of Program Evaluation Ever Used FP Uses FP Alternative to experimental designs Application of multi-level multi-variate statistical analysis to “observational” (cross sectional) data. Fundamentals of Program Evaluation Example: Measuring the effects of exposure (dose effect) Did increased exposure to Stop Aids Love Life in Ghana relate to desired outcomes? Controlling for socio-economic factors and access to media Fundamentals of Program Evaluation % aware of at least one way to avoid AIDS by level of exposure 100 90 80 70 60 50 40 30 20 10 0 None Low High Male Fundamentals of Program Evaluation Female % believe their friends approve condoms (HIV/AIDS) 80 70 60 50 None 40 Low 30 High 20 10 0 Male Fundamentals of Program Evaluation Female ABC’s: Abstinence Mean age at first sex 20.5 20 19.5 19 None 18.5 Low 18 High 17.5 17 16.5 16 Male Fundamentals of Program Evaluation Female ABC’s: Being faithful % with 1 partner in past year 30 25 20 None 15 Low High 10 5 0 Male Fundamentals of Program Evaluation Female ABC’s: Condom use % used condom at last sex 35 30 25 None 20 Low 15 High 10 5 0 Male Fundamentals of Program Evaluation Female Example of estimating the effects of exposure to media Example from Tsha Tsha (South Africa) Using the data related to exposure to the communication program as a determinant of outcome (Credits to CADRE in S. Africa, and to the authors: Kevin Kelly, Warren Parker, and Larry Kincaid) Fundamentals of Program Evaluation Perceived Qualities of Boniswa among Females 80 69 68 62 62 Honest Courageous 61 60 40 20 0 Concerned Fundamentals of Program Evaluation Confident Loyal N=269 Example: relating outcome to identification with characters Does identification with specific characters in a soap opera affect outcomes? Controlling for SES Fundamentals of Program Evaluation Multiple regression to estimate the independent effect of exposure, controlling for other influences * Learned about AIDS on TV .13 .11 .26 Female Lagged AIDS attitude (wave 1) .13 .09 .33 Education Frequency of TV viewing AIDS Attitude .12 .09 .31 Kwazulu province residence Fundamentals of Program Evaluation Recall of the Drama -.01 * Including lagged attitude means that the impact of other variables is on change in attitude. Path model: Effects of recall of drama and identification with Boniswa on AIDS Attitudes Watches “Days of Our Lives” .06 Abstained for a month or more .09 .34 Learned about AIDS on TV Lagged AIDS attitude Frequency of TV viewing .13 .13 .11 .26 Female Education Identification with Boniswa .12 AIDS Attitude .13 .33 .12 .09 .31 Kwazuluofprovince Fundamentals Program Evaluation .09 Recall of the Drama -.01 Wave 3; South Africa, 2004 Observational studies with statistical controls Widespread use, esp. among academics Advantages Doesn’t require an experimental design Allows greater understanding of the pathways to change (tests conceptual framework) Limitations: Difficult to rule out confounding factors Fundamentals of Program Evaluation Why are the Victoria et al and Habicht et al articles important? Challenge RCT as the gold standard for evaluating ALL interventions Suggest different levels of evidence: Probability (RCTs) Plausibility (includes comparison group and addresses confounding factors) Adequacy (based on trends in expected direction following intervention) Fundamentals of Program Evaluation Reasons why RCTs may have low external validity (Victora) Causal link between intervention and outcome can vary depending on external factors: Actual dose delivered to the target population varies Institutional, provider, recipient behaviors Dose-response relationship varies by site: Nutritional interventions (health of target pop) Fundamentals of Program Evaluation Trade-off between internal and external validity Example: evaluating communication campaign Pilot projects: don’t deliver the full impact of a national full coverage program Results not generalizable Mass media: lower internal validity Can’t control for all confounding factors Fundamentals of Program Evaluation Take home message from Victora et al./Habicht et al. RCTs are the gold standard: But may not be appropriate for evaluating large scale public health interventions Evidence-based public health must draw on studies with designs other than RCTs Plausibility Adequacy Fundamentals of Program Evaluation