This work is licensed under a Creative Commons

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike License. Your use of this
material constitutes acceptance of that license and the conditions of use of materials on this site.
Copyright 2006, The Johns Hopkins University and Jane Bertrand. All rights reserved. Use of these materials
permitted only in accordance with license rights granted. Materials provided “AS IS”; no representations or
warranties provided. User assumes all responsibility for use, and all liability related thereto, and must independently
review all materials for accuracy and efficacy. May contain materials owned by others. User is responsible for
Fundamentals
of Program
Evaluation
obtaining
permissions
for use from
third parties as needed.
Fundamentals of Program Evaluation
Course 380.611
Experimental, Non-experimental and
Quasi-Experimental Designs
Fundamentals of Program Evaluation
Topics to cover:
„
„
„
„
„
Study designs to measure impact
Non-experimental designs
Experimental designs
Quasi-experimental designs
Observational studies with advanced
multivariate analysis
Fundamentals of Program Evaluation
Which study designs (next class)
control to threats to validity?
„
Experimental (randomized control trials)
„
„
„
Quasi-experimental
„
„
Controls some threats to validity
Non experimental
„
„
“gold standard” for assessing impact
Controls threat to validity
Do not control threats to validity
Observational with statistical controls:
„
Controls some threats to validity
Fundamentals of Program Evaluation
Notation in study designs
„
„
„
„
„
1)
2)
X = intervention or program
O = observation (data collection point)
RA = random allocation (assignment)
RS = random selection
How would we interpret?
O X O
O
O
Fundamentals of Program Evaluation
Source of data for “O”
„
„
„
Routine health information (service
statistics)
Survey data
Other types of quantitative data
collection
Fundamentals of Program Evaluation
Non-experimental designs
„
„
„
Post-test only
Pretest post-test
Static group comparison
Sources:
„ Fisher and Foreit, 2002
„ Cook & Campbell, Campbell & Stanley
Fundamentals of Program Evaluation
Posttest-Only Design
Time
Experimental Group
Fundamentals of Program Evaluation
X
O1
Example: % of youth that
used condom at last sex
Time
Experimental
Group
X
Note: “X” is the campaign
What are the threats to validity?
Fundamentals of Program Evaluation
15%
Pretest-Posttest Design
Time
Experimental Group
Fundamentals of Program Evaluation
O2
X
O1
Example: % of youth that
used condoms at last sex
Exp.
3%
Threats to validity?
Fundamentals of Program Evaluation
X
15%
Static-Group Comparison
Time
Experimental Group
Comparison Group
Fundamentals of Program Evaluation
X
O1
O2
Example: % of youth that
used condoms at last sex
Exp. group
Control group
Threats to validity?
Fundamentals of Program Evaluation
X
15%
5%
Questions
„
„
Do evaluators ever use nonexperimental designs?
If so, why?
Fundamentals of Program Evaluation
Experimental designs
„
Pretest post-test control group design
Post-test only control group design
„
Randomized controlled trials (RCTs):
„
„
„
Are a subset of experimental designs
Stay tuned for Ron Gray’s lecture (Lecture
12)
Fundamentals of Program Evaluation
Pretest-Posttest Control Group
Design
Time
Experimental Group
O1 X O2
Control Group
O3
RA
Fundamentals of Program Evaluation
O4
Example: % of youth that
used condom at last sex
Time 1
Exp
10%
Control
11%
Time 2
X
24%
RA
12%
Note: Youth aren’t randomized to use/not use condoms.
This design implies that they are randomly allocated to a
Fundamentals
Evaluation
group
thatof Program
RECEIVES/DOES
NOT RECEIVE the program.
Posttest-Only Control Group
Design
Time
Experimental Group
RA
Control Group
Fundamentals of Program Evaluation
X
O1
O2
Example: % of pregnant women
who deliver with skilled attendant
Time 1
RA
Exp.
Control
X
Time 2
40%
25%
Note: Women aren’t randomized to deliver w/w-out. This
design implies that they are randomly allocated to a group
that RECEIVES/DOES NOT RECEIVE the program.
Fundamentals of Program Evaluation
Multiple Treatment Designs
Although practical difficulties in conducting research increase with design complexity,
experiments that examine the multiple treatments are frequently used in OR studies.
One advantage is that using multiple treatment designs increases the available
alternatives.
Time
RA
Experimental Group 1
O1 X O2
Experimental Group 2
O3 Y O4
Control Group
Fundamentals of Program Evaluation
O5
O6
Limitations of the
experimental design
„
„
„
„
Difficult to use with full-coverage
programs
Generalizability (external validity) is low
Politically unpopular
(In some cases) unethical
Fundamentals of Program Evaluation
Quasi-experimental designs
„
„
„
When randomization is unfeasible or
unethical
Treatment must still be manipulable
Threats to validity must be explicitly
controlled
Fundamentals of Program Evaluation
Three most common quasiexperimental designs:
„
„
„
Time series
Pretest post-test non-equivalent control
group design
Separate sample pre-test post-test
design
Fundamentals of Program Evaluation
Quasi-experimental designs:
Time Series
„
„
Multiple observations before and after X
Can be used prospectively or
retrospectively
Fundamentals of Program Evaluation
Time Series Design
Time
Experimental Group
Fundamentals of Program Evaluation
O1 O2 O3 X O4 O5 O6
Example 1 of Time Series
Millions of Condoms Distributed
Sudden increase at program intervention
7
6
5
4
3
2
1
0
O1
O2
O3
O4
O5
Program Intervention (X)
Fundamentals of Program Evaluation
O6
Example 2 of Time Series
Millions of Condoms Distributed
Steady increase regardless of intervention
7
6
5
4
3
2
1
0
O1
O2
O3
O4
O5
Program Intervention (X)
Fundamentals of Program Evaluation
O6
Example 3 of Time Series
Millions of Condoms Distributed
Increases and decreases before & after intervention
7
6
5
4
3
2
1
0
O1
O2
O3
O4
O5
Program Intervention (X)
Fundamentals of Program Evaluation
O6
Example 4 of Time Series
Millions of Condoms Distributed
Temporary impact of intervention
7
6
5
4
3
2
1
0
O1
O2
O3
O4
O5
Program Intervention (X)
Fundamentals of Program Evaluation
O6
Using service statistics to
demonstrate probable effect
„
Advantages
„
„
„
„
Lowers costs
Takes advantage of existing information
Constitutes a natural experiment
Disadvantages
„
„
Falls short of demonstrating causality
Can’t know “what would have happened in
the absence of the intervention”
Fundamentals of Program Evaluation
Millions of Condoms Distributed
Time Series Analysis: Condom
Sales in Ghana
11
10
9
8
7
6
5
4
3
2
Early Late Early Late Early Late Early Late Early Late Early Late
96
96 97
97 98
98 99
99 00
00 01
01
Sales & distribution figures from
MOH, GSMF, and PPAG
Fundamentals of Program Evaluation
Program Intervention
(February 2000)
Clinic visits in Nepal Before &
After Radio Programming
Radio Spots on Air
Radio Serial on Air
No Intervention
Fundamentals of Program Evaluation
Average Monthly Clinic Visits
Example: Mass Media
Vasectomy Promotion in Brazil
„
Reasons that it’s a good case study:
„
Topic is almost exclusive to the media
campaign
„
„
little “naturally occurring” communication on
subject
Available service statistics
„
„
Routinely collected
High quality
Fundamentals of Program Evaluation
Key media events that
promoted Pro-Pater
„
1983:
„
„
„
3 minute broadcast on vasectomy and ProPater
Resulted in 50% increase in # vasectomies
1985:
„
„
10 week newspaper and magazine
promotion (with Population Council)
54% increase in # vasectomies
Fundamentals of Program Evaluation
1989 campaign
„
„
Vasectomy promotion in 3 cities (Sao
Paulo, Salvador and Fortaleza)
Objective of the campaign:
„
„
Increase knowledge/awareness (and
eliminate misconceptions)
Increase # vasectomies among lower
middle class men aged 25-49
Fundamentals of Program Evaluation
4 phases of the 1989-90
campaign
„
(1) Pre-campaign P.R.
„
„
Press releases, press conference
(2) Campaign-“vasectomy is an act of
love”
„
„
„
TV spots (May, June 1989)
Radio
Pamphlets, billboard, magazine ads
Fundamentals of Program Evaluation
4 phases of the 1989-90
campaign
„
(3) Rebroadcasting of TV spots
„
„
„
September 1989
2-5 times daily in evenings
(4) Mini-campaign: (Jan-March 1990)
„
„
„
Ad in Veja magazine
Electronic billboard in San Paulo
Direct mailing of pamphlet to Veja
subscribers
Fundamentals of Program Evaluation
Mean number of daily calls to
PRO-Pater clinic: 1989-90
120
100
80
60
40
20
0
Before
campaign
Campaign
Fundamentals of Program Evaluation
After
campaign
Minicampaign
After minicampaign
Results of Brazil Study
Source: D. Lawrence
Kincaid; Alice
Payne Merritt;
Liza Nickerson; Sandra de Castro Buffington; Marcos Paulo P. de Castro; Bernadete Martin de Castro,
Fundamentals
of Program
Evaluation
"Impact of a Mass media Vasectomy Promotion Campaign in Brazil," International Family Planning Perspectives, Vol. 22, No. 4. (Dec., 1996), pp. 169-175.
Number of vasectomies
„
„
„
„
Poisson regression: measured slopes
based on monthly points
Immediate/substantial increase after
start of the campaign
Gradual decline through 1990
After last campaign, downward trend
became even greater
Fundamentals of Program Evaluation
Possible hypotheses
„
„
Cost of operation increased
Competition from other doctors (that
Pro-Pater had trained) increased
Fundamentals of Program Evaluation
Take-home messages re: time
series
„
„
“What was the impact of the program?”
Combining service stats/special studies
„
„
„
Useful to understanding program dynamics
Time series (quasi-experimental design)
can’t definitively demonstrate causality
“What would have happened in the
absence of the program”????
Fundamentals of Program Evaluation
Non-Equivalent Control Group
Time
Experimental Group
O1 X O2
Control Group
O3
Fundamentals of Program Evaluation
O4
Example: vasectomy promotion in
Guatemala (1983-84)
„
3 communication strategies
„
„
„
„
„
Radio and promoter
Radio only
Promoter only
One community per strategy (all on
Southern coast)
Before/after survey
„
400 men per community, each survey
Fundamentals of Program Evaluation
Results of Vasectomy Promotion:
% interested in vasectomy
Radio & Promoter
Radio only
Promoter only
Comparison
Fundamentals of Program Evaluation
Pre
16
32
33
23
Post
22
32
37
27
Results of Vasectomy
promotion: % operated
Radio & Promoter
Radio only
Promoter only
Comparison
Fundamentals of Program Evaluation
Pre
0.5
1.5
1.0
0.3
Post
0.8
2.0
3.0*
1.0
Separate Sample PretestPosttest Design
Time
RS
Target
population
O1
X
RS
Target
Population
X
O2
Fundamentals of Program Evaluation
Findings from Mayan
Birthspacing Project
100
90
80
70
60
50
40
30
20
10
0
Before
After
Knows
Approves
Fundamentals of Program Evaluation
Ever Used
FP
Uses FP
Alternative to experimental
designs
Application of multi-level multi-variate
statistical analysis to “observational”
(cross sectional) data.
Fundamentals of Program Evaluation
Example: Measuring the effects
of exposure (dose effect)
„
Did increased exposure to Stop Aids Love
Life in Ghana relate to desired outcomes?
„
Controlling for socio-economic factors and
access to media
Fundamentals of Program Evaluation
% aware of at least one way to
avoid AIDS by level of exposure
100
90
80
70
60
50
40
30
20
10
0
None
Low
High
Male
Fundamentals of Program Evaluation
Female
% believe their friends
approve condoms (HIV/AIDS)
80
70
60
50
None
40
Low
30
High
20
10
0
Male
Fundamentals of Program Evaluation
Female
ABC’s: Abstinence
Mean age at first sex
20.5
20
19.5
19
None
18.5
Low
18
High
17.5
17
16.5
16
Male
Fundamentals of Program Evaluation
Female
ABC’s: Being faithful
% with 1 partner in past year
30
25
20
None
15
Low
High
10
5
0
Male
Fundamentals of Program Evaluation
Female
ABC’s: Condom use
% used condom at last sex
35
30
25
None
20
Low
15
High
10
5
0
Male
Fundamentals of Program Evaluation
Female
Example of estimating the
effects of exposure to media
„
„
„
Example from Tsha Tsha (South Africa)
Using the data related to exposure to the
communication program as a determinant of
outcome
(Credits to CADRE in S. Africa, and to the
authors: Kevin Kelly, Warren Parker, and
Larry Kincaid)
Fundamentals of Program Evaluation
Perceived Qualities of Boniswa
among Females
80
69
68
62
62
Honest
Courageous
61
60
40
20
0
Concerned
Fundamentals of Program Evaluation
Confident
Loyal
N=269
Example: relating outcome to
identification with characters
„
„
Does identification with specific
characters in a soap opera affect
outcomes?
Controlling for SES
Fundamentals of Program Evaluation
Multiple regression to estimate the independent
effect of exposure, controlling for other
influences *
Learned about
AIDS on TV
.13
.11
.26
Female
Lagged AIDS
attitude (wave 1)
.13
.09
.33
Education
Frequency of TV
viewing
AIDS
Attitude
.12
.09
.31
Kwazulu province
residence
Fundamentals of Program Evaluation
Recall of the
Drama
-.01
* Including lagged attitude means
that the impact of other variables
is on change in attitude.
Path model: Effects of recall of drama and
identification with Boniswa on AIDS Attitudes
Watches “Days of
Our Lives”
.06
Abstained for a
month or more
.09
.34
Learned about
AIDS on TV
Lagged AIDS
attitude
Frequency of TV
viewing
.13
.13
.11
.26
Female
Education
Identification
with Boniswa
.12
AIDS
Attitude
.13
.33
.12
.09
.31
Kwazuluofprovince
Fundamentals
Program Evaluation
.09
Recall of the
Drama
-.01
Wave 3; South Africa, 2004
Observational studies with
statistical controls
„
„
Widespread use, esp. among academics
Advantages
„
„
„
Doesn’t require an experimental design
Allows greater understanding of the
pathways to change (tests conceptual
framework)
Limitations:
„
Difficult to rule out confounding factors
Fundamentals of Program Evaluation
Why are the Victoria et al and
Habicht et al articles important?
„
„
Challenge RCT as the gold standard for
evaluating ALL interventions
Suggest different levels of evidence:
„
„
„
Probability (RCTs)
Plausibility (includes comparison group and
addresses confounding factors)
Adequacy (based on trends in expected
direction following intervention)
Fundamentals of Program Evaluation
Reasons why RCTs may have
low external validity (Victora)
„
Causal link between intervention and
outcome can vary depending on
external factors:
„
Actual dose delivered to the target
population varies
„
„
Institutional, provider, recipient behaviors
Dose-response relationship varies by site:
„
Nutritional interventions (health of target pop)
Fundamentals of Program Evaluation
Trade-off between internal
and external validity
„
„
Example: evaluating communication
campaign
Pilot projects: don’t deliver the full impact of
a national full coverage program
„
„
Results not generalizable
Mass media: lower internal validity
„
Can’t control for all confounding factors
Fundamentals of Program Evaluation
Take home message from
Victora et al./Habicht et al.
„
RCTs are the gold standard:
„
„
But may not be appropriate for evaluating
large scale public health interventions
Evidence-based public health must draw
on studies with designs other than RCTs
„
„
Plausibility
Adequacy
Fundamentals of Program Evaluation