POWERPOINT ch9

advertisement
Quasi-Experimental
Approaches to
Outcome Evaluation
Presented By: Lana, Kasia & Catherine
Concept Map: Let us Explain
Quasi-Experiments
How to Increase the Validity of
Interpretations
Making
Observation
s a Greater
# of Times
Observing
Other
Groups
Nonequivalent
Control Group
Designs
RegressionDiscontinuit
y Design
Observing
Other
Dependent
Variables
Problems in
Selecting
Comparison
Groups
Combing Designs to Increase
Internal Validity
Time-Series Designs
(Information across many
time Intervals)
Selective
Control
Design
Time-Series
and
Nonequivalent
Control Groups
Analysis of
Time-Series
Designs
ABAB
Design
When The
Intervention
Cannot Be
Removed, but the
Effect Is Large
How the
Feminist/Critical
Research Paradigm
Applies
Quasi?




The word quasi means as if or almost so a quasi
experiment is almost a true experiment
What makes a true experiment is random
assignment of groups or people to treatments
But in Quasi Experiments you have only partial
control over the independent variables because
assignment to conditions is not random
This is useful when random assignment is
impossible or unethical (males vs. females, high
vs. low self-esteem).


It’s not just having intact groups that creates a quasiexperiment.
Individuals who are not in intact groups could enter
treatment levels through: self selection (ex. Because
of a particular performance category) or because the
researcher has “paired” individuals that are believed
to be similar.
How To Tell If Something Caused
Something Else…3 STEPS




1) That the cause comes before the expected effect
2) The cause co-varies with the effect (the more of
the cause, the more of the effect)
Now are you ready??? Cause this next step is the
focus of this chapter!!!
3) that no possible explanation of the effect can be
found EXCEPT for the assumed cause
Quasi-Experiments




Use three methods to increase validity
1) Observing participants at additional times both
before and after the program
2) Observing additional natural groups of people
who were not involved in the program
3) Use of a variety of variables- some expected to
be effected by the program and some not.
Keep in Mind…
These methods do not achieve the
airtight control of true experiments
 However, quasi-experiments do control
for many biases and can yield highly
interpretable evaluations

Making Observations A Greater
Number of Times

This can help the following problems..
 It distinguishes random changes from time period to
time period. All variables will show variation over time
Ex. # of crimes on a given day…not consistent.
 Without info on day-to-day variation, one cannot
know whether there is anything that needs to be
explained.
 Ignoring random variation and threats to internal
validity can lead to erroneous interpretations of the
casual connection between two events.
Time Series Designs

This is a longitudinal design meaning over
time..
 Participants are tested at different times
during the course of the study.
 Strategy: obtain a base line measurement
before an intervention and document both the
change and the maintenance of change
 This is a way of meeting some of the internal
validity challenges that were described in
chapter 8.
How can it help with internal
validity?

Maturation: the effects can be traced during
the time periods before and after the
intervention
 History: effects are more easily detected. By
relating changes in dependent variables to
historical events, it is possible to distinguish
the effects of the program from the impact of
major non-program influences.

Time series designs can also be of two types:
1) interrupted, 2) non-interrupted
Interrupted Vs. Non-interrupted

Both types examine changes in the
dependent variable over time
 However, the interrupted time series design
involves before and after measurement.
 Program evaluators use interrupted designs
almost exclusively when a definite
intervention has occurred at a specific time.
 The evaluators job is to learn whether the
interruption- that is, the introduction of a
program- had an impact.
Characteristics Of A Time Series
Design
1) single unit is defined
 2) measurements are made
 3) Over a number of time intervals
 4) that precede and follow some
controlled or natural observation


The unit observed (person, group, etc)
serves as its own control.
Possible patterns:
There are a number of possible patterns
observable in graphs of a program’s
outcome plotted over time periods.
 Now, we will explain two of the
possibilities.
 Note: the dashed line in the pictures
represents the time of the
program/intervention.

No effect of the Intervention
There appears to be no out- of – the
ordinary change in the observations
after the program
Criterion

Time Intervals
Most Hoped for Finding
Criterion
-This shows a marked increase from a
fairly stable level before the
intervention, and the criterion remains
fairly stable afterwards
Time Intervals
Analysis of Time-Series Designs

1) ABAB Design
– When the intervention is implemented and then removed.
– After establishing a baseline, an intervention is introduced
that is supposed to reduce the frequency of a problem
behavior.
– Suppose the intervention is effective: the problem behavior
decreases.
– After several observation periods the intervention is
removed.
– If the rate of the problem behavior increases, it appears as if
the intervention had an effect (that is, the change was not
due to just maturation or history).
– If the intervention is reintroduced and the problem behavior
is again decreased, it is quite safe to say the intervention is
effective!
Continued…

2) When the intervention cannot be removed,
but the effect is large.
– In this case the impact of the intervention may still
be obvious because of the large effect
– There is a good example in your book that you
can make a note to look over on page 182. In this
example it talks about smoothing
– Smoothing is a method used to reduce/cancel the
effect due to random variation and shows trends.
– Smoothing a graph is no different from finding the
mean of a set of numbers in order to identify the
general pattern.
– A smoothed graph can reveal a pattern in a graph
better than a graph of the raw data can, just as a
mean reveals a general trend better than a list of
original data points.
Observing Other Groups
Nonequivalent Control Group Designs
 This is another way of increasing the
interpretability of an evaluation.

– Why non-equivalent? Because we did not
use random assignment to place subjects
in treatment groups, so we cannot assume
that on the average the groups are the
same, or equivalent to begin with.
Continued.

So what does it involve?
– It increases the number of groups observed
– Here we have experimental and control groups
that are designated before the treatment occurs
and are not randomly assigned.
– If the pretest-posttest design could be duplicated
with another group that did not receive the
program, a potentially strong research design
would result.
– As long as the groups are comparable, nearly all
the internal validity tests are satisfied by this
design.
What is Expected?
A larger improvement between the
pretest and posttest for the program
group than for the comparison group.
 Ex:
Dependent Variable

Program Group
Comparison Group
Before
After
Time of Observation
How To Analyze this Data

By having the two groups by two time periods
analysis of variance, with repeated
measurements over time periods (Mixed
Design)
 Remember Analysis of Variance tells us if the
group means differ….
 If the program is successful and the group
means follow the picture you just saw, the
analysis of variance would reveal a significant
interaction between group and testing period.
Continued

Before beginning a statistical analysis it is
always wise to inspect the data carefully.
 Ideally we would want to find that the
standard deviations associated with the
means in the example are similar
 In this example the means before the
program were nearly equal, if you have a
case where the pretest means are quite
different, analyses of nonequivalent control
group designs may be misleading.
Positive Features of This Design

Including the comparison group permits the
isolation of the threats of internal validity
– Because both groups were tested at the same
time, they had the same amount of time to mature.
– Historical forces have presumably affected the
groups equally
– Because both groups are tested twice, testing
effects should be equivalent
– Finally, the rates of participant loss between
pretest and posttest can be examined to be sure
that they are similar
Useful?
Nonequivalent control groups are
especially useful when part of an
organization is exposed to the program
while other parts are not.
 Since selection to the program is not in
the hands of the participants, and since
the participants’ level of need does not
determine eligibility, the comparability of
the group is quite good.

Why Comparison Groups?

They are chosen when one seeks to learn if
there is an effect of a program (no-treatment
group)
 This would not be appropriate if a comparison
is needed on different ways to offer a servicehere you would use the comparison of
different programs
 If there is a suspicion that attention alone
could affect an outcome, then the comparison
group would be a placebo group (a group that
experiences a program not expected to affect
the outcome variable).
Problems in Selecting Comparison
Groups

Major weakness: finding a comparison
group sufficiently similar to the
treatment group to permit drawing valid
interpretations.
– Ex. Parents who seek out special
programs for their children may also be
devoting more attention to their children at
home than are parents who do not seek
out special programs.
Matching Gone Wrong
While matching is often used to select
comparison groups (on income level,
test score level, rated adjustment,
locality of residence, etc) there are
situations where it can go wrong.
 Example pg. 187 (2nd Paragraph)

The Moral Is Clear

The nonequivalent control group design
is especially sensitive to regression
effects when groups are systematically
different on some dimensions.
Regression Is Not The Only
Weakness.

Other reasons why groups may differ
from each other:
– And most examples break down to a lack
of consistency (teacher using different
methods for different classes, or one
physician encouraging a brochure while
others are not, etc)
Regression-Discontinuity Design
Seen as a useful method for
determining whether a program or
treatment is effective.
 -this actually refers to a set of design
variations
 In its simplest, most traditional form, this
is a pretest-posttest programcomparison group strategy.

How Is It Different?
The unique characteristic of this design
which sets it apart from other pretestposttest group designs is the method by
which research participants are
assigned to conditions.
 Participants here, are assigned to
program or comparison groups solely of
the basis of a cutoff score on a preprogram measure.

Major Advantage…

This cutoff criterion implies the major
advantage of these designs:
– They are appropriate when we wish to
target a program or treatment to those who
most need or deserve it.
– So unlike the other quasi-experiment
alternatives, this design does not require
us to assign potentially needy individuals to
a no-program comparison group in order to
evaluate the effectiveness of a program.
Regression-Discontinuity Design

Used when eligibility for a service is based on
a continuous variable (income, achievement,
level of disability, etc.)
 Example: You have 300 students who are
being tested for reading achievement. Those
scoring the lowest are defined as those
needing the most assistance. If the program
has facilities for 100 students, it seems
reasonable and fair to take the 100 with the
lowest scores into the program.
 If all 300 are retested at the end of the school
year, what would be expected?
We would not expect the 100 to
outperform the 200 regular class
students.
 If the program was effective, we would
expect that the treated children would
have gained more than they would have
had they stayed in their regular
classrooms.
 The regression discontinuity design
enables the evaluator to measure such
effects.

Observing Other Dependent
Variables

Used to increase the validity of interpretations
 These variables are not expected to be
changed by the program, or, at most,
changed only marginally.
 AKA control construct design.
– The added dependent measures must be similar
to the major dependent variable WITHOUT being
strongly influenced by the program.
Back to the Example…(We’ll put it
into perspective for you!)

If the children do read better after a special
program, it might be expected that they would
do better on a social sciences test than they
did before the reading program
 Consequently, a social sciences test would
NOT be a good control construct measure for
a reading program
 However, a test of math skills requiring only
minimal reading levels might be an excellent
additional dependent variable.
Activity

In your group you were given a type of quasiexperiment.
 What you need to do with it:
– List one benefit and one drawback of this type
– Give a program example that your quasiexperiment would be best suited to evaluate
– At what point within the program would it be most
helpful?
Combining Designs to Increase
Internal Validity

1)Time-Series and Nonequivalent Control
Groups
– If a group similar to the program participants can
be observed, the simple interrupted time-series
design is strengthened considerably.
– A key to drawing valid interpretations from
observations lies in being able to repeat the
observations
– If a finding can be replicated, one can be more
sure of conclusions than if conditions make
replication impossible.
Interrupted Time Series with
Switching Replications

This is a refinement in which there are
two groups, each serving as either the
treatment or comparison group on an
alternating basis, through multiple
replications of treatment and removal
Continued
This requires an even higher level of
control over subjects by the researcher
but is a particularly strong design in
ruling out threats to validity.
 Not useful in studies where the
treatment intervention has been
gradual, or when treatment effect does
not decay well.

How About An Example???


Let’s say your seeing if a new alcoholism treatment
program works.
Steps of Interrupted Time Series with Switching
Replications:
– Form two groups of patients, the experimental group and the
comparison group.
– Pre-test both groups with an instrument that would provide a
baseline for the groups, such as the Alcohol Dependence
Scale.
– Apply the treatment to the experimental group and withhold it
from the comparison group.
– Measure the experimental group many times (e.g., every two
weeks) to see if it responded to your treatment. If it did, you
would apply the same treatment to the control group,
measure it many times, and see if you got the same results.
If you did, you can safely assume that this new program is
promising.
Still Combing Designs…

2) Selective Control Design
– By understanding the context of a program,
evaluators may be able to identify the
threats to internal validity that are most
likely to affect an evaluation
– Evaluators may then choose comparison
groups to control for specific threats to
internal validity so that the most plausible
alternative interpretations are eliminated.
– When the appropriate non-equivalent
control groups are available, the selective
control design can be a powerful approach.
Feminist/Critical Research Paradigm

Ontology (Nature of reality): The
apprehended world makes a material
difference in terms of race, gender and class
 Epistemology (Viewpoint/perception of
knowledge base): Knowledge as subjective
and political; researchers values frame
inquiry
 Methodology (How knowledge is gained):
Transformative inquiry; changing the
questions
 Products (Forms of knowledge produced):
Value mediated critiques that challenge
existing power structures and promote
resistance.
How Does This Apply To QuasiExperiments???

Quasi Experiments- Contextual Analyses
 True Experiments- Criticized by feminist
researchers
– No one ‘true’ reality
– All research is bias based upon the
perception/experience of the researcher
– Feminist research: contains a background about
the researcher (s)
Continued.
Criticism of traditional methods
 So…feminist stance about quasiexperiments would be one that is
favorable because there is no ‘random
variation’

Who uses Quasi Experiments in Our
Community? (A few examples)
Well-come Centre for Human Potential
 SACC
 Hiatus House

THANK YOU
Time For The Test!!!
Download