The Nonequivalent Groups Design

advertisement
Quasi-Experiments
The Basic Nonequivalent Groups Design
(NEGD)
N
N

O
O
X
O
O
Key Feature: Nonequivalent
assignment
What Does Nonequivalent Mean?




Assignment is nonrandom.
Researcher didn’t control assignment.
Groups may be different.
Group differences may affect outcomes.
Equivalence


“Equivalent” groups are not necessarily
identical on any pre-test measure.
Merely implies that if the random
assignment procedure was repeated,
the groups would tend toward
equivalence.
Non-Equivalence



Non-equivalent groups do not necessarily
differ on any pre-test measure.
Merely implies that If the same nonrandom assignment procedure was
repeated, the groups would tend to toward
non-equivalence.
If assignment to groups was based partly
on income, then groups would tend to
have different expected mean levels of
income – but any two groups you picked
might well be similar in income levels.
The Point


Equivalence or non-equivalence is
defined by the selection procedure.
Even if the difference in pre-test
means across groups is “small,”
this does not imply that the groups
are equivalent.
– Small differences can introduce big threats.
Quasi- vs. Natural vs. Experiment

In a true experiment, the researcher
performs the random assignment
– Can be in a lab or the field


In a natural experiment, someone else
assigns through a “random” process.
In a quasi-experiment, assignment is not
random, introducing selection threats.
– Much stronger if the selection is not done by
the cases themselves (exogenous sorting).
What is a Natural Experiment

Strict Definition:
– Some truly natural process, such as rainfall
or weather patterns, assigns IV.

Definition we all use in our own work:
– Some exogenous process, rather than our
cases, ourselves, or a causal process
relevant to our theory, assigns IV.
Genres of Natural Experiments
The natural border or
natural disaster
The Rule Change
Jared Diamond’s islands
Dan Posner’s rivers
Caroline Hoxby’s streams
Settler mortality
(Acemoglu, Johnson, and
Robinson)
– Hurricane Katrina
– House seniority system
(Crooks and Hibbing)
– GAVEL amendment in
Colorado
– Connecticut speeding law
– New Zealand electoral
reform
– Propositions
– Strength is that nature
doesn’t care about your
cases or IV
– Relatively easy to spot,
hard to defend
–
–
–
–
Genres of Natural Experiments
The Court Decision


Roe V. Wade for Levitt
and Donohue
Iowa item veto decision
The Lottery



Strength is that court is
not a blatant political
actor responding to
societal shifts or societal
pressures

James Fowler’s use of
Canadian bill introduction
privilege
US House Clerk conducts
a randomization of the
order in which members
choose office
Strength is true
randomness in first step,
but human action in 2nd
Genres of Natural Experiments
Staged Implementation



Two-step
reapportionment
revolution in the United
States
Lots of program
evaluations in
development
Helps to rule out history
and maturation threats
The Threshold

Mail ballot assignment in
precincts with <250
voters

Need to make the
threshold unrelated to
DV, or else use Trochimstyle regression
discontinuity
What Makes a Convincing Natural
Experiment?




You can show that the process of selection
was not related to characteristics of the cases
that are relevant to your DV
In a cross-sectional experiment, demonstrate
that the two groups are quite similar
In a time-series experiment, demonstrate that
little else changed when the treatment took
place.
In a word, show equivalence
Any purported causal test of needs to take
into consideration all of the two-group
threats to validity.
R
R
X
O
O
N
N
X
O
O
Can be a valid causal test.
Fully exposed to threats.
NEGD Design has Multiple Groups AND
Multiple Measures
N O X O
N O
O
This helps rule out
(or at least recognize)
threats.
Pre-Tests v. Covariates
N O X O
N O
O
Pre- Post-Test Design:
Observations are tests
you administer.
N O1 X O2
N O1
O2
Proxy Pre-Test Design:
First observations are
covariates on which you
collect data.
Problems of Internal Validity in
NEGDs
Internal Validity
N O X O
N O
O
All designs suffer from threats to validity.
In addition to all the single group threats,
quasi-experiments are particularly likely to suffer
from multi-group threats.
Selection-history
Selection-maturation
Selection-testing
Selection-instrumentation
Selection-regression
Selection-mortality
The Bivariate Distribution
90
80
Posttest
70
60
50
40
30
30
40
50
Pretest
60
70
80
The Bivariate Distribution
90
80
Posttest
70
60
50
40
30
30
40
Program
Group
has
60
70
80 a
pretest5-point pretest
advantage.
50
The Bivariate Distribution
90
80
Posttest
70
Program
group
scores
15-points
higher
on
Posttest.
60
50
40
30
30
40
Program
group
has
60
70
80 a
pretest5-point pretest
advantage,
50
Graph of Means
80
75
70
65
60
55
50
45
40
35
30
Comparison
Program
Pretest
Comp
Prog
ALL
pretest
MEAN
49.991
54.513
52.252
Posttest
posttest
MEAN
50.008
64.121
57.064
pretest
STD DEV
6.985
7.037
7.360
posttest
STD DEV
7.549
7.381
10.272
Possible Outcome #1
70
65
60
Comparison
Program
55
50
45
40
Pretest
Selection-history
Selection-maturation
Selection-testing
Selection-instrumentation
Selection-regression
Selection-mortality
Posttest






Possible: local event
Possible: PG initially higher
Unlikely: no change in CG
Possible: scale effects
Unlikely: expect change in CG
Possible: PG loses low scorers
Possible Outcome #2
70
65
60
Comparison
Program
55
50
45
40
Pretest
Selection-history
Selection-maturation
Selection-testing
Selection-instrumentation
Selection-regression
Selection-mortality
Posttest






Likely: PG initially higher
Likely: PG initially higher
Possible
Possible
Unlikely: expect change in CG
Possible: both lose low scorers
Possible Outcome #3
70
65
60
Comparison
Program
55
50
45
40
Pretest
Selection-history
Selection-maturation
Selection-testing
Selection-instrumentation
Selection-regression
Selection-mortality
Posttest






Possible: local event
Unlikely: no change in CG
Unlikely: no change in CG
Possible: scale effects
Likely
Possible: PG loses high
scorers
Possible Outcome #4
70
65
60
Comparison
Program
55
50
45
40
Pretest
Selection-history
Selection-maturation
Selection-testing
Selection-instrumentation
Selection-regression
Selection-mortality
Posttest






Possible: local event
Unlikely: no change in CG
Unlikely: no change in CG
Possible: scale effects
Very Likely
Possible: PG loses low scorers
Possible Outcome #5
70
65
60
Comparison
Program
55
50
45
40
Pretest
Posttest
Selection-history
Selection-maturation
Selection-testing “And you should be so lucky…”
Selection-instrumentation
Selection-regression
Selection-mortality
Analysis Requirements
N
N



O
O
X
O
O
Pre-post (or covariates)
Two-group
Treatment-control (dummy = 0, 1)
Analysis of Covariance (ANCOVA)
yi = 0 + 1Xi + 2Zi + ei
where:
outcome score for the ith unit
coefficient for the intercept
pretest coefficient
mean difference for treatment
covariate
dummy variable for treatment(0 = control, 1=
treatment)
ei = residual for the ith unit
yi
0
1
2
Xi
Zi
=
=
=
=
=
=
The Bivariate Distribution
90
80
posttest
70
Program
group
scores
15-points
higher
on
Posttest.
60
50
40
30
30
40
Program
group
has
60
70
80 a
pretest5-point pretest
Advantage.
50
The Bivariate Distribution
90
80
posttest
70
Slope is
B1
Vertical
Distance is
Mean
Treatment
Effect, or
B2
60
50
40
30
30
40
50
pretest
60
70
80
Why Add Covariates to Analysis?




ANCOVA can include more than one
pretest or “control” variable.
Additional pretests further adjust for
initial group differences.
Ideally, in the absence of any treatment
effect, the covariates would perfectly
predict the posttest.
Additional covariates will often improve
the accuracy of the estimate of the
treatment effect.
Irrelevant Covariates



Adding pretests that are completely
unrelated to the posttest, however,
actually decreases precision.
“Irrelevant covariates” contribute
nothing to the analysis, but subtract a
degree of freedom from the error term.
This reduces the efficiency of the
estimate.
Omitted Covariates



Covariates that are related to the posttest but
not to the treatment can be ignored without
biasing the estimate of the treatment effect.
Covariates that are related to the posttest and
the treatment but that are omitted will bias the
estimate of the treatment effect.
We can safely omit control variables even if
they are highly correlated with the posttest as
long as they do not correlate with the
treatment.
Omitted Variables Bias


Omitted (relevant) covariates that are
positively correlated with the treatment
will lead us to overestimate the
treatment effect.
Omitted (relevant) covariates that are
negatively correlated with the treatment
will lead us to underestimate the
treatment effect.
Bottom Line



We should always try to include omitted
relevant covariates, except
When the omitted covariate is itself a
consequence of the treatment.
If cannot include a relevant covariate,
we can at least predict the direction if
not magnitude of the likely bias.
But…What about measurement error?



With multiple covariates, measurement
error does not always lead to a pseudoeffect.
As measurement error in any single
variable increases, it becomes “as if”
the variable is not included in the
ANCOVA.
This then mimics an omitted variables
problem, and the direction of bias
depends upon the relationship between
the “noisy” covariate and the treatment.
Other Quasi-Experimental Designs
Separate Pre-Post Samples
N1
N1
N2
N2



O
X
O
O
O
Groups with the same subscript come from the same
context.
Here, N1 might be people who were in the program at
Agency 1 last year, with those in N2 at Agency 2 last year.
This is like having a proxy pretest on a different group.
Separate Pre-Post Samples
N
N



R1
R1
R2
R2
O
X
O
O
O
Take random samples at two times of people at two
nonequivalent agencies.
Useful when you routinely measure with surveys.
You can assume that the pre and post samples are
equivalent, but the two agencies may not be.
Double-Pretest Design
N
N


O
O
O
O
X
O
O
Strong in internal validity
Helps address selection-maturation
Switching Replications
N
N



O
O
X
O
O
X
O
O
Strong design for both internal and
external validity
Strong against social threats to internal
validity
Strong ethically
Nonequivalent Dependent Variables Design (NEDV)
N



O1
O2
X
O1
O2
The variables have to be similar enough that
they are affected the same way by all threats.
The program has to target one variable and
not the other.
In simple form, weak internal validity.
NEDV Example
80
70
Algebra
Geometry
60
50
40
Pre



Post
Only works if we can assume that geometry scores
show what would have happened to algebra if
untreated.
The variable is the control.
Note that there is no control group here.
NEDV Pattern Matching




Have many outcome variables.
Have theory that tells how affected
(from most to least) each variable will
be by the program.
Match observed gains with predicted
ones.
With pattern, NEDV can be extremely
powerful.
NEDV Pattern Matching
80
Algebra
Geometry
60
Arithmetic
Reasoning
Analogies
40
Grammar
Punctuation
20
Spelling
Comprehension
0
Creativity
Exp

A “ladder” graph.
Obs
r = .997
NEDV: Lake and O’Mahony 2006
Issues that Generated
Interstate Wars (Percent)
A Simple Pattern-Matching Design
60
50
Territory-Related
40
Foreign Interests
30
Economic Interests
20
Realpolitik
10
0
1815-1914
1918-1941
Period
1945-1989
Hypothesis:
As territory
declines in
value in 20th
century
(measured by
average state
size), wars
fought over
territory should
decline in
frequency. There
should be no
pattern in other
Issues.
Download