Quasi-Experiments The Basic Nonequivalent Groups Design (NEGD) N N O O X O O Key Feature: Nonequivalent assignment What Does Nonequivalent Mean? Assignment is nonrandom. Researcher didn’t control assignment. Groups may be different. Group differences may affect outcomes. Equivalence “Equivalent” groups are not necessarily identical on any pre-test measure. Merely implies that if the random assignment procedure was repeated, the groups would tend toward equivalence. Non-Equivalence Non-equivalent groups do not necessarily differ on any pre-test measure. Merely implies that If the same nonrandom assignment procedure was repeated, the groups would tend to toward non-equivalence. If assignment to groups was based partly on income, then groups would tend to have different expected mean levels of income – but any two groups you picked might well be similar in income levels. The Point Equivalence or non-equivalence is defined by the selection procedure. Even if the difference in pre-test means across groups is “small,” this does not imply that the groups are equivalent. – Small differences can introduce big threats. Quasi- vs. Natural vs. Experiment In a true experiment, the researcher performs the random assignment – Can be in a lab or the field In a natural experiment, someone else assigns through a “random” process. In a quasi-experiment, assignment is not random, introducing selection threats. – Much stronger if the selection is not done by the cases themselves (exogenous sorting). What is a Natural Experiment Strict Definition: – Some truly natural process, such as rainfall or weather patterns, assigns IV. Definition we all use in our own work: – Some exogenous process, rather than our cases, ourselves, or a causal process relevant to our theory, assigns IV. Genres of Natural Experiments The natural border or natural disaster The Rule Change Jared Diamond’s islands Dan Posner’s rivers Caroline Hoxby’s streams Settler mortality (Acemoglu, Johnson, and Robinson) – Hurricane Katrina – House seniority system (Crooks and Hibbing) – GAVEL amendment in Colorado – Connecticut speeding law – New Zealand electoral reform – Propositions – Strength is that nature doesn’t care about your cases or IV – Relatively easy to spot, hard to defend – – – – Genres of Natural Experiments The Court Decision Roe V. Wade for Levitt and Donohue Iowa item veto decision The Lottery Strength is that court is not a blatant political actor responding to societal shifts or societal pressures James Fowler’s use of Canadian bill introduction privilege US House Clerk conducts a randomization of the order in which members choose office Strength is true randomness in first step, but human action in 2nd Genres of Natural Experiments Staged Implementation Two-step reapportionment revolution in the United States Lots of program evaluations in development Helps to rule out history and maturation threats The Threshold Mail ballot assignment in precincts with <250 voters Need to make the threshold unrelated to DV, or else use Trochimstyle regression discontinuity What Makes a Convincing Natural Experiment? You can show that the process of selection was not related to characteristics of the cases that are relevant to your DV In a cross-sectional experiment, demonstrate that the two groups are quite similar In a time-series experiment, demonstrate that little else changed when the treatment took place. In a word, show equivalence Any purported causal test of needs to take into consideration all of the two-group threats to validity. R R X O O N N X O O Can be a valid causal test. Fully exposed to threats. NEGD Design has Multiple Groups AND Multiple Measures N O X O N O O This helps rule out (or at least recognize) threats. Pre-Tests v. Covariates N O X O N O O Pre- Post-Test Design: Observations are tests you administer. N O1 X O2 N O1 O2 Proxy Pre-Test Design: First observations are covariates on which you collect data. Problems of Internal Validity in NEGDs Internal Validity N O X O N O O All designs suffer from threats to validity. In addition to all the single group threats, quasi-experiments are particularly likely to suffer from multi-group threats. Selection-history Selection-maturation Selection-testing Selection-instrumentation Selection-regression Selection-mortality The Bivariate Distribution 90 80 Posttest 70 60 50 40 30 30 40 50 Pretest 60 70 80 The Bivariate Distribution 90 80 Posttest 70 60 50 40 30 30 40 Program Group has 60 70 80 a pretest5-point pretest advantage. 50 The Bivariate Distribution 90 80 Posttest 70 Program group scores 15-points higher on Posttest. 60 50 40 30 30 40 Program group has 60 70 80 a pretest5-point pretest advantage, 50 Graph of Means 80 75 70 65 60 55 50 45 40 35 30 Comparison Program Pretest Comp Prog ALL pretest MEAN 49.991 54.513 52.252 Posttest posttest MEAN 50.008 64.121 57.064 pretest STD DEV 6.985 7.037 7.360 posttest STD DEV 7.549 7.381 10.272 Possible Outcome #1 70 65 60 Comparison Program 55 50 45 40 Pretest Selection-history Selection-maturation Selection-testing Selection-instrumentation Selection-regression Selection-mortality Posttest Possible: local event Possible: PG initially higher Unlikely: no change in CG Possible: scale effects Unlikely: expect change in CG Possible: PG loses low scorers Possible Outcome #2 70 65 60 Comparison Program 55 50 45 40 Pretest Selection-history Selection-maturation Selection-testing Selection-instrumentation Selection-regression Selection-mortality Posttest Likely: PG initially higher Likely: PG initially higher Possible Possible Unlikely: expect change in CG Possible: both lose low scorers Possible Outcome #3 70 65 60 Comparison Program 55 50 45 40 Pretest Selection-history Selection-maturation Selection-testing Selection-instrumentation Selection-regression Selection-mortality Posttest Possible: local event Unlikely: no change in CG Unlikely: no change in CG Possible: scale effects Likely Possible: PG loses high scorers Possible Outcome #4 70 65 60 Comparison Program 55 50 45 40 Pretest Selection-history Selection-maturation Selection-testing Selection-instrumentation Selection-regression Selection-mortality Posttest Possible: local event Unlikely: no change in CG Unlikely: no change in CG Possible: scale effects Very Likely Possible: PG loses low scorers Possible Outcome #5 70 65 60 Comparison Program 55 50 45 40 Pretest Posttest Selection-history Selection-maturation Selection-testing “And you should be so lucky…” Selection-instrumentation Selection-regression Selection-mortality Analysis Requirements N N O O X O O Pre-post (or covariates) Two-group Treatment-control (dummy = 0, 1) Analysis of Covariance (ANCOVA) yi = 0 + 1Xi + 2Zi + ei where: outcome score for the ith unit coefficient for the intercept pretest coefficient mean difference for treatment covariate dummy variable for treatment(0 = control, 1= treatment) ei = residual for the ith unit yi 0 1 2 Xi Zi = = = = = = The Bivariate Distribution 90 80 posttest 70 Program group scores 15-points higher on Posttest. 60 50 40 30 30 40 Program group has 60 70 80 a pretest5-point pretest Advantage. 50 The Bivariate Distribution 90 80 posttest 70 Slope is B1 Vertical Distance is Mean Treatment Effect, or B2 60 50 40 30 30 40 50 pretest 60 70 80 Why Add Covariates to Analysis? ANCOVA can include more than one pretest or “control” variable. Additional pretests further adjust for initial group differences. Ideally, in the absence of any treatment effect, the covariates would perfectly predict the posttest. Additional covariates will often improve the accuracy of the estimate of the treatment effect. Irrelevant Covariates Adding pretests that are completely unrelated to the posttest, however, actually decreases precision. “Irrelevant covariates” contribute nothing to the analysis, but subtract a degree of freedom from the error term. This reduces the efficiency of the estimate. Omitted Covariates Covariates that are related to the posttest but not to the treatment can be ignored without biasing the estimate of the treatment effect. Covariates that are related to the posttest and the treatment but that are omitted will bias the estimate of the treatment effect. We can safely omit control variables even if they are highly correlated with the posttest as long as they do not correlate with the treatment. Omitted Variables Bias Omitted (relevant) covariates that are positively correlated with the treatment will lead us to overestimate the treatment effect. Omitted (relevant) covariates that are negatively correlated with the treatment will lead us to underestimate the treatment effect. Bottom Line We should always try to include omitted relevant covariates, except When the omitted covariate is itself a consequence of the treatment. If cannot include a relevant covariate, we can at least predict the direction if not magnitude of the likely bias. But…What about measurement error? With multiple covariates, measurement error does not always lead to a pseudoeffect. As measurement error in any single variable increases, it becomes “as if” the variable is not included in the ANCOVA. This then mimics an omitted variables problem, and the direction of bias depends upon the relationship between the “noisy” covariate and the treatment. Other Quasi-Experimental Designs Separate Pre-Post Samples N1 N1 N2 N2 O X O O O Groups with the same subscript come from the same context. Here, N1 might be people who were in the program at Agency 1 last year, with those in N2 at Agency 2 last year. This is like having a proxy pretest on a different group. Separate Pre-Post Samples N N R1 R1 R2 R2 O X O O O Take random samples at two times of people at two nonequivalent agencies. Useful when you routinely measure with surveys. You can assume that the pre and post samples are equivalent, but the two agencies may not be. Double-Pretest Design N N O O O O X O O Strong in internal validity Helps address selection-maturation Switching Replications N N O O X O O X O O Strong design for both internal and external validity Strong against social threats to internal validity Strong ethically Nonequivalent Dependent Variables Design (NEDV) N O1 O2 X O1 O2 The variables have to be similar enough that they are affected the same way by all threats. The program has to target one variable and not the other. In simple form, weak internal validity. NEDV Example 80 70 Algebra Geometry 60 50 40 Pre Post Only works if we can assume that geometry scores show what would have happened to algebra if untreated. The variable is the control. Note that there is no control group here. NEDV Pattern Matching Have many outcome variables. Have theory that tells how affected (from most to least) each variable will be by the program. Match observed gains with predicted ones. With pattern, NEDV can be extremely powerful. NEDV Pattern Matching 80 Algebra Geometry 60 Arithmetic Reasoning Analogies 40 Grammar Punctuation 20 Spelling Comprehension 0 Creativity Exp A “ladder” graph. Obs r = .997 NEDV: Lake and O’Mahony 2006 Issues that Generated Interstate Wars (Percent) A Simple Pattern-Matching Design 60 50 Territory-Related 40 Foreign Interests 30 Economic Interests 20 Realpolitik 10 0 1815-1914 1918-1941 Period 1945-1989 Hypothesis: As territory declines in value in 20th century (measured by average state size), wars fought over territory should decline in frequency. There should be no pattern in other Issues.