The Classic Experiment (and Its Limitations)

advertisement
The Classic Experiment
(and Its Limitations)
Class 6
Stages of the Research Process
• Research process begins with a hypothesis
about a presumed (causal?) relationship
between an independent and a dependent
variable
– We also might assume that there are conditioning
variables, as well
• The elements of a test of this hypothesis are:
– Research design to assess the relationships between
the variables
– Recruiting subjects for testing the hypotheses
– Valid and reliable measurement of the variables
– Appropriate methods of statistical analysis that permit
inferential conclusions about the hypothesis
Research Designs
• Today, we discuss research designs, focusing on
experiments.
– Contrast this with an epidemiological model, where we infer that
group differences are attributable to the hypothesized effect in a
population.
– In an experiment, we attempt to control for those differences
between groups, so that any differences we observe between
groups is attributable to the test, and not to the group differences
• This is why experiments are considered a “gold
standard” in identifying a causal relationship between a
dependent and an independent variable.
– Obviously, experiments are not always feasible
– Their strengths and limitations fuel endless debates, and have
become a battleground for litigants seeking to assess a pattern
of facts
– Examples from video games, alcohol and car crashes
Types of Research Designs
• Case studies
– Good for generating hypotheses, for understanding and
illustrating causal linkages
– Not good for testing hypotheses, or for generalizing to other
populations
• Correlational studies
– Studies that assess simultaneous changes in independent and
dependent variables.
• Example: income levels and voter preferences on surveys
• Example: diet and disease (epi causation model
– You can still make predictions from correlational studies if you
have ruled out other causes, but you cannot achieve “control”
without understanding directionality of effect.
• True experiments
– Random assignment of subjects to groups, unequal treatment of
similarly situated people….. ‘but for…’ causation
• Examples: Perry Pre-School, clinical drug trials
• Quasi-experiments
– Nonrandom assignment, with approximations and control for
between-group differences.
•
Why are experiments the gold standard?
– An experiment is a design for testing
hypotheses regarding the empirical
relationship between an independent and a
dependent variable
– It is the most efficient and reliable way to rule
out spurious causation (rival hypotheses)
through random assignment of individuals to
test conditions, and therefore to establish
conditions for causal inference.
– Causality is critical for the scientific goals of
“explanation," "prediction" and “control.”
Why Random Assignment?
•
RA assigns units to conditions based on chance
–
•
•
Not the same as random sampling – we get to this later, as an
example of a validity threat or strength
Avoids correlation of causes with treatment conditions
When is randomization feasible? ETHICAL DECISION
–
–
–
–
–
–
–
–
When demand outstrips supply
When supply of X is short
When isolation or separation of experimental group is possible
Mandatory change (legislation)
No preferences
No advantages (denial of possibly beneficial service)
New organizations are created
Lotteries
Types of Experiments
• The Classic Experimental Design
• The Post-test Only Experimental Design
– Strengths -- No test effects, no desensitization
– Weaknesses -- Problems in attribution of effects, does
not eliminate rival causal factors such as history or
test effects, introduces test effects (!)
• The Solomon Four-Group Design (Fig 8.5)
– Provides estimates of test effects, avoids reactivity
and test effects.
– Expensive, difficult to implement, especially under
field conditions
• Nested, or Hierarchical Designs
– Allows for identification of contextual effects
– Common in school research
Natural Experiments
•
•
Natural Disasters, Policy or Legislative
Changes
Examples
–
–
–
–
Flipping Coins in the Courtroom
Damage Caps
Disaster Research – Highway 880
Waiver Laws in Adjacent Areas
Some Limitations to Experiments
• Generalizability of X -- complex realities vs.
single variables
• Representations of theory -- e.g., the meaning of
arrest
• Period effects -- problems of the day, factors
related to crimes or behaviors at one time may
not be salient at another time (e.g., Drug eras,
drug-crime relationships)
• Political Limitations (e.g., over-rides)
• Organizational resistance
When You Can’t Randomize:
Quasi-Experiments
• Theory and Logic
– Adjusting for selection differences
– This can be done either by design controls or statistical controls
or both
• No-Control Quasi-Experimental Designs
– Time series before and after an intervention
– Removed TX (satisfies the essentialist view of causation)
• Critiques of multiple pretest observations
– Test effects (sensitization, et al.) – works best if the pretest
observations are unobtrusive
– Change over time in status of subject vis-à-vis the preconditions
for treatment
Quasi-Experimental Designs
That Use Control Groups
•
Matched Strategies
–
–
–
•
Matched Cases – (Case Control Designs) Housing Discrimination
Matched Samples -- Bishop Waiver Study
Weaknesses and Strengths (omitted variable biases)
Difficulties and Problems with Matching
–
•
Endogeneity of Cause and Effect
Strategies for Better Matches
–
–
–
•
Use stable variables (avoid measurement errors)
Avoid confounding of matching variables with dependent variables
(outcomes)
Use “deep” matches – longitudinally measured or stable variables,
for example, rather than single-state variables
Statistical Solutions
–
–
instrumental variables approach
“propensity score matching” – try to model the underlying differences
between experimental and control groups
Experimental Validity
• Validity - whether an experiment produces “true”
or “accurate” answers
• Threats to internal validity
– Threats posed by the design of the experiment itself -whether the observational procedures may have
produced the results. Internal validity refers to the
soundness of the design to justify the conclusions
reached.
• Threats to external validity
– Threats due to the limitations of the sample -whether the research is generalizeable or applicable
only to the population studied. In other words, it
refers to the extent to which the results can be
generalized.
Internal Validity Threats
•
•
•
•
•
•
•
•
•
History – local factors
Maturation of subjects – they change
Test Effects – subjects figure out test
Instrumentation – biased instruments
Regression to the Mean – “what goes up…”
Selection Bias I – non-equivalent groups
Mortality – subjects leave experiment
Testing Effects – you know you’re being studied
Reactivity – reactions to the researcher rather
than the stimulus
External Validity Threats
• Selection Bias II -- groups are unrepresentative of
general populations
• Multiple treatment inference -- more than one
independent variable operating
• Halo effects -- conferring status or label that influences
behavior
• Local history – changing contexts
• Diffusion of treatment -- controls imitate experimental
subjects
• Compensatory equalization of treatment -- controls want
to receive experimental treatment
• Decay -- erosion of treatment
• Contamination -- C's receive some of E treatment
Tradeoffs
• Must we trade internal validity for external
validity in experiments?
Download