The Classic Experiment (and Its Limitations) Class 6 Stages of the Research Process • Research process begins with a hypothesis about a presumed (causal?) relationship between an independent and a dependent variable – We also might assume that there are conditioning variables, as well • The elements of a test of this hypothesis are: – Research design to assess the relationships between the variables – Recruiting subjects for testing the hypotheses – Valid and reliable measurement of the variables – Appropriate methods of statistical analysis that permit inferential conclusions about the hypothesis Research Designs • Today, we discuss research designs, focusing on experiments. – Contrast this with an epidemiological model, where we infer that group differences are attributable to the hypothesized effect in a population. – In an experiment, we attempt to control for those differences between groups, so that any differences we observe between groups is attributable to the test, and not to the group differences • This is why experiments are considered a “gold standard” in identifying a causal relationship between a dependent and an independent variable. – Obviously, experiments are not always feasible – Their strengths and limitations fuel endless debates, and have become a battleground for litigants seeking to assess a pattern of facts – Examples from video games, alcohol and car crashes Types of Research Designs • Case studies – Good for generating hypotheses, for understanding and illustrating causal linkages – Not good for testing hypotheses, or for generalizing to other populations • Correlational studies – Studies that assess simultaneous changes in independent and dependent variables. • Example: income levels and voter preferences on surveys • Example: diet and disease (epi causation model – You can still make predictions from correlational studies if you have ruled out other causes, but you cannot achieve “control” without understanding directionality of effect. • True experiments – Random assignment of subjects to groups, unequal treatment of similarly situated people….. ‘but for…’ causation • Examples: Perry Pre-School, clinical drug trials • Quasi-experiments – Nonrandom assignment, with approximations and control for between-group differences. • Why are experiments the gold standard? – An experiment is a design for testing hypotheses regarding the empirical relationship between an independent and a dependent variable – It is the most efficient and reliable way to rule out spurious causation (rival hypotheses) through random assignment of individuals to test conditions, and therefore to establish conditions for causal inference. – Causality is critical for the scientific goals of “explanation," "prediction" and “control.” Why Random Assignment? • RA assigns units to conditions based on chance – • • Not the same as random sampling – we get to this later, as an example of a validity threat or strength Avoids correlation of causes with treatment conditions When is randomization feasible? ETHICAL DECISION – – – – – – – – When demand outstrips supply When supply of X is short When isolation or separation of experimental group is possible Mandatory change (legislation) No preferences No advantages (denial of possibly beneficial service) New organizations are created Lotteries Types of Experiments • The Classic Experimental Design • The Post-test Only Experimental Design – Strengths -- No test effects, no desensitization – Weaknesses -- Problems in attribution of effects, does not eliminate rival causal factors such as history or test effects, introduces test effects (!) • The Solomon Four-Group Design (Fig 8.5) – Provides estimates of test effects, avoids reactivity and test effects. – Expensive, difficult to implement, especially under field conditions • Nested, or Hierarchical Designs – Allows for identification of contextual effects – Common in school research Natural Experiments • • Natural Disasters, Policy or Legislative Changes Examples – – – – Flipping Coins in the Courtroom Damage Caps Disaster Research – Highway 880 Waiver Laws in Adjacent Areas Some Limitations to Experiments • Generalizability of X -- complex realities vs. single variables • Representations of theory -- e.g., the meaning of arrest • Period effects -- problems of the day, factors related to crimes or behaviors at one time may not be salient at another time (e.g., Drug eras, drug-crime relationships) • Political Limitations (e.g., over-rides) • Organizational resistance When You Can’t Randomize: Quasi-Experiments • Theory and Logic – Adjusting for selection differences – This can be done either by design controls or statistical controls or both • No-Control Quasi-Experimental Designs – Time series before and after an intervention – Removed TX (satisfies the essentialist view of causation) • Critiques of multiple pretest observations – Test effects (sensitization, et al.) – works best if the pretest observations are unobtrusive – Change over time in status of subject vis-à-vis the preconditions for treatment Quasi-Experimental Designs That Use Control Groups • Matched Strategies – – – • Matched Cases – (Case Control Designs) Housing Discrimination Matched Samples -- Bishop Waiver Study Weaknesses and Strengths (omitted variable biases) Difficulties and Problems with Matching – • Endogeneity of Cause and Effect Strategies for Better Matches – – – • Use stable variables (avoid measurement errors) Avoid confounding of matching variables with dependent variables (outcomes) Use “deep” matches – longitudinally measured or stable variables, for example, rather than single-state variables Statistical Solutions – – instrumental variables approach “propensity score matching” – try to model the underlying differences between experimental and control groups Experimental Validity • Validity - whether an experiment produces “true” or “accurate” answers • Threats to internal validity – Threats posed by the design of the experiment itself -whether the observational procedures may have produced the results. Internal validity refers to the soundness of the design to justify the conclusions reached. • Threats to external validity – Threats due to the limitations of the sample -whether the research is generalizeable or applicable only to the population studied. In other words, it refers to the extent to which the results can be generalized. Internal Validity Threats • • • • • • • • • History – local factors Maturation of subjects – they change Test Effects – subjects figure out test Instrumentation – biased instruments Regression to the Mean – “what goes up…” Selection Bias I – non-equivalent groups Mortality – subjects leave experiment Testing Effects – you know you’re being studied Reactivity – reactions to the researcher rather than the stimulus External Validity Threats • Selection Bias II -- groups are unrepresentative of general populations • Multiple treatment inference -- more than one independent variable operating • Halo effects -- conferring status or label that influences behavior • Local history – changing contexts • Diffusion of treatment -- controls imitate experimental subjects • Compensatory equalization of treatment -- controls want to receive experimental treatment • Decay -- erosion of treatment • Contamination -- C's receive some of E treatment Tradeoffs • Must we trade internal validity for external validity in experiments?