Quasi-experimental Designs Cozby Ch. 11 Intro to Experimental Psychology Psych 185 Dr. Isonio Golden West College Recap--Designs To date, we have discussed: True experimental designs (Chapters 8 and 10) What remains is today’s topic: Quasi-experimental designs These are “intermediate” designs—they lack the randomization and full control of extraneous variables that true experiments have but can nevertheless yield reasonably reliable conclusions For completeness we will also mention a couple of non-experimental designs (what Campbell and Stanley call “pre-experiments”) that are fraught with confounds Internal Validity Remember: Internal validity refers to the extent to which we are able to clearly attribute the observed differences on the dependent variable to the effects of the independent variable. Thus, true experimental designs have high internal validity and quasi-experimental designs have some intermediate level of internal validity Threats to Internal Validity Factors which, if uncontrolled can confound (i.e., “mess up”) a study The classic list comes from Campbell and Stanley Threats to Internal Validity History Maturation Repeated Testing Instrument Decay Statistical Regression Selection Mortality Threats to Internal Validity History – Outside events (beyond the context of the study itself) influence the participants during the study, possible affecting their behavior This threat is relevant in any type of repeated measures design Pretest → IV → Posttest Threats . . . History Example of History operating as an internal validity threat: Study to assess how attitudes toward welfare program are affected by having a personal discussion with recipients Pretest measure of attitudes on Monday; discussion occur Tuesday through Thursday; posttest measure of attitudes on Friday BUT, a special TV documentary reporting on widespread welfare fraud was aired on Wednesday evening. Results show no difference between pre- and posttest attitude assessments; due to true lack of difference, or the confounding effect of the TV show??? Threats to Internal Validity Maturation Any change in the participants (beyond the direct effect of the IV) during the course of the study. The changes can be permanent (e.g., physical growth) or temporary (e.g., fatigue) Like History, this threat arises in pre-post designs Threats . . . Maturation Example of maturation operating as a threat to internal validity Study to assess effectiveness of training program to increase capacity of STM; pretest given at noon; program held from 1to 5pm; posttest given at 530pm Results indicate no difference between pre- and posttest memory scores; due to ineffective program, or . . . fatigue ???? Threats to Internal Validity Repeated Testing This threat occurs when participants are measured multiple times. The threat is that they might be somehow changed by the measurement process itself. This could then affect their behavior. One common outcome of repeated testing: becoming “Test Wise” Experience with tests affect measurements in sometimes dramatic ways- Classic IQ effect: benefit of being test wise is estimated to be about 35 points Personality tests—test wise persons appear to be better adjusted Threats . . . Repeated testing Example of repeated testing operating as an internal validity threat: Does a special program teaching mental math increase the speed of doing everyday calculations? Pretest (10 arithmetic problems) is given; one-hour of mental math instruction follows; then, the posttest is administered (same problems in order to see “improvement”; results indicate dramatic improvement in calculation speed Do you sell the program to school districts OR is the “improvement” a reflection of the fact that the participants had seen the problems previously, only an hour earlier ??? Threats to Internal Validity Instrument decay This threat operates whenever a change in the measurement instrument itself Instrument can be a device (e.g., batteries run down) a human (e.g., changes standards over time or becomes fatigued) Threats . . . Instrument decay A good example of how instrument decay can threaten internal validity is with the evaluations of writing samples English professors designate a whole day to read and evaluate the samples; late in the day as they get tired, they begin to miss or errors (or they begin to grade more stringently or . . . ) The analogy to springs in a scale becoming stretched out over time fits well Threats to Internal Validity Statistical Regression (regression toward the mean) refers to the natural tendency for extreme values to move closer to average upon subsequent measurement “Extremes moderate over time” Any superlative performance by an individual (e.g., in sports) is likely to be followed by a more ordinary one from that same individual Threats . . . Statistical Regression Example: A new therapy for severe depression is evaluated. Twenty severely depressed persons go through a two-week “just smile” program; after the program, their mood is re-assessed. Results show a marked improvement—average depression level is only mild to moderate. Do you conclude the program is indeed wonderful, or might it just be that extremely depressed people will naturally be (at least slightly) less depressed two weeks later ???? Threats to Internal Validity Selection operates whenever groups used in a comparison are not equivalent at the outset of a study We have talked about this threat previously The confound is that we don’t know whether group differences seen on the dependent variable are due to the effects of the independent variable or to pre-existing differences Threats . . . selection EXAMPLE: Two psychology professors argue over the merits of providing students with class notes. Professor A thinks doing so will send the wrong message to students—you have the notes, so there is no need to attend class. The other believes that students can understand that the notes are an aid to help them get more out of class lecture/discussion. They decide to do a study—professor A makes notes available to his students for the remaining 8 weeks of the semester; professor does not do so for the same time period. They compare performance on the final exam and find that student in professor B’s class get better exam scores. What do you conclude and why? Threats to Internal Validity Mortality refers to the effects of the loss of participants over time in a longitudinal study. Those remaining in the study, then, may not be representative of the original group nor of the population. Participant mortality can result from a variety of factors—literal mortality, moving out the area, losing contact, no longer committed to the project, etc. Threats . . . mortality An example of mortality operating as an internal validity threat: a school district is evaluating the effects of mandatory school uniforms on discipline problems (expectation is that uniforms will cause a reduction in problems). However, during the course of the study, the greatest trouble-makers leave the district. Results indicate a significantly lower level of problems at the end of the first year of uniforms. What do you conclude? Now to the designs They differ in format (structure, design) Because of their design they handle (or fail to handle) potential threats to internal validity in different ways See page 1 of the “Non-experimental, quasi-experimental and true experimental designs” Handout True experimental Designs (We discussed these previously) Posttest only control group design and pretest-posttest control group design (bottom half of page) were covered in chapter 8; factorial designs were introduced in chapter 10 Non-experimental designs These designs are inherently flawed and fraught with internal validity threats: One shot cases study One-group pretest-posttest design Non-equivalent control group design Non-experimental designs One shot case study One-group pretest-posttest design Non-equivalent control group design Quasi-experimental Designs Non-equivalent control group pretest-posttest Interrupted time series Control series Retrospective Reversal Multiple-baseline across participants Multiple-baseline across target behaviors Equivalent time samples Equivalent materials samples Proxy pretest Double pre-test Nonequivalent Control Group Pretest-Posttest Design This design salvages the non-equivalent control group posttest only design This is a strong quasi-experimental design A good choice when it is not possible to randomly assign to conditions. Provides good control for history, maturation, testing, instrumentation, and mortality as threats to internal validity Example—see handout and “Notel” discussion in Cozby (p. 202) Interrupted Time Series Design A series of measures both before and after the implementation of the treatment condition are compared. See biofeedback example in Handout and Connecticut traffic fatalities (Cozby, Figure 11.2, pg. 203) Weakness is definitely failure to control for history effects Control Series Design Like the interrupted time series, multiple pre- and post measures are taken and compared. However, a control group is also tracked, allowing for a “between groups” comparison Example—Handout: DARE program; Cozby Figure 11.3 Retrospective Study Existing, archival data are compiled to test the effects of some past treatment (IV). Most serious threat to validity = history Example: Effect of establishing Outreach Office at GWC Reversal Designs Comparison of multiple periods, designated by letters (A, B, C) in which treatment is alternatively either absent or present. The design attempts to establish the link between the administration of the treatment and a specific behavior change Example: Handout—truth brushing in developmentally disabled men ABA; ABAB Multiple Baseline Designs Multiple baseline designs are similar to plain reversal designs, except that the treatment is introduced at different times for different subjects or targeting different behaviors for a single subject Across Subjects – see Handout: token system to reinforce toothbrushing in subjects A, B, C Across Behaviors -- see Handout: token system applied to three self-care behaviors across three phases Equivalent Time Series Designs Useful when the treatment effect is short lasting and no control group is readily available Example: see Handout: effect of music on industrial production Equivalent Materials Samples Design Various sets of materials deemed “equivalent” are used in a repeated measures format, altering levels of treatment because exposure to a set of materials is likely to cause some “carry over” Example: see Handout—distributed vs. massed practice with different sets of to-be-remembered words Proxy Pretest Design This design is similar to the standard pre-post design, but in this case the “pretest” measure is collected at the end of the study. Actually, a proxy variable is used to establish (after the fact) where the groups would have been on the pretest Two variations: Recollection proxy pretest design Archived proxy pretest design Double Pretest Design This design builds on the moderately strong nonequivalent control group pretest-posttest design by adding a pretest for both groups. This allows for an assessment of how the two groups might be changing prior to the start of the program