Proxy Pretest Design

advertisement
Quasi-experimental Designs
Cozby Ch. 11
Intro to Experimental Psychology
Psych 185
Dr. Isonio
Golden West College
Recap--Designs

To date, we have discussed:


True experimental designs (Chapters 8 and 10)
What remains is today’s topic:



Quasi-experimental designs
These are “intermediate” designs—they lack the randomization and full control of
extraneous variables that true experiments have but can nevertheless yield
reasonably reliable conclusions
For completeness we will also mention a couple of non-experimental designs (what
Campbell and Stanley call “pre-experiments”) that are fraught with confounds
Internal Validity


Remember: Internal validity refers to the extent to which we
are able to clearly attribute the observed differences on the
dependent variable to the effects of the independent variable.
Thus, true experimental designs have high internal validity and
quasi-experimental designs have some intermediate level of
internal validity
Threats to Internal Validity


Factors which, if uncontrolled can confound (i.e., “mess up”) a
study
The classic list comes from Campbell and Stanley
Threats to Internal Validity







History
Maturation
Repeated Testing
Instrument Decay
Statistical Regression
Selection
Mortality
Threats to Internal Validity

History –


Outside events (beyond the context of the study itself) influence the
participants during the study, possible affecting their behavior
This threat is relevant in any type of repeated measures design

Pretest → IV → Posttest
Threats . . . History


Example of History operating as an internal validity threat:
Study to assess how attitudes toward welfare program are affected by
having a personal discussion with recipients



Pretest measure of attitudes on Monday; discussion occur Tuesday through
Thursday; posttest measure of attitudes on Friday
BUT, a special TV documentary reporting on widespread welfare fraud was aired
on Wednesday evening.
Results show no difference between pre- and posttest attitude assessments; due to
true lack of difference, or the confounding effect of the TV show???
Threats to Internal Validity

Maturation


Any change in the participants (beyond the direct effect of the IV)
during the course of the study. The changes can be permanent (e.g.,
physical growth) or temporary (e.g., fatigue)
Like History, this threat arises in pre-post designs
Threats . . . Maturation



Example of maturation operating as a threat to internal validity
Study to assess effectiveness of training program to increase capacity of
STM; pretest given at noon; program held from 1to 5pm; posttest given
at 530pm
Results indicate no difference between pre- and posttest memory
scores; due to ineffective program, or . . . fatigue ????
Threats to Internal Validity

Repeated Testing

This threat occurs when participants are measured multiple times. The
threat is that they might be somehow changed by the measurement process
itself. This could then affect their behavior.
One common outcome of repeated testing:
becoming “Test Wise”

Experience with tests affect measurements in sometimes dramatic
ways-

Classic IQ effect: benefit of being test wise is estimated to be about 35 points
Personality tests—test wise persons appear to be better adjusted
Threats . . . Repeated testing


Example of repeated testing operating as an internal validity threat:
Does a special program teaching mental math increase the speed of
doing everyday calculations?


Pretest (10 arithmetic problems) is given; one-hour of mental math instruction
follows; then, the posttest is administered (same problems in order to see
“improvement”; results indicate dramatic improvement in calculation speed
Do you sell the program to school districts OR is the “improvement” a reflection of
the fact that the participants had seen the problems previously, only an hour earlier
???
Threats to Internal Validity

Instrument decay


This threat operates whenever a change in the measurement
instrument itself
Instrument can be a device (e.g., batteries run down) a human (e.g.,
changes standards over time or becomes fatigued)
Threats . . . Instrument decay

A good example of how instrument decay can threaten internal
validity is with the evaluations of writing samples

English professors designate a whole day to read and evaluate the
samples; late in the day as they get tired, they begin to miss or errors
(or they begin to grade more stringently or . . . )

The analogy to springs in a scale becoming stretched out over time
fits well
Threats to Internal Validity


Statistical Regression (regression toward the mean) refers to
the natural tendency for extreme values to move closer to
average upon subsequent measurement
“Extremes moderate over time”

Any superlative performance by an individual (e.g., in sports) is likely
to be followed by a more ordinary one from that same individual
Threats . . . Statistical Regression

Example:


A new therapy for severe depression is evaluated. Twenty severely
depressed persons go through a two-week “just smile” program; after
the program, their mood is re-assessed. Results show a marked
improvement—average depression level is only mild to moderate.
Do you conclude the program is indeed wonderful, or might it just be
that extremely depressed people will naturally be (at least slightly) less
depressed two weeks later ????
Threats to Internal Validity



Selection operates whenever groups used in a comparison are
not equivalent at the outset of a study
We have talked about this threat previously
The confound is that we don’t know whether group differences
seen on the dependent variable are due to the effects of the
independent variable or to pre-existing differences
Threats . . . selection


EXAMPLE: Two psychology professors argue over the merits of
providing students with class notes. Professor A thinks doing so will
send the wrong message to students—you have the notes, so there is
no need to attend class. The other believes that students can
understand that the notes are an aid to help them get more out of class
lecture/discussion. They decide to do a study—professor A makes
notes available to his students for the remaining 8 weeks of the
semester; professor does not do so for the same time period. They
compare performance on the final exam and find that student in
professor B’s class get better exam scores.
What do you conclude and why?
Threats to Internal Validity


Mortality refers to the effects of the loss of participants over
time in a longitudinal study.
Those remaining in the study, then, may not be representative
of the original group nor of the population.

Participant mortality can result from a variety of factors—literal
mortality, moving out the area, losing contact, no longer
committed to the project, etc.
Threats . . . mortality

An example of mortality operating as an internal validity threat:
a school district is evaluating the effects of mandatory school
uniforms on discipline problems (expectation is that uniforms
will cause a reduction in problems). However, during the
course of the study, the greatest trouble-makers leave the
district. Results indicate a significantly lower level of problems
at the end of the first year of uniforms. What do you conclude?
Now to the designs



They differ in format (structure, design)
Because of their design they handle (or fail to handle) potential
threats to internal validity in different ways
See page 1 of the “Non-experimental, quasi-experimental and
true experimental designs” Handout
True experimental Designs


(We discussed these previously)
Posttest only control group design and pretest-posttest control
group design (bottom half of page) were covered in chapter 8;
factorial designs were introduced in chapter 10
Non-experimental designs

These designs are inherently flawed and fraught with internal
validity threats:



One shot cases study
One-group pretest-posttest design
Non-equivalent control group design
Non-experimental designs

One shot case study

One-group pretest-posttest design

Non-equivalent control group design
Quasi-experimental Designs











Non-equivalent control group pretest-posttest
Interrupted time series
Control series
Retrospective
Reversal
Multiple-baseline across participants
Multiple-baseline across target behaviors
Equivalent time samples
Equivalent materials samples
Proxy pretest
Double pre-test
Nonequivalent Control Group Pretest-Posttest
Design





This design salvages the non-equivalent control group posttest only
design
This is a strong quasi-experimental design
A good choice when it is not possible to randomly assign to conditions.
Provides good control for history, maturation, testing, instrumentation,
and mortality as threats to internal validity
Example—see handout and “Notel” discussion in Cozby (p. 202)
Interrupted Time Series Design



A series of measures both before and after the implementation
of the treatment condition are compared.
See biofeedback example in Handout and Connecticut traffic
fatalities (Cozby, Figure 11.2, pg. 203)
Weakness is definitely failure to control for history effects
Control Series Design


Like the interrupted time series, multiple pre- and post measures
are taken and compared. However, a control group is also
tracked, allowing for a “between groups” comparison
Example—Handout: DARE program; Cozby Figure 11.3
Retrospective Study
Existing, archival data are compiled to test the effects of
some past treatment (IV).
Most serious threat to validity = history
Example: Effect of establishing Outreach Office at GWC
Reversal Designs




Comparison of multiple periods, designated by letters (A, B, C) in which
treatment is alternatively either absent or present.
The design attempts to establish the link between the administration of
the treatment and a specific behavior change
Example: Handout—truth brushing in developmentally disabled men
ABA; ABAB
Multiple Baseline Designs



Multiple baseline designs are similar to plain reversal designs, except
that the treatment is introduced at different times for different subjects
or targeting different behaviors for a single subject
Across Subjects – see Handout: token system to reinforce toothbrushing in subjects A, B, C
Across Behaviors -- see Handout: token system applied to three
self-care behaviors across three phases
Equivalent Time Series Designs


Useful when the treatment effect is short lasting and no
control group is readily available
Example: see Handout: effect of music on industrial
production
Equivalent Materials Samples Design


Various sets of materials deemed “equivalent” are used in a repeated
measures format, altering levels of treatment because exposure to a set
of materials is likely to cause some “carry over”
Example: see Handout—distributed vs. massed practice with different
sets of to-be-remembered words
Proxy Pretest Design


This design is similar to the standard pre-post design, but in this case
the “pretest” measure is collected at the end of the study. Actually, a
proxy variable is used to establish (after the fact) where the groups
would have been on the pretest
Two variations:


Recollection proxy pretest design
Archived proxy pretest design
Double Pretest Design


This design builds on the moderately strong nonequivalent
control group pretest-posttest design by adding a pretest for both
groups.
This allows for an assessment of how the two groups might be
changing prior to the start of the program
Download