Experimental Designs: True, Quasi, Pre

advertisement
Experimental Designs:
True, Quasi, Pre
HDFS 8200: Research Methods
in HDFS (MANFRA)
Fall 2011
Recall:
• IV => Independent variable
• DV => Dependent variable
– The IV influences the DV
•
•
•
•
•
Discrete vs. Continuous variables
Active vs. Attribute IV
External validity
Sampling
Random selection
Example
• Research is interested in the effects of parental
separation on adolescent academic motivation
• Participants of interest: adolescents
• Constructs of interest:
– Parental separation
– Academic motivation
• Variables (operational definition):
– Separation (IV, Attribute): One parent moves out of adolescent’s
primary residence for a reason related to marital disharmony
(trial separation, pre-divorce, divorce, etc.)
– Academic motivation (DV): Adolescent’s self-report of the
importance and desire to do well (in terms of grades) in school
and continue his/her education post-graduation
Example
• Research question: Does the separation of an
adolescent’s parents affect his/her motivation to do well
academically in school and pursue education after
graduating High School?
• Hypothesis: Adolescents whose parents separated
within 2 years will have lower self-reported motivation to
get good grades in High School and will also have lower
self-reported motivation to attend college compared to
adolescents whose parents are not separated.
– NOTE: “separated” implies a current state of the sample not
what is about to or may come in the future; clear comparison of
two groups on two DVs (motivation for grades and college);
limitation of separation event to the past 2 years
Example
• Measures:
– Adolescent report of whether one parent has moved out of their
primary residence within last 2 years (Discrete, Dichotomous)
– Adolescent report of importance and desire for good classroom
grades (Continuous, Based on multiple Likert scale questions)
– Adolescent report of importance and desire for going to college
(Continuous, Based on multiple Likert scale questions)
• NOTE: “importance and desire” is operational definition of
motivation
• Sample: volunteers from 2 local high schools
– Parents provide consent; adolescent provides written assent
– Adolescent fills out anonymous questionnaire and returns it to
the researcher via mail
– Adolescents who live with only one parent and the other parent
did not move out within 2 years are excluded from the study
Study Designs
Design Characteristics
• The following study design characteristics will determine
the type of study design along with the type (attribute,
active) of IV (e.g., true experiment, quasi-experiment)
• Research does not always have the option (nature of
study topic and ethics determines choice) or some
options contain too much cost
• Making the decision to go with one over the other will
result in a give-and-take between internal validity and
external validity
–
–
–
–
–
–
One-group vs. Multi-group
Equivalent vs. Nonequivalent
Randomized vs. Nonrandomized
Matched vs. Nonmatched
Posttest-only vs. Pretest-posttest vs. Time series
Temporary treatment vs. Continuous treatment
One-group vs. Multi-group
• One-group => everyone in sample experiences
the IV event
• Multi-group => sample is divided into 2 or more
groups
– Typically 2 groups:
• 1 group experiences IV event (level 1)
• 1 group does not (level 2)
– Allows experimenter to compare outcome (DV) at
different levels of IV (event, no event)
• If groups do not differ on DV, no support for hypothesis
One-group vs. Multi-group
E
One-group
DV
Pre-Event
Post-Event
IV
E
DV
C
Pre-Event
Post-Event
IV
Multi-group
One-group vs. Multi-group
• The more groups in the study, the more
confidence that the research has that the IVevent is causing or influencing the DV
– Internal validity => the extent to which one can infer
that the IV caused the DV
• With only 1 group, it is nearly impossible for the
researcher to distinguish between the effects of
time or practice from the effects of the IV
• Control or comparison groups are used to
remove the ambiguity from the study
One-group vs. Multi-group
• Example: A research is interested in
knowing if copying numerals in order will
increase preschool children’s numeric
knowledge.
– One-group design:
• All children are assessed on numeric knowledge
(Pretest); then, copy numerals for 10 minutes a
day for two weeks; then, all children are assessed
on numeric knowledge (Posttest)
• Children increase from pre to post on numeric
knowledge. Aside from the activity, why might this
have happened?
One-group vs. Multi-group
• Example: A research is interested in knowing if
copying numerals in order will increase
preschool children’s numeric knowledge.
– Two-group design:
• All children are assessed on numeric knowledge (Pretest);
then, half the children copy numerals for 10 minutes a day
for two weeks (E) and half copy letters for 10 min/day for 2
weeks (C); then, all children are assessed on numeric
knowledge (Posttest)
• All children increase from pre to post on numeric knowledge.
Clearly, it is not the copying of numerals that increases
numeric knowledge. So, why did all of the children increase
in numeric knowledge?
One-group vs. Multi-group
• Example: A research is interested in knowing if copying
numerals in order will increase preschool children’s
numeric knowledge.
– Four-group design:
• All children are assessed on numeric knowledge (Pretest)
• Then:
–
–
–
–
Quarter
Quarter
Quarter
Quarter
the children copy numerals for 10 min/day for two weeks (E)
copy letters for 10 min/day for 2 weeks (C1)
color for 10 min/day for 2 weeks (C2)
do not do any added activity for 2 weeks (Control)
• Then, all children are assessed on numeric knowledge (Posttest)
• Children copying letters increase from pre to post on numeric
knowledge. Children coloring increases very slightly. Children with
no added activity show no increase in numeric knowledge.
• How confident will the research be in the results (internal validity)?
• Why did the children copying numerals and letters increase in
numeric knowledge?
Equivalent vs. Nonequivalent
• In multi-group designs, are the groups functionally
similar (equivalent) or different (nonequivalent)
• How does the research make two groups equivalent?
– Random assignment to groups
– When participants are not randomly assigned to groups, the
groups are considered nonequivalent
• Typically happens with existing groups (e.g., classroom)
• Randomized vs. Nonrandomized
– Randomized = equivalent
– Nonrandomized = nonequivalent
• FYI: most study designs are worded as Randomized
vs. Nonequivalent primarily because the word
“equivalent” has too strong of an implication and
nonequivalent groups can have some randomization
Randomized vs. Nonequivalent
• Why is randomization important?
– If participants are randomly assigned to groups, the
researcher can be far more confident that the
outcome of the study is related to the variable of
interest (IV) rather than another (unmeasured) factor
that resulted in the nonequivalent groups
– Randomization dramatically increases internal
validity
• Separation Example: what other (unmeasured)
factor might have a negative effect on
adolescents’ motivation for doing well in school
and going to college that also resulted in the
separation?
Randomized vs. Nonequivalent
• What can change during a study?
• It is possible that randomized groups
(equivalent) become nonrandomized
during the course of the study
– For example, attrition or mortality
– Systematic attrition – something about the
study causes a specific subset of participants
to drop out
Matched vs. Nonmatched
• Pertains to studies with more than one group
• Matched sample => prior to assigning individuals to
groups, they are matched by the number of groups on an
important factor known to influence the DV
– “Guarantees” that each group will have nearly the same
statistical mean and variance of the matching factor and thereby
have “equivalent” covariation with the DV
• Nonmatched sample => participants are simply
assigned to a group
– With small samples, it is possible to end up with nonequivalent
groups even with random assignment
– Matching the sample ahead of time will decrease this possibility
provided each half of the participants has a close match in the
other half
Posttest-only vs. Pretest-posttest
vs. Time series
• These characteristics have to do with the number of times the DV
is measured
– NOTE: This is not related to the length or number of implementations of
the IV event
• Posttest-only => the DV is measure only once and after the IV event
• Pretest-posttest => the DV is measure twice—before and after the
IV event
• Time series => the DV is measure multiple times before and after
the IV event
– Before the IV event provides baseline data—what is the change that is
expected between time points with no IV
– After the IV event provides the lasting effect of the IV—did the IV act
as an intervention causing lasting change or a curriculum causing
temporary change?
– Typically, time series have 3 or more time points of data collection
before and after (total 6) the IV event
• However, it should be noted that more than 1 measure of the DV before (and
after) the IV event is better than only 1 (sometimes called multiple-pretest
multiple-posttest rather than time series)
Posttest-only vs. Pretest-posttest
?
?
?
Posttest-only
DV
Pretest-Posttest
Pre-Event
Post-Event
IV
• Again, the primary consideration here is related to internal validity
• Which design option provides the researcher more confidence that the IV
caused the DV
• With the posttest-only design option, the researcher only knows how the
sample measures on the DV after the IV event—there is no way to be 100%
confident about how they measured on the DV before the IV event
• Sometimes this is all one can do…exploring effects of tornado on family
• The pretest-posttest design option provides the researcher with information
about the DV before and after the IV event
Pretest-posttest vs. Time series
Time Series
DV
Pretest-Posttest
Pre-Event
Post-Event
IV
• With the time series design, the research is more confident that the “natural”
progression (increase) of the DV between time points is nearly null (0)
• Therefore, the increase in slope from before to after the IV event is more likely
caused by the IV event itself and not “natural growth”
• What is another possible reason for the growth after the IV event?
• The pretest-posttest design does not provide the researcher with information
about the “natural” change of the DV.
• It is possible that the “natural growth” of the DV is all the researchers sees—in
other words, the IV event has no effect. With a pretest-posttest design, it might
be erroneously concluded that the IV caused the DV when it did not.
Temporary treatment vs.
Continuous treatment
• Temporary treatment => IV event occurs during an isolated time
point or time interval—there is no measure of the DV during this time
– It is possible for another unmeasured or unknown factor that occurs
concomitantly with the IV to cause the change in the DV (e.g., having
outside researchers in a childcare center causes teachers to “do more”
with children—the researchers program actually has no impact)
• Continuous treatment => IV event occurs on a regular basis and
measures of the DV occur within the time frame of IV event
implementation
– The continuous treatment option can limit the researchers ability to
explore lasting effects of an intervention—or determine whether a
program is an intervention (changes the trajectory after event) or a
curriculum (changes the trajectory during event)
• Intermittent treatment => IV event occurs during a time period,
then stops for an equivalent time period, then starts again, etc.
– Provides evidence that a “fluke” didn’t result in the change in DV if the
change occurs after every “IV event start” interval
Temporary vs. Intermittent Treatment
DV
DV
Summary
• “Best” design?
– Theoretically: a matched, randomized, multi-group time series
design with intermittent treatment!
– Practically: it depends on the question, the situation, the
environment, the participants, etc. The goal is to increase the
internal validity of the study as much as possible within the
confounds and constraints raised by the topic
• NOTE: in your book (like most texts), true experimental
designs often do not collect DV data at multiple time
points. This is mostly because it is deemed unnecessary
with a randomized sample—there is little threat to
internal validity when studies have high control of an
active IV and the participants are randomly sampled and
randomly assigned to groups
Choosing a Research Design
•
•
•
•
Best addresses the problem
Ethics
Cost in time and money
Validity (internal & external)
Specific Designs
Quasi-Experimental Designs
• By definition what do we have to rule out?
– One-group vs. Multi-group
– Equivalent vs. Nonequivalent
– Randomized vs. Nonrandomized
– Matched vs. Nonmatched
– Posttest-only vs. Pretest-posttest vs. Time
series
– Temporary treatment vs. Continuous
treatment
Quasi-experimental Designs
•
•
•
•
One group posttest-only design
One group pretest-posttest design
Non-equivalent control group design
Non-equivalent control group pretest-posttest
design
• Time series
Experimental Design
• Advantages
– Best establishes cause-and-effect
relationships
• Disadvantages
– Artificiality of experiments
– Feasibility
– Unethical
Causality
• Temporal precedence
• Covariation between IV and DV
• Eliminate alternative explanations
Types of Experimental Designs
• Simple True Experimental
• Complex True Experimental
• Quasi-Experimental
Characteristics of True Designs
• Manipulation (treatment)
• Randomization
• Control group
Characteristics of simple true designs
• One IV with 2 levels (T, C)
• One DV
Types
• Randomized posttest control group design
R
R
T
C
Post
Post
• Randomized pretest-posttest control group
design
R
R
Pre
Pre
T
C
Post
Post
Advantages & Disadvantages
• Advantages of pretest design
– Equivalency of groups
– Can measure extent of change
– Determine inclusion
– Assess reasons for and effects of mortality
• Disadvantages of pretest design
– Time-consuming
– Sensitization to pre-test
Solomon four-group design
R
R
R
R
•
•
•
Pre
Pre
T
C
T
C
Post
Post
Post
Post
Can measure if the PRE assessment has an affect on POST by
comparing the POST scores of the first two groups to the POST
scores of last two groups.
If they are respectively similar, then the PRE scores did not impact
the POST scores.
If the are respectively different, then the PRE scores did impact
the POST scores.
Complex True Experimental
• Randomized matched control group
design
• Increased levels of IV
• Factorial design
• Multiple DVs
Randomized matched control
group design
M
M
R
R
T
C
Post
Post
• Used in small samples
•  cost in time & money
Increased Levels of IV
• Provides more complete information about
the relationship between the IV & DV
• Detects curvilinear relationships
• Examines effects of multiple treatments
DV
$3
Performance level (% complete)
$0
Reward Amount
$1
$2
Amount of reward promised ($)
IV
DV
Performance level (% complete)
Increased Levels of IV
Amount of reward promised ($)
IV
Factorial Design
• >1 IV (factor)
• Simultaneously determine effects of 2 or
more factors on the DV (real world)
• Between Factor vs. Within Factor
• Described by # of factors and levels of
factors
– E.g., 2 x 3 Between Subjects Factorial Design
Do different exercise regimens (hi, med, lo
intensity) have the same effect on men as
they do on women?
• 3 X 2 (Exercise Regimen X Gender)
– 2 factors
– Exercise Regimen – 3 levels
– Gender – 2 levels
– Between factors
– DV?
– Active or attribute IVs?
“Cells” of a Factorial Design
Exercise
Intensity
Gender
Male
Female
High
Medium
Low
Do strength gains occur at the same rate in men
as they do in women over a 6 mo. training period?
Measurements are taken at 0, 2, 4, 6 mo.
• 2 X 4 (Gender X Time)
– ? factors
– Time – 4 levels
– Gender – 2 levels
– Between or within factors?
– DV?
– Active or attribute IVs?
“Cells” of a Factorial Design
Time
Gender
0 mo.
Male
Female
2 mo.
4 mo.
6 mo.
Download