IMBOTQ Module 8 answers

advertisement
Module 8: Experimental validity
“IT MIGHT BE ON THE QUIZ”
1. What are “threats to internal validity” and “threats to external validity”? How are they
different?
'Threats to internal validity' are things that might go wrong in trying to answer questions about
cause and effect. They are ways a study might fail to provide convincing or valid evidence
(handout packet).
'Threats to external validity' are threats to the generalizability of results, "that is, the extent to
which the results of a study could be expected to apply to other individuals or settings" (handout
packet).
They are different in that 'threats to internal validity' are threats that threaten to make the study
itself and thus it's results invalid whereas 'threats to external validity' are threats to how the
results are applied or interpreted to generalize to the population and/or other contexts. I see the
difference as an inward threat which could potentially destroy the study or outward threats that
could potentially make a study useless for my purposes.
Random selection improves the external validity of an experiment and random assignment
improves the internal validity. This is why “randomization” is so important in research.
External validity is a way to determine is to whom are these results applicable. This speaks to the
generalizability of the results. In other words, from what population were the subjects selected,
and how confident are we that the sample is representative of the population?
External validity can be viewed in terms of units, treatments, outcomes and settings.
Subjects: Does the effect hold with a different group?
Treatments: Does the effect hold when there are variations in treatment implementation?
Measures: Does the effect hold on different measures?
Settings: Does the effect found in one setting hold in other settings?
Internal validity, what you are really interested in is the quality of the research design. Did the
researcher plan ahead effectively? Did he or she take issues into consideration that will provide
the strongest evidence about cause and effect? Many issues that could be compromising to a
research design, and thus its results, can be addressed by careful planning, but that still doesn’t
guarantee that things won’t go wrong. How confident can we be that the independent variable,
rather than other extraneous or confounding variables, produced the observed effect?
“Threats to internal validity”: refers to categories of ways that the internal validity of a study
might be called into question
2. Gall, Gall, and Borg (2003) list the Hawthorne effect as a threat to external validity, whereas
McMillan (2003) uses it as an example of a threat to internal validity. How would you
resolve this difference? Under what conditions is the Hawthorne effect a matter of internal
validity? External validity?
McMillan believes it is an internal validity problem because the subjects respond in a
particular way because they know they are participating in a study—so one is unable to
conclude the treatment caused the desired effect. This is an example of “subject effects” (the
subjects awareness of their involvement in the study may effect their performance). Gall,
Gall, & Borg believe the Hawthorne effect is a threat to external validity because of the
actual Hawthorne effect is taking place. That is, the treatment cannot be generalizable outside
of this group because new subjects will not be experiencing the “Hawthorne effect” (i.e.,
being in a study—receiving special treatment).
3. For items 3-10, choose from the following list what threat to internal validity is likely to be
the greatest, or most obvious. Give a justification for your choice.
history
selection
maturation
pretesting
instrumentation
treatment replications
subject attrition
statistical regression
diffusion of treatment
experimenter effects
subject effects
4. Dr. Jones finds a large school district in which half of the elementary schools have just
adopted the Accelerated Reader program. At the end of the school year (ex post facto), he
gets standardized test scores for students from all the elementary schools in the district, and
finds that those with the Accelerated Reader program have significantly higher test scores.
He concludes that the Accelerated Reader produces gains in standardized test scores. Which
threat to internal validity most directly and obviously calls Dr. Jones’ conclusion into
question? Selection
5. Dr. Jones found two first grade teachers who were willing to take part in an experiment
comparing two different approaches to reading instruction. By conducting tests at the very
beginning of the year, he found that the two classes were almost perfectly matched in terms
of students’ verbal IQ and reading ability. By flipping a coin, he randomly assigned Miss
Murphy to the experimental condition, Jolly Phonics, and Mrs. Butler, her friend in the
classroom across the hall, to the control group, continuing her normal instruction, using the
basal reading series adopted by the district. What threat to internal validity can Dr. Jones
anticipate? diffusion of treatment or history - but I think this could be even more a history
threat. On page 215, the textbook says "history is the category of threat in studies where
different teachers are responsible for implementing each intervention."
6. Dr. Jones wanted to examine the effectiveness of his new critical thinking skills program for
gifted students. He gave a test of critical thinking to all the 9th graders at a local middle
school, and selected the top ten percent of the students for a trial of his program. After 3
weeks of critical thinking lessons, he gave the critical thinking test to the participants, and
found that their posttest scores were significantly lower than their pretest scores. He
concluded that his critical thinking program had a negative effect on gifted students’ critical
thinking skills. What threat to internal validity calls his conclusion into question? Statistical
regression
7. Dr. Jones takes over the math period in two fifth grade classrooms. In one classroom he uses
his Jones’ Mighty Math textbook, and in the other he uses the math textbook adopted by the
district. The two classrooms were equivalent on a test of math knowledge at the beginning of
the year, but an end-of-year test shows that the students using Dr. Jones’ textbook learned
significantly more. Dr. Jones claims that this result demonstrates the superiority of his
textbook. A critic of Dr. Jones’ claim is most likely to mention which kind of threat to
internal validity? Experimenter effects
8. A longitudinal panel study of students at a certain private school found that in the year the
study began, 22% of the first grade students were diagnosed as having ADHD. Five years
later, those same students were contacted. Of those still at this private school, only 14%
were now categorized as having ADHD. The principal concluded that the strict discipline
program used at that school was improving students’ ability to concentrate, and therefore
reducing the number showing signs of ADHD. What threat to internal validity offers a
different explanation of the change observed? subject attrition
9. Dr. Jones gives a group of 100 students a spelling test containing words of Greek origin. He
then gives these students 15 minutes of daily instruction for a week on the history of Greece
and its contribution to Western civilization. At the end of the week he tests them on the same
words, and finds that their accuracy in spelling has improved. He attributes the gain in
spelling accuracy to his intervention. What threat to internal validity calls this interpretation
into question? Pretesting
10. Two math teachers in a large middle school each taught algebra to 3 classes of 7th graders,
and there were 25 students in each of these classes, so a total of 150 students were involved
in the study. Dr. Jones got each of the teachers to try one of two new instructional
approaches. The mean performance of students who had experienced instructional method
A was higher than the mean performance of students who had experienced instructional
method B. Conducting a t-test with an N of 150, Dr. Jones found the difference in means to
be statistically significant, and concluded that instructional method A was superior. What
threat to internal validity calls this interpretation into question? Treatment replications
11. Dr. Jones was investigating the effects of two different behavior management techniques.
Ten classrooms were randomly assigned to the Assertive Discipline condition, and 10 to the
Positive Humor condition. Dr. Jones, having received lots of funding, had 20 graduate
assistants, and assigned a different graduate assistant to observe in each classroom. By
chance, all the graduate students who were observing in the Assertive Discipline classrooms
were from the counseling program, and all the graduate students who were observing in
Positive Humor classrooms were from the educational leadership program. The frequency
of disruptive behaviors was found to be significantly higher in the classrooms using
Assertive Discipline. Dr. Jones wants to publish an article using this data to prove that
Positive Humor is superior to Assertive Discipline. What threat to internal validity calls this
claim into question? Instrumentation
12. What is the most serious threat to internal validity of an ex post facto design?
The most serious threat to internal validity of an ex post facto design is around subject or group
selection. The researcher needs to be very careful that extraneous variables have not influenced
the study and that differences in groups being compared have been controlled.
13. Which threats to internal validity are controlled for by a true experimental design? Which
are not?
This question confused me a bit, with what was meant by a "true experiment" but after reading
that the randomized groups were these it made it a bit easier.
Controlled threats would be: selection, maturation, and statistical regression.
Not controlled, or threats that could have an effect on the experiment would be all the others:
history, pretesting, instrumentation, treatment replications, subject attrition, diffusion of
treatment, experimenter effects, and subject effects.
I used the chart on page 232 to help me with this.
Internal Validity threats:
History: uncontrolled events that might impact that dependent variable. Example: half the control
group gets the flu, big news story on the topic being studied in the experiment
Selection: A difference between the treatment groups ( other than the treatment) Example:
schools in the treatment condition have more experienced teachers, students in the control
condition have higher verbal IQ’s. Selection is almost always a serious potential problem for ex
post facto studies
Maturation: any change over time (whether long term or short term) that might impact the
dependent variable. Example: subjects are older and wiser after a year of schooling, subjects are
tired and bored after 20 minutes of testing. Maturation is a threat to internal validity in single
group designs, of if it differentially affects the treatment groups.
Pretesting: taking the pretest can impact performance on the posttest. Example: taking the pretest
can alert subjects to the purpose of the experiment, taking the pretest can increase subjects’
awareness of and hence subsequent learning about, the topic of the test. Pretesting is a threat to
internal validity only in single group designs, or if it differentially affects the treatment groups.
Instrumentation: unreliability or bias in the measure of procedures used to obtain data. Example:
using different observers for different treatment conditions, using a dependent measure that is not
sensitive to the treatment. Instrumentation is a threat to internal validity only in single group
designs or if it differentially effects the treatment groups.
Treatment replications: inferential statistical tests assume that each observation (the value of the
dependent variable for each subject) is independent. Example: a teacher teaching a class of 25
students with a new method should count as only one instance of the treatment, not 25.
Subject Attrition: subjects drop out of the study. Example: in a longitudinal study of high school
students, poor readers are more likely to drop out. Subject attrition is a threat to internal validity
only in single group designs, or if it differentially affects the treatment groups.
Statistical regression: due to measurement error, subjects with extreme high or low scores on the
pretest are likely to score closer to the mean on the posttest. Example: If you use a reading test to
identify the lowest 20% of readers in your class, these students will (on the average) do better on
the posttest, whether or not there has been any actual improvement in their reading ability.
Diffusion of treatment: the control group may find out about, and imitate, the treatment received
by the experimental group: Example, the control group teachers find out what the experimental
group teachers are doing, and try it out.
Experimenter effects: The influence of the experimenter's behavior, personality traits, or
expectancies on the results of his or her own research. Expectations or attributes of the
experimenter differentially affect the treatment groups. Example: the experimenter implements
two treatments, and unconsciously favors the one he/she hopes will do better.
Subject effects: changes in behavior generated by the subject by virtue of being in a study. The
behavior of subjects is influenced by their knowledge that they are taking part in the study.
Example: Hawthorne effects: the subjects perform better because they know they are in an
experiment. Compensatory rivalry: control group subjects try harder. Resentful demoralization,
control group subjects resent not getting treatment and give up.
Download