Pelham Ch 8 answers

advertisement
Ideal Answers to Chapter 8 (Repeated Measures) Questions
QUESTION 8.1. The means in this hypothetical study were in the predicted direction (Mnegative trait = 5.7,
Mpositive trait = 6.9). However, an independent samples t-test revealed that this difference was not
significant, t (18) = -1.59, p = .128. Thus, if the only data we had at our disposal were these data, we
would not be able to claim any support for the hypothesis that people consider previously unheard-of
traits more self-descriptive if they are positive.
QUESTION 8.2. The results for each hypothetical study are summarized below, including the results of
the same paired samples t-test for each study. All three studies yielded results in the direction predicted
by the main hypothesis. However, only Study 3 yielded significant results:
Study
negative trait positive trait difference
t
df
p .
Study 1
5.9
6.9
+1.0
-1.50
19
.149
Study 2
5.9
6.5
+0.6
-1.71
19
.104
Study 3
5.9
6.4
+0.5
-2.24
19
.038
In most parametric statistical tests, power (the ability to detect an effect when one exists) depends
heavily on three things: (a) sample size, (b) between-condition variability, and (c) within-condition
variability. It is clear that the neither of the first two factors can possibly explain the differences in
statistical significance between these three studies. First, the sample sizes are identical in each of the
three studies (n = 20). Second the between-condition variability (i.e., the size of the mean differences in
the two experimental conditions) gets smaller, rather than larger, as the p values become smaller. At
first blush, it might also look like the amount of variability in scores is about the same in the three
studies. After all, the within-condition standard deviations hover around two points in all three studies,
and get slightly larger as we move from Study 1 to Study 3. However, because within-subjects studies
focus on the consistency of the difference between two or more experimental conditions, this means
that the variability that really counts, variability in the difference scores between the negative and
positive trait conditions, gets much smaller as we move from Study 1 to study 3.
The reason why this happens is that as we move from Study 1 to Study 3, the correlation between
whether people endorsed the negative trait and whether they endorsed the positive traits becomes
much larger. When two sets of scores on the same kind of scale are highly correlated, this guarantees
that there will be very little variability in the difference scores that are generated from the two scores –
because high and low values for the two different scores will tend to go hand in hand. Incidentally, if we
conducted a one-sample t-test on the difference scores, we’d get the same result that we get for the
paired samples t-tests. Thus, what a paired samples t-test does is the logical and mathematical
equivalent of conducting a one-sample t-test on a single set of difference scores. This means, by the
way, that a paired samples t-test will only increase your power to detect an effect (relative to a
between-subjects test) when two conceptually related sets of score are, in fact, correlated.
QUESTION 8.3. The proper analysis for these data is a mixed model ANOVA in which trait favorability is
the within-subjects variable and self-esteem is the between-subjects variable. In addition to a significant
main effect of trait favorability, F (1, 18) = 7.63, p = .013, this analysis yielded a significant trait
favorability x self-esteem interaction, F (1, 18) = 10.98, p = .004. Simple effects tests reveal that among
participants low in self-esteem, there was no effect of the within-subjects trait favorability
manipulation, t (9) = 0.32, p = .758. In fact, low self-esteem participants were ever so slightly more likely
to endorse the bogus trait when it was negative (Mnegative trait = 5.7, Mpositive trait = 5.6). In contrast,
participants high in self-esteem were clearly more likely to endorsed the positive as opposed to the
negative bogus trait (Mnegative trait = 6.1, Mpositive trait = 7.2), t (9) = -6.13, p < .001. Although the overall main
effect of trait favorability is consistent with self-enhancement theories, the interaction effect is more
consistent with self-consistency theories such a self-verification theory.
QUESTION 8.4. The self-enhancement index was simply a difference score that consisted of people’s
ratings for the positive bogus trait minus their ratings for the negative bogus trait. This difference score
was significantly correlated with self-esteem, r (18) = .62, p = .004. High self-esteem people generally
gave more self-enhancing responses. As it turns out the p value associated with the correlation
involving this difference score is exactly the same as the p value observed for the self-esteem x trait
favorability interaction term in the mixed model ANOVA. This is good because the goal of the ANOVA
was to see if the within-subjects self-enhancement effect was any stronger (or weaker) than usual
among people high in self-esteem. Although these two methods yielded identical p values, the
advantage of the mixed model ANOVA, in this particular case, is that it revealed more clearly exactly
what was going on in the study. In the case of the simple correlation, it would be impossible to know
whether either self-esteem group was self-enhancing in an absolute sense. In principle, for example,
the high self-esteem group could have been self-denigrating (by indicating that the negative bogus trait
described them better than the positive bogus trait) while the low self-esteem group was simply much
more self-denigrating. In contrast, simple effects tests in the ANOVA told us exactly what was happening
in each self-esteem group
QUESTION 8.5. On average (i.e., in the typical country), people thought their lives in five years (M = 6.8)
would be better than their current lives (M = 5.4), F (1, 153) = 470.8, p < .001. In fact, there was no
country in the entire set of 154 countries, in which perceptions of the future were meaningfully more
negative than perceptions of the present. There were 2 countries (Japan and El Salvador) in which
future ratings were slightly less positive than present ratings, but the differences in both countries was
less than 0.10 scale points. On the whole this constitutes highly robust support for the optimistic bias.
On average human beings seem to believe that their present lives will be better than their future lives.
QUESTION 8.6. A mixed model ANOVA revealed the already documented repeated measures main
effect of time period, F (1,149) = 851.68, p < .001, as well as a significant between-subjects main effect
of region, F (4,149) = 14.76, p < .001, indicating merely that well-being is higher in some world regions
than in others. However, the same analysis also revealed a significant Region x Time Period interaction,
F (4,149) = 44.39, p < .001, suggesting that the magnitude of the optimistic bias does vary across world
regions. However, a quick glance at the means revealed that the difference between present and
perceived future well-being did not fit the expected pattern. In fact the gap between perceived present
and future well-being was larger in Asia, Africa, and Latin America than in Europe and the group of four
Western (English speaking) countries outside of Europe.
To simplify this analysis, I compared only the “European plus” region and Latin America, that is, the
world’s most and least individualistic regions, respectively. First, a mixed model ANOVA revealed that
there was an interaction between region and time period, F (1,60) = 25.57, p < .001. Separate repeated
measures analyses in each of these two regions revealed a robust effect of time period in both regions,
both ps < .001. However, the effect of time period was actually larger in Latin America (respective
present and futures Ms = 5.9 and 7.1, partial Eta2 = .86) than in the Europe-plus region (respective
present and futures Ms = 6.3 and 7.0, partial Eta2 = .67). To put this differently, although Westerners
and non-Westerners view their present lives a bit differently, they have much more similar views of their
future lives. From a different perspective, whereas the optimistic gap between present and predicted
future well-being among Westerners is 0.7 points, the same gap among Latin Americans is 1.2 points.
Thus, these results are highly inconsistent with the original predictions.
[Note to instructors: There are obviously many other ways to chop up these regions and many
possible post hoc tests one could do, but because these results are clearly in the opposite direction of
predictions, there is no way to divide the regions that would yield any kind of support for the original
predictions.]
QUESTION 8.7. I analyzed these AMP data using a mixed model ANOVA in which candidate preference
was the between-subjects independent variable and the nature of the primes (Bush faces or Kerry faces)
that preceded the rated Chinese characters was the within-subjects variable. This analysis yielded no
main effect of candidate preference, F (1, 37) = 0.02, p = .900, and no main effect of prime type, F (1, 37)
= 0.33, p = .570. However, consistent with the model of Payne et al., the analysis did reveal a significant
Preference x Prime Type interaction, F (1, 37) = 21.57, p < .001.
To follow up on the significant interaction, I conducted paired samples t-tests separately for those who
said they would vote for Bush and those who said they would vote for Kerry (by using the Split file
command). Participants who said they would vote for Bush, judged the ambiguous Chinese characters
more favorably when they were primed with Bush’s photos (M = .72) than when primed with Kerry’s
photos (M = .48), t (19) = 4.47, p < .001. In contrast, those who said they would vote for Kerry, judged
the ambiguous Chinese characters more favorably when primed with Kerry’s photos (M = .70) than
when primed with Bush’s photos (M = .51), t (19) = -2.48, p = .023.
These finding provide preliminary evidence for the validity of the AMP. Evidence that these effects are
implicit is based on the fact that the authors of the real study explicitly warned participants to try to
avoid any biasing effect of the priming stimuli when judging the neutral target stimuli. Despite these
admonitions, this study showed a robust, presumably unconscious, misattribution effect.
QUESTION 8.8. Although within-subjects designs have many advantages, many within-subjects designs
(especially lab experiments) are subject to worrisome effects such as carryover effects, fatigue, or
interference effects. Such effects mean that people may sometimes respond differently to stimuli to
which they are exposed later in a study than to stimuli to which they are exposed earlier in a study – for
reasons that have nothing to do with the variable that is being manipulated. By counterbalancing the
order in which different participants experience different within-subjects conditions, researchers can
often minimize or balance out such sources of bias.
However, there are sometimes limits to counterbalancing. Some physical manipulations, for example,
simply cannot be reversed. Thus it is not possible to counterbalance brain lesions in rats. A more subtle
example has to do with mood. It is possible, in principle, to change a person’s mood in a short period
(and to do so in counterbalanced order). However, doing so and then making a second set of similar
measurements might alert some participants to the purpose of a study and open it up to demand
characteristics (i.e., people might do what they think the experimenter wants them to do) or to
reactance (i.e., people might do the opposite of what they think the experimenter expects them to do).
Of course, it is also the case that some variables (e.g., tornado strikes) cannot be ethically manipulated
at all (at either a between-subjects or a within-subjects level). Such variables can only be measured, and
even if researchers make pre- and post- measurements (e.g., pre-tornado and post-tornado PTSD
measurements) they obviously cannot counterbalance the order of the two tornado conditions.
Download