Pelham Ch 5 Answers

advertisement
Ideal Answers to Chapter 5 Questions
QUESTION 5.1. I decide to name this factor “traditionalism” because all of the value items that load
heavily on this factor seem to involve traditional values, such as politeness, religiosity and respect for
American power.
QUESTION 5.2. I ran a principal components factor analysis. I chose the six survey items that loaded
most highly on the factor with the highest eigenvalue. In so doing, I made sure that none of the six
items loaded very highly on the other two factors that had eigenvalues greater than 1.0.
The reliability of this measure of traditional values was high, especially for a measure with only six items
(α = .83). This means that people who agreed with any one item on the scale tended, on average, to
agree with the other five items. Further, deleting any one of these six items would have reduced the
overall reliability of the scale.
QUESTION 5.3. People who scored high in traditionalism tended to be older (r = .23), more politically
conservative (r = .27), and more opposed to abortion (r = .29), all ps < .01. These associations are pretty
intuitive, and they support my interpretation of “trism” as “traditionalism.” Older people tend to be
more traditional, either because people tend to become more traditional as they age (e.g., because they
become better served by the status quo), or because earlier U.S. cohorts were simply more traditional. It
is also highly intuitive that conservative political attitudes go hand in hand with traditional values.
Finally, more traditional people tend to be more opposed to abortion, perhaps because, historically
speaking, abortion is a recent newcomer to the human scene and perhaps because religious
conservatives tend to be strongly opposed to abortion.
QUESTION 5.4. Because there were only 2 items in this scale, the item-total correlation for the (recoded)
shyness item and the outgoing item had to be identical. That is, shyness correlates with outgoingness in
exactly the same way that outgoingness correlates with shyness.
QUESTION 5.5a. 2b. The Spearman Brown formula says r =.650 becomes a reliability estimate of .788 for
this extraversion scale. Two items with individual estimated reliabilities of .65 combine to form a scale
whose overall estimated reliability is .788.
QUESTION 5.5b. Cronbach’s alpha for this scale is also exactly .788. That makes sense because alpha is
based on the average correlation of every item in a scale with every other item in the scale. In the
simplest possible case of a 2-item scale, the average of a single correlation is that correlation.
QUESTION 5.6. The average correlation between the 10 individual items in the self-esteem scale and the
outspoken variable was .26, with a range of r = -.07 to r = .43. It certainly makes sense that, on average,
these correlations were positive. People high in self-esteem should be more confident in social
situations and thus more willing to speak up.
QUESTION 5.7. Yes, the scale was internally consistent, α = .80 (standardized item α = .83). This is a high
reliability score for a 10-item scale. I didn’t delete any items from the scale because they all had good
item-total correlations. Most important, as shown by the statistics under “alpha if item deleted” each
individual item contributed positively to the scale’s reliability. Deleting any one item would have
reduced the reliability of the scale to at least a small degree.
QUESTION 5.8a. The correlation between the composite self-esteem scale and the outspoken variable
was r = .46, p < .05. This was much higher than the average correlation for the 10 individual items (r =
.26). Presumably, this happened because when you average multiple items together, the noise or error
associated with any one specific item is reduced or negated by the unique source of error associated
with a different specific item. Thus, noise or error tends to average out in the composite score, making
that composite score superior. In this case reliability seems to contribute to validity in the sense that the
validity correlation between self-esteem and outspokenness increase from an average of r = .26 to r =
.46 for the total self-esteem score.
QUESTION 5.8b. This self-esteem scale is temporally stable, that is, it has a high test-retest correlation, r
= .76 (which converts to a Spearman-Brown reliability estimate of .86). People who say that they are
high (or low) in self-esteem at time 1 tend to say the same thing at time 2. If self-esteem is an individual
difference variable, we would hope this would be true.
QUESTION 5.8c. When I averaged together the self-esteem scores from time 1 and time 2, the
correlation between self-esteem and outspokenness went up even further (r = .50). This is another
example of how multiple measurement increases reliability. For example, if someone were having a bad
week when they filled out the time 1 self-esteem scale then this wouldn’t likely be true again at time 2,
and the average score would better reflect the person’s true, chronic level of self-esteem. All else being
equal, reliability does seem to contribute to validity
QUESTION 5.9. I ran a reliability analysis and found that the items “gregarious,” “pleasant,” and
“convivial” each reduced the overall reliability of the scale. Thus, I deleted them. I’m guessing that
many people in this pilot study simply did not know the meaning of obscure terms such as gregarious
and convivial. Presumably, they all knew what pleasant meant, but being pleasant would seem to have
little to do with extraversion. It might load better on “agreeableness,” which is a different Big 5
personality trait. Alternately, this item might have little validity on any scale because it is so vague and
socially desirable that everyone feels it describes them. In other words, “pleasant” may have also been
a bad item because of a ceiling effect. The mean score of 8.7 (on a 9-point scale) for the “pleasant” item
is highly consistent with this idea! After deleting the three bad items, the resulting 4-item scale was
highly reliable for such a brief measure; α = .91. To reduce any concerns about a “yay-saying” bias
inflating the reliability of this scale, I could simply add a few items that indicate a lack of extraversion
(e.g., shy, quiet), and then recode these items. People who tend to say yes to everything would
presumably tend to say yes to these items. If we had as many negative as positive indicators of
extraversion, we could thus balance out this yay-saying bias.
QUESTION 5.10. Jessica’s ratings are quite a bit higher than those of the other five raters. Specifically,
she seems to be more generous than anyone else – in the sense that she is giving higher average delay
of gratification ratings. Without looking more closely, though, it’s hard to say whether she is doing a
poor or a stellar job.
QUESTION 5.11a. Jessica is arguably the most reliable rater in the group. Her corrected “item-total
correlation” (actually a rater-total correlation) was the highest of any rater. That is, her individual
ratings for each child agreed very well with the average group rating received by each child (they agreed
better, in fact, than those of any other rater). Deleting her ratings would reduce the reliability of the
total EQM score. The alpha for the six raters was .87, but if we were to delete Jessica, the alpha would
drop to .81. In contrast, Jolie seems to have been doing a very poor job. Her corrected item total
correlation was only .15, and deleting her ratings from the group would substantially improve the
reliability of the ratings (the alpha would increase to .91!).
QUESTION 5.11b. Although Jessica’s ratings correlated very well with those of the other raters, there
was a problem with her ratings. Specifically, she seems to be overly generous in her ratings across the
board. Her rater mean was clearly the highest in the group. She should undergo a short re-training
session, the goal of which would be to cut about 1 point off her average rating -- without influencing the
rank order of her ratings. Because I am guessing that the absolute score (not merely the relative score)
for the quiet mouse game has a lot of meaning, I would be concerned about the elevation in Jessica’s
ratings. Thus, although elevation won’t change the correlation between the QM ratings and some
other outcome we might care about, elevation could be a big problem if we want to say something
about the mean level of these kids’ quiet mouse scores and compare them with kids in other samples.
QUESTION 5.11c. At first glance, it looks like GS won. However, if you delete Jolie’s ratings (as you
should because she is highly unreliable) the winner would be IB, with a mean score of 5.38.
QUESTION 5.12. Although a few of these items had very low item total correlations, Cronbach’s alpha
for this 27-item scale was quite acceptable, α = .82. Based on this reliability score, few people would
presumably be inclined to question the internal consistency of the scale.
QUESTION 5.13. Despite the high Cronbach’s alpha level of this scale, a principal components analysis of
these 27 items clearly suggested that the data are more consistent with a 2-factor than with a 1-factor
solution. Specifically, there were two factors with eigenvalues much higher than 1.0. Component 1 had
an eigenvalue of 8.81 and accounted for 32.62% of the variance in the items. Component 2 had an
eigenvalue of 4.96, and accounted for an additional 18.36% of the variance. Collectively, then, these two
factors accounted for slightly more than half of the variance in the 27 items. The item loadings for the
two factors make it extremely clear that the first 15 items load more heavily on the first factor than on
any other. These 15 items seems to have to do a hedonistic form of life satisfaction that has to do with
enjoying oneself and getting pleasure from life. The last 12 items actually seem to load negatively on
this factor. Unlike the first set of items, these items seem to have to do with feeling that one’s life is
meaningful or serves an important purpose. I thus labeled the two factors, pleasure and meaning,
respectively.
QUESTION 5.14. The first 15 items map very well onto what I called pleasure, and the last 12 items map
very well onto what I called meaning. These two factors actually correlate negatively with one another,
r (58) = -.27, p < .05. This certainly not what you’d expect of two subsets of the 27 items that seem to
make up an internally consistent scale. Further, this suggests very strongly that a solid Cronbach’s alpha
score for a scale does not guarantee that the scale is unidimensional. Finally, these results also suggest
quite clearly that there are at least two very different kinds of life satisfaction. In fact, they are so
different that people who tend to score high on one kind of life satisfaction tend to score low on the
other kind.
Download