“I'll remember this!” Effects of emotionality on memory predictions

Journal of Memory and Language 62 (2010) 240–253
Contents lists available at ScienceDirect
Journal of Memory and Language
journal homepage: www.elsevier.com/locate/jml
‘‘I’ll remember this!” Effects of emotionality on memory predictions
versus memory performance
Carissa A. Zimmerman *, Colleen M. Kelley
Department of Psychology, Florida State University, USA
a r t i c l e
i n f o
Article history:
Received 19 December 2008
revision received 6 November 2009
Available online 19 January 2010
Keywords:
Memory
Metacognition
Emotion
Free recall
Cued recall
a b s t r a c t
Emotionality is a key component of subjective experience that influences memory. We
tested how the emotionality of words affects memory monitoring, specifically, judgments
of learning, in both cued recall and free recall paradigms. In both tasks, people predicted
that positive and negative emotional words would be recalled better than neutral words.
That prediction was valid for free recall of positive, negative, and neutral words, but invalid
for cued recall of negative word pairs compared to neutral and positive pairs; only positive
emotional pairs showed enhanced recall relative to neutral pairs. Consequently, people
exhibited extreme overconfidence for cued recall of negative word pairs on the first
study-test trial. We demonstrate that emotionality does not globally enhance memory,
but rather has specific effects depending on the valence and task. Results are discussed
in terms of this complex relationship between emotionality and memory performance
and the subsequent variations in diagnosticity of emotionality as a cue for memory
monitoring.
Ó 2009 Elsevier Inc. All rights reserved.
Introduction
The emotional quality of events plays an important role
in online monitoring, drawing attention to both aversive
and appetitive signals (Calvo & Lang, 2004; Öhman, Flykt,
& Esteves, 2001; Stormark, Nordby, & Hugdahl, 1995). Given that emotion plays a role in how we monitor our external environment, it may also affect how we monitor the
internal environment of our thoughts and memories. However, little research has examined the role that emotionality plays in memory monitoring, and most has focused on
monitoring at retrieval via confidence judgments (Talarico
& Rubin, 2003) and remember/know judgments (Kensinger
& Corkin, 2003; Ochsner, 2000; Sharot, Delgado, & Phelps,
2004). Surprisingly, the emotionality of events tends to
be associated with a stronger feeling of remembering and
enhanced confidence at retrieval even when that confi-
* Corresponding author. Address: Department of Psychology, Florida
State University, Tallahassee, FL 32306, USA.
E-mail address: zimmerman@psy.fsu.edu (C.A. Zimmerman).
0749-596X/$ - see front matter Ó 2009 Elsevier Inc. All rights reserved.
doi:10.1016/j.jml.2009.11.004
dence is not warranted. For example, people are highly
confident in their memory for the details accompanying
emotional flashbulb memories despite the fact that these
details are no more likely to be accurately remembered
than details of less emotional events (Talarico & Rubin,
2003).
In this paper, we ask how the emotional quality of
words affects memory monitoring during encoding, in particular, the subjective feeling that one will remember in
the future. This subjective memorability is assessed via
judgments of learning (JOLs). Knowing the degree to which
an experience is memorable is particularly important for
memory control processes, such as investing time and effort in encoding when one wants to guarantee that one will
remember. For example, if it is important to remember a
person’s name or a conversation with one’s boss, memory
monitoring allows one to assess whether that name or conversation will be remembered, or if extra steps, such as
elaborative encoding or the creation of a physical record,
are needed in order to guarantee later availability of the
information. There is no research, however, investigating
such memory monitoring of emotional material.
C.A. Zimmerman, C.M. Kelley / Journal of Memory and Language 62 (2010) 240–253
Following Koriat’s (1997) cue-utilization framework,
we propose that people’s ability to monitor their learning
of emotional versus neutral material might depend upon
two general classes of information. The first is a theorybased analytic inference. Through experience, people may
have acquired the knowledge that emotional events tend
to be memorable, in the same way that they tend to believe
that cued recall for a related word pair (tea–coffee) will be
better than cued recall for an unrelated word pair (tea–
skater). If this is the case, during study emotionality operates as a theory-based intrinsic cue to memorability, a cue
that is inherent to the word, picture, or other material to be
remembered. Indeed, a recent survey of adults found that
people believe that ‘‘dramatic” events are more memorable
than daily events (Magnusson et al., 2006), a finding consistent with the ideas that emotion is a salient component
of lay theories of memory and that, based on such a theory,
people will generally predict better memory for emotional
compared to neutral material.
People also use experience-based or nonanalytic heuristics to monitor memorability (Kelley & Jacoby, 1996; Koriat, 1997); these are subjective, internal cues such as ease of
processing (Begg, Duft, Lalonde, Melnick, & Sanvito, 1989;
Benjamin, Bjork, & Schwartz, 1998; Rhodes & Castel,
2008). The attention-grabbing nature of emotional material or the emotional reaction itself may be taken as an
indication that the item is memorable. In this case, emotion would be acting as an experience-based intrinsic
cue. For example, Monin (2003) found that liking is misinterpreted as familiarity on a memory test (see also Claypool, Hall, Mackie, & Garcia-Marques, 2007; GarciaMarques, Mackie, Claypool, & Garcia-Marques, 2004); similarly, emotional reactions during encoding could be subjectively experienced as indicators of memorability.
The validity of the cues that are the basis for memory
monitoring determines the accuracy of the monitoring
process. The accuracy of JOLs is measured in two different
ways. Calibration is a measure of absolute accuracy, or how
well on average participants’ JOLs match their actual level
of recall. Resolution is a measure of relative accuracy, or
the extent to which JOLs distinguish between later recalled
and non-recalled items. With regard to emotional materials, free recall is better for both positive and negative stimuli, including words, pictures, film clips, and narrated slide
shows, than for their neutral counterparts (Bradley, Greenwald, Petry, & Lang, 1992; Cahill et al., 1996; Doerksen &
Shimamura, 2001; Harris & Pashler, 2005; Hertel & Parks,
2002; Kensinger & Corkin, 2003; Nagae & Moscovitch,
2002; Ochsner, 2000; Rubin & Friendly, 1986). Enhanced
free recall has been attributed to greater distinctiveness
(Schmidt, 1991; Schmidt & Saari, 2007; Talmi, Luk, McGarry, & Moscovitch, 2007), increased attention during encoding (Anderson & Phelps, 2001; Calvo & Lang, 2004; Öhman
et al., 2001), and, at longer-term retention intervals, to
greater consolidation of memories for emotionally arousing events (LaBar & Cabeza, 2006; LaBar & Phelps, 1998;
McGaugh, 2004; Phelps, 2004). Thus, emotionality ought
to be a valid cue for judging the memorability of individual
words in a free recall paradigm.
In contrast, it is difficult to predict the validity of emotionality as an indicator of memorability in cued recall,
241
because there is little research on emotional material and
intentional associative memory. Indeed, we found no prior
studies of cued recall using emotional words. We found a
single study of associative recognition of negative compared to neutral word pairs, which reported worse recognition of negative pairs compared to neutral pairs (Onoda,
Okamoto, & Yamawaki, 2009; however, other characteristics of the items that might affect memorability were not
equated for emotional versus neutral pairs). Immediate
cued recall of prior free associations to emotional versus
neutral words (not encoded intentionally) also reveals a
disadvantage for emotional words (Levinger & Clark,
1961; McDowal, 1994). In contrast, Tsukiura and Cabeza
(2008) found that, after studying pairs of names and faces,
cued recall of the facial expression was more accurate for
names that had been paired with happy rather than neutral faces.
Studies of memory for incidentally encoded source
characteristics, such as spatial and temporal information,
show no advantage for emotional words or pictures compared to their neutral counterparts. Maddock and Frein
(2009) found equivalent memory for the spatial and temporal context of neutral and positive words, and worse
context memory for negative words. Recollection of
whether a picture appeared in a first versus second list
was lower for emotional pictures compared to neutral pictures (Aupée, 2007), and memory for whether a word was
read versus heard was lower for emotional than for neutral
words (Cook, Hicks, & Marsh, 2007; see Mather, 2007, for a
review of incidental binding of emotional material and related information). Even intentional encoding of source
characteristics such as font color seems not to be enhanced
for emotional material (Dougal, Phelps, & Davachi, 2007),
and was worse for negative compared to neutral words
(Maddock & Frein, 2009, Experiment 4). Thus, if people
judge emotional material as more memorable in preparation for a task that requires binding two items together,
such as cued recall, it is not at all clear that they will be
correct, particularly for negative material.
The current experiments explore whether people judge
the memorability of emotional versus neutral words differently by asking for JOLs prior to a cued recall test (Experiments 1, 3, and 4) and a free recall test (Experiments 2 and
3). We predict that people use the emotionality of words as
a cue to memorability on both types of memory test, and
so will give higher JOLs to emotional than to neutral words
and word pairs. Given that emotionality is a valid cue for
free recall, memory monitoring should be relatively effective; however, emotionality of word pairs may be a misleading cue to memorability in cued recall, resulting in
overconfidence for emotional pairs.
Experiment 1
In Experiment 1, we assessed whether people used the
emotionality of words as a cue for JOLs in a paired associates paradigm across two study-test trials. Participants
studied pairs of negative or neutral words and rated the
likelihood that they would be able to recall the second
word when presented with the first. We predict that
242
C.A. Zimmerman, C.M. Kelley / Journal of Memory and Language 62 (2010) 240–253
participants will use emotionality of items as a cue in making JOLs, and so will give higher JOLs to negative compared
to neutral word pairs on Trial 1. However, given that some
types of associative memory are not better for negative
compared to neutral materials, as reviewed above, we predict that this will lead to marked overconfidence.
After repeated study-test trials, particular experiencebased cues such as familiarity of an item and memory for
a prior test become available; because these mnemonic
cues are so diagnostic of later memory performance, with
repeated study-test cycles people typically abandon other
cues intrinsic to the studied material, such as word pair
relatedness, and shift to using the mnemonic cues as a basis for JOLs (Koriat, 1997). In the current study, we anticipate that emotionality will be a cue to memorability on
the first study trial, but that participants will shift to more
diagnostic mnemonic cues (e.g., memory for whether they
recalled a target on the first test) as a basis for JOLs on the
second study trial. Additionally, we explore the effect of
emotionality on memory control processes by including
self-paced as well as experimenter-paced study time conditions. If people believe that emotional material is highly
memorable, they may study it for less time than neutral
material.
Method
Participants
Forty-eight undergraduate students enrolled in an
introductory psychology course at Florida State University
participated in exchange for partial course credit. Students
were randomly assigned to either the self-paced study
time or experimenter-paced study time condition and
tested individually.
Materials
Eighty-four nouns were chosen from the Affective
Norms for English Words (ANEW; Bradley & Lang, 1999);
40 of the words were neutral (M = 5.2 on ANEW’s scale
from 1 [unpleasant] to 9 [pleasant]) and 44 were negative
(M = 2.5). Four additional neutral words were taken from
Rubin and Friendly (1986), who used a rating scale from
1 (bad) to 7 (good); these four neutral words all received
ratings in the range of 4–6. Both valence and arousal are
important components of emotionality, so in addition to
being more negative, emotional words were also selected
to be more arousing according to the ANEW (M = 5.8 on
ANEW’s scale from 1 [calm] to 9 [aroused]) than neutral
words (M = 4.2), F(1, 82) = 73.79, MSE = .73, p < .001. All
negative and neutral words were equated for frequency
(Kuçera & Francis, 1967; M = 26.3 for negative, M = 34.0
for neutral), F(1, 82) = 1.18, MSE = 1048.23, p = .28, length
(M = 5.6 for both), F < 1, concreteness (Friendly, Franklin,
& Rubin, 1982; Nelson, McEvoy, & Schreiber, 1998;
M = 4.8 for negative, M = 4.9 for neutral), F < 1, imageability
(Wilson, 1988; M = 514.1 for negative, M = 519.8 for neutral), F(1, 72) = 1.04, MSE = 10234.78, p = .31, and familiarity (Wilson; M = 527.0 for negative, M = 503.0 for
neutral), F < 1.
From these nouns, two types of word pairs were
formed: 22 negative pairs (e.g., prison–cancer) and 22 neu-
tral pairs (e.g., violin–avenue). A pair-wise latent semantic
analysis (LSA; Landauer, Foltz, & Laham, 1998) on the individual word pairs ensured that the words comprising each
pair were unrelated to one another. The average pair-wise
similarity score for all word pairs was .04; no pair had a
similarity score greater than .1 (LSA similarity scores are
on a scale from 1.0 to +1.0). Additionally, neutral noun
pairs were no more related overall than negative noun
pairs (mean similarity score = .04 for both), F < 1. Words
were further assessed for relatedness using free association
norms (Nelson et al., 1998). No cue was listed as an associate of its paired target word; additionally, the highest
probability of producing any word in the list in response
to a cue word in the list was .13. There was no significant
difference in the forward association strengths of negative
(M = .05) and neutral word sets (M = .04), F < 1.
Procedure
Participants studied the same list of words for two
study-test trials. Word pair order in both the study and test
phases was newly randomized for each participant. In the
study phase of each trial, participants saw the 44 pairs presented one at a time on a computer screen. In the experimenter-paced condition, each pair was presented for 5 s;
in the self-paced condition, participants were able to study
the pair for as long as they chose. For participants in the
self-paced condition, study time was measured from presentation of the pair to when the participant pressed the
spacebar to bring up the next pair. In both conditions,
the inter-stimulus interval between the pair and the JOL
prompt was 250 ms. Participants in both conditions were
instructed to study each pair such that they would later
be able to recall the second word (target word) when presented with the first (cue word). Immediately after each
pair was presented, participants estimated the probability
of recalling the target word when prompted by the cue
word on a scale from 0 (certain will not recall the target
word) to 100 (certain will recall the target word).
Immediately after each study phase, participants
moved on to a test phase. In the test phase, each of the
44 cue words appeared on the screen one at a time. Participants were instructed to type in the target word that was
paired with that cue word during the study phase; they
were given a maximum of 15 s to do so. The next cue word
was presented immediately after the participant entered
his or her response.
Results and discussion
There were no significant interactions with study time
condition, so we collapsed across this factor in all of the
following analyses. Contrary to our prediction, negative
pairs were not studied for significantly less time
(M = 7077.5 ms) than neutral pairs (M = 7557.1 ms) on
Trial 1, F(1, 18) = 3.20, MSE = 683381.4, p = .09.
Recall
Misspelled items and plurals were counted as correct in
this and subsequent experiments. A Trial (1, 2) Emotionality (negative, neutral) repeated-measures ANOVA on the
percentage of words recalled found no main effect of
C.A. Zimmerman, C.M. Kelley / Journal of Memory and Language 62 (2010) 240–253
emotionality; in fact, there was a trend towards recall
being worse for negative (M = 33.5) compared to neutral
pairs (M = 36.3), F(1, 47) = 2.65, MSE = 141.20, p = .11,
g2 = .05. There was a significant main effect of trial such
that recall improved for both item types from the first to
the second trial, F(1, 47) = 204.27, MSE = 167.36, p < .001,
g2 = .81 (see Fig. 1).
Commission errors
We also examined the possibility that participants
tended to make more commission errors for negative items
than for neutral items. A commission error was the production of any word other than the correct target in response to the cue. A repeated-measures ANOVA
performed on the number of commission errors revealed
a main effect of emotionality, F(1, 47) = 53.93, MSE = 9.34,
p < .001, g2 = .53; overall, participants committed more
commission errors for negative cues (M = 8.7 out of 22)
compared to neutral cues (M = 5.5 out of 22).
Calibration
Overall. Calibration is the average correspondence between JOLs and recall. Following Koriat (1997), we included JOL and recall percentages as two levels of one
independent variable designated ‘‘measure” (see also Koriat & Bjork, 2005, 2006; Koriat, Bjork, Sheffer, & Bar, 2004;
Koriat, Ma’ayan, Sheffer, & Bjork, 2006; Koriat, Sheffer, &
Ma’ayan, 2002; Van Overschelde & Nelson, 2006). A Trial
(1, 2) Emotionality (negative, neutral) Measure (JOL,
recall) repeated-measures ANOVA revealed a significant
3-way interaction, F(1, 46) = 8.65, MSE = 31.44, p < .01,
g2 = .16, illustrated in Fig. 1. We followed up on the interaction with separate analyses of Trial 1 and Trial 2 because
we predicted that emotionality of the word pair would be a
cue for JOLs on Trial 1, whereas mnemonic information
Fig. 1. Experiment 1 mean judgment of learning (JOL) and recall as a
function of trial and pair emotionality. Error bars represent standard
errors.
243
such as memory for the prior test would be the predominate cue for JOLs on Trial 2.
Trial 1. An Emotionality (negative, neutral) Measure
(JOL, recall) repeated-measures ANOVA revealed a significant interaction, F(1, 47) = 40.99, MSE = 44.42, p < .001,
g2 = .47. Participants were overconfident in their memory
for negative pairs on Trial 1, F(1, 47) = 43.50,
MSE = 288.48, p < .001, g2 = .48; in fact, participants believed that they would recall almost twice as many negative pairs as they actually did. Participants were also
overconfident in their memory for neutral pairs, F(1,
47) = 8.71, MSE = 306.63, p < .01, g2 = .16, although to a lesser extent; this finding of some overconfidence for neutral
pairs is common in the metacognition literature (Koriat &
Bjork, 2005; Koriat, Ma’ayan, Sheffer, & Bjork, 2006; Koriat,
Sheffer, & Ma’ayan, 2002).
Trial 2. Results from the second trial also revealed a significant
Emotionality Measure
interaction,
F(1,
47) = 10.19, MSE = 36.75, p < .01, g2 = .18. The typical
underconfidence with practice effect (Koriat, Sheffer, &
Ma’ayan, 2002) was found for both negative, F(1,
47) = 8.78, MSE = 267.15, p < .01, g2 = .16, and neutral pairs,
F(1, 47) = 19.60, MSE = 293.15, p < .001, g2 = .29, although it
was less pronounced for negative pairs.
Resolution
Resolution is a measure of the relative accuracy of JOLs
and is commonly indexed by within-subject Goodman–
Kruskal gamma correlations between JOLs and recall accuracy (Nelson, 1984). JOL–recall gammas were calculated
separately for negative and neutral pairs. Twelve participants were excluded because they had no variability in recall in one of the conditions (one participant recalled all
targets on Trial 2, three recalled all targets in the neutral
condition on Trial 2, seven recalled no neutral targets on
Trial 1, and one recalled no negative targets on Trial 1).
One participant was excluded for giving all negative pairs
a JOL of 20 on Trial 2. Because gammas could not be computed for these participants, only the remaining 35 participants were included in the following analyses (see
Schwartz & Metcalfe, 1992, Experiment 3, for a similar
number of exclusions).
Consistent with previous findings (Koriat, 1997), JOL–
recall gammas increased across trials, F(1, 34) = 8.52,
MSE = .16, p < .01, g2 = .20. This increase in gammas is typically interpreted as a shift away from the use of intrinsic
cues, which would include the emotional quality of word
pairs, to make JOLs and toward the use of mnemonic cues,
which are typically quite diagnostic of memory performance. There was also a significant main effect of emotionality, F(1, 33) = 15.04, MSE = .13, p < .001, g2 = .31. JOL–
recall gammas were worse overall for negative pairs
(M = .34) compared to neutral pairs (M = .53); participants
were less able to distinguish between pairs that they
would and would not recall when those pairs were negative rather than neutral. However, to preview, this finding
did not replicate in the subsequent experiments and will
not be discussed further. The interaction between trial
and emotionality was not significant, F < 1.
To measure whether people were reliant on emotionality as a cue on Trial 1, but less reliant on emotionality and
244
C.A. Zimmerman, C.M. Kelley / Journal of Memory and Language 62 (2010) 240–253
more reliant on mnemonic cues on Trial 2, we calculated
the gamma correlation between emotionality (negative
versus neutral) and JOLs (see Koriat, 1997 for a similar
analysis in the case of pair relatedness as a cue). The gamma correlation between JOL and emotionality reflects the
extent to which high JOLs are assigned to negative items
and low JOLs are assigned to neutral items; in this case, a
high JOL–emotionality gamma would indicate that negative emotionality is used as a cue for JOLs. The JOL–emotionality gamma decreased significantly from the first to
second trial, F(1, 45) = 31.09, MSE = .04, p < .001, g2 = .41
(see Fig. 2). This pattern indicates that emotionality of
the pairs is a cue used to make JOLs on Trial 1, but it is
abandoned in favor of more diagnostic mnemonic cues
on Trial 2 (e.g. Koriat, 1997).
Summary
As predicted, emotionally negative word pairs were
judged as far more memorable than neutral word pairs;
however, negative word pairs did not live up to their
apparent memorability. Cued recall was not better for negative word pairs relative to neutral word pairs, resulting in
overconfidence on Trial 1. On Trial 2, participants shifted
away from reliance on emotionality to make their JOLs,
presumably to more mnemonic cues; calibration shifted
to underconfidence for both negative and neutral pairs,
with more underconfidence for neutral pairs.
Experiment 2
In Experiment 2, we explored whether the negative
emotionality of words enhanced their subjective memorability in a free recall paradigm. Most experiments find better recall of negative words and pictures relative to neutral
materials, and so we predicted that the use of emotionality
as a cue for JOLs would produce equivalent monitoring for
Fig. 2. Experiment 1 mean JOL–recall and JOL–emotionality gammas as a
function of trial. Error bars represent standard errors.
negative and neutral words. The exceptions to this pattern
of superior recall of emotional items are found in experiments that attempt to equate negative and neutral words
on subjective measures of relatedness via ratings, based
on the claim that relatively short-term emotional memory
enhancement is due to the categorical nature of emotional
items themselves (Schmidt & Saari, 2007; Talmi, Luk, et al.,
2007; Talmi & Moscovitch, 2004; Talmi, Schimmack, Paterson, & Moscovitch, 2007). In contrast, we use the same
method as in Experiment 1 of equating materials on LSA
and free association norms (see also Dougal & Rotello,
2007). Additionally, we examine this semantic-cohesion
explanation of emotional memory enhancement by measuring the extent to which negative and neutral words
cluster together in recall. If emotionally negative items
are recalled better by virtue of more relational processing
among negative items, then we would expect to see clustering by emotion category in participants’ recall output.
Method
Participants
Thirty-six undergraduates enrolled in an introductory
psychology course at Florida State University participated
in exchange for partial course credit. All participants were
tested individually.
Materials
Forty-four nouns were chosen from the Experiment 1
materials; half of the words were neutral (M = 5.17 on ANEW’s scale from 1 [unpleasant] to 9 [pleasant]) and half
were negative (M = 2.45). Negative emotional words
(M = 5.9) were also selected to be more arousing than neutral words (M = 4.2), F(1, 42) = 65.55, MSE = .50, p < .001.
Negative and neutral words were equated for frequency
(Kuçera & Francis, 1967; M = 27.9 for negative, M = 30.3
for neutral), length (M = 5.8 for negative, M = 5.4 for neutral), concreteness (Friendly et al., 1982; Nelson et al.,
1998; M = 5.1 for both), familiarity (Wilson, 1988;
M = 530.7 for negative, M = 519.2 for neutral), and imageability (Wilson; M = 543.1 for negative, M = 521.8 for neutral), all F < 1. A matrix LSA (Landauer et al., 1998) on the
individual words ensured that negative words were no
more inter-related than neutral words, F < 1; the average
similarity score was .08 for both negative and neutral word
sets. Words were also examined using free association
norms (Nelson et al., 1998); only seven words had associates on the list. The mean probability of producing any
word in response to another was .02 for negative words
and .08 for neutral words, F < 1.
Procedure
The experiment again consisted of two study-test trials.
Word order in both the study and test phases was newly
randomized for each participant. In the study phase, each
of the 44 words was presented on a computer screen for
5 s, with an inter-stimulus interval of 250 ms. Participants
were instructed to study each word such that they would
later be able to recall it on their own. After the 5 s study
period for each word, the following question appeared on
the screen: ‘‘What are the chances that you will recall this
C.A. Zimmerman, C.M. Kelley / Journal of Memory and Language 62 (2010) 240–253
word?” Participants responded by typing in their answer
on a scale from 0 (certain will not recall the word) to 100
(certain will recall the word). Immediately after each study
phase, participants moved on to a test phase. During the
test phase, participants were told to type in as many of
the words as they could recall; they were given as much
time as they needed to do so.
Results and discussion
Recall
A Trial (1, 2) Emotionality (negative, neutral) repeated-measures ANOVA on the percentage of words recalled revealed that, in contrast to the results for cued
recall in Experiment 1, negative words were recalled better
than neutral words, F(1, 35) = 20.71, MSE = 155.42, p < .001,
g2 = .37, (see Fig. 3). Higher free recall of negative words is
consistent with previous research (Bradley et al., 1992; Cahill et al., 1996; Harris & Pashler, 2005; Nagae & Moscovitch, 2002; Rubin & Friendly, 1986), and occurred even
though emotional and neutral words were equated for
relatedness using LSA and free association norms. Recall
improved from Trial 1 to Trial 2 equally for both word
types, F(1, 35) = 205.01, MSE = 79.12, p < .05, g2 = .85,
(F < 1 for the interaction). These results support our prediction that emotionality is a valid cue for JOLs in a free recall
test.
Calibration
A Trial (1, 2) Emotionality (negative, neutral) Measure (JOL, recall) repeated-measures ANOVA yielded a significant main effect of trial, F(1, 35) = 55.92, MSE = 172.55,
p < .001, g2 = .62. There was also a significant main effect of
emotionality, F(1, 35) = 41.75, MSE = 167.55, p < .001,
g2 = .54. Not only were negative words better recalled than
neutral words, but JOLs were also higher for negative
Fig. 3. Experiment 2 mean judgment of learning (JOL) and recall as a
function of trial and pair emotionality. Error bars represent standard
errors.
245
words than neutral words, which is consistent with the
use of emotionality as a cue for JOLs. We also found a significant Trial Measure interaction, F(1, 35) = 51.82,
MSE = 129.38, p < .001, g2 = .60, as shown in Fig. 3. On Trial
1, participants were equally overconfident in their judgments of negative and neutral words, as revealed by the
simple effect of measure across emotionality, F(1,
35) = 10.58, MSE = 491.50, p < .01, g2 = .23. On Trial 2, JOLs
shifted to equal underconfidence for both negative and
neutral words, F(1, 35) = 5.29, MSE = 360.44, p < .05,
g2 = .13. No other interactions were significant, all F < 1.
Resolution
Gammas could not be calculated for one participant because he had no variability in JOLs in one of the conditions
(he gave all neutral words a JOL of 50 on Trial 2); this participant was excluded from the analysis. Consistent with
previous findings (Koriat, 1997), there was a main effect
of trial, F(1, 34) = 7.63, MSE = .15, p < .01, g2 = .18; JOL–recall gammas increased across trials for both negative and
neutral words (see Fig. 4). Monitoring resolution was
equally effective for negative (M = .37) and neutral words
(M = .38), F < 1.
We again examined whether emotionality was a cue for
Trial 1 JOLs, and whether that cue was abandoned in favor
of more valid mnemonic cues on Trial 2 by computing JOL–
emotionality gammas for each trial. The JOL–emotionality
gammas were significantly greater than zero on both Trial
1, t(34) = 6.01, p < .001, d = 1.02, and Trial 2, t(34) = 6.49,
p < .001, d = 1.10 (see Fig. 4). This positive relationship between emotionality and JOLs suggests that participants
were in fact differentially assigning high and low JOLs
based on a given word’s emotionality such that negative
words received higher JOLs than neutral words. In contrast
to Experiment 1, the JOL–emotionality gammas did not
drop from Trial 1 to Trial 2 as mnemonic cues became
available, F < 1, as illustrated in Fig. 4. One explanation
Fig. 4. Experiment 2 mean JOL–recall and JOL–emotionality gammas as a
function of trial. Error bars represent standard errors.
246
C.A. Zimmerman, C.M. Kelley / Journal of Memory and Language 62 (2010) 240–253
for this pattern is that emotionality could serve as an
intrinsic cue on both trials. Alternatively, the intrinsic cue
of emotionality could be used on Trial 1, and mnemonic
cues used on Trial 2, as in Experiment 1, but a high correlation between emotionality and mnemonic cues in the
free recall task would produce no change in the JOL–emotionality gamma on Trial 2.
Semantic clustering
We calculated Adjusted Ratio of Clustering (ARC; Roenker, Thompson, & Brown, 1971) scores to examine clustering according to the emotional category of negative versus
neutral. The ARC score is the proportion of actual category
repetitions above chance to the total possible category repetitions above chance. ARC scores were significantly different from zero on the second trial only (M = .12),
t(35) = 3.08, p < .01, d = .5. The lack of clustering according
to emotional category on the first recall trial (M = .04) indicates little within-emotion relational processing, which
undermines the argument that superior recall of negative
items is based on greater semantic cohesion among emotional items (Buchanan, Etzel, Adolphs, & Tranel, 2006;
Maratos, Allan, & Rugg, 2000; Talmi, Luk, et al., 2007; Talmi
& Moscovitch, 2004; Talmi, Schimmack, et al., 2007), at
least when the relatedness is equated for negative and
neutral words using LSA and free association norms, as in
the current experiment. Although ARC scores were significantly different from zero by the second trial, they were
still numerically quite small, only 12% above chance.
Summary
Participants again judged negative emotional items as
more memorable than neutral items. However, in the current free recall task, negative words were, in fact, recalled
more often than neutral words, which made emotionality a
valid cue for predicting recall. Thus, accuracy of monitoring
performance did not differ based on emotionality; both
calibration and resolution were equivalent for negative
and neutral words.
Lastly, clustering scores were not significantly different
from zero on Trial 1, indicating that participants did not
organize their recall output by emotionality. There are a
number of theories as to why free recall is better for emotionally charged words than neutral words. In the long
term, consolidation processes may be modulated by the
amygdala, producing less forgetting for emotional items.
However, in the shorter study-test intervals used in the
current experiment, one major theory is that emotional
items constitute a semantic category that encourages more
relational processing among like items (e.g. Talmi & Moscovitch, 2004); however, the lack of clustering in our free
recall data does not support this semantic-cohesion theory
of emotional memory enhancement.
Experiment 3
Thus far we have demonstrated that, although individual negative emotional words are indeed better recalled
than neutral words, pairs of negative words are not better
recalled than their neutral counterparts. However, partici-
pants predicted that they would remember both individual
negative words and negative word pairs better than neutral words and pairs. This prediction was accurate in the
case of free recall, but led to marked overconfidence in
the case of cued recall. Because the cued recall results, as
well as the monitoring results, for negative emotional
words are novel, Experiment 3 aims to replicate these findings. In addition, we tested the nature of memory and
monitoring performance for positive words and word
pairs. In Experiments 1 and 2, negative and neutral pairs
differed in both valence and arousal, and either dimension
could be driving the higher judgments of memorability. In
Experiment 3, we equate the positive words and negative
words on arousal, with both being more arousing than
the neural words.
Experiment 3 also controls for subjective relatedness of
the word pairs. When an independent set of 15 participants rated the relatedness of the two words forming a pair
in Experiment 1, they perceived the negative pairs as more
related, even though relatedness was controlled via free
association norms and LSA. In Experiment 3, we controlled
for both objective and subjective relatedness of positive,
negative, and neutral word pairs; thus Experiment 3 tests
whether emotionality per se, independent of perceived
and actual relatedness, affects both JOLs and memory performance in cued recall. We also include free recall versus
cued recall as a between-subjects manipulation.
As noted in the introduction, to our knowledge Experiment 1 is the first study to measure cued recall of words
varying in emotionality, and we found no better recall of
negative than neutral words. The extension to positive
word pairs is also a first, and therefore we are unable to
predict whether memory will be better for positive word
pairs than neutral pairs. However, we do predict that people will judge positive word pairs as more memorable than
neutral pairs, regardless of the accuracy of that prediction.
Experiment 3 used a single study-test trial, in order to focus on the use of emotionality as a cue to memorability.
Method
Participants
Forty undergraduate students enrolled in an introductory psychology course at Florida State University participated in exchange for partial course credit. All
participants were randomly assigned to either the cued recall or free recall condition and were tested individually.
Materials
Eighty-one nouns were chosen from the Affective
Norms for English Words (ANEW; Bradley & Lang, 1999);
27 of the words were neutral (M = 5.2 on ANEW’s scale
from 1 [unpleasant] to 9 [pleasant]), 26 were negative
(M = 2.6), and 28 were positive (M = 7.4). Three additional
words (two negative and one neutral) were taken from Rubin and Friendly (1986), who used a rating scale from 1
(bad) to 7 (good); the negative words both had ratings below three and the neutral word had a rating of five. Both
positive and negative words were also more arousing than
neutral words according to the ANEW (M = 5.6 on ANEW’s
scale from 1 [calm] to 9 [aroused] for positive, M = 5.8 for
C.A. Zimmerman, C.M. Kelley / Journal of Memory and Language 62 (2010) 240–253
negative, M = 4.1 for neutral), F(2, 79) = 32.00, MSE = .74,
p < .001; positive and negative words did not differ from
each other in arousal, F < 1. All positive, negative, and neutral words were equated for frequency (Kuçera & Francis,
1967; M = 26.5 for positive, M = 24.1 for negative,
M = 27.7 for neutral), F < 1, length (M = 5.5 for positive,
M = 5.6 for negative, M = 5.8 for neutral), F < 1, concreteness (Friendly et al., 1982; Nelson et al., 1998; M = 4.9 for
positive, M = 5.2 for negative, M = 5.1 for neutral), F < 1,
and imageability (Wilson, 1988; M = 560.9 for positive,
M = 538.8 for negative, M = 517.9 for neutral), F(2,
62) = 1.53, MSE = 6447.47, p = .22. A matrix LSA (Landauer
et al., 1998) on each word set ensured that, overall, negative and positive words were no more related than neutral
words (M = .13 for all), F < 1. Words were also assessed for
relatedness using free association norms (Nelson et al.,
1998). No cue was listed as an associate of its paired target
word; additionally, the highest probability of producing
any word in the list in response to a cue word was .04.
There was no significant difference in the forward association strengths of positive (M = .05), negative (M = .03), or
neutral word sets (M = .03), F < 1.
From these nouns, three types of word pairs were
formed for the cued recall condition: 14 positive pairs, 14
negative pairs, and 14 neutral pairs. All pairs were rated
for relatedness by an independent set of participants
(n = 18). Overall, negative (M = 2.6 on a scale from 1 [not
related] to 7 [very related]) and positive pairs (M = 2.5)
were no more subjectively related than neutral pairs
(M = 2.3), F < 1. A pair-wise LSA (Landauer et al., 1998) on
the individual word pairs further ensured that the words
comprising each pair were unrelated to one another. The
average pair-wise similarity score for all word pairs was
.051; no pair had a similarity score greater than .2. Additionally, neither negative noun pairs (M = .036) nor positive noun pairs (M = .065), were more related overall than
neutral noun pairs (M = .053), F < 1.
The targets from the cued recall condition served as the
stimuli in the free recall condition. Both positive and negative targets were more arousing (M = 5.9 for positive;
M = 6.2 for negative) than neutral targets (M = 4.1), F(1,
80) = 33.43, MSE = .73, p < .001; positive and negative targets did not differ from each other in arousal, F < 1. All positive, negative, and neutral targets were equated for
frequency (Kuçera & Francis, 1967; M = 23.3 for positive,
M = 25.5 for negative and neutral), length (M = 6.1 for positive, M = 5.5 for negative, M = 5.9 for neutral), concreteness
(Friendly et al., 1982; Nelson et al., 1998; M = 4.8 for positive, M = 5.1 for negative, M = 5.2 for neutral), and imageability (Wilson, 1988; M = 543.6 for positive, M = 549.4
for negative, M = 533.8 for neutral), all F < 1. A matrix LSA
(Landauer et al., 1998) on each target set ensured that,
overall, negative (M = .17) and positive targets (M = .18)
were no more related than neutral targets (M = .15), F(2,
39) = 1.50, MSE = .002, p = .24. Targets were also assessed
for relatedness using free association norms (Nelson
et al., 1998). The highest probability of producing any target in the list in response to another target was .06. There
was no significant difference in the forward association
strengths of positive (M = .05) and negative (M = .04) target
sets, F < 1; no neutral targets had associates in the list.
247
Procedure
The procedure in the free recall condition was as in
Experiment 2, and the procedure for the cued recall condition was as in the experimenter-controlled study condition
of Experiment 1. However, in Experiment 3, word pairs
(cued recall condition) or single words (free recall condition) were presented for a single study-test trial to examine the role of emotionality as an intrinsic cue to
memorability.
Results and discussion
A Condition (free, cued) Emotionality (positive, negative, neutral) Measure (JOL, recall) mixed-measures ANOVA revealed a significant 3-way interaction, F(2,
76) = 13.94, MSE = 68.62, p < .001, g2 = .27. To follow up
on this interaction, we examined the cued and free recall
conditions separately.
Cued recall condition
Recall. A repeated-measures ANOVA on the percentage of
words correctly recalled revealed a significant main effect
of emotionality, F(2, 38) = 18.35, MSE = 130.95, p < .001,
g2 = .49 (see Fig. 5). Follow-up tests confirmed that positive
word pairs were recalled significantly better than both
negative, F(1, 19) = 21.60, MSE = 150.87, p < .001, g2 = .53,
and neutral pairs, F(1, 19) = 32.62, MSE = 120.19, p < .001,
g2 = .63; recall of negative and neutral pairs did not differ
significantly, F < 1.
Commission errors. A repeated-measures ANOVA on the
number of commission errors revealed a significant main
effect of emotionality, F(2, 38) = 3.55, MSE = 1.86, p < .05,
g2 = .16. Participants made significantly more commission
errors for negative pairs (M = 2.7 out of 14) compared to
positive pairs (M = 1.5 out of 14), F(1, 19) = 5.32,
Fig. 5. Experiment 3 mean judgment of learning (JOL) and recall as a
function of test type and pair emotionality. Error bars represent standard
errors.
248
C.A. Zimmerman, C.M. Kelley / Journal of Memory and Language 62 (2010) 240–253
MSE = 2.49, p < .05, g2 = .22, with neutral pairs falling in between (M = 2.1).
and negative words were not significantly different, F(1,
19) = 1.40, MSE = 32.60, p = .25.
Calibration. An Emotionality (positive, negative, neutral) Measure (JOL, recall) repeated-measures ANOVA revealed a main effect of emotionality, F(2, 38) = 21.76,
MSE = 95.02, p < .001, g2 = .53, which was qualified by a
significant interaction, F(2, 38) = 14.10, MSE = 72.13,
p < .001, g2 = .43, illustrated in the left pane of Fig. 5. To follow up on this interaction, we examined calibration for
each emotion category separately. For neutral and positive
pairs, participants were accurate in their predictions overall; JOLs did not differ significantly from actual level of recall for either neutral, F < 1, or positive pairs, F(1,
19) = 1.44, MSE = 254.35, p = .25. For negative pairs, however, participants were again significantly overconfident
in their memory, F(1, 19) = 6.55, MSE = 303.36, p < .05,
g2 = .26, replicating the findings of Experiment 1.
Resolution. One participant was excluded because he had
no variability in recall in one of the conditions (he recalled
no neutral words). A repeated-measures ANOVA on the
JOL–recall gammas revealed only a marginal difference
among emotion categories (M = .10 for negative, M = .22
for neutral, M = .39 for positive), F(2, 36) = 2.45, MSE = .17,
p = .10, g2 = .12. JOL–emotionality gammas were again significantly different from zero (M = .32), t(19) = 6.52,
p < .001, d = 1.45; participants used emotionality as a cue
for making their JOLs, as suggested by the JOL analyses presented above.
Resolution. Four participants were excluded because they
had no variability in recall in one of the conditions (one recalled no negative pairs, three recalled no neutral pairs). A
repeated-measures ANOVA on the JOL–recall gammas revealed no differences in relative accuracy among the emotion categories, F < 1. As previewed in Experiment 1, JOL–
recall gammas were not lower for negative (M = .19) compared to neutral (M = .21) or positive pairs (M = .19).
JOL–emotionality gammas were computed by collapsing across positive and negative valence; we calculated
the gamma correlation between emotionality (positive/
negative versus neutral) and JOLs. The average JOL–emotionality gamma was significantly different from zero
(M = .32), t(19) = 8.47, p < .001, d = 1.88, indicating that
participants again relied on the emotionality of word pairs
as a cue to make their JOLs; higher JOLs were given to both
positive and negative pairs compared to neutral pairs.
Free recall condition
Recall. A repeated-measures ANOVA on the percentage of
words correctly recalled revealed a main effect of emotionality, F(2, 38) = 8.73, p < .001, g2 = .32 (see Fig. 5). Both negative and positive words were recalled better than neutral
words, F(1, 19) = 27.09, MSE = 79.15, p < .001, g2 = .59, and
F(1, 19) = 5.93, MSE = 145.26, p < .05, g2 = .24, respectively.
Positive and negative words did not differ significantly
from each other, F(1, 19) = 1.88, MSE = 152.73, p = .19.
Calibration. An Emotionality (positive, negative, neutral) Measure (JOL, recall) repeated-measures ANOVA revealed a main effect of measure, F(1, 19) = 14.49,
MSE = 532.45, p < .001, g2 = .37; participants were equally
overconfident in their predictions for positive, negative,
and neutral words, as illustrated in the right pane of
Fig. 5. There was also a significant main effect of emotionality, F(2, 38) = 17.50, MSE = 94.95, p < .001, g2 = .48, Indeed, paralleling the pattern for cued recall, JOLs were
higher for negative, F(1, 19) = 24.20, MSE = 34.39, p < .001,
g2 = .56, and positive words, F(1, 19) = 35.14, MSE = 36.06,
p < .001, g2 = .65, compared to neutral words; in contrast
to cued recall, however, recall was also higher for both negative and positive words, as noted above. JOLs for positive
Semantic clustering. As in Experiment 2, we tested the
semantic cohesion theory of enhanced free recall of emotional words by calculating ARC scores, using the categories ‘‘positive,” ‘‘negative,” and ‘‘neutral.” ARC scores were
not significantly different from zero (M = .04), t < 1, indicating that participants did not cluster their recall responses by emotionality.
Summary
Experiment 3 investigated memory monitoring of positive words in cued and free recall tasks, and replicated the
results of Experiment 1 and 2 regarding memory monitoring of negative words. In cued recall, we found no enhanced recall of negative pairs relative to neutral pairs.
Surprisingly, however, the equally arousing positive pairs
were remembered significantly better than either negative
or neutral pairs. Participants gave high JOLs to both types
of emotional words, resulting in good calibration for positive pairs, but overconfidence for negative pairs. In free recall, both positive and negative individual words were
recalled better than neutral words; calibration, in terms
of some overconfidence, was equal for all word types;
monitoring resolution also did not vary as a function of
emotionality. As in Experiment 2, there was little evidence
for output clustering by the emotional category of the
words, which does not support the idea that emotional
words receive more relational processing by virtue of their
categorical nature, as predicted by the semantic cohesion
theory of emotional memory enhancement.
Experiment 4
Experiment 3 is the first to show that positive word
pairs are remembered significantly better than neutral
word pairs in cued recall, whereas equally arousing negative word pairs are not. The purpose of Experiment 4 was
to replicate this novel result with a different set of
materials.
Method
Participants
Twenty-four undergraduate students enrolled in an
introductory psychology course at Florida State University
C.A. Zimmerman, C.M. Kelley / Journal of Memory and Language 62 (2010) 240–253
participated in exchange for partial course credit. All participants were tested individually.
Materials
Eighty-four new words were chosen from the ANEW
(Bradley & Lang, 1999); 28 of the words were neutral
(M = 5.2 on ANEW’s scale from 1 [unpleasant] to 9 [pleasant]), 28 were negative (M = 2.6), and 28 were positive
(M = 7.2). Positive emotional words were more arousing
(M = 5.4 on ANEW’s scale from 1 [calm] to 9 [aroused])
than neutral words (M = 4.0), F(1, 54) = 53.23, MSE = .53,
p < .001, as were negative emotional words (M = 5.7), F(1,
54) = 60.34, MSE = .66, p < .001; negative and positive
words did not differ from each other in terms of arousal,
F(1, 54) = 1.11, MSE = .90, p = .30. All positive, negative,
and neutral words were equated for frequency (Kuçera &
Francis, 1967; M = 16.9 for positive, M = 18.1 for negative,
M = 19.7 for neutral), concreteness (Friendly et al., 1982;
Nelson et al., 1998; Wilson, 1988; M = 5.0 for positive,
M = 5.2 for negative, M = 5.4 for neutral), familiarity (Wilson; M = 499.4 for positive, M = 492.2 for negative,
M = 512.5 for neutral), and imageability (Wilson;
M = 536.7 for positive, M = 531.2 for negative, M = 529.6
for neutral), all F < 1. Words were assessed for relatedness
using free association norms (Nelson et al., 1998). No cue
was listed as an associate of its paired target word and
no cue elicited another cue word’s target as a response.
Additionally, only three words had associates on the list;
the highest probability of producing any word in the list
in response to any other word was .04. Additionally, positive, negative, and neutral words all elicited approximately
the same number of associates according to Nelson et al.
(1998) (M = 13.8 for positive, M = 13.9 for negative,
M = 11.4 for neutral), F(2, 56) = 1.02, MSE = 35.8, p = .37. A
matrix LSA ensured that negative (M = .13) and positive
word sets (M = .12) were no more related overall than
the neutral word set (M = .11), F(2, 85) = 1.70, MSE = .001,
p = .19.
From these nouns, three types of word pairs were
formed: 14 positive pairs, 14 negative pairs, and 14 neutral
pairs. A pair-wise LSA (Landauer et al., 1998) on the individual word pairs ensured that the words comprising each
pair were unrelated to one another. The average pair-wise
similarity score for all word pairs was .05; no pair had a
similarity score greater than .11. Additionally, neither negative noun pairs (M = .04) nor positive noun pairs (M = .06),
were more related overall than neutral noun pairs
(M = .05), F < 1.
Procedure
The general procedure was the same as the experimenter-paced condition of Experiment 1, with the exception that participants studied words for only one studytest trial. Participants saw the 42 pairs presented one at a
time on a computer screen for 5 s and were instructed to
study each pair such that they would later be able to recall
the target word when presented with the cue word. Immediately after each pair was presented, participants estimated the probability of recalling the target word when
prompted by the cue word on a scale from 0 (certain will
not recall the word) to 100 (certain will recall the word).
249
Immediately following the study phase, each of the 42
cue words appeared on the screen one at a time and participants were given 15 s to type in the target word that was
paired with that cue word during the study phase. The next
cue word was presented immediately after the participant
entered his or her response.
Results and discussion
Recall
A repeated-measures ANOVA on the percentage of
words recalled revealed a main effect of emotionality,
F(2, 46) = 48.77, MSE = 79.01, p < .001, g2 = .68 (see Fig. 6).
As in Experiment 3, cued recall was significantly higher
for positive pairs compared to both neutral, F(1,
23) = 68.15, MSE = 60.26, p < .001, g2 = .75, and negative
pairs, F(1, 23) = 81.22, MSE = 86.88, p < .001, g2 = .78. Additionally, participants recalled significantly fewer negative
compared to neutral pairs, F(1, 23) = 4.41, MSE = 89.88,
p < .05, g2 = .16. We note that, in Experiment 1, recall of
negative pairs trended toward being worse than recall of
neutral pairs, although this difference was not significant;
in Experiment 3, recall was equal for negative and neutral
pairs. Therefore, we can confidently conclude that recall of
negative pairs is not enhanced relative to neutral pairs,
although it is sometimes significantly worse whereas other
times the two are equivalent.
Commission errors
A repeated-measures ANOVA performed on the number
of commission errors revealed a significant main effect of
emotionality, F(2, 46) = 5.69, MSE = 2.0, p < .01, g2 = .20.
Follow-up tests confirmed that participants made significantly more commission errors for negative (M = 3.7 out
of 14) compared to positive pairs (M = 2.3), F(1,
23) = 14.18, MSE = 1.60, p < .01, g2 = .38. There was no sig-
Fig. 6. Experiment 4 mean judgment of learning (JOL) and recall as a
function of pair emotionality. Error bars represent standard errors.
250
C.A. Zimmerman, C.M. Kelley / Journal of Memory and Language 62 (2010) 240–253
nificant difference in the number of commission errors for
negative compared to neutral pairs, F(1, 23) = 2.08,
MSE = 2.26, p = .16, g2 = .08, although there was a trend toward participants making fewer commission errors for positive (M = 2.3) compared to neutral pairs (M = 3.1), F(1,
23) = 3.15, MSE = 2.14, p = .09, g2 = .12.
Calibration
An Emotionality (positive, negative, neutral) Measure
(JOL, recall) repeated-measures ANOVA showed main effects of both emotionality, F(2, 46) = 59.22, MSE = 60.47,
p < .001, g2 = .72, and measure, F(1, 23) = 8.15,
MSE = 274.52, p < .05, g2 = .26. However, these main effects
were qualified by a significant interaction, F(2, 46) = 18.71,
MSE = 56.64, p < .001, g2 = .45. Specifically, participants
were overconfident in their memory for negative, F(1,
23) = 24.66, MSE = 156.67, p < .001, g2 = .52, and neutral
pairs, F(1, 23) = 4.73, MSE = 103.19, p < .05, g2 = .17, but
accurate in their memory predictions for positive pairs,
F < 1 (see Fig. 6).
Resolution
One participant was excluded from this analysis because he had no variability in his JOLs. A repeated-measures ANOVA on the JOL–recall gammas revealed no
significant effect of emotionality, (M = .28 for positive,
M = .46 for negative, M = .30 for neutral) F < 1; participants
were equally able to distinguish between later recalled and
non-recalled positive, negative, and neutral word pairs
when making their JOLs.
JOL–emotionality gammas were computed collapsing
across positive and negative emotionality, as in Experiment 3. The average JOL–emotionality gammas were again
significantly different from zero (M = .28), t(22) = 5.91,
p < .001, d = 1.23, indicating that participants gave higher
JOLs to both positive and negative pairs than to neutral
pairs.
Summary
Experiment 4 sought to replicate the novel finding in
Experiment 3 that positive word pairs were remembered
significantly better than negative and neutral word pairs.
Using a different set of materials, we again found that cued
recall of positive pairs was significantly better than cued
recall of negative and neutral pairs. Again, participants
gave high JOLs to both negative and positive word pairs,
resulting in calibration accuracy for positive pairs, as in
Experiment 3, but overconfidence for negative pairs, as in
Experiments 1 and 3. We again found no monitoring differences in relative accuracy among the different emotion
categories, replicating the findings of Experiment 3.
General discussion
The current experiments are the first to investigate the
role of the emotionality of words in memory monitoring.
We found that emotionality is a cue that people rely on
when monitoring their learning of emotional and neutral
materials; in both free and cued recall, participants predicted that emotional words would be more memorable
than neutral words. In free recall (Experiments 2 and 3),
participants recalled more positive and negative words,
just as they predicted. Additionally, monitoring of free recall was equally accurate for emotional versus neutral
words, both in terms of calibration and resolution. In cued
recall (Experiments 1, 3 and 4), however, negative word
pairs were not more memorable, resulting in marked overconfidence for negative pairs on the first trial, when emotionality was used as an intrinsic cue. Positive pairs, in
contrast, were consistently recalled better than both negative and neutral pairs (Experiments 3 and 4), and participants were accurate in their absolute memory
predictions for these pairs.
Emotion as a cue for memory monitoring
The most straightforward explanation for variations in
the effectiveness of memory monitoring in free recall versus cued recall is that people use emotionality as a cue to
predict better future memory based on a simple theory
that memory for emotional events is better than memory
for non-emotional events (e.g. Magnusson et al., 2006).
Alternatively, the subjective experience initiated by both
positive and negative words, such as the arousal they induce, may be taken as a signal of memorability. In reality,
we found that the effects of emotionality on memory depend upon the valence of the word and the requirements
of the task. Cued recall of negative words did not exceed
that of neutral words whereas cued recall of positive words
was almost twice that of negative and neutral words, two
novel findings that we will address in the next section.
Immediate affective reactions are the basis for many
judgments such as how satisfied one is with one’s life, even
when the true source of the reaction is something more
transient, such as the weather (Pham, 2004; Schwarz &
Clore, 1983, 2007). Options can be quickly evaluated based
on one’s feelings and choices can be made based on affective reactions to anticipated outcomes (Johnson & Tversky,
1983; Slovic, Finucane, Peters, & MacGregor, 2002). Affective reactions are also used as a basis for predicting the future likelihood of events, with positive reactions leading to
higher likelihood estimates, and negative reactions leading
to lower likelihood estimates (Lench, 2009). If affective
reactions were similarly used as a direct basis for predicting future likelihood of recall, one would see higher estimates for positive words and lower estimates for
negative words in Experiments 3 and 4. However, both positive and negative words garnered higher estimates of future recall, so valence is clearly not influencing likelihood
estimates of memory in the same way that it influences
other likelihood estimates (e.g. Lench, 2009). According
to the affect-as-information approach, valence signals good
versus bad, while arousal signals importance (Clore & Storbeck, 2006). In Experiments 3 and 4, both positive and negative words were more arousing than the neutral words;
given that arousal drives a subjective feeling of importance, arousal may also be a cue for memorability (see also
Castel, 2007; Rhodes & Castel, 2008). Although arousal or
the subjective experience of emotionality could be the
cue for judgments of learning, it cannot be the sole mech-
C.A. Zimmerman, C.M. Kelley / Journal of Memory and Language 62 (2010) 240–253
anism accounting for emotionality effects in cued recall, as
we will discuss in the next section.
Emotionality and memory for relations
The current memory results have important implications for theories of emotion and memory in that they
demonstrate dissociations in memory performance
depending upon the valence of the studied emotional
materials. Specifically, the finding that relational memory
as assessed by cued recall is enhanced for positive, but
not negative, word pairs indicates that we cannot rely on
a general theory of emotional effects on memory; rather,
any viable theory must be able to account for differences
between memory for positive and negative items. The theory that emotional memory enhancement is driven solely
by arousal, for example, does not satisfy this requirement.
In Experiments 3 and 4, positive and negative words were
equated for arousal, yet memory for positive word pairs
was enhanced while memory for negative word pairs
was equal to, or even below, memory for low arousal neutral word pairs.
The theory that emotional memory enhancement in
free recall is due to greater semantic cohesion or relatedness among emotional items is also not supported by the
current results. We consistently found differences in memory performance despite equating positive, negative, and
neutral words and word pairs for both normative (free
association norms and latent semantic analysis) and subjective relatedness; furthermore, clustering scores in free
recall indicated that participants did not organize their recall output along emotional lines.
Why is cued recall of emotionally positive word pairs
enhanced relative to neutral pairs, but not cued recall of
emotionally negative word pairs? Indeed, there are few
theories that make differential predictions for memory
for negative versus positive materials. One explanation
for the enhanced cued recall of positive words is that positive emotional material is intrinsically rewarding, and rewards enhance memory processes. For example, when
photos are preceded by small versus large monetary rewards for later recognition, there is correlated activity in
the hippocampus and reward pathways that predicts
which items will be later remembered (Adcock, Thangavel,
Whitfield-Gabrieli, Knutson, & Gabrieli, 2006). The reward
hypothesis has also been invoked to account for the
rewarding effects of a smile (Tsukiura & Cabeza, 2008);
namely, smiling faces lead to better memory for the facial
expression even when it is later cued with a name. In contrast, people are less likely to remember earlier facial
expressions of surprise, anger, or fear (Shimamura, Ross,
& Bennett, 2006). Similarly, the rewarding nature of positive word pairs may have caused them to be better remembered than either neutral or negative pairs. However, this
theory does not make differential predictions for words
versus word pairs; namely, it cannot explain why positive
word pairs, but not individual positive words, were recalled better than their negative counterparts.
A second possible explanation arises from research on
the influence of positive mood on creativity. Isen and colleagues (Ashby, Isen, & Turken, 1999; Isen & Daubman,
251
1984; Isen, Daubman, & Nowicki, 1987; Isen, Johnson,
Mertz, & Robinson, 1985) argue that positive mood states
enhance creative thinking such that people may be better
able to relate distinct concepts to one another when they
are in a positive mood or when the materials themselves
are positive (see also Frederickson, 2001). Participants in
a positive mood perform better on the Remote Associates
Test than participants in a neutral or negative mood (Isen
et al., 1987) and participants give more unusual first associates to positive words than to either negative or neutral
words in a free association task (Isen et al., 1985). This ability to see relationships between seemingly unrelated positive words may have helped our participants form unique
and memorable relations among the arbitrary words comprising a to-be-remembered pair when that pair was positive rather than negative or neutral.
Conclusions
Affective reactions are a quick heuristic that is relied
upon for a variety of judgments, including attitudes toward
people, outcomes in choice situations, and the likelihood of
future events. The current results illustrate that emotionality is a key cue for memorability judgments, such that people predict greater memorability for positive and negative
words and word pairs relative to neutral words and word
pairs. Performance, and thereby the diagnosticity of emotionality as a cue, however, differed as a function of the
type of memory test and the emotional tone of the materials. In a cued recall test, participants remembered more
positive, but not more negative, pairs than neutral pairs
and were thus more overconfident in their memory predictions for negative word pairs. Just as confidence at retrieval
can be inflated by the emotionality of the retrieved flashbulb event (Talarico & Rubin, 2003), judgments of cued recall memorability at encoding were inflated by negative
emotionality. Emotionality signals, ‘‘I’ll remember this!”
In some cases, such as memory for individual items, this
signal is correct; however, in other important cases, such
as memory for interconnected elements of negative events,
this signal is misleading.
Acknowledgments
The authors would like to thank Heather Ashley, Laura
Bauerband, Jasmina Diaz, Annie Honigfort, Amanda Hultgren, Nicholas Karr, Becki Kielaszek, Alexis Flores, Lina Gomez, Matt Hester, Jessica Lopez, Richard Molina, Iraida
Neira, Laura Olivos, and David Schell, the undergraduate
research assistants who helped with data collection for
these experiments. We would also like to thank Robert
Greene, Matthew Rhodes, and three anonymous reviewers
for their insightful comments on an earlier version of this
paper.
References
Adcock, R. A., Thangavel, A., Whitfield-Gabrieli, S., Knutson, B., & Gabrieli,
J. D. E. (2006). Reward-motivated learning: Mesolimbic activation
precedes memory formation. Neuron, 50, 507–517.
252
C.A. Zimmerman, C.M. Kelley / Journal of Memory and Language 62 (2010) 240–253
Anderson, A. K., & Phelps, E. A. (2001). Lesions of the human amygdala
impair enhanced perception of emotionally salient events. Nature,
411, 305–309.
Ashby, F. G., Isen, A. M., & Turken, A. U. (1999). A neuropsychological
theory of positive affect and its influence on cognition. Psychological
Review, 106, 529–550.
Aupée, A. M. (2007). A detrimental effect of emotion on picture
recollection. Scandinavian Journal of Psychology, 48, 7–11.
Begg, I., Duft, S., Lalonde, P., Melnick, R., & Sanvito, J. (1989). Memory
predictions are based on ease of processing. Journal of Memory and
Language, 28, 610–632.
Benjamin, A. S., Bjork, R. A., & Schwartz, B. L. (1998). The mismeasure of
memory: When retrieval fluency is misleading as a metemnemonic
index. Journal of Experimental Psychology: General, 127, 55–68.
Bradley, M. M., Greenwald, M. K., Petry, M. C., & Lang, P. J. (1992).
Remembering pictures: Pleasure and arousal in memory. Journal of
Experimental Psychology: Learning, Memory, and Cognition, 18,
379–390.
Bradley, M. M., & Lang, P. J. (1999). Affective norms for English words
(ANEW): Instruction manual and affective ratings. Technical report C1, The Center for Research in Psychophysiology, University of Florida.
Buchanan, T. W., Etzel, J. A., Adolphs, R., & Tranel, D. (2006). The influence
of autonomic arousal and semantic relatedness on memory for
emotional words. International Journal of Psychophysiology, 61, 26–33.
Cahill, L., Haier, R. J., Fallon, J., Alkire, M. T., Tang, C., Keator, D., et al.
(1996). Amygdala activity at encoding correlated with long-term free
recall of emotional information. Proceedings of the National Academy of
Sciences of the United States of America, 93, 8016–8021.
Calvo, M. G., & Lang, P. J. (2004). Gaze patterns when looking at emotional
pictures: Motivationally biased attention. Motivation and Emotion, 28,
221–243.
Castel, A. D. (2007). The adaptive and strategic use of memory by older
adults: Evaluative processing and value-directed remembering. In A.
S. Benjamin & B. H. Ross (Eds.). The psychology of learning and
motivation (Vol. 48, pp. 225–270). London: Academic Press.
Claypool, H. M., Hall, C. E., Mackie, D. M., & Garcia-Marques, T. (2007).
Positive mood, attribution, and the illusion of familiarity. Journal of
Experimental Social Psychology, 44, 721–728.
Clore, G. L., & Storbeck, J. (2006). Affect as information about liking,
efficacy, and importance. In J. Forgas (Ed.), Affect in social thinking and
behavior (pp. 123–142). New York: Psychology Press.
Cook, G. I., Hicks, J. L., & Marsh, R. L. (2007). Source monitoring is not
always enhanced for valenced material. Memory & Cognition, 35,
222–230.
Doerksen, S., & Shimamura, A. P. (2001). Source memory enhancement for
emotional words. Emotion, 1, 5–11.
Dougal, S., Phelps, E. A., & Davachi, L. (2007). The role of medial temporal
lobe in item recognition and source recollection of emotional stimuli.
Cognitive, Affective, & Behavioral Neuroscience, 7, 233–242.
Dougal, S., & Rotello, C. M. (2007). ‘‘Remembering” emotional words is
based on response bias, not recollection. Psychonomic Bulletin &
Review, 14, 423–429.
Frederickson, B. L. (2001). The role of positive emotions in positive
psychology: The broaden-and-build theory of positive emotions.
American Psychologist, 56, 218–226.
Friendly, M., Franklin, P. E., & Rubin, D. C. (1982). The Toronto Word Pool:
Norms for imagery, concreteness, orthographic variables, and
grammatical usage for 1080 words. Behavior Research Methods &
Instrumentation, 14, 375–399.
Garcia-Marques, T., Mackie, D. M., Claypool, H. M., & Garcia-Marques, L.
(2004). Positivity can cue familiarity. Personality and Social Psychology
Bulletin, 30, 585–593.
Harris, C. R., & Pashler, H. (2005). Enhanced memory for negatively
emotionally charged pictures without selective rumination. Emotion,
5, 191–199.
Hertel, P. T., & Parks, C. (2002). Emotional episodes facilitate word recall.
Cognition & Emotion, 16, 685–694.
Isen, A. M., & Daubman, K. A. (1984). The influence of affect on
categorization. Journal of Personality and Social Psychology, 47,
1206–1217.
Isen, A. M., Daubman, K. A., & Nowicki, G. P. (1987). Positive affect
facilitates creative problem solving. Journal of Personality and Social
Psychology, 52, 1122–1131.
Isen, A. M., Johnson, M. M., Mertz, E., & Robinson, G. F. (1985). The
influence of positive affect on the unusualness of word associations.
Journal of Personality and Social Psychology, 48, 1413–1426.
Johnson, E. J., & Tversky, A. (1983). Affect, generalization, and the
perception of risk. Journal of Personality and Social Psychology, 45,
20–31.
Kelley, C. M., & Jacoby, L. L. (1996). Adult egocentrism: Subjective
experience versus analytic bases for judgment. Journal of Memory and
Language, 35, 157–175.
Kensinger, E. A., & Corkin, S. (2003). Memory enhancement for emotional
words: Are emotional words more vividly remembered than neutral
words? Memory & Cognition, 31, 1169–1180.
Koriat, A. (1997). Monitoring one’s knowledge during study: A cueutilization approach to judgments of learning. Journal of Experimental
Psychology: General, 126, 349–370.
Koriat, A., & Bjork, R. A. (2005). Illusions of competence in monitoring
one’s knowledge during study. Journal of Experimental Psychology:
Learning, Memory, and Cognition, 31, 187–194.
Koriat, A., & Bjork, R. A. (2006). Mending metacognitive illusions: A
comparison of mnemonic-based and theory-based predictions.
Journal of Experimental Psychology: Learning, Memory, and Cognition,
32, 1133–1145.
Koriat, A., Bjork, R. A., Sheffer, L., & Bar, S. K. (2004). Predicting one’s own
forgetting: The role of experience-based and theory-based processes.
Journal of Experimental Psychology: General, 133, 643–656.
Koriat, A., Ma’ayan, H., Sheffer, L., & Bjork, R. A. (2006). Exploring a
mnemonic debiasing account of the underconfidence-with-practice
effect. Journal of Experimental Psychology: Learning, Memory, and
Cognition, 32, 595–608.
Koriat, A., Sheffer, L., & Ma’ayan, H. (2002). Comparing objective and
subjective learning curves: Judgments of learning exhibit increased
underconfidence with practice. Journal of Experimental Psychology:
General, 131, 147–162.
Kuçera, H., & Francis, W. H. (1967). Computational analysis of present-day
American English. Providence, RI: Brown University.
LaBar, K. S., & Cabeza, R. (2006). Cognitive neuroscience of emotional
memory. Nature, 7, 54–64.
LaBar, K. S., & Phelps, E. A. (1998). Arousal-mediated memory
consolidation: Role of the medial temporal lobe in humans.
Psychological Science, 9, 490–493.
Landauer, T. K., Foltz, P. W., & Laham, D. (1998). Introduction to latent
semantic analysis. Discourse Processes, 25, 259–284.
Lench, H. C. (2009). Automatic optimism: The affective basis of judgment
about the future likelihood of events. Journal of Experimental
Psychology: General:, 138, 187–200.
Levinger, G., & Clark, J. (1961). Emotional factors in the forgetting of word
associations. Journal of Abnormal and Social Psychology, 62, 99–105.
Maddock, R. J., & Frein, S. T. (2009). Reduced memory for the spatial and
temporal context of unpleasant words. Cognition & Emotion, 23,
96–117.
Magnusson, S., Andersson, J., Cornoldi, C., Da Beni, R., Endestad, T.,
Goodman, G. S., et al. (2006). What people believe about memory.
Memory, 14, 595–613.
Maratos, E. J., Allan, K., & Rugg, M. D. (2000). Recognition memory for
emotionally negative and neutral words: An ERP study.
Neuropsychologia, 38, 1452–1465.
Mather, M. (2007). Emotional arousal and memory binding: An objectbased framework. Perspectives in Psychological Science, 2, 33–52.
McDowal, J. (1994). Recall of associates generated to emotionally toned
stimulus words. Canadian Journal of Experimental Psychology, 48,
82–94.
McGaugh, J. L. (2004). The amygdala modulates the consolidation of
memories of emotionally arousing experiences. Annual Review of
Neuroscience, 27, 1–28.
Monin, B. (2003). The warm glow heuristic: When liking leads to
familiarity. Journal of Personality and Social Psychology, 85, 1035–1048.
Nagae, S., & Moscovitch, M. (2002). Cerebral hemispheric differences in
memory of emotional and non-emotional words in normal
individuals. Neuropsychologia, 40, 1601–1607.
Nelson, T. O. (1984). A comparison of current measures of the accuracy of
feeling-of-knowing predictions. Psychological Bulletin, 95, 109–133.
Nelson, D. L., McEvoy, C. L., & Schreiber, T. A. (1998). The University of
South Florida word association, rhyme, and word fragment norms.
<http://www.usf.edu/FreeAssociation/>.
Ochsner, K. N. (2000). Are affective experiences richly recollected or
simply familiar? The experience and process of recognizing feelings
past. Journal of Experimental Psychology: General, 129, 242–261.
Öhman, A., Flykt, A., & Esteves, F. (2001). Emotion drives attention:
Detecting the snake in the grass. Journal of Experimental Psychology:
General, 130, 466–478.
Onoda, K., Okamoto, Y., & Yamawaki, S. (2009). Neural correlates of
associative memory: The effects of negative emotion. Neuroscience
Research, 64, 50–55.
Pham, M. T. (2004). The logic of feeling. Journal of Consumer Psychology, 14,
360–369.
C.A. Zimmerman, C.M. Kelley / Journal of Memory and Language 62 (2010) 240–253
Phelps, E. A. (2004). Human emotion and memory: Interactions of the
amygdala and hippocampal complex. Current Opinion in Neurobiology,
14, 198–202.
Rhodes, M. G., & Castel, A. D. (2008). Memory predictions are influenced
by perceptual information: Evidence for metacognitive illusions.
Journal of Experimental Psychology: General, 137, 615–625.
Roenker, D. L., Thompson, C. P., & Brown, S. C. (1971). Comparison of
measures for the estimation of clustering in free recall. Psychological
Bulletin, 76, 45–48.
Rubin, D. C., & Friendly, M. (1986). Predicting which words get recalled:
Measures of free recall, availability, goodness, emotionality, and
pronounciability for 925 nouns. Memory & Cognition, 14, 79–94.
Schmidt, S. (1991). Can we have a distinctive theory of memory? Memory
& Cognition, 19, 523–542.
Schmidt, S., & Saari, B. (2007). The emotional memory effect: Differential
processing or item distinctiveness? Memory & Cognition, 35,
1905–1916.
Schwartz, B. L., & Metcalfe, J. (1992). Cue familiarity but not target
retrievability enhances feeling-of-knowing judgments. Journal of
Experimental Psychology: Learning, Memory, and Cognition, 18,
1074–1083.
Schwarz, N., & Clore, G. L. (1983). Mood, misattribution, and judgments of
well-being: Informative and directive functions of affective states.
Journal of Personality and Social Psychology, 45, 513–523.
Schwarz, N., & Clore, G. L. (2007). Feelings and phenomenal experiences.
In A. W. Kruglanski & E. T. Higgins (Eds.), Social psychology: Handbook
of basic principles (2nd ed., pp. 385–407). New York, NY: Guilford
Press.
Sharot, T., Delgado, M. R., & Phelps, E. A. (2004). How emotion enhances
the feeling of remembering. Nature Neuroscience, 7, 1376–1380.
253
Shimamura, A. P., Ross, J. G., & Bennett, H. D. (2006). Memory for facial
expressions: The power of a smile. Psychonomic Bulletin & Review, 13,
217–222.
Slovic, P., Finucane, M., Peters, E., & MacGregor, D. G. (2002). Rational
actors or rational fools: Implications of the affect heuristic for
behavioral economics. Journal of Socio-Economics, 31, 329–342.
Stormark, K. M., Nordby, H., & Hugdahl, K. (1995). Attentional shifts to
emotionally charged cues: Behavioural and ERP data. Cognition &
Emotion, 9, 507–523.
Talarico, J. M., & Rubin, D. C. (2003). Confidence, not consistency,
characterizes flashbulb memories. Psychological Science, 14, 455–461.
Talmi, D., Luk, B. T. C., McGarry, L. M., & Moscovitch, M. (2007). The
contribution of relatedness and distinctiveness to emotionallyenhanced memory. Journal of Memory and Language, 56, 555–574.
Talmi, D., & Moscovitch, M. (2004). Can semantic relatedness explain the
enhancement of memory for emotional words? Memory & Cognition,
32, 742–751.
Talmi, D., Schimmack, U., Paterson, T., & Moscovitch, M. (2007). The role
of attention and relatedness in emotionally enhanced memory.
Emotion, 7, 89–102.
Tsukiura, T., & Cabeza, R. (2008). Orbitofrontal and hippocampal
contributions to memory for face-name associations: The rewarding
power of a smile. Neuropsychologia, 46, 2310–2319.
Van Overschelde, J. P., & Nelson, T. O. (2006). Delayed judgments of
learning cause both a decrease in absolute accuracy (calibration) and
an increase in relative accuracy (resolution). Memory & Cognition, 34,
1527–1538.
Wilson, M. D. (1988). The MRC psycholinguistic database: Machine
readable dictionary, version 2. Behavioral Research Methods,
Instruments and Computers, 20, 6–11.