Journal of Memory and Language 62 (2010) 240–253 Contents lists available at ScienceDirect Journal of Memory and Language journal homepage: www.elsevier.com/locate/jml ‘‘I’ll remember this!” Effects of emotionality on memory predictions versus memory performance Carissa A. Zimmerman *, Colleen M. Kelley Department of Psychology, Florida State University, USA a r t i c l e i n f o Article history: Received 19 December 2008 revision received 6 November 2009 Available online 19 January 2010 Keywords: Memory Metacognition Emotion Free recall Cued recall a b s t r a c t Emotionality is a key component of subjective experience that influences memory. We tested how the emotionality of words affects memory monitoring, specifically, judgments of learning, in both cued recall and free recall paradigms. In both tasks, people predicted that positive and negative emotional words would be recalled better than neutral words. That prediction was valid for free recall of positive, negative, and neutral words, but invalid for cued recall of negative word pairs compared to neutral and positive pairs; only positive emotional pairs showed enhanced recall relative to neutral pairs. Consequently, people exhibited extreme overconfidence for cued recall of negative word pairs on the first study-test trial. We demonstrate that emotionality does not globally enhance memory, but rather has specific effects depending on the valence and task. Results are discussed in terms of this complex relationship between emotionality and memory performance and the subsequent variations in diagnosticity of emotionality as a cue for memory monitoring. Ó 2009 Elsevier Inc. All rights reserved. Introduction The emotional quality of events plays an important role in online monitoring, drawing attention to both aversive and appetitive signals (Calvo & Lang, 2004; Öhman, Flykt, & Esteves, 2001; Stormark, Nordby, & Hugdahl, 1995). Given that emotion plays a role in how we monitor our external environment, it may also affect how we monitor the internal environment of our thoughts and memories. However, little research has examined the role that emotionality plays in memory monitoring, and most has focused on monitoring at retrieval via confidence judgments (Talarico & Rubin, 2003) and remember/know judgments (Kensinger & Corkin, 2003; Ochsner, 2000; Sharot, Delgado, & Phelps, 2004). Surprisingly, the emotionality of events tends to be associated with a stronger feeling of remembering and enhanced confidence at retrieval even when that confi- * Corresponding author. Address: Department of Psychology, Florida State University, Tallahassee, FL 32306, USA. E-mail address: zimmerman@psy.fsu.edu (C.A. Zimmerman). 0749-596X/$ - see front matter Ó 2009 Elsevier Inc. All rights reserved. doi:10.1016/j.jml.2009.11.004 dence is not warranted. For example, people are highly confident in their memory for the details accompanying emotional flashbulb memories despite the fact that these details are no more likely to be accurately remembered than details of less emotional events (Talarico & Rubin, 2003). In this paper, we ask how the emotional quality of words affects memory monitoring during encoding, in particular, the subjective feeling that one will remember in the future. This subjective memorability is assessed via judgments of learning (JOLs). Knowing the degree to which an experience is memorable is particularly important for memory control processes, such as investing time and effort in encoding when one wants to guarantee that one will remember. For example, if it is important to remember a person’s name or a conversation with one’s boss, memory monitoring allows one to assess whether that name or conversation will be remembered, or if extra steps, such as elaborative encoding or the creation of a physical record, are needed in order to guarantee later availability of the information. There is no research, however, investigating such memory monitoring of emotional material. C.A. Zimmerman, C.M. Kelley / Journal of Memory and Language 62 (2010) 240–253 Following Koriat’s (1997) cue-utilization framework, we propose that people’s ability to monitor their learning of emotional versus neutral material might depend upon two general classes of information. The first is a theorybased analytic inference. Through experience, people may have acquired the knowledge that emotional events tend to be memorable, in the same way that they tend to believe that cued recall for a related word pair (tea–coffee) will be better than cued recall for an unrelated word pair (tea– skater). If this is the case, during study emotionality operates as a theory-based intrinsic cue to memorability, a cue that is inherent to the word, picture, or other material to be remembered. Indeed, a recent survey of adults found that people believe that ‘‘dramatic” events are more memorable than daily events (Magnusson et al., 2006), a finding consistent with the ideas that emotion is a salient component of lay theories of memory and that, based on such a theory, people will generally predict better memory for emotional compared to neutral material. People also use experience-based or nonanalytic heuristics to monitor memorability (Kelley & Jacoby, 1996; Koriat, 1997); these are subjective, internal cues such as ease of processing (Begg, Duft, Lalonde, Melnick, & Sanvito, 1989; Benjamin, Bjork, & Schwartz, 1998; Rhodes & Castel, 2008). The attention-grabbing nature of emotional material or the emotional reaction itself may be taken as an indication that the item is memorable. In this case, emotion would be acting as an experience-based intrinsic cue. For example, Monin (2003) found that liking is misinterpreted as familiarity on a memory test (see also Claypool, Hall, Mackie, & Garcia-Marques, 2007; GarciaMarques, Mackie, Claypool, & Garcia-Marques, 2004); similarly, emotional reactions during encoding could be subjectively experienced as indicators of memorability. The validity of the cues that are the basis for memory monitoring determines the accuracy of the monitoring process. The accuracy of JOLs is measured in two different ways. Calibration is a measure of absolute accuracy, or how well on average participants’ JOLs match their actual level of recall. Resolution is a measure of relative accuracy, or the extent to which JOLs distinguish between later recalled and non-recalled items. With regard to emotional materials, free recall is better for both positive and negative stimuli, including words, pictures, film clips, and narrated slide shows, than for their neutral counterparts (Bradley, Greenwald, Petry, & Lang, 1992; Cahill et al., 1996; Doerksen & Shimamura, 2001; Harris & Pashler, 2005; Hertel & Parks, 2002; Kensinger & Corkin, 2003; Nagae & Moscovitch, 2002; Ochsner, 2000; Rubin & Friendly, 1986). Enhanced free recall has been attributed to greater distinctiveness (Schmidt, 1991; Schmidt & Saari, 2007; Talmi, Luk, McGarry, & Moscovitch, 2007), increased attention during encoding (Anderson & Phelps, 2001; Calvo & Lang, 2004; Öhman et al., 2001), and, at longer-term retention intervals, to greater consolidation of memories for emotionally arousing events (LaBar & Cabeza, 2006; LaBar & Phelps, 1998; McGaugh, 2004; Phelps, 2004). Thus, emotionality ought to be a valid cue for judging the memorability of individual words in a free recall paradigm. In contrast, it is difficult to predict the validity of emotionality as an indicator of memorability in cued recall, 241 because there is little research on emotional material and intentional associative memory. Indeed, we found no prior studies of cued recall using emotional words. We found a single study of associative recognition of negative compared to neutral word pairs, which reported worse recognition of negative pairs compared to neutral pairs (Onoda, Okamoto, & Yamawaki, 2009; however, other characteristics of the items that might affect memorability were not equated for emotional versus neutral pairs). Immediate cued recall of prior free associations to emotional versus neutral words (not encoded intentionally) also reveals a disadvantage for emotional words (Levinger & Clark, 1961; McDowal, 1994). In contrast, Tsukiura and Cabeza (2008) found that, after studying pairs of names and faces, cued recall of the facial expression was more accurate for names that had been paired with happy rather than neutral faces. Studies of memory for incidentally encoded source characteristics, such as spatial and temporal information, show no advantage for emotional words or pictures compared to their neutral counterparts. Maddock and Frein (2009) found equivalent memory for the spatial and temporal context of neutral and positive words, and worse context memory for negative words. Recollection of whether a picture appeared in a first versus second list was lower for emotional pictures compared to neutral pictures (Aupée, 2007), and memory for whether a word was read versus heard was lower for emotional than for neutral words (Cook, Hicks, & Marsh, 2007; see Mather, 2007, for a review of incidental binding of emotional material and related information). Even intentional encoding of source characteristics such as font color seems not to be enhanced for emotional material (Dougal, Phelps, & Davachi, 2007), and was worse for negative compared to neutral words (Maddock & Frein, 2009, Experiment 4). Thus, if people judge emotional material as more memorable in preparation for a task that requires binding two items together, such as cued recall, it is not at all clear that they will be correct, particularly for negative material. The current experiments explore whether people judge the memorability of emotional versus neutral words differently by asking for JOLs prior to a cued recall test (Experiments 1, 3, and 4) and a free recall test (Experiments 2 and 3). We predict that people use the emotionality of words as a cue to memorability on both types of memory test, and so will give higher JOLs to emotional than to neutral words and word pairs. Given that emotionality is a valid cue for free recall, memory monitoring should be relatively effective; however, emotionality of word pairs may be a misleading cue to memorability in cued recall, resulting in overconfidence for emotional pairs. Experiment 1 In Experiment 1, we assessed whether people used the emotionality of words as a cue for JOLs in a paired associates paradigm across two study-test trials. Participants studied pairs of negative or neutral words and rated the likelihood that they would be able to recall the second word when presented with the first. We predict that 242 C.A. Zimmerman, C.M. Kelley / Journal of Memory and Language 62 (2010) 240–253 participants will use emotionality of items as a cue in making JOLs, and so will give higher JOLs to negative compared to neutral word pairs on Trial 1. However, given that some types of associative memory are not better for negative compared to neutral materials, as reviewed above, we predict that this will lead to marked overconfidence. After repeated study-test trials, particular experiencebased cues such as familiarity of an item and memory for a prior test become available; because these mnemonic cues are so diagnostic of later memory performance, with repeated study-test cycles people typically abandon other cues intrinsic to the studied material, such as word pair relatedness, and shift to using the mnemonic cues as a basis for JOLs (Koriat, 1997). In the current study, we anticipate that emotionality will be a cue to memorability on the first study trial, but that participants will shift to more diagnostic mnemonic cues (e.g., memory for whether they recalled a target on the first test) as a basis for JOLs on the second study trial. Additionally, we explore the effect of emotionality on memory control processes by including self-paced as well as experimenter-paced study time conditions. If people believe that emotional material is highly memorable, they may study it for less time than neutral material. Method Participants Forty-eight undergraduate students enrolled in an introductory psychology course at Florida State University participated in exchange for partial course credit. Students were randomly assigned to either the self-paced study time or experimenter-paced study time condition and tested individually. Materials Eighty-four nouns were chosen from the Affective Norms for English Words (ANEW; Bradley & Lang, 1999); 40 of the words were neutral (M = 5.2 on ANEW’s scale from 1 [unpleasant] to 9 [pleasant]) and 44 were negative (M = 2.5). Four additional neutral words were taken from Rubin and Friendly (1986), who used a rating scale from 1 (bad) to 7 (good); these four neutral words all received ratings in the range of 4–6. Both valence and arousal are important components of emotionality, so in addition to being more negative, emotional words were also selected to be more arousing according to the ANEW (M = 5.8 on ANEW’s scale from 1 [calm] to 9 [aroused]) than neutral words (M = 4.2), F(1, 82) = 73.79, MSE = .73, p < .001. All negative and neutral words were equated for frequency (Kuçera & Francis, 1967; M = 26.3 for negative, M = 34.0 for neutral), F(1, 82) = 1.18, MSE = 1048.23, p = .28, length (M = 5.6 for both), F < 1, concreteness (Friendly, Franklin, & Rubin, 1982; Nelson, McEvoy, & Schreiber, 1998; M = 4.8 for negative, M = 4.9 for neutral), F < 1, imageability (Wilson, 1988; M = 514.1 for negative, M = 519.8 for neutral), F(1, 72) = 1.04, MSE = 10234.78, p = .31, and familiarity (Wilson; M = 527.0 for negative, M = 503.0 for neutral), F < 1. From these nouns, two types of word pairs were formed: 22 negative pairs (e.g., prison–cancer) and 22 neu- tral pairs (e.g., violin–avenue). A pair-wise latent semantic analysis (LSA; Landauer, Foltz, & Laham, 1998) on the individual word pairs ensured that the words comprising each pair were unrelated to one another. The average pair-wise similarity score for all word pairs was .04; no pair had a similarity score greater than .1 (LSA similarity scores are on a scale from 1.0 to +1.0). Additionally, neutral noun pairs were no more related overall than negative noun pairs (mean similarity score = .04 for both), F < 1. Words were further assessed for relatedness using free association norms (Nelson et al., 1998). No cue was listed as an associate of its paired target word; additionally, the highest probability of producing any word in the list in response to a cue word in the list was .13. There was no significant difference in the forward association strengths of negative (M = .05) and neutral word sets (M = .04), F < 1. Procedure Participants studied the same list of words for two study-test trials. Word pair order in both the study and test phases was newly randomized for each participant. In the study phase of each trial, participants saw the 44 pairs presented one at a time on a computer screen. In the experimenter-paced condition, each pair was presented for 5 s; in the self-paced condition, participants were able to study the pair for as long as they chose. For participants in the self-paced condition, study time was measured from presentation of the pair to when the participant pressed the spacebar to bring up the next pair. In both conditions, the inter-stimulus interval between the pair and the JOL prompt was 250 ms. Participants in both conditions were instructed to study each pair such that they would later be able to recall the second word (target word) when presented with the first (cue word). Immediately after each pair was presented, participants estimated the probability of recalling the target word when prompted by the cue word on a scale from 0 (certain will not recall the target word) to 100 (certain will recall the target word). Immediately after each study phase, participants moved on to a test phase. In the test phase, each of the 44 cue words appeared on the screen one at a time. Participants were instructed to type in the target word that was paired with that cue word during the study phase; they were given a maximum of 15 s to do so. The next cue word was presented immediately after the participant entered his or her response. Results and discussion There were no significant interactions with study time condition, so we collapsed across this factor in all of the following analyses. Contrary to our prediction, negative pairs were not studied for significantly less time (M = 7077.5 ms) than neutral pairs (M = 7557.1 ms) on Trial 1, F(1, 18) = 3.20, MSE = 683381.4, p = .09. Recall Misspelled items and plurals were counted as correct in this and subsequent experiments. A Trial (1, 2) Emotionality (negative, neutral) repeated-measures ANOVA on the percentage of words recalled found no main effect of C.A. Zimmerman, C.M. Kelley / Journal of Memory and Language 62 (2010) 240–253 emotionality; in fact, there was a trend towards recall being worse for negative (M = 33.5) compared to neutral pairs (M = 36.3), F(1, 47) = 2.65, MSE = 141.20, p = .11, g2 = .05. There was a significant main effect of trial such that recall improved for both item types from the first to the second trial, F(1, 47) = 204.27, MSE = 167.36, p < .001, g2 = .81 (see Fig. 1). Commission errors We also examined the possibility that participants tended to make more commission errors for negative items than for neutral items. A commission error was the production of any word other than the correct target in response to the cue. A repeated-measures ANOVA performed on the number of commission errors revealed a main effect of emotionality, F(1, 47) = 53.93, MSE = 9.34, p < .001, g2 = .53; overall, participants committed more commission errors for negative cues (M = 8.7 out of 22) compared to neutral cues (M = 5.5 out of 22). Calibration Overall. Calibration is the average correspondence between JOLs and recall. Following Koriat (1997), we included JOL and recall percentages as two levels of one independent variable designated ‘‘measure” (see also Koriat & Bjork, 2005, 2006; Koriat, Bjork, Sheffer, & Bar, 2004; Koriat, Ma’ayan, Sheffer, & Bjork, 2006; Koriat, Sheffer, & Ma’ayan, 2002; Van Overschelde & Nelson, 2006). A Trial (1, 2) Emotionality (negative, neutral) Measure (JOL, recall) repeated-measures ANOVA revealed a significant 3-way interaction, F(1, 46) = 8.65, MSE = 31.44, p < .01, g2 = .16, illustrated in Fig. 1. We followed up on the interaction with separate analyses of Trial 1 and Trial 2 because we predicted that emotionality of the word pair would be a cue for JOLs on Trial 1, whereas mnemonic information Fig. 1. Experiment 1 mean judgment of learning (JOL) and recall as a function of trial and pair emotionality. Error bars represent standard errors. 243 such as memory for the prior test would be the predominate cue for JOLs on Trial 2. Trial 1. An Emotionality (negative, neutral) Measure (JOL, recall) repeated-measures ANOVA revealed a significant interaction, F(1, 47) = 40.99, MSE = 44.42, p < .001, g2 = .47. Participants were overconfident in their memory for negative pairs on Trial 1, F(1, 47) = 43.50, MSE = 288.48, p < .001, g2 = .48; in fact, participants believed that they would recall almost twice as many negative pairs as they actually did. Participants were also overconfident in their memory for neutral pairs, F(1, 47) = 8.71, MSE = 306.63, p < .01, g2 = .16, although to a lesser extent; this finding of some overconfidence for neutral pairs is common in the metacognition literature (Koriat & Bjork, 2005; Koriat, Ma’ayan, Sheffer, & Bjork, 2006; Koriat, Sheffer, & Ma’ayan, 2002). Trial 2. Results from the second trial also revealed a significant Emotionality Measure interaction, F(1, 47) = 10.19, MSE = 36.75, p < .01, g2 = .18. The typical underconfidence with practice effect (Koriat, Sheffer, & Ma’ayan, 2002) was found for both negative, F(1, 47) = 8.78, MSE = 267.15, p < .01, g2 = .16, and neutral pairs, F(1, 47) = 19.60, MSE = 293.15, p < .001, g2 = .29, although it was less pronounced for negative pairs. Resolution Resolution is a measure of the relative accuracy of JOLs and is commonly indexed by within-subject Goodman– Kruskal gamma correlations between JOLs and recall accuracy (Nelson, 1984). JOL–recall gammas were calculated separately for negative and neutral pairs. Twelve participants were excluded because they had no variability in recall in one of the conditions (one participant recalled all targets on Trial 2, three recalled all targets in the neutral condition on Trial 2, seven recalled no neutral targets on Trial 1, and one recalled no negative targets on Trial 1). One participant was excluded for giving all negative pairs a JOL of 20 on Trial 2. Because gammas could not be computed for these participants, only the remaining 35 participants were included in the following analyses (see Schwartz & Metcalfe, 1992, Experiment 3, for a similar number of exclusions). Consistent with previous findings (Koriat, 1997), JOL– recall gammas increased across trials, F(1, 34) = 8.52, MSE = .16, p < .01, g2 = .20. This increase in gammas is typically interpreted as a shift away from the use of intrinsic cues, which would include the emotional quality of word pairs, to make JOLs and toward the use of mnemonic cues, which are typically quite diagnostic of memory performance. There was also a significant main effect of emotionality, F(1, 33) = 15.04, MSE = .13, p < .001, g2 = .31. JOL– recall gammas were worse overall for negative pairs (M = .34) compared to neutral pairs (M = .53); participants were less able to distinguish between pairs that they would and would not recall when those pairs were negative rather than neutral. However, to preview, this finding did not replicate in the subsequent experiments and will not be discussed further. The interaction between trial and emotionality was not significant, F < 1. To measure whether people were reliant on emotionality as a cue on Trial 1, but less reliant on emotionality and 244 C.A. Zimmerman, C.M. Kelley / Journal of Memory and Language 62 (2010) 240–253 more reliant on mnemonic cues on Trial 2, we calculated the gamma correlation between emotionality (negative versus neutral) and JOLs (see Koriat, 1997 for a similar analysis in the case of pair relatedness as a cue). The gamma correlation between JOL and emotionality reflects the extent to which high JOLs are assigned to negative items and low JOLs are assigned to neutral items; in this case, a high JOL–emotionality gamma would indicate that negative emotionality is used as a cue for JOLs. The JOL–emotionality gamma decreased significantly from the first to second trial, F(1, 45) = 31.09, MSE = .04, p < .001, g2 = .41 (see Fig. 2). This pattern indicates that emotionality of the pairs is a cue used to make JOLs on Trial 1, but it is abandoned in favor of more diagnostic mnemonic cues on Trial 2 (e.g. Koriat, 1997). Summary As predicted, emotionally negative word pairs were judged as far more memorable than neutral word pairs; however, negative word pairs did not live up to their apparent memorability. Cued recall was not better for negative word pairs relative to neutral word pairs, resulting in overconfidence on Trial 1. On Trial 2, participants shifted away from reliance on emotionality to make their JOLs, presumably to more mnemonic cues; calibration shifted to underconfidence for both negative and neutral pairs, with more underconfidence for neutral pairs. Experiment 2 In Experiment 2, we explored whether the negative emotionality of words enhanced their subjective memorability in a free recall paradigm. Most experiments find better recall of negative words and pictures relative to neutral materials, and so we predicted that the use of emotionality as a cue for JOLs would produce equivalent monitoring for Fig. 2. Experiment 1 mean JOL–recall and JOL–emotionality gammas as a function of trial. Error bars represent standard errors. negative and neutral words. The exceptions to this pattern of superior recall of emotional items are found in experiments that attempt to equate negative and neutral words on subjective measures of relatedness via ratings, based on the claim that relatively short-term emotional memory enhancement is due to the categorical nature of emotional items themselves (Schmidt & Saari, 2007; Talmi, Luk, et al., 2007; Talmi & Moscovitch, 2004; Talmi, Schimmack, Paterson, & Moscovitch, 2007). In contrast, we use the same method as in Experiment 1 of equating materials on LSA and free association norms (see also Dougal & Rotello, 2007). Additionally, we examine this semantic-cohesion explanation of emotional memory enhancement by measuring the extent to which negative and neutral words cluster together in recall. If emotionally negative items are recalled better by virtue of more relational processing among negative items, then we would expect to see clustering by emotion category in participants’ recall output. Method Participants Thirty-six undergraduates enrolled in an introductory psychology course at Florida State University participated in exchange for partial course credit. All participants were tested individually. Materials Forty-four nouns were chosen from the Experiment 1 materials; half of the words were neutral (M = 5.17 on ANEW’s scale from 1 [unpleasant] to 9 [pleasant]) and half were negative (M = 2.45). Negative emotional words (M = 5.9) were also selected to be more arousing than neutral words (M = 4.2), F(1, 42) = 65.55, MSE = .50, p < .001. Negative and neutral words were equated for frequency (Kuçera & Francis, 1967; M = 27.9 for negative, M = 30.3 for neutral), length (M = 5.8 for negative, M = 5.4 for neutral), concreteness (Friendly et al., 1982; Nelson et al., 1998; M = 5.1 for both), familiarity (Wilson, 1988; M = 530.7 for negative, M = 519.2 for neutral), and imageability (Wilson; M = 543.1 for negative, M = 521.8 for neutral), all F < 1. A matrix LSA (Landauer et al., 1998) on the individual words ensured that negative words were no more inter-related than neutral words, F < 1; the average similarity score was .08 for both negative and neutral word sets. Words were also examined using free association norms (Nelson et al., 1998); only seven words had associates on the list. The mean probability of producing any word in response to another was .02 for negative words and .08 for neutral words, F < 1. Procedure The experiment again consisted of two study-test trials. Word order in both the study and test phases was newly randomized for each participant. In the study phase, each of the 44 words was presented on a computer screen for 5 s, with an inter-stimulus interval of 250 ms. Participants were instructed to study each word such that they would later be able to recall it on their own. After the 5 s study period for each word, the following question appeared on the screen: ‘‘What are the chances that you will recall this C.A. Zimmerman, C.M. Kelley / Journal of Memory and Language 62 (2010) 240–253 word?” Participants responded by typing in their answer on a scale from 0 (certain will not recall the word) to 100 (certain will recall the word). Immediately after each study phase, participants moved on to a test phase. During the test phase, participants were told to type in as many of the words as they could recall; they were given as much time as they needed to do so. Results and discussion Recall A Trial (1, 2) Emotionality (negative, neutral) repeated-measures ANOVA on the percentage of words recalled revealed that, in contrast to the results for cued recall in Experiment 1, negative words were recalled better than neutral words, F(1, 35) = 20.71, MSE = 155.42, p < .001, g2 = .37, (see Fig. 3). Higher free recall of negative words is consistent with previous research (Bradley et al., 1992; Cahill et al., 1996; Harris & Pashler, 2005; Nagae & Moscovitch, 2002; Rubin & Friendly, 1986), and occurred even though emotional and neutral words were equated for relatedness using LSA and free association norms. Recall improved from Trial 1 to Trial 2 equally for both word types, F(1, 35) = 205.01, MSE = 79.12, p < .05, g2 = .85, (F < 1 for the interaction). These results support our prediction that emotionality is a valid cue for JOLs in a free recall test. Calibration A Trial (1, 2) Emotionality (negative, neutral) Measure (JOL, recall) repeated-measures ANOVA yielded a significant main effect of trial, F(1, 35) = 55.92, MSE = 172.55, p < .001, g2 = .62. There was also a significant main effect of emotionality, F(1, 35) = 41.75, MSE = 167.55, p < .001, g2 = .54. Not only were negative words better recalled than neutral words, but JOLs were also higher for negative Fig. 3. Experiment 2 mean judgment of learning (JOL) and recall as a function of trial and pair emotionality. Error bars represent standard errors. 245 words than neutral words, which is consistent with the use of emotionality as a cue for JOLs. We also found a significant Trial Measure interaction, F(1, 35) = 51.82, MSE = 129.38, p < .001, g2 = .60, as shown in Fig. 3. On Trial 1, participants were equally overconfident in their judgments of negative and neutral words, as revealed by the simple effect of measure across emotionality, F(1, 35) = 10.58, MSE = 491.50, p < .01, g2 = .23. On Trial 2, JOLs shifted to equal underconfidence for both negative and neutral words, F(1, 35) = 5.29, MSE = 360.44, p < .05, g2 = .13. No other interactions were significant, all F < 1. Resolution Gammas could not be calculated for one participant because he had no variability in JOLs in one of the conditions (he gave all neutral words a JOL of 50 on Trial 2); this participant was excluded from the analysis. Consistent with previous findings (Koriat, 1997), there was a main effect of trial, F(1, 34) = 7.63, MSE = .15, p < .01, g2 = .18; JOL–recall gammas increased across trials for both negative and neutral words (see Fig. 4). Monitoring resolution was equally effective for negative (M = .37) and neutral words (M = .38), F < 1. We again examined whether emotionality was a cue for Trial 1 JOLs, and whether that cue was abandoned in favor of more valid mnemonic cues on Trial 2 by computing JOL– emotionality gammas for each trial. The JOL–emotionality gammas were significantly greater than zero on both Trial 1, t(34) = 6.01, p < .001, d = 1.02, and Trial 2, t(34) = 6.49, p < .001, d = 1.10 (see Fig. 4). This positive relationship between emotionality and JOLs suggests that participants were in fact differentially assigning high and low JOLs based on a given word’s emotionality such that negative words received higher JOLs than neutral words. In contrast to Experiment 1, the JOL–emotionality gammas did not drop from Trial 1 to Trial 2 as mnemonic cues became available, F < 1, as illustrated in Fig. 4. One explanation Fig. 4. Experiment 2 mean JOL–recall and JOL–emotionality gammas as a function of trial. Error bars represent standard errors. 246 C.A. Zimmerman, C.M. Kelley / Journal of Memory and Language 62 (2010) 240–253 for this pattern is that emotionality could serve as an intrinsic cue on both trials. Alternatively, the intrinsic cue of emotionality could be used on Trial 1, and mnemonic cues used on Trial 2, as in Experiment 1, but a high correlation between emotionality and mnemonic cues in the free recall task would produce no change in the JOL–emotionality gamma on Trial 2. Semantic clustering We calculated Adjusted Ratio of Clustering (ARC; Roenker, Thompson, & Brown, 1971) scores to examine clustering according to the emotional category of negative versus neutral. The ARC score is the proportion of actual category repetitions above chance to the total possible category repetitions above chance. ARC scores were significantly different from zero on the second trial only (M = .12), t(35) = 3.08, p < .01, d = .5. The lack of clustering according to emotional category on the first recall trial (M = .04) indicates little within-emotion relational processing, which undermines the argument that superior recall of negative items is based on greater semantic cohesion among emotional items (Buchanan, Etzel, Adolphs, & Tranel, 2006; Maratos, Allan, & Rugg, 2000; Talmi, Luk, et al., 2007; Talmi & Moscovitch, 2004; Talmi, Schimmack, et al., 2007), at least when the relatedness is equated for negative and neutral words using LSA and free association norms, as in the current experiment. Although ARC scores were significantly different from zero by the second trial, they were still numerically quite small, only 12% above chance. Summary Participants again judged negative emotional items as more memorable than neutral items. However, in the current free recall task, negative words were, in fact, recalled more often than neutral words, which made emotionality a valid cue for predicting recall. Thus, accuracy of monitoring performance did not differ based on emotionality; both calibration and resolution were equivalent for negative and neutral words. Lastly, clustering scores were not significantly different from zero on Trial 1, indicating that participants did not organize their recall output by emotionality. There are a number of theories as to why free recall is better for emotionally charged words than neutral words. In the long term, consolidation processes may be modulated by the amygdala, producing less forgetting for emotional items. However, in the shorter study-test intervals used in the current experiment, one major theory is that emotional items constitute a semantic category that encourages more relational processing among like items (e.g. Talmi & Moscovitch, 2004); however, the lack of clustering in our free recall data does not support this semantic-cohesion theory of emotional memory enhancement. Experiment 3 Thus far we have demonstrated that, although individual negative emotional words are indeed better recalled than neutral words, pairs of negative words are not better recalled than their neutral counterparts. However, partici- pants predicted that they would remember both individual negative words and negative word pairs better than neutral words and pairs. This prediction was accurate in the case of free recall, but led to marked overconfidence in the case of cued recall. Because the cued recall results, as well as the monitoring results, for negative emotional words are novel, Experiment 3 aims to replicate these findings. In addition, we tested the nature of memory and monitoring performance for positive words and word pairs. In Experiments 1 and 2, negative and neutral pairs differed in both valence and arousal, and either dimension could be driving the higher judgments of memorability. In Experiment 3, we equate the positive words and negative words on arousal, with both being more arousing than the neural words. Experiment 3 also controls for subjective relatedness of the word pairs. When an independent set of 15 participants rated the relatedness of the two words forming a pair in Experiment 1, they perceived the negative pairs as more related, even though relatedness was controlled via free association norms and LSA. In Experiment 3, we controlled for both objective and subjective relatedness of positive, negative, and neutral word pairs; thus Experiment 3 tests whether emotionality per se, independent of perceived and actual relatedness, affects both JOLs and memory performance in cued recall. We also include free recall versus cued recall as a between-subjects manipulation. As noted in the introduction, to our knowledge Experiment 1 is the first study to measure cued recall of words varying in emotionality, and we found no better recall of negative than neutral words. The extension to positive word pairs is also a first, and therefore we are unable to predict whether memory will be better for positive word pairs than neutral pairs. However, we do predict that people will judge positive word pairs as more memorable than neutral pairs, regardless of the accuracy of that prediction. Experiment 3 used a single study-test trial, in order to focus on the use of emotionality as a cue to memorability. Method Participants Forty undergraduate students enrolled in an introductory psychology course at Florida State University participated in exchange for partial course credit. All participants were randomly assigned to either the cued recall or free recall condition and were tested individually. Materials Eighty-one nouns were chosen from the Affective Norms for English Words (ANEW; Bradley & Lang, 1999); 27 of the words were neutral (M = 5.2 on ANEW’s scale from 1 [unpleasant] to 9 [pleasant]), 26 were negative (M = 2.6), and 28 were positive (M = 7.4). Three additional words (two negative and one neutral) were taken from Rubin and Friendly (1986), who used a rating scale from 1 (bad) to 7 (good); the negative words both had ratings below three and the neutral word had a rating of five. Both positive and negative words were also more arousing than neutral words according to the ANEW (M = 5.6 on ANEW’s scale from 1 [calm] to 9 [aroused] for positive, M = 5.8 for C.A. Zimmerman, C.M. Kelley / Journal of Memory and Language 62 (2010) 240–253 negative, M = 4.1 for neutral), F(2, 79) = 32.00, MSE = .74, p < .001; positive and negative words did not differ from each other in arousal, F < 1. All positive, negative, and neutral words were equated for frequency (Kuçera & Francis, 1967; M = 26.5 for positive, M = 24.1 for negative, M = 27.7 for neutral), F < 1, length (M = 5.5 for positive, M = 5.6 for negative, M = 5.8 for neutral), F < 1, concreteness (Friendly et al., 1982; Nelson et al., 1998; M = 4.9 for positive, M = 5.2 for negative, M = 5.1 for neutral), F < 1, and imageability (Wilson, 1988; M = 560.9 for positive, M = 538.8 for negative, M = 517.9 for neutral), F(2, 62) = 1.53, MSE = 6447.47, p = .22. A matrix LSA (Landauer et al., 1998) on each word set ensured that, overall, negative and positive words were no more related than neutral words (M = .13 for all), F < 1. Words were also assessed for relatedness using free association norms (Nelson et al., 1998). No cue was listed as an associate of its paired target word; additionally, the highest probability of producing any word in the list in response to a cue word was .04. There was no significant difference in the forward association strengths of positive (M = .05), negative (M = .03), or neutral word sets (M = .03), F < 1. From these nouns, three types of word pairs were formed for the cued recall condition: 14 positive pairs, 14 negative pairs, and 14 neutral pairs. All pairs were rated for relatedness by an independent set of participants (n = 18). Overall, negative (M = 2.6 on a scale from 1 [not related] to 7 [very related]) and positive pairs (M = 2.5) were no more subjectively related than neutral pairs (M = 2.3), F < 1. A pair-wise LSA (Landauer et al., 1998) on the individual word pairs further ensured that the words comprising each pair were unrelated to one another. The average pair-wise similarity score for all word pairs was .051; no pair had a similarity score greater than .2. Additionally, neither negative noun pairs (M = .036) nor positive noun pairs (M = .065), were more related overall than neutral noun pairs (M = .053), F < 1. The targets from the cued recall condition served as the stimuli in the free recall condition. Both positive and negative targets were more arousing (M = 5.9 for positive; M = 6.2 for negative) than neutral targets (M = 4.1), F(1, 80) = 33.43, MSE = .73, p < .001; positive and negative targets did not differ from each other in arousal, F < 1. All positive, negative, and neutral targets were equated for frequency (Kuçera & Francis, 1967; M = 23.3 for positive, M = 25.5 for negative and neutral), length (M = 6.1 for positive, M = 5.5 for negative, M = 5.9 for neutral), concreteness (Friendly et al., 1982; Nelson et al., 1998; M = 4.8 for positive, M = 5.1 for negative, M = 5.2 for neutral), and imageability (Wilson, 1988; M = 543.6 for positive, M = 549.4 for negative, M = 533.8 for neutral), all F < 1. A matrix LSA (Landauer et al., 1998) on each target set ensured that, overall, negative (M = .17) and positive targets (M = .18) were no more related than neutral targets (M = .15), F(2, 39) = 1.50, MSE = .002, p = .24. Targets were also assessed for relatedness using free association norms (Nelson et al., 1998). The highest probability of producing any target in the list in response to another target was .06. There was no significant difference in the forward association strengths of positive (M = .05) and negative (M = .04) target sets, F < 1; no neutral targets had associates in the list. 247 Procedure The procedure in the free recall condition was as in Experiment 2, and the procedure for the cued recall condition was as in the experimenter-controlled study condition of Experiment 1. However, in Experiment 3, word pairs (cued recall condition) or single words (free recall condition) were presented for a single study-test trial to examine the role of emotionality as an intrinsic cue to memorability. Results and discussion A Condition (free, cued) Emotionality (positive, negative, neutral) Measure (JOL, recall) mixed-measures ANOVA revealed a significant 3-way interaction, F(2, 76) = 13.94, MSE = 68.62, p < .001, g2 = .27. To follow up on this interaction, we examined the cued and free recall conditions separately. Cued recall condition Recall. A repeated-measures ANOVA on the percentage of words correctly recalled revealed a significant main effect of emotionality, F(2, 38) = 18.35, MSE = 130.95, p < .001, g2 = .49 (see Fig. 5). Follow-up tests confirmed that positive word pairs were recalled significantly better than both negative, F(1, 19) = 21.60, MSE = 150.87, p < .001, g2 = .53, and neutral pairs, F(1, 19) = 32.62, MSE = 120.19, p < .001, g2 = .63; recall of negative and neutral pairs did not differ significantly, F < 1. Commission errors. A repeated-measures ANOVA on the number of commission errors revealed a significant main effect of emotionality, F(2, 38) = 3.55, MSE = 1.86, p < .05, g2 = .16. Participants made significantly more commission errors for negative pairs (M = 2.7 out of 14) compared to positive pairs (M = 1.5 out of 14), F(1, 19) = 5.32, Fig. 5. Experiment 3 mean judgment of learning (JOL) and recall as a function of test type and pair emotionality. Error bars represent standard errors. 248 C.A. Zimmerman, C.M. Kelley / Journal of Memory and Language 62 (2010) 240–253 MSE = 2.49, p < .05, g2 = .22, with neutral pairs falling in between (M = 2.1). and negative words were not significantly different, F(1, 19) = 1.40, MSE = 32.60, p = .25. Calibration. An Emotionality (positive, negative, neutral) Measure (JOL, recall) repeated-measures ANOVA revealed a main effect of emotionality, F(2, 38) = 21.76, MSE = 95.02, p < .001, g2 = .53, which was qualified by a significant interaction, F(2, 38) = 14.10, MSE = 72.13, p < .001, g2 = .43, illustrated in the left pane of Fig. 5. To follow up on this interaction, we examined calibration for each emotion category separately. For neutral and positive pairs, participants were accurate in their predictions overall; JOLs did not differ significantly from actual level of recall for either neutral, F < 1, or positive pairs, F(1, 19) = 1.44, MSE = 254.35, p = .25. For negative pairs, however, participants were again significantly overconfident in their memory, F(1, 19) = 6.55, MSE = 303.36, p < .05, g2 = .26, replicating the findings of Experiment 1. Resolution. One participant was excluded because he had no variability in recall in one of the conditions (he recalled no neutral words). A repeated-measures ANOVA on the JOL–recall gammas revealed only a marginal difference among emotion categories (M = .10 for negative, M = .22 for neutral, M = .39 for positive), F(2, 36) = 2.45, MSE = .17, p = .10, g2 = .12. JOL–emotionality gammas were again significantly different from zero (M = .32), t(19) = 6.52, p < .001, d = 1.45; participants used emotionality as a cue for making their JOLs, as suggested by the JOL analyses presented above. Resolution. Four participants were excluded because they had no variability in recall in one of the conditions (one recalled no negative pairs, three recalled no neutral pairs). A repeated-measures ANOVA on the JOL–recall gammas revealed no differences in relative accuracy among the emotion categories, F < 1. As previewed in Experiment 1, JOL– recall gammas were not lower for negative (M = .19) compared to neutral (M = .21) or positive pairs (M = .19). JOL–emotionality gammas were computed by collapsing across positive and negative valence; we calculated the gamma correlation between emotionality (positive/ negative versus neutral) and JOLs. The average JOL–emotionality gamma was significantly different from zero (M = .32), t(19) = 8.47, p < .001, d = 1.88, indicating that participants again relied on the emotionality of word pairs as a cue to make their JOLs; higher JOLs were given to both positive and negative pairs compared to neutral pairs. Free recall condition Recall. A repeated-measures ANOVA on the percentage of words correctly recalled revealed a main effect of emotionality, F(2, 38) = 8.73, p < .001, g2 = .32 (see Fig. 5). Both negative and positive words were recalled better than neutral words, F(1, 19) = 27.09, MSE = 79.15, p < .001, g2 = .59, and F(1, 19) = 5.93, MSE = 145.26, p < .05, g2 = .24, respectively. Positive and negative words did not differ significantly from each other, F(1, 19) = 1.88, MSE = 152.73, p = .19. Calibration. An Emotionality (positive, negative, neutral) Measure (JOL, recall) repeated-measures ANOVA revealed a main effect of measure, F(1, 19) = 14.49, MSE = 532.45, p < .001, g2 = .37; participants were equally overconfident in their predictions for positive, negative, and neutral words, as illustrated in the right pane of Fig. 5. There was also a significant main effect of emotionality, F(2, 38) = 17.50, MSE = 94.95, p < .001, g2 = .48, Indeed, paralleling the pattern for cued recall, JOLs were higher for negative, F(1, 19) = 24.20, MSE = 34.39, p < .001, g2 = .56, and positive words, F(1, 19) = 35.14, MSE = 36.06, p < .001, g2 = .65, compared to neutral words; in contrast to cued recall, however, recall was also higher for both negative and positive words, as noted above. JOLs for positive Semantic clustering. As in Experiment 2, we tested the semantic cohesion theory of enhanced free recall of emotional words by calculating ARC scores, using the categories ‘‘positive,” ‘‘negative,” and ‘‘neutral.” ARC scores were not significantly different from zero (M = .04), t < 1, indicating that participants did not cluster their recall responses by emotionality. Summary Experiment 3 investigated memory monitoring of positive words in cued and free recall tasks, and replicated the results of Experiment 1 and 2 regarding memory monitoring of negative words. In cued recall, we found no enhanced recall of negative pairs relative to neutral pairs. Surprisingly, however, the equally arousing positive pairs were remembered significantly better than either negative or neutral pairs. Participants gave high JOLs to both types of emotional words, resulting in good calibration for positive pairs, but overconfidence for negative pairs. In free recall, both positive and negative individual words were recalled better than neutral words; calibration, in terms of some overconfidence, was equal for all word types; monitoring resolution also did not vary as a function of emotionality. As in Experiment 2, there was little evidence for output clustering by the emotional category of the words, which does not support the idea that emotional words receive more relational processing by virtue of their categorical nature, as predicted by the semantic cohesion theory of emotional memory enhancement. Experiment 4 Experiment 3 is the first to show that positive word pairs are remembered significantly better than neutral word pairs in cued recall, whereas equally arousing negative word pairs are not. The purpose of Experiment 4 was to replicate this novel result with a different set of materials. Method Participants Twenty-four undergraduate students enrolled in an introductory psychology course at Florida State University C.A. Zimmerman, C.M. Kelley / Journal of Memory and Language 62 (2010) 240–253 participated in exchange for partial course credit. All participants were tested individually. Materials Eighty-four new words were chosen from the ANEW (Bradley & Lang, 1999); 28 of the words were neutral (M = 5.2 on ANEW’s scale from 1 [unpleasant] to 9 [pleasant]), 28 were negative (M = 2.6), and 28 were positive (M = 7.2). Positive emotional words were more arousing (M = 5.4 on ANEW’s scale from 1 [calm] to 9 [aroused]) than neutral words (M = 4.0), F(1, 54) = 53.23, MSE = .53, p < .001, as were negative emotional words (M = 5.7), F(1, 54) = 60.34, MSE = .66, p < .001; negative and positive words did not differ from each other in terms of arousal, F(1, 54) = 1.11, MSE = .90, p = .30. All positive, negative, and neutral words were equated for frequency (Kuçera & Francis, 1967; M = 16.9 for positive, M = 18.1 for negative, M = 19.7 for neutral), concreteness (Friendly et al., 1982; Nelson et al., 1998; Wilson, 1988; M = 5.0 for positive, M = 5.2 for negative, M = 5.4 for neutral), familiarity (Wilson; M = 499.4 for positive, M = 492.2 for negative, M = 512.5 for neutral), and imageability (Wilson; M = 536.7 for positive, M = 531.2 for negative, M = 529.6 for neutral), all F < 1. Words were assessed for relatedness using free association norms (Nelson et al., 1998). No cue was listed as an associate of its paired target word and no cue elicited another cue word’s target as a response. Additionally, only three words had associates on the list; the highest probability of producing any word in the list in response to any other word was .04. Additionally, positive, negative, and neutral words all elicited approximately the same number of associates according to Nelson et al. (1998) (M = 13.8 for positive, M = 13.9 for negative, M = 11.4 for neutral), F(2, 56) = 1.02, MSE = 35.8, p = .37. A matrix LSA ensured that negative (M = .13) and positive word sets (M = .12) were no more related overall than the neutral word set (M = .11), F(2, 85) = 1.70, MSE = .001, p = .19. From these nouns, three types of word pairs were formed: 14 positive pairs, 14 negative pairs, and 14 neutral pairs. A pair-wise LSA (Landauer et al., 1998) on the individual word pairs ensured that the words comprising each pair were unrelated to one another. The average pair-wise similarity score for all word pairs was .05; no pair had a similarity score greater than .11. Additionally, neither negative noun pairs (M = .04) nor positive noun pairs (M = .06), were more related overall than neutral noun pairs (M = .05), F < 1. Procedure The general procedure was the same as the experimenter-paced condition of Experiment 1, with the exception that participants studied words for only one studytest trial. Participants saw the 42 pairs presented one at a time on a computer screen for 5 s and were instructed to study each pair such that they would later be able to recall the target word when presented with the cue word. Immediately after each pair was presented, participants estimated the probability of recalling the target word when prompted by the cue word on a scale from 0 (certain will not recall the word) to 100 (certain will recall the word). 249 Immediately following the study phase, each of the 42 cue words appeared on the screen one at a time and participants were given 15 s to type in the target word that was paired with that cue word during the study phase. The next cue word was presented immediately after the participant entered his or her response. Results and discussion Recall A repeated-measures ANOVA on the percentage of words recalled revealed a main effect of emotionality, F(2, 46) = 48.77, MSE = 79.01, p < .001, g2 = .68 (see Fig. 6). As in Experiment 3, cued recall was significantly higher for positive pairs compared to both neutral, F(1, 23) = 68.15, MSE = 60.26, p < .001, g2 = .75, and negative pairs, F(1, 23) = 81.22, MSE = 86.88, p < .001, g2 = .78. Additionally, participants recalled significantly fewer negative compared to neutral pairs, F(1, 23) = 4.41, MSE = 89.88, p < .05, g2 = .16. We note that, in Experiment 1, recall of negative pairs trended toward being worse than recall of neutral pairs, although this difference was not significant; in Experiment 3, recall was equal for negative and neutral pairs. Therefore, we can confidently conclude that recall of negative pairs is not enhanced relative to neutral pairs, although it is sometimes significantly worse whereas other times the two are equivalent. Commission errors A repeated-measures ANOVA performed on the number of commission errors revealed a significant main effect of emotionality, F(2, 46) = 5.69, MSE = 2.0, p < .01, g2 = .20. Follow-up tests confirmed that participants made significantly more commission errors for negative (M = 3.7 out of 14) compared to positive pairs (M = 2.3), F(1, 23) = 14.18, MSE = 1.60, p < .01, g2 = .38. There was no sig- Fig. 6. Experiment 4 mean judgment of learning (JOL) and recall as a function of pair emotionality. Error bars represent standard errors. 250 C.A. Zimmerman, C.M. Kelley / Journal of Memory and Language 62 (2010) 240–253 nificant difference in the number of commission errors for negative compared to neutral pairs, F(1, 23) = 2.08, MSE = 2.26, p = .16, g2 = .08, although there was a trend toward participants making fewer commission errors for positive (M = 2.3) compared to neutral pairs (M = 3.1), F(1, 23) = 3.15, MSE = 2.14, p = .09, g2 = .12. Calibration An Emotionality (positive, negative, neutral) Measure (JOL, recall) repeated-measures ANOVA showed main effects of both emotionality, F(2, 46) = 59.22, MSE = 60.47, p < .001, g2 = .72, and measure, F(1, 23) = 8.15, MSE = 274.52, p < .05, g2 = .26. However, these main effects were qualified by a significant interaction, F(2, 46) = 18.71, MSE = 56.64, p < .001, g2 = .45. Specifically, participants were overconfident in their memory for negative, F(1, 23) = 24.66, MSE = 156.67, p < .001, g2 = .52, and neutral pairs, F(1, 23) = 4.73, MSE = 103.19, p < .05, g2 = .17, but accurate in their memory predictions for positive pairs, F < 1 (see Fig. 6). Resolution One participant was excluded from this analysis because he had no variability in his JOLs. A repeated-measures ANOVA on the JOL–recall gammas revealed no significant effect of emotionality, (M = .28 for positive, M = .46 for negative, M = .30 for neutral) F < 1; participants were equally able to distinguish between later recalled and non-recalled positive, negative, and neutral word pairs when making their JOLs. JOL–emotionality gammas were computed collapsing across positive and negative emotionality, as in Experiment 3. The average JOL–emotionality gammas were again significantly different from zero (M = .28), t(22) = 5.91, p < .001, d = 1.23, indicating that participants gave higher JOLs to both positive and negative pairs than to neutral pairs. Summary Experiment 4 sought to replicate the novel finding in Experiment 3 that positive word pairs were remembered significantly better than negative and neutral word pairs. Using a different set of materials, we again found that cued recall of positive pairs was significantly better than cued recall of negative and neutral pairs. Again, participants gave high JOLs to both negative and positive word pairs, resulting in calibration accuracy for positive pairs, as in Experiment 3, but overconfidence for negative pairs, as in Experiments 1 and 3. We again found no monitoring differences in relative accuracy among the different emotion categories, replicating the findings of Experiment 3. General discussion The current experiments are the first to investigate the role of the emotionality of words in memory monitoring. We found that emotionality is a cue that people rely on when monitoring their learning of emotional and neutral materials; in both free and cued recall, participants predicted that emotional words would be more memorable than neutral words. In free recall (Experiments 2 and 3), participants recalled more positive and negative words, just as they predicted. Additionally, monitoring of free recall was equally accurate for emotional versus neutral words, both in terms of calibration and resolution. In cued recall (Experiments 1, 3 and 4), however, negative word pairs were not more memorable, resulting in marked overconfidence for negative pairs on the first trial, when emotionality was used as an intrinsic cue. Positive pairs, in contrast, were consistently recalled better than both negative and neutral pairs (Experiments 3 and 4), and participants were accurate in their absolute memory predictions for these pairs. Emotion as a cue for memory monitoring The most straightforward explanation for variations in the effectiveness of memory monitoring in free recall versus cued recall is that people use emotionality as a cue to predict better future memory based on a simple theory that memory for emotional events is better than memory for non-emotional events (e.g. Magnusson et al., 2006). Alternatively, the subjective experience initiated by both positive and negative words, such as the arousal they induce, may be taken as a signal of memorability. In reality, we found that the effects of emotionality on memory depend upon the valence of the word and the requirements of the task. Cued recall of negative words did not exceed that of neutral words whereas cued recall of positive words was almost twice that of negative and neutral words, two novel findings that we will address in the next section. Immediate affective reactions are the basis for many judgments such as how satisfied one is with one’s life, even when the true source of the reaction is something more transient, such as the weather (Pham, 2004; Schwarz & Clore, 1983, 2007). Options can be quickly evaluated based on one’s feelings and choices can be made based on affective reactions to anticipated outcomes (Johnson & Tversky, 1983; Slovic, Finucane, Peters, & MacGregor, 2002). Affective reactions are also used as a basis for predicting the future likelihood of events, with positive reactions leading to higher likelihood estimates, and negative reactions leading to lower likelihood estimates (Lench, 2009). If affective reactions were similarly used as a direct basis for predicting future likelihood of recall, one would see higher estimates for positive words and lower estimates for negative words in Experiments 3 and 4. However, both positive and negative words garnered higher estimates of future recall, so valence is clearly not influencing likelihood estimates of memory in the same way that it influences other likelihood estimates (e.g. Lench, 2009). According to the affect-as-information approach, valence signals good versus bad, while arousal signals importance (Clore & Storbeck, 2006). In Experiments 3 and 4, both positive and negative words were more arousing than the neutral words; given that arousal drives a subjective feeling of importance, arousal may also be a cue for memorability (see also Castel, 2007; Rhodes & Castel, 2008). Although arousal or the subjective experience of emotionality could be the cue for judgments of learning, it cannot be the sole mech- C.A. Zimmerman, C.M. Kelley / Journal of Memory and Language 62 (2010) 240–253 anism accounting for emotionality effects in cued recall, as we will discuss in the next section. Emotionality and memory for relations The current memory results have important implications for theories of emotion and memory in that they demonstrate dissociations in memory performance depending upon the valence of the studied emotional materials. Specifically, the finding that relational memory as assessed by cued recall is enhanced for positive, but not negative, word pairs indicates that we cannot rely on a general theory of emotional effects on memory; rather, any viable theory must be able to account for differences between memory for positive and negative items. The theory that emotional memory enhancement is driven solely by arousal, for example, does not satisfy this requirement. In Experiments 3 and 4, positive and negative words were equated for arousal, yet memory for positive word pairs was enhanced while memory for negative word pairs was equal to, or even below, memory for low arousal neutral word pairs. The theory that emotional memory enhancement in free recall is due to greater semantic cohesion or relatedness among emotional items is also not supported by the current results. We consistently found differences in memory performance despite equating positive, negative, and neutral words and word pairs for both normative (free association norms and latent semantic analysis) and subjective relatedness; furthermore, clustering scores in free recall indicated that participants did not organize their recall output along emotional lines. Why is cued recall of emotionally positive word pairs enhanced relative to neutral pairs, but not cued recall of emotionally negative word pairs? Indeed, there are few theories that make differential predictions for memory for negative versus positive materials. One explanation for the enhanced cued recall of positive words is that positive emotional material is intrinsically rewarding, and rewards enhance memory processes. For example, when photos are preceded by small versus large monetary rewards for later recognition, there is correlated activity in the hippocampus and reward pathways that predicts which items will be later remembered (Adcock, Thangavel, Whitfield-Gabrieli, Knutson, & Gabrieli, 2006). The reward hypothesis has also been invoked to account for the rewarding effects of a smile (Tsukiura & Cabeza, 2008); namely, smiling faces lead to better memory for the facial expression even when it is later cued with a name. In contrast, people are less likely to remember earlier facial expressions of surprise, anger, or fear (Shimamura, Ross, & Bennett, 2006). Similarly, the rewarding nature of positive word pairs may have caused them to be better remembered than either neutral or negative pairs. However, this theory does not make differential predictions for words versus word pairs; namely, it cannot explain why positive word pairs, but not individual positive words, were recalled better than their negative counterparts. A second possible explanation arises from research on the influence of positive mood on creativity. Isen and colleagues (Ashby, Isen, & Turken, 1999; Isen & Daubman, 251 1984; Isen, Daubman, & Nowicki, 1987; Isen, Johnson, Mertz, & Robinson, 1985) argue that positive mood states enhance creative thinking such that people may be better able to relate distinct concepts to one another when they are in a positive mood or when the materials themselves are positive (see also Frederickson, 2001). Participants in a positive mood perform better on the Remote Associates Test than participants in a neutral or negative mood (Isen et al., 1987) and participants give more unusual first associates to positive words than to either negative or neutral words in a free association task (Isen et al., 1985). This ability to see relationships between seemingly unrelated positive words may have helped our participants form unique and memorable relations among the arbitrary words comprising a to-be-remembered pair when that pair was positive rather than negative or neutral. Conclusions Affective reactions are a quick heuristic that is relied upon for a variety of judgments, including attitudes toward people, outcomes in choice situations, and the likelihood of future events. The current results illustrate that emotionality is a key cue for memorability judgments, such that people predict greater memorability for positive and negative words and word pairs relative to neutral words and word pairs. Performance, and thereby the diagnosticity of emotionality as a cue, however, differed as a function of the type of memory test and the emotional tone of the materials. In a cued recall test, participants remembered more positive, but not more negative, pairs than neutral pairs and were thus more overconfident in their memory predictions for negative word pairs. Just as confidence at retrieval can be inflated by the emotionality of the retrieved flashbulb event (Talarico & Rubin, 2003), judgments of cued recall memorability at encoding were inflated by negative emotionality. Emotionality signals, ‘‘I’ll remember this!” In some cases, such as memory for individual items, this signal is correct; however, in other important cases, such as memory for interconnected elements of negative events, this signal is misleading. Acknowledgments The authors would like to thank Heather Ashley, Laura Bauerband, Jasmina Diaz, Annie Honigfort, Amanda Hultgren, Nicholas Karr, Becki Kielaszek, Alexis Flores, Lina Gomez, Matt Hester, Jessica Lopez, Richard Molina, Iraida Neira, Laura Olivos, and David Schell, the undergraduate research assistants who helped with data collection for these experiments. We would also like to thank Robert Greene, Matthew Rhodes, and three anonymous reviewers for their insightful comments on an earlier version of this paper. References Adcock, R. A., Thangavel, A., Whitfield-Gabrieli, S., Knutson, B., & Gabrieli, J. D. E. (2006). Reward-motivated learning: Mesolimbic activation precedes memory formation. Neuron, 50, 507–517. 252 C.A. Zimmerman, C.M. Kelley / Journal of Memory and Language 62 (2010) 240–253 Anderson, A. K., & Phelps, E. A. (2001). Lesions of the human amygdala impair enhanced perception of emotionally salient events. Nature, 411, 305–309. Ashby, F. G., Isen, A. M., & Turken, A. U. (1999). A neuropsychological theory of positive affect and its influence on cognition. Psychological Review, 106, 529–550. Aupée, A. M. (2007). A detrimental effect of emotion on picture recollection. Scandinavian Journal of Psychology, 48, 7–11. Begg, I., Duft, S., Lalonde, P., Melnick, R., & Sanvito, J. (1989). Memory predictions are based on ease of processing. Journal of Memory and Language, 28, 610–632. Benjamin, A. S., Bjork, R. A., & Schwartz, B. L. (1998). The mismeasure of memory: When retrieval fluency is misleading as a metemnemonic index. Journal of Experimental Psychology: General, 127, 55–68. Bradley, M. M., Greenwald, M. K., Petry, M. C., & Lang, P. J. (1992). Remembering pictures: Pleasure and arousal in memory. Journal of Experimental Psychology: Learning, Memory, and Cognition, 18, 379–390. Bradley, M. M., & Lang, P. J. (1999). Affective norms for English words (ANEW): Instruction manual and affective ratings. Technical report C1, The Center for Research in Psychophysiology, University of Florida. Buchanan, T. W., Etzel, J. A., Adolphs, R., & Tranel, D. (2006). The influence of autonomic arousal and semantic relatedness on memory for emotional words. International Journal of Psychophysiology, 61, 26–33. Cahill, L., Haier, R. J., Fallon, J., Alkire, M. T., Tang, C., Keator, D., et al. (1996). Amygdala activity at encoding correlated with long-term free recall of emotional information. Proceedings of the National Academy of Sciences of the United States of America, 93, 8016–8021. Calvo, M. G., & Lang, P. J. (2004). Gaze patterns when looking at emotional pictures: Motivationally biased attention. Motivation and Emotion, 28, 221–243. Castel, A. D. (2007). The adaptive and strategic use of memory by older adults: Evaluative processing and value-directed remembering. In A. S. Benjamin & B. H. Ross (Eds.). The psychology of learning and motivation (Vol. 48, pp. 225–270). London: Academic Press. Claypool, H. M., Hall, C. E., Mackie, D. M., & Garcia-Marques, T. (2007). Positive mood, attribution, and the illusion of familiarity. Journal of Experimental Social Psychology, 44, 721–728. Clore, G. L., & Storbeck, J. (2006). Affect as information about liking, efficacy, and importance. In J. Forgas (Ed.), Affect in social thinking and behavior (pp. 123–142). New York: Psychology Press. Cook, G. I., Hicks, J. L., & Marsh, R. L. (2007). Source monitoring is not always enhanced for valenced material. Memory & Cognition, 35, 222–230. Doerksen, S., & Shimamura, A. P. (2001). Source memory enhancement for emotional words. Emotion, 1, 5–11. Dougal, S., Phelps, E. A., & Davachi, L. (2007). The role of medial temporal lobe in item recognition and source recollection of emotional stimuli. Cognitive, Affective, & Behavioral Neuroscience, 7, 233–242. Dougal, S., & Rotello, C. M. (2007). ‘‘Remembering” emotional words is based on response bias, not recollection. Psychonomic Bulletin & Review, 14, 423–429. Frederickson, B. L. (2001). The role of positive emotions in positive psychology: The broaden-and-build theory of positive emotions. American Psychologist, 56, 218–226. Friendly, M., Franklin, P. E., & Rubin, D. C. (1982). The Toronto Word Pool: Norms for imagery, concreteness, orthographic variables, and grammatical usage for 1080 words. Behavior Research Methods & Instrumentation, 14, 375–399. Garcia-Marques, T., Mackie, D. M., Claypool, H. M., & Garcia-Marques, L. (2004). Positivity can cue familiarity. Personality and Social Psychology Bulletin, 30, 585–593. Harris, C. R., & Pashler, H. (2005). Enhanced memory for negatively emotionally charged pictures without selective rumination. Emotion, 5, 191–199. Hertel, P. T., & Parks, C. (2002). Emotional episodes facilitate word recall. Cognition & Emotion, 16, 685–694. Isen, A. M., & Daubman, K. A. (1984). The influence of affect on categorization. Journal of Personality and Social Psychology, 47, 1206–1217. Isen, A. M., Daubman, K. A., & Nowicki, G. P. (1987). Positive affect facilitates creative problem solving. Journal of Personality and Social Psychology, 52, 1122–1131. Isen, A. M., Johnson, M. M., Mertz, E., & Robinson, G. F. (1985). The influence of positive affect on the unusualness of word associations. Journal of Personality and Social Psychology, 48, 1413–1426. Johnson, E. J., & Tversky, A. (1983). Affect, generalization, and the perception of risk. Journal of Personality and Social Psychology, 45, 20–31. Kelley, C. M., & Jacoby, L. L. (1996). Adult egocentrism: Subjective experience versus analytic bases for judgment. Journal of Memory and Language, 35, 157–175. Kensinger, E. A., & Corkin, S. (2003). Memory enhancement for emotional words: Are emotional words more vividly remembered than neutral words? Memory & Cognition, 31, 1169–1180. Koriat, A. (1997). Monitoring one’s knowledge during study: A cueutilization approach to judgments of learning. Journal of Experimental Psychology: General, 126, 349–370. Koriat, A., & Bjork, R. A. (2005). Illusions of competence in monitoring one’s knowledge during study. Journal of Experimental Psychology: Learning, Memory, and Cognition, 31, 187–194. Koriat, A., & Bjork, R. A. (2006). Mending metacognitive illusions: A comparison of mnemonic-based and theory-based predictions. Journal of Experimental Psychology: Learning, Memory, and Cognition, 32, 1133–1145. Koriat, A., Bjork, R. A., Sheffer, L., & Bar, S. K. (2004). Predicting one’s own forgetting: The role of experience-based and theory-based processes. Journal of Experimental Psychology: General, 133, 643–656. Koriat, A., Ma’ayan, H., Sheffer, L., & Bjork, R. A. (2006). Exploring a mnemonic debiasing account of the underconfidence-with-practice effect. Journal of Experimental Psychology: Learning, Memory, and Cognition, 32, 595–608. Koriat, A., Sheffer, L., & Ma’ayan, H. (2002). Comparing objective and subjective learning curves: Judgments of learning exhibit increased underconfidence with practice. Journal of Experimental Psychology: General, 131, 147–162. Kuçera, H., & Francis, W. H. (1967). Computational analysis of present-day American English. Providence, RI: Brown University. LaBar, K. S., & Cabeza, R. (2006). Cognitive neuroscience of emotional memory. Nature, 7, 54–64. LaBar, K. S., & Phelps, E. A. (1998). Arousal-mediated memory consolidation: Role of the medial temporal lobe in humans. Psychological Science, 9, 490–493. Landauer, T. K., Foltz, P. W., & Laham, D. (1998). Introduction to latent semantic analysis. Discourse Processes, 25, 259–284. Lench, H. C. (2009). Automatic optimism: The affective basis of judgment about the future likelihood of events. Journal of Experimental Psychology: General:, 138, 187–200. Levinger, G., & Clark, J. (1961). Emotional factors in the forgetting of word associations. Journal of Abnormal and Social Psychology, 62, 99–105. Maddock, R. J., & Frein, S. T. (2009). Reduced memory for the spatial and temporal context of unpleasant words. Cognition & Emotion, 23, 96–117. Magnusson, S., Andersson, J., Cornoldi, C., Da Beni, R., Endestad, T., Goodman, G. S., et al. (2006). What people believe about memory. Memory, 14, 595–613. Maratos, E. J., Allan, K., & Rugg, M. D. (2000). Recognition memory for emotionally negative and neutral words: An ERP study. Neuropsychologia, 38, 1452–1465. Mather, M. (2007). Emotional arousal and memory binding: An objectbased framework. Perspectives in Psychological Science, 2, 33–52. McDowal, J. (1994). Recall of associates generated to emotionally toned stimulus words. Canadian Journal of Experimental Psychology, 48, 82–94. McGaugh, J. L. (2004). The amygdala modulates the consolidation of memories of emotionally arousing experiences. Annual Review of Neuroscience, 27, 1–28. Monin, B. (2003). The warm glow heuristic: When liking leads to familiarity. Journal of Personality and Social Psychology, 85, 1035–1048. Nagae, S., & Moscovitch, M. (2002). Cerebral hemispheric differences in memory of emotional and non-emotional words in normal individuals. Neuropsychologia, 40, 1601–1607. Nelson, T. O. (1984). A comparison of current measures of the accuracy of feeling-of-knowing predictions. Psychological Bulletin, 95, 109–133. Nelson, D. L., McEvoy, C. L., & Schreiber, T. A. (1998). The University of South Florida word association, rhyme, and word fragment norms. <http://www.usf.edu/FreeAssociation/>. Ochsner, K. N. (2000). Are affective experiences richly recollected or simply familiar? The experience and process of recognizing feelings past. Journal of Experimental Psychology: General, 129, 242–261. Öhman, A., Flykt, A., & Esteves, F. (2001). Emotion drives attention: Detecting the snake in the grass. Journal of Experimental Psychology: General, 130, 466–478. Onoda, K., Okamoto, Y., & Yamawaki, S. (2009). Neural correlates of associative memory: The effects of negative emotion. Neuroscience Research, 64, 50–55. Pham, M. T. (2004). The logic of feeling. Journal of Consumer Psychology, 14, 360–369. C.A. Zimmerman, C.M. Kelley / Journal of Memory and Language 62 (2010) 240–253 Phelps, E. A. (2004). Human emotion and memory: Interactions of the amygdala and hippocampal complex. Current Opinion in Neurobiology, 14, 198–202. Rhodes, M. G., & Castel, A. D. (2008). Memory predictions are influenced by perceptual information: Evidence for metacognitive illusions. Journal of Experimental Psychology: General, 137, 615–625. Roenker, D. L., Thompson, C. P., & Brown, S. C. (1971). Comparison of measures for the estimation of clustering in free recall. Psychological Bulletin, 76, 45–48. Rubin, D. C., & Friendly, M. (1986). Predicting which words get recalled: Measures of free recall, availability, goodness, emotionality, and pronounciability for 925 nouns. Memory & Cognition, 14, 79–94. Schmidt, S. (1991). Can we have a distinctive theory of memory? Memory & Cognition, 19, 523–542. Schmidt, S., & Saari, B. (2007). The emotional memory effect: Differential processing or item distinctiveness? Memory & Cognition, 35, 1905–1916. Schwartz, B. L., & Metcalfe, J. (1992). Cue familiarity but not target retrievability enhances feeling-of-knowing judgments. Journal of Experimental Psychology: Learning, Memory, and Cognition, 18, 1074–1083. Schwarz, N., & Clore, G. L. (1983). Mood, misattribution, and judgments of well-being: Informative and directive functions of affective states. Journal of Personality and Social Psychology, 45, 513–523. Schwarz, N., & Clore, G. L. (2007). Feelings and phenomenal experiences. In A. W. Kruglanski & E. T. Higgins (Eds.), Social psychology: Handbook of basic principles (2nd ed., pp. 385–407). New York, NY: Guilford Press. Sharot, T., Delgado, M. R., & Phelps, E. A. (2004). How emotion enhances the feeling of remembering. Nature Neuroscience, 7, 1376–1380. 253 Shimamura, A. P., Ross, J. G., & Bennett, H. D. (2006). Memory for facial expressions: The power of a smile. Psychonomic Bulletin & Review, 13, 217–222. Slovic, P., Finucane, M., Peters, E., & MacGregor, D. G. (2002). Rational actors or rational fools: Implications of the affect heuristic for behavioral economics. Journal of Socio-Economics, 31, 329–342. Stormark, K. M., Nordby, H., & Hugdahl, K. (1995). Attentional shifts to emotionally charged cues: Behavioural and ERP data. Cognition & Emotion, 9, 507–523. Talarico, J. M., & Rubin, D. C. (2003). Confidence, not consistency, characterizes flashbulb memories. Psychological Science, 14, 455–461. Talmi, D., Luk, B. T. C., McGarry, L. M., & Moscovitch, M. (2007). The contribution of relatedness and distinctiveness to emotionallyenhanced memory. Journal of Memory and Language, 56, 555–574. Talmi, D., & Moscovitch, M. (2004). Can semantic relatedness explain the enhancement of memory for emotional words? Memory & Cognition, 32, 742–751. Talmi, D., Schimmack, U., Paterson, T., & Moscovitch, M. (2007). The role of attention and relatedness in emotionally enhanced memory. Emotion, 7, 89–102. Tsukiura, T., & Cabeza, R. (2008). Orbitofrontal and hippocampal contributions to memory for face-name associations: The rewarding power of a smile. Neuropsychologia, 46, 2310–2319. Van Overschelde, J. P., & Nelson, T. O. (2006). Delayed judgments of learning cause both a decrease in absolute accuracy (calibration) and an increase in relative accuracy (resolution). Memory & Cognition, 34, 1527–1538. Wilson, M. D. (1988). The MRC psycholinguistic database: Machine readable dictionary, version 2. Behavioral Research Methods, Instruments and Computers, 20, 6–11.