Modality Based Working Memory James Sulzen School of Education Stanford University April 1, 2001 Abstract This study tested a hypothesis that working memory is primarily modally organized. This experiment performed a free-recall task which presented randomized stimuli sequences in seven presentation modalities (visual (V), auditory (A), haptic (H), kinesthetic (K), linguistic-auditory (LA), linguistic-visual (LV), and spatial-auditory (SA)). The same number of stimuli was presented in each modality on any given trial run. Results showed recall was linearly dependent upon the number of items in each modality up to a limit of about three items presented for a modality and then leveled out thereafter. Recency and primacy effects indicated that at least several of the modality recall sequences operated with differing underlying processes indicating further support for the independent modalities memory hypothesis. Appendices Table of Contents INTRODUCTION .............................................................................................................................................3 MODALITY-ORGANIZED COGNITION ....................................................................................................................4 PREDICTIONS ..................................................................................................................................................6 METHOD.........................................................................................................................................................7 RESULTS.........................................................................................................................................................8 DISCUSSION .................................................................................................................................................13 CONCLUSION ...............................................................................................................................................15 REFERENCES ...............................................................................................................................................16 APENDICES .................................................................................. ERROR! BOOKMARK NOT DEFINED. APPENDIX A – LIST OF ALL STIMULI .......................................... ERROR! BOOKMARK NOT DEFINED. STIMULUS CODES .................................................................................... ERROR! BOOKMARK NOT DEFINED. APPENDIX B –ORGANIZATION OF TRIAL RUNS ......................... ERROR! BOOKMARK NOT DEFINED. APPENDIX C – PROCEDURES ....................................................... ERROR! BOOKMARK NOT DEFINED. APPENDIX D – ORGANIZATION AND ADMINISTRATION OF THE STIMULI .. ERROR! BOOKMARK NOT DEFINED. APPENDIX E – TABULATED RECALL DATA ................................ ERROR! BOOKMARK NOT DEFINED. APPENDIX F – RAW RECALL DATA ............................................. ERROR! BOOKMARK NOT DEFINED. APPENDIX G – NORMALIZATION FORMULA FOR CALCULATING SERIAL RECALL PROBABILITIES ...................................................................................................... ERROR! BOOKMARK NOT DEFINED. APPENDIX H – TABULATED RECENCY AND PRIMACY DATA .... ERROR! BOOKMARK NOT DEFINED. APPENDIX I – RAW RECENCY AND PRIMACY DATA .................. ERROR! BOOKMARK NOT DEFINED. APPENDIX J – CLUSTERING OF SUBJECT RESPONSES ............... ERROR! BOOKMARK NOT DEFINED. Appendices Introduction "The study of models of memory often seems like a backwater in the overall study of memory. Models do not have a prominent place in experimental studies of memory and they are not used or examined by most researchers in the field... Recent development of models of long-term memory has proceeded relatively independently of other areas of memory research." (Ratcliff & McKoon, 2000, p. 571) Studies of human short term and working memory have a very rich and long history (Ebbinghaus, 1885; James, 1890; Miller, 1956; and for surveys: Crowder, 1993; Bower, 2000; Baddeley, 2000 ). A number of models of human memory and working memory have been proposed and tested over time, especially those involving verbal or visual elements. There have also been a number of studies demonstrating various modal forms of short term memory (STM) such as for haptic and olfactory capacities (Schurman, 1973; White, 1997). Baddeley and Hitch’s (1974) classic modal model of working memory combining a spatio-visual, phonological, and executive control system was an initial attempt to articulate perceived modal-related sub-components of working memory. Since then, it seems reasonable to suppose that working memory is in fact fractionated among a number of modular systems as evidence accumulates for the existence of more and more different components (Weiskrantz, 1987; Baddeley, 2000). Recently, fMRI evidence has started to accumulate for a neurological basis for the phonological loop (Paulesu, Frith, & Frackowiak, 1993; Awh et al., 1996) and even for a modal basis of representing categories of objects such as living things (Schill-Thompson et al., 1999). In addition to the mounting evidence that both working memory and perhaps long term memory (LTM) are organized along modal lines, there is strong evidence to indicate that the modal systems highly interact with each other. In the Schill-Thompson study (1999), it appears that visual centers are always activated whenever a subject is asked to think about any aspect of a living thing (even such as parts of or the food of living things – i.e., “are snails edible”). This is taken to indicate that the category of living things seems to have a primary visual element which seems principally responsible for triggering other modalities, and brain damage to a modal visual area might therefore well impair retrieval of the associated memories in the other modalities. Cross-modal priming is a fairly clear example of interaction. McKone (McKone & Dennis, 2000) found that auditory or visual stimuli acted to prime stimuli in the other modality. Perhaps of more interest in terms of the current writing, they found that same modality priming has a greater effect than cross modality priming, and that visual versus auditory priming of nonwords is different (auditory performs better). McKone interprets these results as indicating a perceptual basis locus for priming with some form of weak re-encoding occurring to effect the cross-modal priming. There is also evidence for non-sensory based, but modal storage. Penney (Penney, 1989) reviewed the literature on auditory and visual modality effects and concluded that auditory and visually presented words were re-encoded in a phonological store accessible from either, and that the auditory and visual channels represent two separate processing streams. Her argument is based upon five points: 1) Improved ability to perform two concurrent verbal tasks when different input modalities are employed relative to the single-mode situation; 2) Improved memory when different items are presented to two sensory modalities rather than one; 3) Selective interference effects within, but less so across, modalities; 4) Subjects' preference for, and greater efficiency of, recall organized by modality than by time of presentation; and 5) The presence of short-term memory deficits that appear to be specific to the auditory or visual modalities. Additionally, Penney showed that bilingual speakers prefer to organize recall tasks by modality of presentation, as opposed to organizing recall by language of presentation, time of presentation, or category of item. Another bilingual study (Dehaene et al., 1999) showed that precise arithmetic calculations are carried out in one’s native language (i.e., the language in which arithmetic was presumably learned), whereas approximate arithmetic Appendices calculations are carried out via visual and spatial means. This finding in conjunction with the concept of the independent phonological store, leads to an implication of language, or perhaps rather a linguistic capability, existing independently of any of the standard modalities. On an informal, but perhaps intuitively satisfying basis, as far back as 1890 William James (James, 1890) provides an elegant example of cross-modal encoding of knowledge. Holding open the lips prior to thinking of any word with labials or dentals such as "bubble" or "toddle" distinctly affects most people's recall process. (“Is your image under these conditions distinct? To most people the image is at first ‘thick’ as the sound would be if they tried to pronounce it with lips parted.” p. 63). This would seem to be an example of interfering with a cross-modal retrieval across at least the haptic (touch), kinesthetic (sensomotor), visual, and verbal systems. Given the evidence for both some sort of modularized sub-specialization of working memory, some of which certainly seems to organize along modal lines, it seems reasonable to suppose that each modal sensory system may have its own working memory component. Goldstone and Barsalou (1998) have argued that there are many reasons to believe that much of cognition is perceptually based and proceeds via perceptual representation processes. They argue along the following lines: 1) That many if not in fact all of the properties associated with amodal symbol systems can be achieved with perceptually-based systems (such as productivity); 2) Raw perceptual processing is often much more powerful for certain tasks than an equivalent amodal system; 3) Perception naturally supports similarity; 4) Perception can be readily tuned to conceptual demands; 5) Perceptual simulation occurs in conceptual tasks and which have no explicit perceptual demands (for example, Maxwell’s imagining microscopic spinning spheres in dielectrics when developing his Electrodynamics equations (Nersessian, unpublished), or Einstein utilizing his visualizations of space-time when developing relativity). Countering these claims and conjectures, have been theories of episodic, semantic, and other memory organization (Baddeley, 2000). There is also strong evidence that people can organize working memory around categories – that is to say that structuring items by category in effect seems to create something of a “separate” short term memory for each category leading to a two to three fold improvement in working memory capacity (Watkins & Peynircioglu, 1983; Bower, et al., 1969). These category effects even show recency and primacy effects. We will address these issues of categorization and non-modal organization in the discussion section. Modality-Organized Cognition The evidence for multiple, modality-related working memory components leads to a supposition that perhaps each modality has its own working memory and some level of cognitive processing capability. If each modality has its own working memory and processing capability, then why not its own long term memory and its own deeper cognitive processing capability? Following these conjectures to some sort of logical conclusion leads to a possible memory and cognitive functional organization as illustrated by Figure 1. Figure 1 illustrates that a certain number of modal units interact with each other to create the experience of cognition. Some of these are “first-level” modal processing loci, each directly connected to its own sensory system via the sensory registers. There are also a number of “second-level modal loci” each with its own specialization. In this model, every modality loci (hereafter referred to as modalities) is connected to and capable of stimulating or receiving stimuli from any other modality. This interaction probably operates through or in conjunction with the type of centralized switching network referred to as a “central executive” (Baddeley & Hitch, 1974). The second-level stimuli have no direct connection to external sensory registers and so must receive their sensory inputs only by first-level restimulation. Appendices The set of modalities represented here were selected because experimental evidence indicated a functional nexus for each and because they seem to represent a minimal set that spans many cognitive phenomena. There may also be “tertiary” or other modalities serving to organize social cognition, personality or other functions, but the above model does not address such possibilities. The model provides an organizing framework for representing relatively low-levels of cognition involving perception and knowledge representation. Figure 1 – Modality-organized cognition Sensory Registers G H K O First Level Modalities A V L S E Second Level Modalities A - Auditory G - Gustatory H - Haptic K - Kinesthetic O - Olfactory V - Visual E - Emotional/affective L - Linguistic S - Spatial The rest of this writing will use the single-letter abbreviations listed in Figure 1 to identify each of the modal systems. When it is necessary or useful to distinguish which first level modality is interacting with a given second level one, the two letters are combined, so “LV” means a visually presented linguistic item, while “SA” means an auditory spatial stimulus. Figure 1 should be interpreted in light of the following: - Representational Systems: Each modality should be thought of as a “representational system” which represents processes, knowledge, perceptions, and sensory experience in its own particular way. V represents knowledge in pictures and images, A in sounds, and so on. K is the kinesthetic sensomotor system. L is a pure linguistic system that represents knowledge and does its processing in terms of sequenced and syntactically ordered symbols. S is a system that represents spatial knowledge and performs spatial processing. E controls our affective memories and processing. The other modalities should be self-explanatory. - Completeness of each modality: In this model, each modality is a complete cognitive processing system with its own working memory, long term memory, and processing capabilities. The type and manner of internal organization is probably very specific to the given modality (i.e. S is probably very differently organized than E or than V, for example). This helps explain some of the modality differences observed in the literature such as the slight superiority of recalling auditory-presented words as opposed to visually presented ones during free recall tasks. - Cross-stimulation and multi-modal representations: Each modal system is constantly stimulating each other system with its outputs, including stimulating itself with its own outputs (i.e. feedback). This cross stimulation probably provides a capacity for feedback loops and re-encoding of stimuli, as well as higher level organizations of cognition. The question arises as to how these separate systems combine or interact, and why is it not more obvious that such separate systems exist? Following evidence from Schill-Thompson (1999), it seems probable that it may often require several cross-stimulating modal systems to meaningfully represent concepts and various sorts of Appendices knowledge. Consider the category of 'living things', which, according to their data, appears to have a necessary visual component,but which also has elements in other modalities to define its representation. If the visual portion of the ‘living things’ representation were impaired via a lesion for example, then the other elements that make up the 'living things' representation would still be intact, but not be capable of being stimulated. Therefore the person loses knowledge of what a 'living thing' is, even though most of the knowledge is still available (and indeed may be accessible via other cue paths.) The concept of ‘living things’ cannot be kicked into gear because the necessary visual element is missing from the stimulus chain. In a similar vein, James’ (1890) example with ‘bubble’ and ‘toddle’ could therefore be understood as indicating that the meaning or knowledge of these words is encoded across the L, H, K, and V modalities; and that interfering with one modality (K, when the lips are parted,) interferes with the retrieval process and the associated V image gets changed. As for it not being more obvious that these hypothesized internal systems have a distinct existence, the explanation might be that the extent of interactivity makes the whole seem like a monolithic entity making it tremendously difficult to discern the individual elements. Consider, as an analogy, aborigines trying to discern the internal structure of an automobile by being able to examine only its external appearance and perhaps drive it only in very limited and controlled circumstances. With neither the concepts nor useful tools for investigating internal combustion engines, they would have little chance of deducing internal electrical, carburetion, fuel, cooling, exhaust and other internal systems (although they might be able to deduce the existence of some systems such as steering and brakes that have relatively easily observed external correlates.) Similarly, with the human cognitive system there is a tendency to regard memory as one large undifferentiated system with perhaps some salient subsystems such as vision, auditory, or spatial processing. Predictions Given the above hypothesis regarding the modal basis for cognition, the following predictions seem likely: 1) Single-modal presentations: A set of simple stimuli limited to a single modality is more likely to be primarily encoded in that modality rather than being re-encoded and cross-stored. However, it is necessary to remember that there are probably significant exceptions to this, for example the fact that people seem to be particularly capable of recoding linguistic items among the V, L, and A modalities. 2) Total working memory capacity: Assuming that each modality has an amount of independent working memory, and that each one functions similarly to how we currently consider working memory to function, then the total working memory capacity of a person should approximate the sum of the capacities of the individual modal working memories. This prediction ignores duplicate encoding affects and the apparent need to cross-modally encode some types of items, (i.e. the' living things' category,) which would require using capacity from several modalities, and thereby reduce the seeming total capacity of the system 3) Testing working memory capacity: If the above predictions hold it should be possible to present a tuned set of stimuli to fill each particular modal working memory. This should lead to an apparent increased memory capacity compared to presenting a more randomly chosen set of stimuli. In fact, there should be a linear increase in items recalled as the number of items presented in each modality is increased. At some point there should be a leveling off of the number of items recalled despite a continuing increase in the number of items presented. 4) Seven plus or minus two: Much of the free recall literature has focused on presenting either images or words as the basic stimuli, (with the words having either auditory or visual presentation). In terms of the cognitive model presented here, this consists of stimuli in the V, LA, and LV modalities. As mentioned above, it seems likely that people have a facile ability to cross-encode between the L, V, and A modalities. If this is so, in terms of free recall literature, the V, LA, and LV modalities might be thought of as one common store that should show the familiar “seven plus or minus two” total capacity limitation characteristic of unordered free recall tasks. 5) Primacy / recency effects: If each modality has its own working memory, then it seems likely each should show some of the normal primacy and recency effects, (remembering initial and most recent stimuli the best). Appendices These predictions are the basis for the experiment described in this study. Method Subjects Subjects were 9 volunteers, all acquainted with the author, ranging in age from mid-twenties to early-sixties, (mean age of 40 years); five female and four male. All subjects had either graduate degrees or were engaged in postgraduate study at major institutions. Design Subjects performed a free recall task in which the basic manipulation varied the number of items presented in each of seven modalities, (A, V, LA, LV, H, K, and SA). These particular seven modalities were chosen because it was relatively easy to create a suitable stimulus sequence for each one to test the conjectures. Trial run organization. Presentation sequences of stimuli (“trial runs”) were set up so each trial run had the same number of stimuli from each of the seven modalities. For example, a given trial run had two stimuli in each modality, another had three in each modality, and so on. The number of items in each modality will be referred to as the IM count (items per modality). Within a given trial run, stimuli were randomly mixed in presentation order. (For example, for a block with two stimuli from each of the seven modalities, the order of presentation might be: A, LV, H, K, A, SA, K, LA, H, V, SA, LV, V, LA. At the completion of each presentation sequence, subjects were asked to recall as many items as they could. Each stimulus was presented only once to any subject. Stimuli were carefully screened to avoid inadvertent redundancies (i.e., saying the word “pig” and showing a picture of a pig). Fill sequence. As the working memory of V, A, and L modalities are experimentally well established (Penney, 1989), it is important to establish the independent existence of other modal systems. Therefore a fill sequence of at least eight items was included at the end of the trial run for each run of stimuli consisting of items in the V, LA, and LV modalities to minimize the likelihood that Ss used these three modalities for cross-storage from the H, A, K, and SA modalities. Materials Table 1 gives the total number of stimuli presented in each trial run, the number of modality-specific stimuli in each run, and the length and make-up of each fill sequence. A set of cards was prepared for the experimenter to use in administering stimuli. Each card listed one stimulus and identified both the modality and the stimulus to be presented to a subject. In the case of the SA modality, a card indicated a direction relative to the subject, and the experimenter clicked a staple remover twice at the appropriate location to generate a directional sound. See Appendix A for a complete list of stimuli and Appendix D for how stimuli were organized and administered in the experiment. Appendices Table 1 – Length and make-up of each trial run Series Trial Run Total number of stimuli Number of items in each of the various modalities IM Count Fill Sequence (V / LA / LV) Number of Subjects T0 16 - 16 (3/6/7) 4 T1 16 - 16 (4/5/7) 3 Series T2 22 2 8 (2/3/3) 3 A T3 29 3 8 (2/3/3) 3 T6 29 3 8 (2/3/3) 3 T7 30 2 16 (4/6/6) 3 T8 29 3 8 (2/3/3) 6 T9 36 4 8 (2/3/3) 6 T10 44 5 8 (2/3/3) 5 Series B The stimuli assigned to each trial run are listed in Appendix B. In assembling the trial runs, procedures were followed to minimize the chance that any stimulus or stimulus sequence within a trial run had an undue recall bias (see Appendix C for the details of how trial runs were assembled). Note that, in Table 1, the length of each trial run is given by IM*7 + (length of fill sequence). Procedure Subjects were each presented with several trial runs. Each trial run consisted of a sequence of stimuli, which, without interruption, was immediately followed by each subject’s attempt to recall as many items out of the presentation as possible in whatever manner they chose. The experimenter recorded the recalled items in the order recalled. A cassette recorder was used as a back-up to check that each subject’s recalled list was accurately recorded. Subjects were run in two separate series (identified as series A and series B), where each series had a separate set of trial runs and its own separate group of subjects. Subjects within each series received exactly the same treatment (i.e. same sequence of trial runs organized in exactly the same way from subject to subject). The number of subjects was too small to try balancing the order of trial runs. It simplified performing the experiment and analysis to give all subjects in a series the same treatment. Two series were used because subjects were all unpaid volunteers and the initial pilot testing indicated that it was too time taxing to run subjects through more than three or four trial runs, (especially with the longer sequences). This necessitated using two distinct groups of subjects with two separate sets of trial runs to span the necessary range of IM values. Some unfortunate differences cropped up in performance between the two series, which will be addressed in the discussion section. No trial runs were conducted at IM=1 because of limited subject availability. It seemed superfluous to use a trial run at such a simple level because subjects would likely have a high success rate at such a simple level, and the IM=1 values would just show strong linearity in the data. In the interests of expediency, this linearity is assumed herein. Results Coding of Answers During recall, subjects typically identified stimuli via a single word or a simple description (i.e. “pig”, “rain”, “touched me on the shoulder, back of head, and arm”, etc.). Since stimuli in the V, LV, and LA modalities had been carefully chosen to avoid any redundancies across modalities, it was a easy to match the great majority of subject responses to stimulus items in these modalities. Coding the H, K, A, and SA modalities was a bit more challenging as Appendices presentations were done non-verbally, but recall was verbal, necessitating a cross-modal translation on the part of the subject to produce the recall. This meant that subjects gave a variety of descriptions and it required some judgment in a number of cases to match a recalled item to a particular stimulus. In the case of the H, K and SA modalities, a number of stimuli were combined to simplify the coding and increase the ability to accurately distinguish subject responses. (See Appendix A for the differences between the encoding versus stimulus lists.) Double checking was achieved by coding each recall sequence twice (with at least a few weeks between codings); there were only five items out of a total of about 370 total whose coding was changed on the second go-through. Analysis Chart 1 shows the number of items correctly recalled by each subject for every trial run. There is a distinct linear trend (r=0.86, p<.001), but the data is rather dispersed, especially for IM>2 values. An analysis of variance showed significance F(4,30)=4.3, p<.01. However, only the IM=0 group showed any significant difference (p<.01), while the IM=2 showed near significant difference from the IM=5 group (p<.08). Chart 1 – Number of correctly recalled items (each data point represents one subject’s successful recall count for one trial run) # items successfully recalled 25 20 Total Recalled (V+LV+LA+A+H+K+SA) 15 10 5 V + LV + LA 0 -1 0 1 2 3 4 5 6 IM Count (number of items presented per modality) Also graphed on Chart 1 are the totals for the visual and linguistic items (i.e., V+LA+LV). These correspond to traditional free recall tests where subjects are presented either pictures or word items for recall. These modalities, when summed together, have no particular sensitivity to the IM count with a correlation of r=0.04; they show a fairly constant sum of about 7.5 recalled items (SD=1.4) across all values of the IM count. Given that V+LV+LA is a near constant, Chart 2 shows the that a large fraction of the variability of the Total Recalled value can be attributed to the sum of the recalled items in the four A, H, K, and SA modalities. The two data sets correlate at r=0.89 (p < .01). The sum of A+H+K+SA also has a very linear dependency upon the IM Count, correlating with r=0.90 (p < .01). Appendices Chart 2 – Graph of Total Recalled and Total of A+H+K+SA 25 Total Recalled 20 # items y = 2.5x + 8.3 R2 = 0.75 15 A+H+K+SA 10 V + LV + LA 5 y = 2.3x + 1.4 R2 = 0.79 0 0 1 2 3 4 5 6 IM Count (# items per modality) Chart 3 shows the number of correctly recalled items for all IM Counts, broken out by the Series A and Series B data. (Series A and B are the two sets of trial runs and of their corresponding subjects.) It also shows the totals for the A, H, K, and SA modalities labeled as line D in the chart. The trend lines for the A and B series data are rather distinctly different from each other and appear to break right at IM=3. This discontinuity between the two series will be covered more in the discussion section. An analysis of variance showed possible significance between the Total Recalled IM=2 group and the Series B data, with F(2,17)=2.8, p<.09. The line labeled “C” in Chart 3 shows the average of the percentage of total items in a each trial run which were correctly recalled. This shows a peak value at IM=2 and then declining thereafter. This indicates that subjects were recalling a smaller and smaller proportion of the total number of items presented to them. An analysis of variance showed significance F(4,29)=3.5, p=0.05. Appendices Chart 3 – Correctly recalled items for series A and B, and per cent recalled Series #A Series #B (error bars +/- 1 SD) 120% % correct minus fill sequence (right axis) 20.0 B 100% Total Recalled 80% 15.0 C A 60% 10.0 40% A+H+K+SA 5.0 20% 0.0 0% 0 1 2 3 4 IM Count (# of items per modality) 5 % Correct # Corrrect Items 25.0 6 Recalled Items in Each Modality Chart 4 – Averages of V, LV, and LA Recalled Items # of items Chart 4 shows the graph of the 9.0 means of the recalled items for V, LV, LA, and A. Of particular note is 8.0 the slight negative correlation V+LA+LV between the LA+LV versus V plots 7.0 (one goes down where the other goes up and vice versa), r= -0.33, 6.0 (p<.07). Given the near constant value of the V+LA+LV plot, this 5.0 seems like indirect evidence for LA+LV recoding going on between the V 4.0 and L modalities. That is, an item received in say LA gets recoded and LA V 3.0 stored in V, thereby lowering the A number of V items that can be 2.0 recalled, but apparently increasing LV the number of LA items that have been remembered. There is no such 1.0 systematic variation between the LA and LV curves, indicating no 0.0 seeming relationship; one might 0 1 2 3 4 5 6 hope for counter-correlation since, by the memory model used here, they IM Count are using a shared resource. However, the possible occurrence of recoding into V and/or A may have obscured this relationship. Of note also is that the LA and A modalities are slightly monotonic in opposing directions (r= -0.9 of the means), perhaps indicating that introduction of the A Appendices modality items starting at IM=2 starts to place a slight burden on the auditory systems and lowering its efficiency, ever so slightly, of passing LA items through to the L modality. This is just conjecture however as there is no significance to any of these measures (except perhaps the V vs. LV+LA measure above). Chart 5 – Averages for recalled items in A, H, K, and SA # of items Chart 5 shows the plot of the recalled items from the 4.0 other four modalities under study. There is a general SA 3.5 monotonic increase for all but the H modality, which itself K 3.0 has a slight puzzling downward trend after IM=3. In particular, the H, A, and K plots are distinctly similar and 2.5 A linear at the IM=2 and IM=3 values (varying from 1.5 to 2.0 2.4 recalled items, respectively). Since at IM=2, a H 1.5 maximum of two items can be recalled (and 3 items at 1.0 IM=3, and so on), no curve can perforce increase at a slope of faster than one. The very linear relationship in 0.5 the range IM=2 to IM=3 seems to provide additional 0.0 evidence that a linear process is occuring in these 0 1 2 3 4 5 6 modalities. At IM=4 there seem to be distinct nonlinearities introduced into all but the K curve, indicative IM Count perhaps some sort of internal effects, such as inteference or capacity limitations, starting to occur within those modalities. Of further interest might be the fact that virtually none of these curves correlate with each other on a within-subject basis (the highest correlation is between SA and K on a subject-by-subject basis, r=0.46, (p<.05)). This means that a subject performing well or poorly in a given modality seems to have no bearing on how the same subject does in other modalities for a given trial run. Recency and Primacy Curves Chart 6 – Probability of recall of V, LV, and LA as a function of item’s original presentation position Probability of Recall Chart 6 shows the probability of an item’s 0.9 V being recalled as a 0.8 function of where the item LA 0.7 occurred within the V+LV+LA presentation sequence 0.6 LA+LV within its own modality. 0.5 The data across trial runs and across modalities had LV 0.4 to be normalized relative 0.3 to each other to 0.2 compensate for the fact that the number of items 0.1 presented in any given 0 modality varied considerably depending upon the trial run and the modality. For example, in Serial Position of Recall (normalized for trial run length) trial run T9, there were seven LA and seven LV items, six V items, and four for each of the other modalities. This meant that each modality had a varying number of items presented which differed across trial runs. Calculating the probability that the ith item presented for modality m, required normalizing the length of all modality sequences. See Appendix G for details of how the normalization was performed and the primacy/recency curves were calculated. 0.95 0.85 0.75 0.65 0.55 0.45 0.35 0.25 0.15 0.05 As to be expected, Chart 6 shows fairly solid recency and primacy effects for virtually all combinations of the V, LV, and LA modalities, one of the marks of short term memory. Note how the LA+LV plot shows that the LV and LA curves tend to cancel out their individual swings, especially in the somewhat wild LA swings of the latter third of the Appendices curve. Such behavior seems indicative of the existence a single underlying shared resource that they are both making use of. Chart 7 – Probability of recall of A, H, K, and SA for trial runs T9 and T10 Probability of Recall Chart 7 shows the probability of an item in Total = combined K, H, A, & SA the A, H, K, or SA 1.2 modalities being correctly recalled, depending upon 1 the order it was presented K to a subject. This chart is 0.8 organized identically to H Chart 6, except that it only 0.6 Total shows data for trial runs A where each modality had at 0.4 least four items (trial runs T9 and T10). The data was SA 0.2 restricted to these trial runs because the other ones had 0 modality sequences of only two or three items per modality. Such sequences are just too short to show Serial Position of Recall (normalized to trial run length) much in the way of recency or primacy effect, especially considering the small number of subjects. (As it is, because the recall sequences are so short, there is noticeable quantization effects in the chart from the normalizing operation.) 95% 85% 75% 65% 55% 45% 35% 25% 15% 5% In Chart 7, only the H curve really shows much of the normal primacy/recency curve. The other modalities seem to have no readily discernable overall pattern. Notable is the near 100% recall of the initial sound (in either A or SA) and similarly the near 100% recall of the last kinesthetic action performed across all subjects and all trial runs. There certainly generally seems to be a primacy effect (except for K). The lack of recency may be due to the small number of items presented in each modality and due to the relatively long time from the end of each sequence until recall actually started. Each sequence was intermixed with each other and with the V, LV, and LA sequences. In addition there was the fill sequence (in V, LV, and LA) which further delayed getting to the recall of these four modalities. So much delay may have limited the recency effect. Additionally, the sudden dip of the SA curve at the end may be due to a coding artifact from the last SA item in trial run T9; this stimulus seemed to have been confounded with another stimulus in that trial run. This may have artificially lowered the end of the SA curve. It would also explain the unexpectedly flat response at IM=4 for the SA curve in Chart 5. Discussion The modality cognition model (Figure 1) was used to make a number of predictions and which were tested in this experiment. Total Working Memory Capacity As predicted, the total working memory capacity seemed to be increased by using modal specific stimuli. Chart 1 shows subjects recalled about 7 to 8 items in the V, LV, and LA modalities. Chart 2 shows subjects were able to reliably recall some eight to 15 additional items by use of additional modalities. Similarly organized free recall experiments typically report recall lengths of 5 to 10 items for arbitrarily length lists of unordered stimuli lists presented in either V or A (Miller, 1956). In this experiment, the 15-23 items recalled by subjects is some 250% to 300% higher than those other reported rates, and similarly higher than the base rate of some 7.5 words for the V+LV+LA levels recorded in this study. The much higher recall rate in this study certainly seems indicative of some additional memory aid being employed. Appendices Other techniques, such as categorizing the stimuli, have been used to increase recall rates. If free recall lists are organized into categories and presented with items blocked together by category, then the recall rate seems to be improved by approximately 15% to 70%, depending upon the study (Dallett, 1964; Cofer, Bruce, & Reicher, 1966). This is still well short of what was found here (which had random presentation of stimuli). As predicted, and as illustrated by Charts 1 through 3, there appears to be a linear increase in recalled items as IM is increased. As for the predicted leveling off of working memory capacity as IM continues to increase, the discontinuity is unfortunate in Chart 3 between the Series A and Series B data at the IM=3. Both the subjects and the organization of the trial runs changed right at this juncture when switching from Series A to Series B. This makes it difficult to discern whether the apparent leveling off is due to the increasing value of IM, the use of a different set of subjects, or the differing set of cards used in the trial runs. One reasonable interpretation of difference in slopes of the two series is indeed that Series B represents a leveling off of the linear increase shown by the Series A trial runs. Unfortunately, the same data can also be reasonably interpreted as a noisy linear increase all the way form IM=2 to IM=5 (see Chart 1). As such, it remains for future work to establish if the predicted leveling off of total working memory capacity does indeed occur as expected. Seven Plus or Minus Two The sum of the V+LV+LA values across all trial runs show a very consistent average of seven to eight items being recalled from these three presentation modalities. Indeed, only two out of 33 trial runs with subjects had any V+LV+LA score higher than nine. The strong consistency reconfirms the seeming limits of these three modalities operating together. The items recalled above and beyond these three modalities are strongly indicative that some other memory mechanism is operating in addition to the usual visual/verbal one as tested by traditional free recall tasks. Primacy / Recency Effects There is a clear primacy / recency effect for the V, LV, and LA modalities as expected and as shown in Chart 6. Chart 7 is not nearly so clear with the A, H, K, and SA modalities. At best there either seems to be some evidence (the primacy effect seems strong, H shows clear recency, and the sum of the curves shows primacy/recency) or there are explanations for lack of a clearly discernable effect. It seems more data needs to be collected to resolve this. However, it seems like there is reason to cautiously expect that this effect exists, especially considering the small numbers of trials and of subjects in this study. Other Evidence Other evidence for the modality cognition model, mentioned earlier, can be found on Charts 4 and 7. On Chart 4, the countervailing swings of the V and LV+LA curves indicate either a shared resource or significant crossencoding and cross-storage is occurring. Similarly, on Chart 7, the countervailing swings of the LV and LA curves indicate a shared resource constraining their individual capacities. This shared resource has often been referred to as verbal memory (Baddeley, 2000) or phonological memory (Penney, 1989). This shared construct is here characterized as a linguistic representational system and which can readily take its input (with literate individuals) from either V or A. Note that this linguistic model also nicely integrates with other linguistic inputs such as with Braille or American Sign Language (KL and VL). There is other evidence developed during the course of the study, but not reported in detail due to lack of space. Independently of presentation order, subjects frequently clustered stimuli from the same modality into sequences during recall. According to Penney (1989), when subjects are presented stimuli in varying modalities and which are also organized by category, during recall they have a strong preference for clustering items via modality as opposed to category. This certainly indicates that modality is a stronger associative bond than category seems to be, and that perhaps modality association occurs because it is a deeper underlying mechanism than category. The fMRI evidence to date seems to indicate this with the demonstration that the category of “living things” is encoded across several modalities (Schill-Thompson, et al., 1999). Appendices It has been suggested that the modality enhancement effect found here is nothing other than a fancy form of categorization (Greeno, 2001). Appropriate categorization can improve free recall by 50% to 70% (Dallett, 1964), but this is only a fraction of the improvement that using multiple modalities seems to have offered here. Bower, et al. (1969) showed a 150% to 350% improvement on recall when stimuli were carefully categorized and hierarchically organized, as opposed to the same items being presented in random order. Bower presented all items at once (via a printed card) in a visually, spatially, and semantically organized hierarchy and gave subjects approximately four minutes to study the hierarchy (of up to 112 items divided into some 30-40 categories). Test subjects had a visual display whose items were carefully associated as to semantic content and spatial grouping so as to make conceptual sense, whereas controls had items randomly organized into the same spatial groupings. Additionally, each subject had four opportunities to see the display and write their recall. The present study of course employed an entirely different presentation technique. However, it is notable that random presentation caused a tremendous detriment in Bower, but random organization in the present study still lead to superior recall performance. One can only conjecture what improvement it would have been to the present study’s subjects to have had all stimuli grouped by modality during presentation. Another way of looking at Bower’s results is through the lens of the current modality model. Bower’s subjects were presented stimuli in V, LV, and SV, and had ample time and cause during the several minute study period to recode to L and A, and possibly into other modalities. Additionally, the items were grouped into small categories of two to three items per category. As such, Bower’s results might possibly be reasonably interpreted as comparable to the current study’s – Bower’s subjects were simultaneous using multiple modalities, just as the current study’s subjects did, and achieved similar results, and in additon had the benefit of semantic categorization. It is also clear, both in the case of Bower, the present, and other category-related studies (Dallet, 1964; Pollio, Richards, and Lucas, 1969), that organizing stimuli and hierarchically grouping them is of enormous benefit in recall. Other studies have demonstrated greatly enhanced recall capabilities and/or recency effects via categorization or similar organization, but these too can be interpreted in terms of utilizing modalities beyond the V, LV, and LA of traditional free recall tasks. Watkins & Peynircioglu (1983) used six categories (riddles, sounds, objects, favorites, quiz questions, and drawings) and which were run in two groupings of riddle-sound-objects and favorites-quizdrawing. The first grouping can easily be interpreted as presentations in LV-A-V and the second group as E-L-K/V. They showed subjects were able to recall about five items from each of three categories; this is a performance comparable to the present study’s. As a counter-argument, studies which try to use purely taxonomic categories (except for Dallet (1964)), tend to show little benefit from categories (Cofer, Bruce, Reicher, 1966). The net result is that the modality utilization of the current study could be interpreted as a form of categorization. However, categorization only seems to succeed where multiple modalities are employed. This makes it seem more likely that successful categorization used to enhance recall should be interpreted as a special case of use of multiple modalities. Conclusion A number of working memory phenomena were consistently reproduced in this experiment (the various well-known effects such as fixed capacity, primacy, and recency) with other phenomena occurring on top of these (greatly increased working memory capacity for one). Various sources of evidence seem supportive of the modality cognition model proposed here. While these modality effects can be explained as a form of categorization, it seems as likely, or possibly even more likely, that categorization effects can be explained as a form of modal crossencoding. In particular support of this view point is the recent fMRI evidence of the modal organization of working memory and likely cross-modal encoding of categories. In conclusion, this study has presented a model for how certain aspects of relatively low-level human cognition occurs via a number of distinct modal loci processing centers ordered into at least two layers. Evidence was developed to show support for a linguistic and spatial modalities as well as for the sensory based ones. In fact, one reasonable interpretation of the present study’s data is that each of the modalities seem to have a working memory capacity of about three items in a free recall task. If this interpretation is valid, then Miller’s famous seven plus or minus two dictum might really be something more akin to say seven = 3+2+2 (V+LV+LA). Appendices References Baddeley, Alan D. & Hitch, Graham (1993). The recency effect: Implict learning with explicit retrieval?. Memory and Cognition, 21(2), 146-155. Baddeley, Alan (2000). Short-Term and Working Memory. In E. Tulving & F. I. M Craik (Eds.) (eds.) Oxford Handbook of Working Memory. Oxford : NY. Bower, Gordon H., Clark, Michar C., Lesgold, ALan M., and Winzenz, David (1969). Hierarchical retrieval schemes in recal of categorized word lists. Journal of Verbal Learning & Verbal Behavior, 8(3), 323-343. Bower, Gordon H. (2000). A Brief History of Memory Research. In E. Tulving and F. I. M. Craik (eds.) (eds.) The Oxford Handbook of Memory. Oxford : NY. Cofer, C. N., Bruce, D. R., and Reicher, G. M. (1966). Clustering in free recall as a function olf certain methodological variables. Journal of Experimental Psychology, 71, 858-866. Crowder, Robert G. (1993). Short-term memory: Where do we stand?. Memory and Cognition, 21, 142-146. Dallett, Kent M. (1964). Number of categories and category information in free recall. Journal of Experimental Psychology, 68(1), 1-12. Dehaene, S., Spelke, E., Pinel, P., Stanescu, R., Tsivkin, S. (1999). Sources of Mathematical Thinking: Behavioral and Brain-Imaging Evidence. Science, 284, 970-974. Duis, Sandra S., Dean, Raymond S., Derks, Peter (1994). The modality effect: A result of methodology?. Elman, Jeffrey L. (1990). Finding structure in time. Cognitive Science, 14, 179-211. Goldstone, Robert L., Barsalou, Lawrence W. (1998). Reuniting perception and conception. Cognition, 65, 231262. Greeno, 2001. Personal conversation. Groeger, John A., Field, David, Hammond, Sean M. (1999). Measuring memory span. Quebeck 98 Conference on Short-Term Memory (Jun: Quebec City, PQ, Canada). James, William (1890). The principles of Psychology. Dover Publications: NY. McKone, Elinor & Dennis, Christopher (2000). Short-term implicit memory: Visual, auditory, and corss-modality priming. Psychonomic Bulletin & Review, , 341-346. Miller, George A. (1956). The magical number seven, plus or minust two: Some limits on our capacity for processing information. Psychological Review, 63(2), 81-97. Nairne, James S. (1990). A feature model of immediate memory. Memory and Cognition, 18(3), 251-269. Nersessian, Nancy J. (unpublished). Abstraction via Generic Modeling in Concept Formation in Science. Penney, Catherine (1989). Modality effects and the structure of short-term verbal memory. Memory and Cognition, 17(4), 398-422. Pollio, Howard R., Richards,Steven, and Lucas, Richard (1969). Temporal properties of category recall. Journal of Verbal Learning & Verbal Behavior, 8, 529-536. Ratcliff, Roger & McKoon, Gail (2000). Memory Models. In Endel Tulving & Fergus I. M. Craik (eds.) (eds.) The Oxford Handbook of Memory. Oxford : NY. Schill-Thompson, S. L., Aquirre, G. K., D'Esposito, M. & Farah, M. J. (1999). A neural basis for category and modality specificity of semantic knowledge. Neuropchologia, 37, 671-676. Schurman, D. L., Bernstein, Ira H., Proctor, Robert W. (1973). Modality-specific short-term storage for pressure. Bulletin of the Psychonomic Society, 1(1B), 70-75. Schweickert, Richard (1993). A multinominal processing tree model for degradation and deintegration in immediate recall. Memory and Cognition, 21, 168-175. Shiiffrin, Richard M. (1993). Short-term memory: A brief commentary. Memory and Cognition, 21, 193-197. Watkins, M. J. & Peynircioglu, Z. F. (1983). Three recency effects at the same time. Journal of Verbal Learning & Verbal Behavior, 22, 375-384. Weiskrantz, L. (1987). Neuroanatomy of memory and amnesia: A case for multiple memory systems. White, Theresa L. & Treisman, Michel (1997). A comparison of the encode of content and order in olfactory memory and in memory for visually presented verbal materials.