Modality-Organized Cognition - Learning, Design and Technology

advertisement
Modality Based Working Memory
James Sulzen
School of Education
Stanford University
April 1, 2001
Abstract
This study tested a hypothesis that working memory is primarily modally organized. This experiment performed a
free-recall task which presented randomized stimuli sequences in seven presentation modalities (visual (V),
auditory (A), haptic (H), kinesthetic (K), linguistic-auditory (LA), linguistic-visual (LV), and spatial-auditory (SA)).
The same number of stimuli was presented in each modality on any given trial run. Results showed recall was
linearly dependent upon the number of items in each modality up to a limit of about three items presented for a
modality and then leveled out thereafter. Recency and primacy effects indicated that at least several of the
modality recall sequences operated with differing underlying processes indicating further support for the
independent modalities memory hypothesis.
Appendices
Table of Contents
INTRODUCTION .............................................................................................................................................3
MODALITY-ORGANIZED COGNITION ....................................................................................................................4
PREDICTIONS ..................................................................................................................................................6
METHOD.........................................................................................................................................................7
RESULTS.........................................................................................................................................................8
DISCUSSION .................................................................................................................................................13
CONCLUSION ...............................................................................................................................................15
REFERENCES ...............................................................................................................................................16
APENDICES .................................................................................. ERROR!
BOOKMARK NOT DEFINED.
APPENDIX A – LIST OF ALL STIMULI .......................................... ERROR!
BOOKMARK NOT DEFINED.
STIMULUS CODES .................................................................................... ERROR! BOOKMARK NOT DEFINED.
APPENDIX B –ORGANIZATION OF TRIAL RUNS ......................... ERROR!
BOOKMARK NOT DEFINED.
APPENDIX C – PROCEDURES ....................................................... ERROR!
BOOKMARK NOT DEFINED.
APPENDIX D – ORGANIZATION AND ADMINISTRATION OF THE STIMULI .. ERROR!
BOOKMARK NOT
DEFINED.
APPENDIX E – TABULATED RECALL DATA ................................ ERROR!
BOOKMARK NOT DEFINED.
APPENDIX F – RAW RECALL DATA ............................................. ERROR!
BOOKMARK NOT DEFINED.
APPENDIX G – NORMALIZATION FORMULA FOR CALCULATING SERIAL RECALL PROBABILITIES
...................................................................................................... ERROR! BOOKMARK NOT DEFINED.
APPENDIX H – TABULATED RECENCY AND PRIMACY DATA .... ERROR!
BOOKMARK NOT DEFINED.
APPENDIX I – RAW RECENCY AND PRIMACY DATA .................. ERROR!
BOOKMARK NOT DEFINED.
APPENDIX J – CLUSTERING OF SUBJECT RESPONSES ............... ERROR!
BOOKMARK NOT DEFINED.
Appendices
Introduction
"The study of models of memory often seems like a backwater in the overall study of memory.
Models do not have a prominent place in experimental studies of memory and they are not used
or examined by most researchers in the field... Recent development of models of long-term
memory has proceeded relatively independently of other areas of memory research." (Ratcliff &
McKoon, 2000, p. 571)
Studies of human short term and working memory have a very rich and long history (Ebbinghaus, 1885; James,
1890; Miller, 1956; and for surveys: Crowder, 1993; Bower, 2000; Baddeley, 2000 ). A number of models of
human memory and working memory have been proposed and tested over time, especially those involving verbal
or visual elements. There have also been a number of studies demonstrating various modal forms of short term
memory (STM) such as for haptic and olfactory capacities (Schurman, 1973; White, 1997). Baddeley and Hitch’s
(1974) classic modal model of working memory combining a spatio-visual, phonological, and executive control
system was an initial attempt to articulate perceived modal-related sub-components of working memory. Since
then, it seems reasonable to suppose that working memory is in fact fractionated among a number of modular
systems as evidence accumulates for the existence of more and more different components (Weiskrantz, 1987;
Baddeley, 2000). Recently, fMRI evidence has started to accumulate for a neurological basis for the phonological
loop (Paulesu, Frith, & Frackowiak, 1993; Awh et al., 1996) and even for a modal basis of representing categories
of objects such as living things (Schill-Thompson et al., 1999).
In addition to the mounting evidence that both working memory and perhaps long term memory (LTM) are
organized along modal lines, there is strong evidence to indicate that the modal systems highly interact with each
other. In the Schill-Thompson study (1999), it appears that visual centers are always activated whenever a subject
is asked to think about any aspect of a living thing (even such as parts of or the food of living things – i.e., “are
snails edible”). This is taken to indicate that the category of living things seems to have a primary visual element
which seems principally responsible for triggering other modalities, and brain damage to a modal visual area might
therefore well impair retrieval of the associated memories in the other modalities. Cross-modal priming is a fairly
clear example of interaction. McKone (McKone & Dennis, 2000) found that auditory or visual stimuli acted to prime
stimuli in the other modality. Perhaps of more interest in terms of the current writing, they found that same
modality priming has a greater effect than cross modality priming, and that visual versus auditory priming of nonwords is different (auditory performs better). McKone interprets these results as indicating a perceptual basis locus
for priming with some form of weak re-encoding occurring to effect the cross-modal priming.
There is also evidence for non-sensory based, but modal storage. Penney (Penney, 1989) reviewed the literature
on auditory and visual modality effects and concluded that auditory and visually presented words were re-encoded
in a phonological store accessible from either, and that the auditory and visual channels represent two separate
processing streams. Her argument is based upon five points:
1) Improved ability to perform two concurrent verbal tasks when different input modalities are employed
relative to the single-mode situation;
2) Improved memory when different items are presented to two sensory modalities rather than one;
3) Selective interference effects within, but less so across, modalities;
4) Subjects' preference for, and greater efficiency of, recall organized by modality than by time of
presentation; and
5) The presence of short-term memory deficits that appear to be specific to the auditory or visual modalities.
Additionally, Penney showed that bilingual speakers prefer to organize recall tasks by modality of presentation, as
opposed to organizing recall by language of presentation, time of presentation, or category of item.
Another bilingual study (Dehaene et al., 1999) showed that precise arithmetic calculations are carried out in one’s
native language (i.e., the language in which arithmetic was presumably learned), whereas approximate arithmetic
Appendices
calculations are carried out via visual and spatial means. This finding in conjunction with the concept of the
independent phonological store, leads to an implication of language, or perhaps rather a linguistic capability,
existing independently of any of the standard modalities.
On an informal, but perhaps intuitively satisfying basis, as far back as 1890 William James (James, 1890) provides
an elegant example of cross-modal encoding of knowledge. Holding open the lips prior to thinking of any word with
labials or dentals such as "bubble" or "toddle" distinctly affects most people's recall process. (“Is your image under
these conditions distinct? To most people the image is at first ‘thick’ as the sound would be if they tried to
pronounce it with lips parted.” p. 63). This would seem to be an example of interfering with a cross-modal retrieval
across at least the haptic (touch), kinesthetic (sensomotor), visual, and verbal systems.
Given the evidence for both some sort of modularized sub-specialization of working memory, some of which
certainly seems to organize along modal lines, it seems reasonable to suppose that each modal sensory system
may have its own working memory component. Goldstone and Barsalou (1998) have argued that there are many
reasons to believe that much of cognition is perceptually based and proceeds via perceptual representation
processes. They argue along the following lines:
1) That many if not in fact all of the properties associated with amodal symbol systems can be achieved with
perceptually-based systems (such as productivity);
2) Raw perceptual processing is often much more powerful for certain tasks than an equivalent amodal
system;
3) Perception naturally supports similarity;
4) Perception can be readily tuned to conceptual demands;
5) Perceptual simulation occurs in conceptual tasks and which have no explicit perceptual demands (for
example, Maxwell’s imagining microscopic spinning spheres in dielectrics when developing his
Electrodynamics equations (Nersessian, unpublished), or Einstein utilizing his visualizations of space-time
when developing relativity).
Countering these claims and conjectures, have been theories of episodic, semantic, and other memory
organization (Baddeley, 2000). There is also strong evidence that people can organize working memory around
categories – that is to say that structuring items by category in effect seems to create something of a “separate”
short term memory for each category leading to a two to three fold improvement in working memory capacity
(Watkins & Peynircioglu, 1983; Bower, et al., 1969). These category effects even show recency and primacy
effects. We will address these issues of categorization and non-modal organization in the discussion section.
Modality-Organized Cognition
The evidence for multiple, modality-related working memory components leads to a supposition that perhaps each
modality has its own working memory and some level of cognitive processing capability. If each modality has its
own working memory and processing capability, then why not its own long term memory and its own deeper
cognitive processing capability? Following these conjectures to some sort of logical conclusion leads to a possible
memory and cognitive functional organization as illustrated by Figure 1.
Figure 1 illustrates that a certain number of modal units interact with each other to create the experience of
cognition. Some of these are “first-level” modal processing loci, each directly connected to its own sensory system
via the sensory registers. There are also a number of “second-level modal loci” each with its own specialization. In
this model, every modality loci (hereafter referred to as modalities) is connected to and capable of stimulating or
receiving stimuli from any other modality. This interaction probably operates through or in conjunction with the type
of centralized switching network referred to as a “central executive” (Baddeley & Hitch, 1974). The second-level
stimuli have no direct connection to external sensory registers and so must receive their sensory inputs only by
first-level restimulation.
Appendices
The set of modalities represented here were selected because experimental evidence indicated a functional nexus
for each and because they seem to represent a minimal set that spans many cognitive phenomena. There may
also be “tertiary” or other modalities serving to organize social cognition, personality or other functions, but the
above model does not address such possibilities. The model provides an organizing framework for representing
relatively low-levels of cognition involving perception and knowledge representation.
Figure 1 – Modality-organized cognition
Sensory
Registers
G
H
K
O
First Level
Modalities
A
V
L
S
E
Second Level
Modalities
A - Auditory
G - Gustatory
H - Haptic
K - Kinesthetic
O - Olfactory
V - Visual
E - Emotional/affective
L - Linguistic
S - Spatial
The rest of this writing will use the single-letter abbreviations listed in Figure 1 to identify each of the modal
systems. When it is necessary or useful to distinguish which first level modality is interacting with a given second
level one, the two letters are combined, so “LV” means a visually presented linguistic item, while “SA” means an
auditory spatial stimulus.
Figure 1 should be interpreted in light of the following:
-
Representational Systems: Each modality should be thought of as a “representational system” which
represents processes, knowledge, perceptions, and sensory experience in its own particular way. V
represents knowledge in pictures and images, A in sounds, and so on. K is the kinesthetic sensomotor
system. L is a pure linguistic system that represents knowledge and does its processing in terms of
sequenced and syntactically ordered symbols. S is a system that represents spatial knowledge and
performs spatial processing. E controls our affective memories and processing. The other modalities
should be self-explanatory.
-
Completeness of each modality: In this model, each modality is a complete cognitive processing
system with its own working memory, long term memory, and processing capabilities. The type and
manner of internal organization is probably very specific to the given modality (i.e. S is probably very
differently organized than E or than V, for example). This helps explain some of the modality
differences observed in the literature such as the slight superiority of recalling auditory-presented
words as opposed to visually presented ones during free recall tasks.
-
Cross-stimulation and multi-modal representations: Each modal system is constantly stimulating each
other system with its outputs, including stimulating itself with its own outputs (i.e. feedback). This cross
stimulation probably provides a capacity for feedback loops and re-encoding of stimuli, as well as
higher level organizations of cognition.
The question arises as to how these separate systems combine or interact, and why is it not more obvious that
such separate systems exist? Following evidence from Schill-Thompson (1999), it seems probable that it may
often require several cross-stimulating modal systems to meaningfully represent concepts and various sorts of
Appendices
knowledge. Consider the category of 'living things', which, according to their data, appears to have a necessary
visual component,but which also has elements in other modalities to define its representation. If the visual portion
of the ‘living things’ representation were impaired via a lesion for example, then the other elements that make up
the 'living things' representation would still be intact, but not be capable of being stimulated. Therefore the person
loses knowledge of what a 'living thing' is, even though most of the knowledge is still available (and indeed may be
accessible via other cue paths.) The concept of ‘living things’ cannot be kicked into gear because the necessary
visual element is missing from the stimulus chain. In a similar vein, James’ (1890) example with ‘bubble’ and
‘toddle’ could therefore be understood as indicating that the meaning or knowledge of these words is encoded
across the L, H, K, and V modalities; and that interfering with one modality (K, when the lips are parted,) interferes
with the retrieval process and the associated V image gets changed.
As for it not being more obvious that these hypothesized internal systems have a distinct existence, the explanation
might be that the extent of interactivity makes the whole seem like a monolithic entity making it tremendously
difficult to discern the individual elements. Consider, as an analogy, aborigines trying to discern the internal
structure of an automobile by being able to examine only its external appearance and perhaps drive it only in very
limited and controlled circumstances. With neither the concepts nor useful tools for investigating internal
combustion engines, they would have little chance of deducing internal electrical, carburetion, fuel, cooling, exhaust
and other internal systems (although they might be able to deduce the existence of some systems such as steering
and brakes that have relatively easily observed external correlates.) Similarly, with the human cognitive system there is a tendency to regard memory as one large undifferentiated system with perhaps some salient subsystems
such as vision, auditory, or spatial processing.
Predictions
Given the above hypothesis regarding the modal basis for cognition, the following predictions seem likely:
1)
Single-modal presentations: A set of simple stimuli limited to a single modality is more likely to be
primarily encoded in that modality rather than being re-encoded and cross-stored. However, it is
necessary to remember that there are probably significant exceptions to this, for example the fact that
people seem to be particularly capable of recoding linguistic items among the V, L, and A modalities.
2)
Total working memory capacity: Assuming that each modality has an amount of independent working
memory, and that each one functions similarly to how we currently consider working memory to
function, then the total working memory capacity of a person should approximate the sum of the
capacities of the individual modal working memories. This prediction ignores duplicate encoding
affects and the apparent need to cross-modally encode some types of items, (i.e. the' living things'
category,) which would require using capacity from several modalities, and thereby reduce the
seeming total capacity of the system
3)
Testing working memory capacity: If the above predictions hold it should be possible to present a
tuned set of stimuli to fill each particular modal working memory. This should lead to an apparent
increased memory capacity compared to presenting a more randomly chosen set of stimuli. In fact,
there should be a linear increase in items recalled as the number of items presented in each modality is
increased. At some point there should be a leveling off of the number of items recalled despite a
continuing increase in the number of items presented.
4)
Seven plus or minus two: Much of the free recall literature has focused on presenting either images or
words as the basic stimuli, (with the words having either auditory or visual presentation). In terms of
the cognitive model presented here, this consists of stimuli in the V, LA, and LV modalities. As
mentioned above, it seems likely that people have a facile ability to cross-encode between the L, V,
and A modalities. If this is so, in terms of free recall literature, the V, LA, and LV modalities might be
thought of as one common store that should show the familiar “seven plus or minus two” total capacity
limitation characteristic of unordered free recall tasks.
5)
Primacy / recency effects: If each modality has its own working memory, then it seems likely each
should show some of the normal primacy and recency effects, (remembering initial and most recent
stimuli the best).
Appendices
These predictions are the basis for the experiment described in this study.
Method
Subjects
Subjects were 9 volunteers, all acquainted with the author, ranging in age from mid-twenties to early-sixties, (mean
age of 40 years); five female and four male. All subjects had either graduate degrees or were engaged in postgraduate study at major institutions.
Design
Subjects performed a free recall task in which the basic manipulation varied the number of items presented in each
of seven modalities, (A, V, LA, LV, H, K, and SA). These particular seven modalities were chosen because it was
relatively easy to create a suitable stimulus sequence for each one to test the conjectures.
Trial run organization. Presentation sequences of stimuli (“trial runs”) were set up so each trial run had the same
number of stimuli from each of the seven modalities. For example, a given trial run had two stimuli in each
modality, another had three in each modality, and so on. The number of items in each modality will be referred to
as the IM count (items per modality). Within a given trial run, stimuli were randomly mixed in presentation order.
(For example, for a block with two stimuli from each of the seven modalities, the order of presentation might be: A,
LV, H, K, A, SA, K, LA, H, V, SA, LV, V, LA. At the completion of each presentation sequence, subjects were
asked to recall as many items as they could. Each stimulus was presented only once to any subject. Stimuli were
carefully screened to avoid inadvertent redundancies (i.e., saying the word “pig” and showing a picture of a pig).
Fill sequence. As the working memory of V, A, and L modalities are experimentally well established (Penney,
1989), it is important to establish the independent existence of other modal systems. Therefore a fill sequence of at
least eight items was included at the end of the trial run for each run of stimuli consisting of items in the V, LA, and
LV modalities to minimize the likelihood that Ss used these three modalities for cross-storage from the H, A, K, and
SA modalities.
Materials
Table 1 gives the total number of stimuli presented in each trial run, the number of modality-specific stimuli in each
run, and the length and make-up of each fill sequence. A set of cards was prepared for the experimenter to use in
administering stimuli. Each card listed one stimulus and identified both the modality and the stimulus to be
presented to a subject. In the case of the SA modality, a card indicated a direction relative to the subject, and the
experimenter clicked a staple remover twice at the appropriate location to generate a directional sound. See
Appendix A for a complete list of stimuli and Appendix D for how stimuli were organized and administered in the
experiment.
Appendices
Table 1 – Length and make-up of each trial run
Series
Trial
Run
Total
number
of
stimuli
Number of items in each of the
various modalities
IM Count
Fill Sequence
(V / LA / LV)
Number
of
Subjects
T0
16
-
16
(3/6/7)
4
T1
16
-
16
(4/5/7)
3
Series
T2
22
2
8
(2/3/3)
3
A
T3
29
3
8
(2/3/3)
3
T6
29
3
8
(2/3/3)
3
T7
30
2
16
(4/6/6)
3
T8
29
3
8
(2/3/3)
6
T9
36
4
8
(2/3/3)
6
T10
44
5
8
(2/3/3)
5
Series
B
The stimuli assigned to each trial run are listed in Appendix B. In assembling the trial runs, procedures were
followed to minimize the chance that any stimulus or stimulus sequence within a trial run had an undue recall bias
(see Appendix C for the details of how trial runs were assembled). Note that, in Table 1, the length of each trial run
is given by IM*7 + (length of fill sequence).
Procedure
Subjects were each presented with several trial runs. Each trial run consisted of a sequence of stimuli, which,
without interruption, was immediately followed by each subject’s attempt to recall as many items out of the
presentation as possible in whatever manner they chose. The experimenter recorded the recalled items in the
order recalled. A cassette recorder was used as a back-up to check that each subject’s recalled list was accurately
recorded.
Subjects were run in two separate series (identified as series A and series B), where each series had a separate
set of trial runs and its own separate group of subjects. Subjects within each series received exactly the same
treatment (i.e. same sequence of trial runs organized in exactly the same way from subject to subject). The
number of subjects was too small to try balancing the order of trial runs. It simplified performing the experiment and
analysis to give all subjects in a series the same treatment. Two series were used because subjects were all
unpaid volunteers and the initial pilot testing indicated that it was too time taxing to run subjects through more than
three or four trial runs, (especially with the longer sequences). This necessitated using two distinct groups of
subjects with two separate sets of trial runs to span the necessary range of IM values. Some unfortunate
differences cropped up in performance between the two series, which will be addressed in the discussion section.
No trial runs were conducted at IM=1 because of limited subject availability. It seemed superfluous to use a trial
run at such a simple level because subjects would likely have a high success rate at such a simple level, and the
IM=1 values would just show strong linearity in the data. In the interests of expediency, this linearity is assumed
herein.
Results
Coding of Answers
During recall, subjects typically identified stimuli via a single word or a simple description (i.e. “pig”, “rain”, “touched
me on the shoulder, back of head, and arm”, etc.). Since stimuli in the V, LV, and LA modalities had been carefully
chosen to avoid any redundancies across modalities, it was a easy to match the great majority of subject responses
to stimulus items in these modalities. Coding the H, K, A, and SA modalities was a bit more challenging as
Appendices
presentations were done non-verbally, but recall was verbal, necessitating a cross-modal translation on the part of
the subject to produce the recall. This meant that subjects gave a variety of descriptions and it required some
judgment in a number of cases to match a recalled item to a particular stimulus. In the case of the H, K and SA
modalities, a number of stimuli were combined to simplify the coding and increase the ability to accurately
distinguish subject responses. (See Appendix A for the differences between the encoding versus stimulus lists.)
Double checking was achieved by coding each recall sequence twice (with at least a few weeks between codings);
there were only five items out of a total of about 370 total whose coding was changed on the second go-through.
Analysis
Chart 1 shows the number of items correctly recalled by each subject for every trial run. There is a distinct linear
trend (r=0.86, p<.001), but the data is rather dispersed, especially for IM>2 values. An analysis of variance showed
significance F(4,30)=4.3, p<.01. However, only the IM=0 group showed any significant difference (p<.01), while
the IM=2 showed near significant difference from the IM=5 group (p<.08).
Chart 1 –
Number of correctly recalled items (each data point represents one subject’s
successful recall count for one trial run)
# items successfully recalled
25
20
Total Recalled
(V+LV+LA+A+H+K+SA)
15
10
5
V + LV + LA
0
-1
0
1
2
3
4
5
6
IM Count (number of items presented per modality)
Also graphed on Chart 1 are the totals for the visual and linguistic items (i.e., V+LA+LV). These correspond to
traditional free recall tests where subjects are presented either pictures or word items for recall. These modalities,
when summed together, have no particular sensitivity to the IM count with a correlation of r=0.04; they show a fairly
constant sum of about 7.5 recalled items (SD=1.4) across all values of the IM count.
Given that V+LV+LA is a near constant, Chart 2 shows the that a large fraction of the variability of the Total
Recalled value can be attributed to the sum of the recalled items in the four A, H, K, and SA modalities. The two
data sets correlate at r=0.89 (p < .01). The sum of A+H+K+SA also has a very linear dependency upon the IM
Count, correlating with r=0.90 (p < .01).
Appendices
Chart 2 – Graph of Total Recalled and Total of A+H+K+SA
25
Total
Recalled
20
# items
y = 2.5x + 8.3
R2 = 0.75
15
A+H+K+SA
10
V + LV + LA
5
y = 2.3x + 1.4
R2 = 0.79
0
0
1
2
3
4
5
6
IM Count (# items per modality)
Chart 3 shows the number of correctly recalled items for all IM Counts, broken out by the Series A and Series B
data. (Series A and B are the two sets of trial runs and of their corresponding subjects.) It also shows the totals for
the A, H, K, and SA modalities labeled as line D in the chart. The trend lines for the A and B series data are rather
distinctly different from each other and appear to break right at IM=3. This discontinuity between the two series will
be covered more in the discussion section. An analysis of variance showed possible significance between the
Total Recalled IM=2 group and the Series B data, with F(2,17)=2.8, p<.09.
The line labeled “C” in Chart 3 shows the average of the percentage of total items in a each trial run which were
correctly recalled. This shows a peak value at IM=2 and then declining thereafter. This indicates that subjects
were recalling a smaller and smaller proportion of the total number of items presented to them. An analysis of
variance showed significance F(4,29)=3.5, p=0.05.
Appendices
Chart 3 – Correctly recalled items for series A and B, and per cent recalled
Series #A
Series #B
(error bars +/- 1 SD)
120%
% correct minus fill sequence
(right axis)
20.0
B
100%
Total
Recalled
80%
15.0
C
A
60%
10.0
40%
A+H+K+SA
5.0
20%
0.0
0%
0
1
2
3
4
IM Count (# of items per modality)
5
% Correct
# Corrrect Items
25.0
6
Recalled Items in Each Modality
Chart 4 – Averages of V, LV, and LA Recalled Items
# of items
Chart 4 shows the graph of the
9.0
means of the recalled items for V,
LV, LA, and A. Of particular note is
8.0
the slight negative correlation
V+LA+LV
between the LA+LV versus V plots
7.0
(one goes down where the other
goes up and vice versa), r= -0.33,
6.0
(p<.07). Given the near constant
value of the V+LA+LV plot, this
5.0
seems like indirect evidence for
LA+LV
recoding going on between the V
4.0
and L modalities. That is, an item
received in say LA gets recoded and
LA
V
3.0
stored in V, thereby lowering the
A
number of V items that can be
2.0
recalled, but apparently increasing
LV
the number of LA items that have
been remembered. There is no such
1.0
systematic variation between the LA
and LV curves, indicating no
0.0
seeming relationship; one might
0
1
2
3
4
5
6
hope for counter-correlation since, by
the memory model used here, they
IM Count
are using a shared resource.
However, the possible occurrence of
recoding into V and/or A may have obscured this relationship. Of note also is that the LA and A modalities are
slightly monotonic in opposing directions (r= -0.9 of the means), perhaps indicating that introduction of the A
Appendices
modality items starting at IM=2 starts to place a slight burden on the auditory systems and lowering its efficiency,
ever so slightly, of passing LA items through to the L modality. This is just conjecture however as there is no
significance to any of these measures (except perhaps the V vs. LV+LA measure above).
Chart 5 – Averages for recalled items in A, H, K, and SA
# of items
Chart 5 shows the plot of the recalled items from the
4.0
other four modalities under study. There is a general
SA
3.5
monotonic increase for all but the H modality, which itself
K
3.0
has a slight puzzling downward trend after IM=3. In
particular, the H, A, and K plots are distinctly similar and
2.5
A
linear at the IM=2 and IM=3 values (varying from 1.5 to
2.0
2.4 recalled items, respectively). Since at IM=2, a
H
1.5
maximum of two items can be recalled (and 3 items at
1.0
IM=3, and so on), no curve can perforce increase at a
slope of faster than one. The very linear relationship in
0.5
the range IM=2 to IM=3 seems to provide additional
0.0
evidence that a linear process is occuring in these
0
1
2
3
4
5
6
modalities. At IM=4 there seem to be distinct nonlinearities introduced into all but the K curve, indicative
IM Count
perhaps some sort of internal effects, such as
inteference or capacity limitations, starting to occur
within those modalities. Of further interest might be the fact that virtually none of these curves correlate with each
other on a within-subject basis (the highest correlation is between SA and K on a subject-by-subject basis, r=0.46,
(p<.05)). This means that a subject performing well or poorly in a given modality seems to have no bearing on how
the same subject does in other modalities for a given trial run.
Recency and Primacy Curves
Chart 6 – Probability of recall of V, LV, and LA as a
function of item’s original presentation position
Probability of Recall
Chart 6 shows the
probability of an item’s
0.9
V
being recalled as a
0.8
function of where the item
LA
0.7
occurred within the
V+LV+LA
presentation sequence
0.6
LA+LV
within its own modality.
0.5
The data across trial runs
and across modalities had
LV
0.4
to be normalized relative
0.3
to each other to
0.2
compensate for the fact
that the number of items
0.1
presented in any given
0
modality varied
considerably depending
upon the trial run and the
modality. For example, in
Serial Position of Recall (normalized for trial run length)
trial run T9, there were
seven LA and seven LV
items, six V items, and four for each of the other modalities. This meant that each modality had a varying number
of items presented which differed across trial runs. Calculating the probability that the ith item presented for
modality m, required normalizing the length of all modality sequences. See Appendix G for details of how the
normalization was performed and the primacy/recency curves were calculated.
0.95
0.85
0.75
0.65
0.55
0.45
0.35
0.25
0.15
0.05
As to be expected, Chart 6 shows fairly solid recency and primacy effects for virtually all combinations of the V, LV,
and LA modalities, one of the marks of short term memory. Note how the LA+LV plot shows that the LV and LA
curves tend to cancel out their individual swings, especially in the somewhat wild LA swings of the latter third of the
Appendices
curve. Such behavior seems indicative of the existence a single underlying shared resource that they are both
making use of.
Chart 7 – Probability of recall of A, H, K,
and SA for trial runs T9 and T10
Probability of Recall
Chart 7 shows the
probability of an item in
Total = combined K, H, A, & SA
the A, H, K, or SA
1.2
modalities being correctly
recalled, depending upon
1
the order it was presented
K
to a subject. This chart is
0.8
organized identically to
H
Chart 6, except that it only
0.6
Total
shows data for trial runs
A
where each modality had at
0.4
least four items (trial runs
T9 and T10). The data was
SA
0.2
restricted to these trial runs
because the other ones had
0
modality sequences of only
two or three items per
modality. Such sequences
are just too short to show
Serial Position of Recall (normalized to trial run length)
much in the way of recency
or primacy effect, especially
considering the small number of subjects. (As it is, because the recall sequences are so short, there is noticeable
quantization effects in the chart from the normalizing operation.)
95%
85%
75%
65%
55%
45%
35%
25%
15%
5%
In Chart 7, only the H curve really shows much of the normal primacy/recency curve. The other modalities seem to
have no readily discernable overall pattern. Notable is the near 100% recall of the initial sound (in either A or SA)
and similarly the near 100% recall of the last kinesthetic action performed across all subjects and all trial runs.
There certainly generally seems to be a primacy effect (except for K). The lack of recency may be due to the small
number of items presented in each modality and due to the relatively long time from the end of each sequence until
recall actually started. Each sequence was intermixed with each other and with the V, LV, and LA sequences. In
addition there was the fill sequence (in V, LV, and LA) which further delayed getting to the recall of these four
modalities. So much delay may have limited the recency effect. Additionally, the sudden dip of the SA curve at the
end may be due to a coding artifact from the last SA item in trial run T9; this stimulus seemed to have been
confounded with another stimulus in that trial run. This may have artificially lowered the end of the SA curve. It
would also explain the unexpectedly flat response at IM=4 for the SA curve in Chart 5.
Discussion
The modality cognition model (Figure 1) was used to make a number of predictions and which were tested in this
experiment.
Total Working Memory Capacity
As predicted, the total working memory capacity seemed to be increased by using modal specific stimuli. Chart 1
shows subjects recalled about 7 to 8 items in the V, LV, and LA modalities. Chart 2 shows subjects were able to
reliably recall some eight to 15 additional items by use of additional modalities. Similarly organized free recall
experiments typically report recall lengths of 5 to 10 items for arbitrarily length lists of unordered stimuli lists
presented in either V or A (Miller, 1956). In this experiment, the 15-23 items recalled by subjects is some 250% to
300% higher than those other reported rates, and similarly higher than the base rate of some 7.5 words for the
V+LV+LA levels recorded in this study. The much higher recall rate in this study certainly seems indicative of some
additional memory aid being employed.
Appendices
Other techniques, such as categorizing the stimuli, have been used to increase recall rates. If free recall lists are
organized into categories and presented with items blocked together by category, then the recall rate seems to be
improved by approximately 15% to 70%, depending upon the study (Dallett, 1964; Cofer, Bruce, & Reicher, 1966).
This is still well short of what was found here (which had random presentation of stimuli).
As predicted, and as illustrated by Charts 1 through 3, there appears to be a linear increase in recalled items as IM
is increased.
As for the predicted leveling off of working memory capacity as IM continues to increase, the discontinuity is
unfortunate in Chart 3 between the Series A and Series B data at the IM=3. Both the subjects and the organization
of the trial runs changed right at this juncture when switching from Series A to Series B. This makes it difficult to
discern whether the apparent leveling off is due to the increasing value of IM, the use of a different set of subjects,
or the differing set of cards used in the trial runs. One reasonable interpretation of difference in slopes of the two
series is indeed that Series B represents a leveling off of the linear increase shown by the Series A trial runs.
Unfortunately, the same data can also be reasonably interpreted as a noisy linear increase all the way form IM=2 to
IM=5 (see Chart 1). As such, it remains for future work to establish if the predicted leveling off of total working
memory capacity does indeed occur as expected.
Seven Plus or Minus Two
The sum of the V+LV+LA values across all trial runs show a very consistent average of seven to eight items being
recalled from these three presentation modalities. Indeed, only two out of 33 trial runs with subjects had any
V+LV+LA score higher than nine. The strong consistency reconfirms the seeming limits of these three modalities
operating together. The items recalled above and beyond these three modalities are strongly indicative that some
other memory mechanism is operating in addition to the usual visual/verbal one as tested by traditional free recall
tasks.
Primacy / Recency Effects
There is a clear primacy / recency effect for the V, LV, and LA modalities as expected and as shown in Chart 6.
Chart 7 is not nearly so clear with the A, H, K, and SA modalities. At best there either seems to be some evidence
(the primacy effect seems strong, H shows clear recency, and the sum of the curves shows primacy/recency) or
there are explanations for lack of a clearly discernable effect. It seems more data needs to be collected to resolve
this. However, it seems like there is reason to cautiously expect that this effect exists, especially considering the
small numbers of trials and of subjects in this study.
Other Evidence
Other evidence for the modality cognition model, mentioned earlier, can be found on Charts 4 and 7. On Chart 4,
the countervailing swings of the V and LV+LA curves indicate either a shared resource or significant crossencoding and cross-storage is occurring. Similarly, on Chart 7, the countervailing swings of the LV and LA curves
indicate a shared resource constraining their individual capacities. This shared resource has often been referred to
as verbal memory (Baddeley, 2000) or phonological memory (Penney, 1989). This shared construct is here
characterized as a linguistic representational system and which can readily take its input (with literate individuals)
from either V or A. Note that this linguistic model also nicely integrates with other linguistic inputs such as with
Braille or American Sign Language (KL and VL).
There is other evidence developed during the course of the study, but not reported in detail due to lack of space.
Independently of presentation order, subjects frequently clustered stimuli from the same modality into sequences
during recall. According to Penney (1989), when subjects are presented stimuli in varying modalities and which are
also organized by category, during recall they have a strong preference for clustering items via modality as
opposed to category. This certainly indicates that modality is a stronger associative bond than category seems to
be, and that perhaps modality association occurs because it is a deeper underlying mechanism than category. The
fMRI evidence to date seems to indicate this with the demonstration that the category of “living things” is encoded
across several modalities (Schill-Thompson, et al., 1999).
Appendices
It has been suggested that the modality enhancement effect found here is nothing other than a fancy form of
categorization (Greeno, 2001). Appropriate categorization can improve free recall by 50% to 70% (Dallett, 1964),
but this is only a fraction of the improvement that using multiple modalities seems to have offered here.
Bower, et al. (1969) showed a 150% to 350% improvement on recall when stimuli were carefully categorized and
hierarchically organized, as opposed to the same items being presented in random order. Bower presented all
items at once (via a printed card) in a visually, spatially, and semantically organized hierarchy and gave subjects
approximately four minutes to study the hierarchy (of up to 112 items divided into some 30-40 categories). Test
subjects had a visual display whose items were carefully associated as to semantic content and spatial grouping so
as to make conceptual sense, whereas controls had items randomly organized into the same spatial groupings.
Additionally, each subject had four opportunities to see the display and write their recall. The present study of
course employed an entirely different presentation technique. However, it is notable that random presentation
caused a tremendous detriment in Bower, but random organization in the present study still lead to superior recall
performance. One can only conjecture what improvement it would have been to the present study’s subjects to
have had all stimuli grouped by modality during presentation.
Another way of looking at Bower’s results is through the lens of the current modality model. Bower’s subjects were
presented stimuli in V, LV, and SV, and had ample time and cause during the several minute study period to recode
to L and A, and possibly into other modalities. Additionally, the items were grouped into small categories of two to
three items per category. As such, Bower’s results might possibly be reasonably interpreted as comparable to the
current study’s – Bower’s subjects were simultaneous using multiple modalities, just as the current study’s subjects
did, and achieved similar results, and in additon had the benefit of semantic categorization. It is also clear, both in
the case of Bower, the present, and other category-related studies (Dallet, 1964; Pollio, Richards, and Lucas,
1969), that organizing stimuli and hierarchically grouping them is of enormous benefit in recall.
Other studies have demonstrated greatly enhanced recall capabilities and/or recency effects via categorization or
similar organization, but these too can be interpreted in terms of utilizing modalities beyond the V, LV, and LA of
traditional free recall tasks. Watkins & Peynircioglu (1983) used six categories (riddles, sounds, objects, favorites,
quiz questions, and drawings) and which were run in two groupings of riddle-sound-objects and favorites-quizdrawing. The first grouping can easily be interpreted as presentations in LV-A-V and the second group as E-L-K/V.
They showed subjects were able to recall about five items from each of three categories; this is a performance
comparable to the present study’s. As a counter-argument, studies which try to use purely taxonomic categories
(except for Dallet (1964)), tend to show little benefit from categories (Cofer, Bruce, Reicher, 1966).
The net result is that the modality utilization of the current study could be interpreted as a form of categorization.
However, categorization only seems to succeed where multiple modalities are employed. This makes it seem more
likely that successful categorization used to enhance recall should be interpreted as a special case of use of
multiple modalities.
Conclusion
A number of working memory phenomena were consistently reproduced in this experiment (the various well-known
effects such as fixed capacity, primacy, and recency) with other phenomena occurring on top of these (greatly
increased working memory capacity for one). Various sources of evidence seem supportive of the modality
cognition model proposed here. While these modality effects can be explained as a form of categorization, it
seems as likely, or possibly even more likely, that categorization effects can be explained as a form of modal crossencoding. In particular support of this view point is the recent fMRI evidence of the modal organization of working
memory and likely cross-modal encoding of categories.
In conclusion, this study has presented a model for how certain aspects of relatively low-level human cognition
occurs via a number of distinct modal loci processing centers ordered into at least two layers. Evidence was
developed to show support for a linguistic and spatial modalities as well as for the sensory based ones. In fact, one
reasonable interpretation of the present study’s data is that each of the modalities seem to have a working memory
capacity of about three items in a free recall task. If this interpretation is valid, then Miller’s famous seven plus or
minus two dictum might really be something more akin to say seven = 3+2+2 (V+LV+LA).
Appendices
References
Baddeley, Alan D. & Hitch, Graham (1993). The recency effect: Implict learning with explicit retrieval?. Memory
and Cognition, 21(2), 146-155.
Baddeley, Alan (2000). Short-Term and Working Memory. In E. Tulving & F. I. M Craik (Eds.) (eds.) Oxford
Handbook of Working Memory. Oxford : NY.
Bower, Gordon H., Clark, Michar C., Lesgold, ALan M., and Winzenz, David (1969). Hierarchical retrieval schemes
in recal of categorized word lists. Journal of Verbal Learning & Verbal Behavior, 8(3), 323-343.
Bower, Gordon H. (2000). A Brief History of Memory Research. In E. Tulving and F. I. M. Craik (eds.) (eds.) The
Oxford Handbook of Memory. Oxford : NY.
Cofer, C. N., Bruce, D. R., and Reicher, G. M. (1966). Clustering in free recall as a function olf certain
methodological variables. Journal of Experimental Psychology, 71, 858-866.
Crowder, Robert G. (1993). Short-term memory: Where do we stand?. Memory and Cognition, 21, 142-146.
Dallett, Kent M. (1964). Number of categories and category information in free recall. Journal of Experimental
Psychology, 68(1), 1-12.
Dehaene, S., Spelke, E., Pinel, P., Stanescu, R., Tsivkin, S. (1999). Sources of Mathematical Thinking: Behavioral
and Brain-Imaging Evidence. Science, 284, 970-974.
Duis, Sandra S., Dean, Raymond S., Derks, Peter (1994). The modality effect: A result of methodology?.
Elman, Jeffrey L. (1990). Finding structure in time. Cognitive Science, 14, 179-211.
Goldstone, Robert L., Barsalou, Lawrence W. (1998). Reuniting perception and conception. Cognition, 65, 231262.
Greeno, 2001. Personal conversation.
Groeger, John A., Field, David, Hammond, Sean M. (1999). Measuring memory span. Quebeck 98 Conference
on Short-Term Memory (Jun: Quebec City, PQ, Canada).
James, William (1890). The principles of Psychology. Dover Publications: NY.
McKone, Elinor & Dennis, Christopher (2000). Short-term implicit memory: Visual, auditory, and corss-modality
priming. Psychonomic Bulletin & Review, , 341-346.
Miller, George A. (1956). The magical number seven, plus or minust two: Some limits on our capacity for
processing information. Psychological Review, 63(2), 81-97.
Nairne, James S. (1990). A feature model of immediate memory. Memory and Cognition, 18(3), 251-269.
Nersessian, Nancy J. (unpublished). Abstraction via Generic Modeling in Concept Formation in Science.
Penney, Catherine (1989). Modality effects and the structure of short-term verbal memory. Memory and Cognition,
17(4), 398-422.
Pollio, Howard R., Richards,Steven, and Lucas, Richard (1969). Temporal properties of category recall. Journal of
Verbal Learning & Verbal Behavior, 8, 529-536.
Ratcliff, Roger & McKoon, Gail (2000). Memory Models. In Endel Tulving & Fergus I. M. Craik (eds.) (eds.) The
Oxford Handbook of Memory. Oxford : NY.
Schill-Thompson, S. L., Aquirre, G. K., D'Esposito, M. & Farah, M. J. (1999). A neural basis for category and
modality specificity of semantic knowledge. Neuropchologia, 37, 671-676.
Schurman, D. L., Bernstein, Ira H., Proctor, Robert W. (1973). Modality-specific short-term storage for pressure.
Bulletin of the Psychonomic Society, 1(1B), 70-75.
Schweickert, Richard (1993). A multinominal processing tree model for degradation and deintegration in immediate
recall. Memory and Cognition, 21, 168-175.
Shiiffrin, Richard M. (1993). Short-term memory: A brief commentary. Memory and Cognition, 21, 193-197.
Watkins, M. J. & Peynircioglu, Z. F. (1983). Three recency effects at the same time. Journal of Verbal Learning &
Verbal Behavior, 22, 375-384.
Weiskrantz, L. (1987). Neuroanatomy of memory and amnesia: A case for multiple memory systems.
White, Theresa L. & Treisman, Michel (1997). A comparison of the encode of content and order in olfactory
memory and in memory for visually presented verbal materials.
Download