Mental imagery - Center for Cognitive Science

advertisement
Pylyshyn: Pictures in the brain?
Return of the mental image:
Are there pictures in the brain?
Zenon Pylyshyn
Rutgers Center for Cognitive Science
Abstract
In the past few years there appears to have been a revival of interest in the study of
mental imagery. Emboldened by new findings from Neuroscience, some people have
returned to the idea that mental imagery involves a special format of thought, one
that is more like a picture than a sentence or a logical calculus. But the evidence
and the arguments that caused both conceptual and empirical problems for the
picture theory in the past 30 years have not gone away and the new evidence does
little to justify the recidivist trend.
The format of thought
There is widespread agreement (among both psychologists and laypeople) that thought
comes in at least two flavors, pictorial (or visual) and verbal. This idea, which derives from our
own very persuasive subjective experience, has been enshrined in what is called the “dual code”
view of thinking and memory [1]. Yet neither the claim that thought is expressed in what we
experience as visual images nor the claim that it is expressed in what we experience as inner
dialogue can be correct because it is easy to show that most (perhaps all) of our thoughts are
beyond the reach of our conscious experience.
The words we appear to think with carry very little, if any, of our thoughts, since they
presuppose much that is not expressed in what we experience as “inner dialogue”. Just as in our
discourse with one another most of what we communicate is unstated, so in our inner dialogue
we use rely on ellipses, demonstratives (like “this” or “that”), deixis (like “me” or “now”),
indirect references, presuppositions, entailments and so on. I think to myself, “I’d better finish
this part of the paper or I will be late for my meeting.” But what is “this part” and how big a part
is it? What meeting would I be late for? How late would I have to be for it to count as being “late
for a meeting”? And who does “I” refer to? The thought also presupposes that I want to go to
some meeting, otherwise I would not have thought about being late for it – though that is unsaid.
In our inner dialogue we follow the same rules of discourse that we follow in conversation,
where the principle is that we should not state what we believe the hearer already knows. But in
inner conversation if we leave things unsaid then the unsaid part of our thoughts must have been
in some other format that we did not experience, so at least that part of the thought was actually
not expressed in language after all. Similarly when we think in pictures what the pictures
represent, how we interpret them and what they mean cannot be found in the pictures
themselves. All pictures are deeply ambiguous, As the philosopher Wittgenstein pointed out, a
picture of a person walking up a hill is identical to a picture of a person walking down a hill
backwards. But our images are never ambiguous since we know what we intend them to
represent and this knowledge itself is neither verbal nor pictorial.
More recently people have focused primarily on the pictorial aspect of thought. Even if most
of our thoughts are unconscious, it could still be that those that are experienced as visual images
might play an important part in our mental life. Famous thinkers are always being quoted as
1
Pylyshyn: Pictures in the brain?
saying that their ideas did not come to them logically but appeared to them in mental pictures [2].
What exactly this means, other than that the person had a vision-like experience, is far from
clear. The experimental evidence for the assumption that we think using a picture-like format
(or, as it is sometimes referred to, a depictive format) is far from compelling, even if what is
being claimed could be made clear. I have argued that the differences between pictorial and
other (verbal, logical?) forms of reasoning that are observed in experiments, are more likely due
to what our thoughts are about than to the form that these thoughts take.
Discussions of mental imagery very often confound the form (or format) of thoughts from
their content, or what they are about. There is clearly a difference between thinking about how
something looks and carrying on an inner dialogue about abstract ideas. But this is a difference
in topic, much like the difference between discussing the newest fashions in clothing and having
a discussion about freedom of the will. This difference, by itself, might well account for such
experimental findings as that vision interferes with mental imagery, since one might just as
reasonably expect that thinking about some topic (say arithmetic) might be more difficult if one
were trying to ignore an irrelevant discussion on the radio on that same topic. Because thinking
about how something looks is different in so many ways from other kinds of thoughts (e.g., it
focuses strongly on quantitative spatial relations) it does seem reasonable that there might be a
different format for visual information than for nonvisual information. Yet if there is something
special about the format in which we think when we have the experience of “seeing an image” in
our mind, science has not yet revealed what it is, despite some 30 years of very active research.
The picture theory of mental images
Despite the uncertainty about how we might distinguish different formats of thought, there
continue to be persistent claims that mental images have a special picture-like format that has
been referred to as “depictive”. One of the few explicit statements concerning what this means
([3], pp5), states that a depictive representation is “… a type of picture, which specifies the
locations and values of configurations of points in a space. … [in which] each part of an object is
represented by a pattern of points, and the spatial relation among these patterns … correspond to
the spatial relations among the parts themselves. … not only is the shape of the represented parts
immediately available to appropriate processes, but so is the shape of the empty space … one
cannot represent a shape in a depictive representation without also specifying a size and
orientation….” For obvious reasons I refer to theories like this as “picture theories” since their
main emphasis is on the pictorial or iconic aspect of the format of mental images.
Over the past three decades a very large number of experiments have been carried out and
cited in support of the picture theory. Among the most widely cited is the finding that it takes
longer to scan one’s attention a greater distance on an image [4] (see Box A). Other experiments
appear to show that it takes longer to “see” some visual detail in a “small” image than in a
“large” image (e.g., it takes less time to report seeing whiskers on a very large image of a mouse
than in small image of a mouse). Related experiments show that when people are asked to turn
their head and judge from their image when a pair of points would become indistinguishable, or
when they are asked to make judgments about the identity of imagine vertical, horizontal or
oblique gratings, subjects gave very nearly the same results to those they would have produced if
they had been presented with the patterns visually. Other experiments appear to show that
people can only remember certain things (e.g., which side of your front door the knob is on) by
first recalling an image and “reading off” the result, and that they tend to commit themselves to
details that would be gratuitous if they were not constructing a picture-like representation (e.g.,
when asked to imagine a printed word, you tend to select either upper case or lower case letters
2
Pylyshyn: Pictures in the brain?
and not something indeterminate). There are also a very large number of studies that examine
whether entertaining an image impairs or enhances performance on certain visual tasks, on
visuo-motor tasks, on recall tasks, on adaptation tasks, and so on. In each case researchers
concluded that the results supported the assumption that images are pictorial entities that are
examined with the visual system [see the extensive review in 5,6].
Although some version of the picture theory of mental imagery is very nearly universally
accepted, it has turned out to be rife with both conceptual and empirical problems. One of the
main difficulties that picture theories run into is that every experimental finding cited in support
of the “pictorial” aspect of mental imagery can be more easily and more naturally explained by
the simple hypothesis that when asked to imagine something, people ask themselves what it
would be like for them to see it, and then they simulate as many aspects of this staged event as
seem relevant, as they know about, and as they are able to mimic. Thus for example, when asked
to imagine a printed word we choose whether to imagine printing in lower or upper case because
a real printed word would be one or the other (on the other hand since we cannot represent every
property of a printed word, we most likely do not choose a particular font or type of paper).
I have looked at a very large number of experiments in some detail and have found very few
(see next paragraph) that do not succumb to the explanation that when we have a visual image of
something, we are simply thinking about what it would look like if we saw it. We do this not for
any perverse desire to please the picture-theorists, but for many different reasons; we do it
because recall of past events is better when we go through a scenario of recalling things in the
order in which they occurred, because imagining things usually results in recall of similar past
experiences, because when we think about how to solve a particular problem we think about the
sequence of events we would go through, and perhaps even more relevant in the context of an
experiment, we do it because the instruction to “imagine” something entails imaging what would
happen in a real situation. But none of these reasons exposes a particular mental mechanism or a
particular format; we could equally have visualized the problem differently, or not at all. In
other words, it may well be that the experiments simply do not reveal anything abut the format of
our thoughts. I consider this alternative explanation (or closely related ones) to be the “null
hypothesis” against which one ought to compare imagery theories, because it makes no
assumptions about format – all the explanation resides in the tacit knowledge that people have
about how things tend to happen in the world (regardless of how this knowledge is represented),
together with their use of well-known psychophysical and cognitive skills. These skills include
the ability to mark and track elements in a visual field, perhaps using the visual indexing or
FINST mechanism (described in [7]), to mark and recall directions in proprioceptive space, to
compute time-to-collision, to generate time intervals proportional to known quantities, and to
recall things better when in a situation resembling one that we were in when we saw it, and so
on).
Although a few experimental findings do not fit the simulation-from-tacit-knowledge
explanation, none of them fit the picture theory either. Take for example the classical “mental
rotation” finding, where the time it takes to determine that two figures are identical except for
orientation has been shown to be a linear function of the difference in orientation between the
figures’ principal axes. This result is obtained even if observers do not attempt to “rotate” their
image and even when they do not experience the phenomenology of a rotating image. But this
result does not support the claim that the “image” behaves as though it were bound to a surface
that is rotated in a rigid and holistic manner. It is clear that some articulated process is taking
place which, at the very least, involves consulting each of the figures in an iterative process. Eye
tracking records and the finding that the apparent rate of rotation depends on both the complexity
3
Pylyshyn: Pictures in the brain?
of the figure and the comparison task, make it clear that the figure’s shape does not move rigidly
through intermediate orientations as many had thought. There has been at least one proposal that
attributes the rotation effect to the necessity of solving this problem within certain computational
resource constraints [8]. Similarly the ability of mental imagery to yield perceptual-motor
adaptation effects [9] appears to be due not to tacit knowledge, but to the fact that imagining
one’s hand position superimposed on a visual scene provides all the required conditions for
adaptation without having to assume that a pictorial object is created. These and many other
cases that are frequently sited in support of the picture theory are reviewed in [5,6].
The basic problem with any appeal to inherent properties of a mental image is this: Since it
is your image you can make it have very nearly any property, or exhibit any behavior you wish
(and so in many circumstances you make it recreate whatever you think would happen if you
were seeing some real event occur – such as the example of superposition of color filters
illustrated in Box B). Consequently, in such cases nothing at all is gained by assuming that
images are pictorial in form. Those who believe that various phenomena, such as the increased
time it takes to scan greater distances in one’s image, or to distinguish details in a smaller image,
are due to the form of the image, reply that the phenomena could not have been “faked” since
people often cannot predict what such experiments will show nor they cannot articulate answers
to certain questions when they do not use a mental image. But there is no question of “faking”
answers or of being disingenuous – people simply do what they are supposed to do when asked
to “image” something: They think about what would happen if they saw it and they use their tacit
knowledge to generate some appropriate sequence of events in their mind (and thus to exhibit
appropriate response times). Similarly subjects answer questions about their image by
consulting their tacit knowledge and their memory of the appearance of the relevant situation.
The concept of tacit knowledge is central in cognitive science where it is clear that access to such
tacit knowledge depends on how the question is put and what the subject believes the task to be.
Those who believe that imaginal thinking is essentially pictorial often explicitly deny the
claim that there are literally pictures in the brain. Yet the way that certain classical behavioral
findings (e.g., increased time to scan greater distances or report details in small images) are
explained requires such a literal picture. Something referred to as a “functional space” (such as a
matrix data structure) will not do since such a space, being a fiction, can have any properties we
like. That we find certain properties “natural” (e.g., that to scan a greater distance in a matrix
requires that we pass through more empty cells) simply shows that the theorists tacitly assume
the matrix to be a simulation of real space, since otherwise one would not need to “scan” over its
cells in any particular order.
It is important to see that the literal picture is essential if the format or the medium of the
image, rather than something else (e.g., tacit knowledge), is to explain the standard imagery
results. For example, in order to explain why it takes longer to mentally scan a greater distance
on a mental image, something about the image format or medium would have to make the
following equation come out to be true: time = distance  speed. This can be either a literal
distance or some set of brain properties that are related by the same equation and are used to
compute time. While the latter is a logical possibility, the odds against it are immense when you
consider the very large number of constraints such an analog medium would have to satisfy, for
example it would have to satisfy the Euclidean axioms, Pythagoras’ theorem, many physical
laws governing acceleration, bouncing and recoil, and so on for all the properties that are often
true of the behavior or our images (which includes momentum, see [10]).
4
Pylyshyn: Pictures in the brain?
It is much more plausible that most of these phenomena have little to do with the format of
mental images, but rather with how people understand the task of examining an imagined map.
For example, no picture format account can explain why the mental scanning effect disappears if
subjects believe that what they are to imagine is a process that would not take longer for larger
distances (or one in which something other than scanning is emphasized – such as computing
relative directions from the image). While it seems obvious that this is the case (try imagining
some scene and switch your attention from one place to another without moving continuously
through the void in between), we have gone through the trouble of showing this in several
experiments [11]. Such demonstrations illustrate what I have referred to as the cognitive
penetrability of phenomena alleged to be constrained by the nature of the format or medium (i.e.,
it shows that the observed phenomena change in a rational way as your beliefs or goals change)
which provides strong evidence that the effect was not due to a property of the image format (or
the way the image is displayed in the brain).
Has neuroscience resolved the imagery debate?
The argument over the picture-theory of mental imagery has gone on for at least 30 years
(not counting the earlier arguments between Locke and Berkeley) and many have grown tired of
the claims and counterclaims, as well as the personal tug-of-war between one’s strong and
persistent intuitions (“How can you seriously doubt that your vivid mental images do not have
inherent pictorial or spatial properties?”) and the equally strong case that can be made against
specific claims concerning the inherent property of images. I suspect that were it not for some
recent developments in neuroscience, the debate might have waned – not because anyone was
convinced to change their mind (though a few have) and certainly not because the issue is in
principal undecidable (at least no more so than any scientific claim) – but because many people
(including the author) are convinced that the debate cannot finally be settled without some
radically new formulations of the research questions and some detailed proposals. That the
argument has not died out is due largely to the fact that the debate has moved into a new arena as
people claim to have found neuroscience evidence that shows what would otherwise have been a
grotesque and unthinkable idea; that when we engage in visual imagery we construct twodimensional pictures on the surface of our visual cortex, which we then perceive visually. In this
opinion piece I will not review the huge body of behavioral evidence about what images are like,
but rather will sketch a number of reasons why the debate has not been illuminated in the least by
the new evidence.
In recent years, picture theorists have received an unexpected boost from the following kinds
of findings from neuroscience.
1. Is has been shown (most clearly in monkeys), that when a visual pattern is presented to the
eye, a homomorphic (continuously deformed) pattern of activity occurs on the visual cortex.
2. When people are asked to create a mental image, activity is found to occur in primary visual
cortex (this remains controversial, but let’s accept it for the sake of argument). Thus imagery
appears to lead to activity in the region that is known to be retinotopically mapped.
3. When people are told to create a large image, there is more activity in the anterior parts of the
medial occipital region, a pattern that is similar to the activation produced by large retinal
images. This is interpreted as showing that image size is a neurologically instantiated
property, and thus not merely phenomenological.
From these findings people have concluded that images not only occur in the visual cortex,
but that they are retinotopically displayed there, much as visual information is thought to be
5
Pylyshyn: Pictures in the brain?
displayed there on its way to being interpreted. This idea suggests that the main difference
between vision and visual imagery is that the former is caused by light on the retina whereas the
latter is caused by the higher cognitive system.
In addition to the evidence referred to in (1) – (3) above, which comes primarily from neural
imaging studies (though there is also evidence from evoked potential studies), there is evidence
from clinical neurology that has also been interpreted as supporting the idea that a neural display
is involved in mental imaging. Martha Farah [12] showed that a patient who developed tunnel
vision after unilateral occipital lobectomy also developed “tunnel imagery,” and Edoardo Bisiach
[13] reported that patients who exhibited hemispatial visual neglect also provided fewer details
about the corresponding neglected side of their mental image. Thus the case for there being a 2D
spatial display in the brain seemed quite strong and led people like Stephen Kosslyn to declare
the imagery debate to be resolved [3].
If one looks at the evidence and at the form of the arguments, however, the case is far from
clear. Consider first the evidence that during imagery the primary visual cortex is activated in
the same manner as it is activated in visual perception. The involvement of some part of the
visual system in visual imagery is an interesting finding, and not too surprising given the
similarity of the two experiences. But that tells us nothing about the form of the representation in
the two cases; it could well be that the representation in both cases takes the form of a data
structure such as occurs in artificial intelligence systems (and which I described in the earliest
critique of mental imagery theories [14]). But more importantly, this argument ignores the very
significant differences between a retinal image and a mental image.
The pattern of activation on the visual cortex is in retinal coordinates and thus changes with
each saccade, it is limited in resolution and color sensitivity essentially to a region of about 2
degrees of visual angle, and exhibits a wide variety of phenomena not shared by mental images,
some of which I will describe below. In contrast, the mental image is in allocentric or
environmental coordinates (and therefore is stable despite movements of the not only the real
eyes and the mind’s eye, but also of the body in space). It is also panoramic in scope (both
phenomenologically and behaviorally), reportedly even extending behind the head [15]. Of
course a mental image may not be exactly like a retinal image. But if one is to take comfort from
the finding that images activate a retinotopically mapped area of the cortex, then one had better
take into account the difference between such a pattern of activity and the mental images that
have been studied for 30 years, and that correspond so satisfyingly to what we experience when
we entertain a mental image.
The disanalogy between the retinal-cortical image are considerable and cover a wide range of
properties such as the apparent size of mental images and the distance between places on the
image. While large mental images may activate different locations in the cortex (corresponding
to where parafovial fibers project to in the visual cortex), this in itself tells us only that mental
images have a neural basis – and who, but a dualist, would have doubted that? But this sort of
parallel between retinal size and brain locus does not help to explain why it takes longer to scan
greater distances or why it takes less time to report details from larger images. Such facts require
more than evidence of a different locus in the brain for large than for small images, they require a
literally greater size – because that’s what does the explaining in the picture theory (bigger
picture = easier to see = faster report).
The difference between patterns on the retina – which, as Roger Tootell and his colleagues
have shown [16] to be mapped onto the primary visual cortex – and the mental images that we
experience, which have been studied so thoroughly, is even deeper than indicated in the above
6
Pylyshyn: Pictures in the brain?
discussion. Images on the retina have yet to be interpreted. But mental images are the very
interpretation itself; not only are they not in need of interpretation (after all, you knew what they
were when you created them!) but there is every reason to believe that they cannot be visually
reinterpreted. Of course one can think about them and figure out what would happen if we did
things to them, like rotate them or combine them with other patterns, but we can only do so when
the combinations are easy to infer, not when they involve a clearly visual (re)perception, such as
the example shown in Box C.
Similarly, if you construct a mental image from a description, the result, however much it
might feel like a vivid picture, does not have the signature properties of visual perception.
Consider the example in Box D. Try it. The result is a figure known as a Necker Cube which is
seen as a three-dimensional wire frame cube that spontaneously reverses as you watch it. Here is
another such example from Geoff Hinton [17]. Imagine a solid cube about one foot on each side.
Pick it up and hold it with one vertex on the table and your finger on the diagonally opposite
vertex, so its diagonal axis from the table to your finger is vertical. Now with your free hand
point to the remaining vertices (the ones not in contact with either the table or your finger). As
you do so, count them. Most people point to 4 vertices lying on a plain, which would give the
cube a total of only 6 vertices, whereas of course there must be 8.
Not surprisingly, objects in the mind do not have the properties of real objects. Nor is
accessing information from them similar to how we visually access information from a scene.
Imagine your name written on a wall before you. Read the letters backwards. It is very difficult
to read information off an image in any order with anything like the freedom you have to read it
off a display, which should not be the case if it was a display. Another revealing way that mental
images are different from the images displayed on the retina or on the cortex is that they are not
subject to Emmert’s law. If you have an image on your retina, such as an after-image (which
presumably also projects to your retinotopically-mapped cortex), and you look at a wall or other
surface, the apparent size of the image depends on how far away the surface is: the further away
it is the larger the apparent size of the retinal image. But this is not true of a mental image (Of
course not: it’s your image and you know how big the object is that you are imagining!). This in
itself provides strong evidence against the cortical picture account of mental imagery.
Your mental image is also of a colored, dynamic, three-dimensional world and there is no
cortical mapping corresponding to all these properties. The only topographical layouts in cortex
that map perceived space are the two-dimensional retinotopic ones in V1. Yet mental imagery
findings apply equally in three dimensions as in two, so if the two-dimensional phenomena (e.g.
scanning, mental rotation, size effects) are to be explained by appeal to a projection on a twodimensional cortical surface, how are the corresponding otherwise-indistinguishable threedimensional image results to be explained? Similarly, if the two-dimensional cortical display is
the front end of both the visual and the imagery system, the displays should interact with the
motor system in the same way. Yet they do not. Reaching for an imagined object does not show
the signature properties that reaching for a perceived object does and many visuomotor
phenomena such as smooth pursuit are not obtained for imagined motion [reviewed in detail in
18]) (indeed there is some doubt that imagined objects can be made to move smoothly through
imagined space [19]).
As for the clinical neurological findings, there are many more findings that contradict the
cortical display theory than there are that support it. If mental imagery and vision both used a
cortical display, it is hard to explain why vision and imagery capacities appear to be doubly
dissociated: There are cases of cortically blind, cortically color-blind, visually agnosic people
7
Pylyshyn: Pictures in the brain?
who have intact mental imagery and people with no mental imagery who have normal vision
[20]. Indeed, virtually all the experimental findings on mental imagery have been found in blind
people. The Farah et al., finding of tunnel imagery following surgery that produced tunnel
vision, might well be due to the fact that the patient had nearly a year of experience with her
tunnel vision before being tested, sufficient time to have a good idea of what things looked like to
her. So long as “imaging” something means to simulate what it would look like, this subject was
doing precisely what she was asked to do. As for the imaginal-visual neglect parallel reported by
Bisiach, there are now many other accounts that show that visual neglect and imaginal neglect
are dissociated. Moreover, when they do work in tandem, both types of neglect may be due to
the a failure to orient attention to space – not to space in a cortical display, but space in the word,
as I will suggest below.
How are images “spatial”?
Tell me where is fancy bred,
Or in the world or in the head?
[The Merchant of Venice]
One property that virtually everyone ascribes to images is spatiality. It certainly appears to
be appropriate to say of images that they have an up-down and a right-left direction: It seems
intuitively obvious that some things in an image are to the right of others. Indeed, it is not
surprising when hemispatial neglect occurs in both vision and imagery since in some important
sense left-right in an image is the same as left-right in perception. Nor is it surprising that we
can point at some things in our image (and can show stimulus-response compatibility effects
when we have to respond with the left hand to things on the right side of the image). These
phenomena appear to be puzzling for someone who does not believe in a picture-theory – for in
what sense does a symbolic representation have a left or a right? I now believe that the answer
to this question is quite simple, but we have been looking for it in the wrong place. What has
spatial properties is real space and we use our perception of real space provides the framework
from which mental images inherit their spatiality.
Consider first the images you have with your eyes open (we might call these projected
images since we seem to project them onto a visible scene). It is easy to see in this case how
perceived space can serve as the framework that imparts spatial properties to an image. All you
need to do (for example in mental scanning experiments such as illustrated in Box A) is think
such thoughts as, “the lighthouse is located here, the ship is located here, the arch is located
here…” and so on where the locatives “here” pick out any sort of individual elements in a visual
scene (which may be nothing more than specks in a uniform texture). Once you have bound
thoughts of individual items to particular elements in a scene, the rest of the phenomena come
from vision (how you do this “binding” was discussed in an earlier TICS article [21]): you can
scan between them, move your attention or your gaze to them, and so on. For this reason many
imagery phenomena are more robust when carried out with eyes open, and some can only be
carried out that way.
But what about the images you have in the dark or with your eyes closed? Here we have to
appeal to another well-known cognitive capacity, the capacity to accurately orient in space using
nonvisual information, chiefly proprioception, but also audition and other modalities. As Fred
Attneave showed [15], people are very good at orienting, even to objects located behind them.
Our spatial sense has also been shown to take into account self-motion. We accurately update our
coordinates (both location and orientation) as we move, and we cannot updated them by
imagining self-motion [22]. If we can orient to a small number of landmark places in our
immediately perceived space – the space we sense through proprioceptive, kinesthetic, tactile
8
Pylyshyn: Pictures in the brain?
and auditory inputs, and other senses that contribute to our exquisite sense of where we are in
relation to our bodies and our environment, then we can then do exactly what I claimed we do
when we project images onto visual scenes: We can bind objects of thought to perceived
locations in space. None of this requires that we assume a spatial display in the head, the one in
the world will do!
This way of looking at the spatiality of images may explain why blind people exhibit most of
the phenomena of mental imagery. Moreover, the cases in which patients with hemispatial
neglect also exhibit neglect to their visual imagery is consistent with this account, since there is
good reason to believe that neglect involves a failure to orient attention in space (real, not
represented space) [23].
Is there something missing in this way of viewing imagery?
I have been defending the view that the results of hundreds imagery experiments do not tell
us anything about the format of mental images and in particular they do not support the intuitive
idea that images are some sort of picture. Why is that so hard to accept? And why do we persist
on searching for pictures, first through behavioral studies that purport to show that images have
metrical properties, and now through neural imaging and clinical neurology studies aimed at
revealing a neural display on which images are displayed? Perhaps it is because I am not
offering a detailed theory of mental imagery. It is, rather, a framework for a theory or, if you
prefer, a set of desiderata that should be met by such a theory. But even that does not explain the
downright hostility that many feel towards such an account. Perhaps it is because something
important is missing from such a story: it does nothing to account for the very strong intuitions
we all share that when we experience a mental image we are looking at something that it, at least
in some way, pictorial. But if all that is going on when we image is that we are constructing a
representation that is no different in form from representations underlying other kinds of
knowledge and memory – perhaps some sort of symbolic description – then why on earth should
it feel like we are looking at something that resembles what we are thinking about? This is not
an easy question because it concerns the connection between brain processes, information
processes and conscious experience – nothing less than the ubiquitous and enigmatic mind-body
problem.
But here is a way to look at, if not to resolve, this mystery. When we perceive a visual scene,
something goes on in our brains that is in some respects similar to what goes on we imagine the
same scene – in what respects it is similar remains an open empirical question. But if I am right,
then neither of these representations takes the form of a two-dimensional display. So why does it
look to us like it does? This sort of question – why something looks the way it does – may not
be a scientifically answerable question because it is a question about what it’s like to be in a
certain conscious brain state. Wittgenstein gives the following story that puts this question in
perspective. Two philosophers meet in the hall and one says to the other, “Why do you suppose
people always thought that the sun went around the earth, rather than the other way around?”
The second philosopher replies, “Obviously because it looks like the sun goes around the earth.”
To which the first philosopher replies, “But what would it look like if it looked like the earth
went around the sun?” There is much we don’t know about why something “looks like” what it
does.
9
Pylyshyn: Pictures in the brain?
References
1
Paivio, A. (1991) Dual coding theory: Retrospect and current status. Canadian Journal of
Psychology 45 (3), 255-287
2
Shepard, R.N. (1978) Externalization of mental images and the act of creation. In Visual
Learning, Thinking and Communication (Randhawa, B.S. and Coffman, W.E., eds.), pp.
133-189, Academic Press
3
Kosslyn, S.M. (1994) Image and Brain: The resolution of the imagery debate, MIT Press
4
Denis, M. and Kosslyn, S.M. (1999) Scanning visual mental images: A window on the mind.
Cahiers de Psychologie Cognitive / Current Psychology of Cognition 18 (4), 409-465
5
Pylyshyn, Z.W. (2002) Mental Imagery: In search of a theory. Behavioral and Brain
Sciences 25 (2), xxx-xxx
6
Pylyshyn, Z.W. (in press) Seeing and visualizing: It's not what you think, MIT
Press/Bradford Books
7
Pylyshyn, Z.W. (2001) Visual indexes, preconceptual objects, and situated vision. Cognition
80 (1/2), 127-158
8
Marr, D. and Nishihara, H.K. (1976) Representation and recognition of spatial organization
of three-dimensional shapes. (MIT A. I. Memo 377)
9
Finke, R.A. (1980) Levels of Equivalence in Imagery and Perception. Psychological Review
87, 113-132
10 Freyd, J.J. and Finke, R.A. (1984) Representational momentum. Journal of Experimental
Psychology: Learning, Memory, & Cognition 10 (1), 126-132
11 Pylyshyn, Z.W. (1981) The imagery debate: Analogue media versus tacit knowledge.
Psychological Review 88, 16-45
12 Farah, M.J. et al. (1992) Visual angle of the mind's eye before and after unilateral occipital
lobectomy. J Exp Psychol Hum Percept Perform 18 (1), 241-246
13 Bisiach, E. and Luzzatti, C. (1978) Unilateral neglect of representational space. Cortex 14
(1), 129-133
14 Pylyshyn, Z.W. (1973) What the Mind's Eye Tells the Mind's Brain: A Critique of Mental
Imagery. Psychological Bulletin 80, 1-24
15 Attneave, F. and Farrar, P. (1977) The visual world behind the head. American Journal of
Psychology 90 (4), 549-563
16 Tootell, R.B. et al. (1982) Deoxyglucose analysis of retinotopic organization in primate
striate cortex. Science 218 (4575), 902-904.
17 Hinton, G.E. (1987) The horizontal-vertical delusion. Perception 16 (5), 677-680
18 Milner, A.D. and Goodale, M.A. (1995) The Visual Brain in Action, Oxford University Press
19 Pylyshyn, Z.W. and Cohen, J. (1999) Imagined extrapolation of uniform motion is not
continuous. In Annual Conference of the Association for Research in Vision and
Ophthalmology, May 1999., Investigative Opthalmology and Visual Science
10
Pylyshyn: Pictures in the brain?
20 Beschin, N. et al. (2000) Perceiving left and imagining right: Dissociation in neglect. Cortex
36 (3), 401-414
21 Pylyshyn, Z.W. (2000) Situating vision in the world. Trends in Cognitive Sciences 4 (5),
197-207
22 Rieser, J.J. (1989) Access to knowledge of spatial structure at novel points of observation.
Journal of Experimental Psychology: Human Perception and Performance 15, 1157-1165
23 Bartolomeo, P. and Chokron, S. (2002) Orienting of attention in left unilateral neglect.
Neuroscience and Biobehavioral Reviews 26 (2), 217-234
11
Pylyshyn: Pictures in the brain?
Box A. Mental scanning
In a large number of experiments, reviewed in a recent publication [a], observers learned a map,
such as the one shown below. They were then asked to imagine the map, to fix their attention on a
given landmark, and then to indicate when they can “see” a second named place on the map. In
every case a linear relation between reaction time and distance on the map was found.
Visual and imaginal scanning
reaction time
4
3
2
visual inspection
mental image
judge orientation
1
0
short
medium
long
distances
In our experiments [b,c] observers learn a map as well. But the map they learn is one that has lights
at each of the landmarks. The lights can be turned off by operating a switch, but this results in
another light immediately coming on at another landmark. As before, observers are asked to
imagine the map, to fix their attention on a landmark, and then to imagine that the light switch has
been operated. Observers then indicate when they can “see” the light come on at another landmark
on their image. In another experiment they make a judgment about the compass direction of the
first landmark from the one to which they have moved their attention. Distances are grouped and a
correlation computed. Reaction time correlated significantly with distance only in the visual case.
There is no significant effect of distance on the time it take to switch attention on the image under
these conditions.
a
Denis, M. and Kosslyn, S.M. (1999) Scanning visual mental images: A window on the mind.
Cahiers de Psychologie Cognitive / Current Psychology of Cognition 18 (4), 409-465.
b
Bannon, L.J. (1981) An Investigation of Image Scanning: Theoretical Claims and Empirical
Evidence. Ph.D. dissertation, University of Western Ontario. University Microfilms, no. 81-50,
599.
c
Pylyshyn, Z.W. (1981) The imagery debate: Analogue media versus tacit knowledge.
Psychological Review 88, 16-45.
12
Pylyshyn: Pictures in the brain?
Box B: Mixing colors in your mind
Think of these circles as colored filters and imagine that they are moved closer together until they
overlap. What color do you see in your “mind’s eye” at the overlapping part? The interesting
question is: Why do you see that color rather than some other? Can you make the overlap be some
other color? The interest in this question turns on why your image took on that color. It has been
reported that people give different answers when they are asked outright what color would appear
than if they are asked to imagine the above sequence [a]. What does that tell us about properties of
image formats (or image mechanisms)? There is a fundamental difference between empirical
phenomena that reveal properties of mind (of the “cognitive architecture” [b]) and ones that reveal
what a subject knows or remembers [c]. The fact that you may give different answers depending
on how you are asked is a very general property of mind that has nothing to do with the properties
of the imagery system.
The chances are quite good that the color you “saw” in your mind is different from the one you
would have seen if actual filters had been moved against a white background. Few people are
aware of the difference between additive and subtractive color mixing. If these were filters, then
they would mix to form white light, but if they were pigments they would mix to form an orange
pigment.
a Kosslyn, S.M. (1981) The Medium and the Message in Mental Imagery: A Theory.
Psychological Review 88, 46-66
b Pylyshyn, Z.W. (1996) The study of cognitive architecture. In Mind Matters: Contributions to
Cognitive Science in honor of Allen Newell (Steier, D. and Mitchell, T., eds.), Lawrence
Erlbaum Associates.
c Pylyshyn, Z.W. (1984) Computation and cognition: Toward a foundation for cognitive
science, MIT Press (Chapter 5).
13
Pylyshyn: Pictures in the brain?
Box C: Can images be visually reinterpreted?
Peter Slezak [a,b] showed the above pictures of animals to subjects and asked them to memorize
what they looked like. Then has asked them to rotate each picture 90 degrees clockwise and report
what they saw. None of his subjects reported seeing the very clear interpretations that could easily
be seen if one rotated the pictures themselves. But even more telling, subjects could get the
intended interpretation if they were allowed to first sketch the image from memory and then rotate
it. Thus it appears that they could recall enough of the image to allow reinterpretation, but were
unable to do it in their mind alone. These figures are reasonably complex and perhaps difficult to
rotate mentally, but they are arguably not as complex as ones that subjects rotated in various
“mental rotation” experiments [c-e].
An argument about whether mental images can be visually reinterpreted has been carried on in the
literature, with one of the most detailed analyses carried out by Mary Peterson [f]. While Peterson
argued that images could be reconstrued in some ways, her data showed that reinterpreting mental
images was very different from reinterpreting real displays – and that bistable images such as the
Necker cube and figure-ground reversals did not with mental images.
a
Slezak, P. (1991) Can images be rotated and inspected? A test of the pictorial medium theory.
In Thirteenth Annual meeting of the Cognitive Science Society, pp. 55-60, Lawrence Erlbaum
Associates
b
Slezak, P. (1992) When can images be reinterpreted: Non-chronometric tests of pictorialism. In
Fourteenth Conference of the Cognitive Science Society, pp. 124-129, Lawrence Erlbaum
Associates
c
Shepard, R.N. and Cooper, L.A. (1982) Mental Images and Their Transformations, MIT Press,
a Bradford Book
d
Shepard, R.N. and Metzler, J. (1971) Mental rotation of three dimensional objects. Science 171,
701-703
e
Pylyshyn, Z.W. (1979) The Rate of 'Mental Rotation' of Images: A Test of a Holistic Analogue
Hypothesis. Memory and Cognition 7, 19-28
f
Peterson, M.A. et al. (1992) Mental images can be ambiguous: Recontruals and referenceframe reversals. Memory and Cognition 20 (2), 107-123
14
Pylyshyn: Pictures in the brain?
Box D: What happens when you “see” a mental image based on
a description
Imagine a parallelogram such as this one:
Now imagine another identical one directly below it like this:
Now imagine connecting each vertex of the top figure to its matching vertex in the bottom figure.
What do you see as you look at it with your “mind’s eye”?
Draw the vertical lines and look at it again. Your mind’s eye does not see the way your real eye
does. That’s because what you imagine is already interpreted and does not undergo further visual
interpretation.
15
Download