YaleTalk - Center for Cognitive Science

advertisement
The Illusion of Mental Pictures
Zenon Pylyshyn
Rutgers University,
Center for Cognitive Science
http:/ruccs.rutgers.edu/faculty/pylyshyn.html
The illusion of mental pictures
● There is no question that we (all but about 2% of us)
experience mental images and, in some sense, use
them to recall, anticipate and enjoy life in the absence
of the things and people that we imagine.
● Not only are we able to “picture” some object or scene
in our “mind’s eye” but it seems that we must do so in
order to solve certain kinds of problems.
● Books are full of examples of how images helped
people to make discoveries in science and create works
of art – none of which would have happened without
the capacity to use mental imagery. I will not rehearse
all the examples, but they include Einstein, Kikule, …
The illusion about the causal
role of mental pictures in thought
● It is important that as scientists we consider what is assumed
when we speak of creating, recalling, examining and
transforming mental images.
● I have argued that there is a powerful illusion behind not only our
folk understanding of mental imagery, but also behind our
attempts to build scientific theories of it, and this illusion is not
just a way of speaking or a handy metaphor. It is an essential
part of our understanding of imagery. In fact the very term
“imagery” betrays an assumption about what it is like.
● The illusion is that when we engage in what we call imaging or
visualizing, there is, somewhere in our head, something that we
see more or less the way we see the world, and which resembles a
possible or actual visual scene about which we are thinking.
Some common mistakes in
thinking about mental imagery
1.
The intentional or phenomenological fallacy:
Confusing properties of the imagined world with
properties of our images.
 Examples : size, distance, and especially temporal duration
2.
Task demands. The rational interpretation of the task
of imagining something is to pretend you are
perceiving it.
 “Imagine X”  “Pretend that you are seeing X happening”
Intuitions about which property in the world maps
onto the same property in its representation








Temperature, weight…

Brightness

3D depth
?
Shape, color 
Size

Motion

Duration

Metrical properties such as distance

?
Metrical axioms, Euclidean properties (Pythagoras’ theorem)
Examples to probe your intuition and your tacit knowledge
Imagine seeing these events unfolding…
● You hit a baseball. What shape trajectory does it trace? It is
coming towards you: Where would you run to catch it? If you
have ever played baseball you would have a great deal of “tacit
knowledge” of what to do in such well-studied cases.
● You drop a rubber ball on the pavement. Tap a button every time
it hits the ground and bounces. Plot bounce height vs time.
What is responsible for this
pattern in your image?
height
● Suppose you get
this pattern:
Time since first drop
● Drop a heavy steel ball at the same time as you drop a light ball
from, say, the leaning tower of Pisa. Indicate when they hit the
ground. Repeat for different heights and weights. (It turns out that
people are Aristotelian rather than Galilean).
What color do you see when two
colored light beams overlap?
?
Two complementary colored light beams
=> white
Two complementary colored filters or paint => black
Where would the water go if you poured
into a full beaker full of sugar?
Is there conservation of volume in
your image? If not, why not?
What do these image behaviors have in common?
● Objects in your image do whatever you believe those
objects would have done had you watched them under
the same set of circumstances in reality.
● Finding that your image mimics nature is not a
discovery about images. It is a discovery about your
tacit beliefs of what would happen in the world under
conditions similar to those of the imagery experiment.
The most interesting questions about mental imagery come
together in the problem of representing spatial patterns
Representation of Space in Mental Images
This is the issue I am most interested in because it
bears on some questions about how visual information
is encoded as well as the vexing question of the role of
conscious experience in cognitive science
Spatial character of mental images
● Some of the more impressive experimental results on
mental imagery (mental rotation, mental scanning,
mental size effects) appear to suggest that images have
spatial properties.
● It is no accident that we can reason by imagining things
laid out in space and then examining the layout pattern
to see the solution. Yet there have been few attempts to
say exactly what “being laid out in space” means, either
formally or physically.
● One of the most explicit has been a statement by Steve
Kosslyn about what he calls the depictive nature of
mental images. Since it shows the intimate connection
of images with spatial cognition I begin with this quote.
Images as depictive representations (Kosslyn, 1994, p 5)
“A depictive representation is a type of picture, which
specifies the locations and values of configurations of
points in a space. … In a depictive representation, not
only is the shape of the represented parts immediately
available to appropriate processes, but so is the shape of
the empty space … Moreover, one cannot represent a
shape in a depictive representation without also
specifying a size and orientation….”
 This is the claim that the form of representation of
images compels certain properties to be represented.
The reason for this assumption goes back to what
“image” means to many people – and to the underlying
mental picture assumption.
Images as displayed in “functional space”

“The space in which the points appear need not be
physical…, but can be like an array in a computer, which
specifies spatial relations purely functionally. That is, the
physical locations in the computer of each point in an array
are not themselves arranged in an array; it is only by virtue
of how this information is ‘read’ and processed that it
comes to function as if it were arranged into an array (with
some points being close, some far, some falling along a
diagonal, etc).” (Kosslyn, 1994, p5)


But it is important why the information is ‘read’ in one way
rather than in another since that is what gives the account the
appearance of being principled and explanatory.
To understand why the picture theory does not offer an
explanation one needs to understand the functional space
proposal and it’s assumptions.
Do images have (or just represent) size?
● There are many studies showing that when subjects imagine
something small it takes them longer to detect small features
(e.g., a mouse’s whiskers) than when they imagine them as
large. What does this tell us about the representation of size?
● There are two possibilities: The “size” is either the size of the
image or it is the size of the thing imagined.
 The first needs either a physical size or some still-unknown
variables that obey the law Time = Distance/Speed.
 The second can yield the observed result simply because people
know what it would be like to view the object, namely if it is small
the details will not be as clear or you will need to ‘zoom’ in on the
object, to see the details, (Ask yourself: What if it were faster for the
small image? What would you conclude?)
● Suppose, instead, the experiment asked you to report details in
a large blurred or low-definition image as opposed to a small
high definition image? Why do you predict that?
One of the least controversial examples of
image transformation: Mental rotation
Time to judge whether (a)-(b) or (b)-(c) are the
same except for orientation increases linearly
with the angle between them (Shepard & Metzler, 1971)
What do you do to judge whether these
two figures are the same shape?
Is this how the process looked to you?
When you make it rotate in your mind, does it seem
to retain its rigid 3D shape without re-computing it?
The important distinction between
architecture and represented content
● It is only obligatory that a certain pattern must occur if
the pattern is caused by fixed properties of the
architecture as opposed to being due to properties of
what is represented (i.e., what the observer tacitly
knows about the behavior of that which is represented)
 If it is obligatory only because the theorist says it is, then
score that as a free empirical parameter (a wild card).
 The important consequence is that if we allow one theory to
stipulate what is obligatory without there being a principle
that mandates it, then any other theory can stipulate the same
thing. Such theories are unconstrained and explain nothing.
 This failure of image theories is quite general – all picture
theories suffer from the same lack of principled constraints.
How are these ‘obligatory’ constraints realized?
● Image properties, such as size and rigidity are assumed to
be inherent in the architecture (of the ‘display’)
● That raises the question of what kind of architecture
could possibly enforce rigidity of shape?
 Notice that neither a spatial display nor a functional space make
it obligatory that shape be rigidly maintained as orientation is
changed. Only certain physical properties can explain rigidity.
 Such rigidity could not be part of the architecture of an imagery
system because we can easily imagine objects for which rigidity
does not hold (e.g. imagine a rotating snake!).
 There is also evidence that ‘mental rotation’ is incremental, not
holistic, and the speed of rotation depends on the conceptual
complexity of the shape and the comparison task.
Example 2: Mental Scanning
● Hundreds of experiments have now been done
demonstrating that it takes longer to scan attention
between places that are further apart in the imagined
scene. In fact the time-distance relation is linear.
● These have been reviewed and described in:
 Denis, M., & Kosslyn, S. M. (1999). Scanning visual mental
images: A
on the
the mind. Cahiers de Psychologie
A window
window on
Cognitivemind
/ Current Psychology of Cognition, 18(4), 409-465.
 Rarely cited are experiments by Liam Bannon and me
(described in Pylyshyn, 1981) which I will summarize for you.
Studies of mental scanning
Does it show that images have metrical space?
2
1.8
1.6
Latency (secs)
1.4
scan image
imagine lights
show direction
1.2
1
0.8
0.6
0.4
0.2
0
Relative distance on image
(Pylyshyn & Bannon. Described in Pylyshyn, 1981)
Conclusion: The image scanning effect is Cognitively Penetrable
 i.e., it depends on Tacit Knowledge.
Studies of mental scanning
Does it show that images have metrical space?
2
1.8
1.6
scan image
imagine lights
Latency (secs)
1.4
show direction
1.2
1
0.8
0.6
0.4
0.2
0
1
2
3
4
Relative distance on image
(Pylyshyn & Bannon. Described in Pylyshyn, 1981)
Conclusion: The image scanning effect is Cognitively Penetrable
 i.e., it depends on Tacit Knowledge.
 The central problem with imagistic explanations…
What is assumed in the mental picture
explanations of mental scanning?
● In actual vision, it takes longer to scan a greater distance because
real distance, real motion, and real time is involved, therefore this
equation holds due to natural law:
Time = distance
speed
But what ensures that a corresponding relation holds in an image?
The obvious answer is: Because the image is laid out in real space!
 But what if that option is closed for empirical reasons? Well you
might appeal to a “Functional Space” which imagists liken to a
matrix data structure in which some pairs of cells are closer and
others further away, and to move from one to another it is natural
that you pass through intermediate cells
● Question: What makes these sorts of properties “natural” in a
matrix data structure?
What warrants the ‘obligatory’ constraint?
To use Prinz’s term, it is not obligatory that the wellknown relation between distance, speed and time hold
in functional space or in a matrix. There is no natural
law or principle that requires it. You could imagine an
object moving instantly or according to any motion
relation you like, and the functional space would then
be made to comply with that since it has no constraints
of its own.
 So why is it natural to imagine a moving object traversing
intermediate empty space when getting from A to B?
 Because that’s how real objects move through real space!
Why is it ‘natural’ to assume that
functional space is like real space?
There are at least two possible reasons why a functional
space, such as a matrix data structure, appears to have natural
spatial properties (e.g., distances, size, empty places):
1. Because when we think of incarnations of functional space,
such as a matrix, we think of how we picture them on paper.
 In fact a matrix does not intrinsically have distance, empty
places, direction or any other such property, except in the mind
of the person who draws it or uses it!
 Moving from one cell to another does not require passing
through intermediate cells unless we stipulate that it does. A
computer is quite happy to go directly from one cell to any other
cell. The same goes for the very concept of ‘intermediate cell’.
Why is it ‘natural’ to assume a matrix …
2. Because when we think of a functional space, such as a
matrix, we think of it as being a way of simulating real
(cortical) space – making it more convenient to think about
the consequences of the cortical space assumption.
 This is why we think of some cells as being ‘between’ others, some
being farther away, etc. This makes properties like distances seem
natural because we interpret the matrix as standing in for real space.
 In that case we are not appealing to a functional space in explaining
the scanning effect, the size effect, etc. The explanatory force of the
explanation comes from the real space that we are assuming.

This is just another way of assuming a real space (in the brain) where
representations of objects are located in neural space.

We will see that all the reasons for the failure of the assumption that
images are laid out on the surface of visual cortex apply equally to this
‘functional space.’
What next?
● We turn now to the only way in which we might be
able to explain the experimental imagery results in
terms of pictorial properties, as assumed by picture
theorists. That’s to locate the picture in the brain –
because it is the only place where there is a literal
physical space that could underwrite such operations
as scanning or rotation or properties such as size or
shape in the terms assumed by picture theorists.
The good news for picture theories
What are some plausible reasons why we might
find a mechanisms of imagery in visual cortex
●
●
●
●
●
There is neuroanatomical evidence for a retinotopic layout in
the earliest visual area of the brain (V1).
Neural imaging data shows that V1 is more active during
mental imagery than during other forms of thought.
Transcranial magnetic stimulation (TMS) of visual areas
interferes more with imagery than other forms of thought.
Clinical cases of visual agnosia show that some impairments
of vision have associated impairments of imagery (Bisiach, Farah)
Recent psychophysical observations of imagery show parallels
with corresponding observations of vision, and these can be
related in both cases to certain cells in V1 (e.g., oblique effect)
Neuroscience evidence shows that the retinal pattern
of activation is displayed on the surface of the cortex
There is a topographical projection
of retinal activity on the visual
cortex of the cat and monkey.
Tootell, R. B., Silverman, M. S., Switkes, E., & de Valois, R. L
(1982). Deoxyglucose analysis of retinotopic organization in
primate striate cortex. Science, 218, 902-904.
The bad news for picture theories
Drawing conclusions about the form of visual
images from neuroscience data faces many hurdles
1. The capacity for imagery and for vision are independent. All
imagery results are observed in the blind as well as in patients
with no visual cortex. So there is nothing visual about them.
2. Cortical topography is 2-D, but mental images are 3-D – all
phenomena (e.g. rotation) occur in depth as well as in the plane.
3. Patterns in the visual cortex are in retinal coordinates whereas
images are primarily in world-coordinates
Unless you make a special effort, your image of parts of the
room stays fixed in room coordinates when you move your eyes
or turn your head or walk around the room.
…Problems with drawing conclusions about mental imagery from neuroscience data
4.
Accessing and manipulating information in an image is
very different from accessing it from the perceived world.
Order of access from images is highly constrained.
 Some have tried to explain this by postulating rapid decay of
images, but the times involved in these demonstrations are not
consistent with the data (e.g., times for reporting letters are
comparable to those involving size or mental scanning).
 Conceptual rather than graphical properties are relevant to image
complexity (e.g., mental rotation) suggesting that image
representations are conceptual.
 If images consist in patterns on visual cortex then they behave
differently when the same patterns are acquired from vision. For
example the important Emmert’s law applies to retinal and
cortical images but not to mental images, a fact largely unnoticed.
…Problems with drawing conclusions about mental imagery from neuroscience data
5. The signature properties of vision (e.g., spontaneous 3D
interpretation, automatic reversals, apparent motion, motion
aftereffects, etc) are absent in images;
6. A cortical display account of most imagery findings is
incompatible with the cognitive penetrability of mental
imagery phenomena, such as scanning and image size effects;
7. The fact that the Mind’s Eye is so much like a real eye (e.g.,
oblique effect, resolution fall-off) should serve to warn us
that we may be studying what observers know about how the
world looks to them, rather than what form their images take
(unless the Mind’s eye is exactly the same as the real eye!).

I will consider a possible neural explanation of the oblique effect later.
…Problems with drawing conclusions about mental imagery from neuroscience data
8.
Many clinical cases cited by image theorists can be
explained by appeal to tacit knowledge and attention


The ‘tunnel effect’ found in vision and imagery (Farah) is
plausibly due to the patient knowing how things looked to
her post-surgery (The experiments were done a year after).
Hemispatial neglect seems to be an attention deficit, which
explains the neglect in imagery reported by Bisiach. A
recent study shows that image neglect does not appear if
patients have their eyes closed (Bartolomeo & Chokron,
2002). This fits well with the account I have offered in
which the spatial character of mental images derives from
concurrently perceived space (I will give examples later).
A more detailed look at two examples
where neuroscience evidence is used
● Claims that fMRI and PET evidence supports the
assumption that larger mental images have
correspondingly larger regions of cortical excitation.
● Claims that the Oblique Effect in imagery supports
the assumption that images are laid out on the visual
cortex.
1. Image size and the visual cortex
● There is evidence that when imagining “large” objects that
overflow one’s phenomenal image, a different pattern of
activation in visual cortex occurs than when imagining a
small object.

This in itself is not remarkable since all scientists accept that
a difference in mental experience must be accompanied by
some difference in the neural state – this is called the
supervenience assumption: no mental differences without
physical differences. This also follows from materialism.
Image size and neural encoding
● In vision: cells in the parafoveal area of the retina project
onto the more frontal parts of the visual cortex. Thus when
objects are large enough so that they fall onto the parafovea,
they will activate frontal parts of the visual cortex.
● In imagery: it is claimed that imagining large objects (which
fill the visual field) leads to increased activity in the frontal
part of the visual cortex. Some have taken this as prima facie
evidence that perceived (large) size is neurally encoded the
same way as imagined (large) size.
Image size and the visual cortex…
● But the explanation for why large visual objects activate more
frontal parts of the visual cortex depends on the fact that fibers
from parafoveal cells connect to these frontal areas. This can’t be
the case with mental images unless they are also on the retina!
 And anyway, how does the fact that large mental images activate
frontal parts of the visual cortex explain why small details are easier
to detect in large mental images? Or how does it explain why
scanning across a large image takes longer just because it happens
to lie in the more fontal visual cortex? All picture-theory
explanations make essential reference to distances and sizes.
● Many neuroscience explanations for imagery findings make
exactly the same mistake of citing activation patterns that arise
from connections to the retina, and which therefore do not work
unless mental images are projected onto the retina. I will give
just one more example of a such a neural explanation because the
error in that case is particularly egregious.
2. The oblique effect and visual cortex
●
In vision, when a set of lines is to be discriminated
(distinguished from a single blur) the discrimination is better
when the lines are vertical or horizontal than when they are at
a 45° angle. This is called the Oblique Effect. It is a lowlevel effect that occurs in the early vision module.

Does the Oblique effect occur with mental images?
Do images have low-level visual properties?
● Imagine a grating in which the bars are:
1. Horizontal
2. Vertical
3. Oblique (45°)
(1)
(2)
(3)
● Imagine the bars getting closer and closer together. In which of
these displays do the bars blur together first?


In vision, the oblique bars blur sooner (called oblique effect)
In imagery, a similar result was reported by Kosslyn et al.
Neurological explanations for both cases?
●
An accepted explanation of the psychophysical case (where lines
are seen) is that in primary visual cortex (V1) there are more
cells tuned to horizontal and vertical orientations than to oblique
orientations, so horizontal and vertical discrimination is more
sensitive. Can this fact also explain why imagined bars show the
same pattern? Kosslyn et al claim that it does and that this
provides further support for the view that images are laid out in
visual cortex.
●
But this argument rests on a misunderstanding of how the
orientation-specific cells are tuned to specific orientations: the
tuning comes from the way they are connected to photoreceptive
cells on the retina. Vertical cells are more often connected to
columns of photocells while horizontal cells are more often
connected to rows of photocells (relative to the retina).
Neurological explanations for both cases?
● If patterns of bars were activated on the surface of
cortex by mental imagery, as assumed by picturetheorists, then no overall bias toward vertical-horizontal
bars would occur. Horizontal cells would be no more
likely to be activated by horizontal patterns on the
surface of the visual cortex than by vertical patterns.
The only way that images of horizontal bars would
preferentially activate horizontal cells is if the images
were on the retina!
What happens when horizontal/vertical cells are
activated by means other than retinal patterns?
9 vertical
9 horizontal
5 oblique
The proportion of Vertical, Horizontal & Oblique cells remains the same
in all cases – they are located at random on the surface of visual cortex!
An overarching consideration:
What if colored three-dimensional images were
found in visual cortex? What would that tell you
about the role of mental images in reasoning?
Would this require a homunculus?
Should we welcome back the homunculus?
● In the limit if the visual cortex mapped the contents of
one’s conscious images we would need an interpreter to
“see” this display in visual cortex
● But we will never have to face this prospect because
experiments show that the contents of mental images are
already conceptual (or, as Kosslyn puts it, are
‘predigested’) and therefore unlike any picture.
● Finally, you can make your image do whatever you want,
and to have whatever properties you wish. There are no
known constraints on mental images that cannot be
attributed to lack of knowledge of the imagined situation
(e.g., imagining a 4-dimensional object).
What is the alternative to a picture in V1
● Even accepting the tacit knowledge explanation of the
scanning result, there remains an open question: How
is the right amount of time computed in the scanning
experiment. I don’t claim that observers just stand by
idly until the right amount of time has passed and
then click the button indicating that the scan has
reached its goal (even though psychophysical studies
show that they are capable of doing so).
● I think there is something to the scanning explanation,
except that the space being scanned is not in the head
but in the concurrently perceived real space.
Are there any ways of representing spatial
layouts that are possible, given these problems?
● Maybe we have been looking in the wrong place for
things that fall under the formal requirements of being
spatial. Maybe they are not in the head after all.
● I have sketched a way of looking at this problem that
locates the spatial character of thought in the
concurrently-perceived world (see Chapter 5 of
Things and Places). I will end with just a hint of this
approach. It relies on findings from the study of the
interaction among perceptual modalities and imagery
as well as with motor actions and also neuroscience
findings concerned with coordinate transformation
mechanisms in the brain.
Another chapter in the imagery debate:
The relation of images to vision and motor control
● It has always seemed to me that one of the properties of
mental images that makes them appear spatial is that they
connect in certain ways not only with vision, but also with
the motor system:
 We can point to things in our image! <example>
● We can “project” our images onto perceived space – even
space perceived in different modalities. I believe that this
observation is the key to the spatial character of images.
● This projection does not require a picture to be projected,
only the location of a small number of features. Over the
past few decades I have been studying a mechanism called a
visual index, or a FINST, that is well suited for this task.
Using a concurrently perceived room to
anchor FINSTs tagged with map labels
Studies of mental scanning
Does it show that images have metrical space?
1.8
1.6
1.4
scan image
Latency (secs)
1.2
1
0.8
0.6
0.4
0.2
0
Relative distance on image
The image scanning effect was shown to be Cognitively Penetrable.
But what allows a smooth scan across the image is the perceptual
display. Without the perceived map scanning would not be smooth and
continuous and the timing would not be accurate (Pylyshyn & Cohen, 1999).
Where do we stand?
● It seems that a literal picture-in-the-brain theory is
untenable for many reasons – including the major
empirical differences between mental images and
cortical images. A serious problem with any formatbased explanation of mental imagery is the cognitive
penetrability of many of the imagery demonstrations.
● The pictorial quality of images may be an illusion that
arises from the similarity of the experience of imaging
and of seeing
So how do we explain the similarity of the experience
of imagining and of seeing – the fact that they both
seem to involve a pictorial panoramic display?
It is very likely that neither experience directly reveals
the form of the representation.
This is what our conscious experience suggests
goes on in vision…
This is what the demands of explanation suggests
must be going on in vision…

For a copy of these slides see:

http://ruccs.rutgers.edu/faculty/pylyshyn/Imagery2011.pptx
END
Download