Imagery slides Imagery and Memory ● Memory Examples: Dual Code Theory To recall Y you must first recall X Windows, doorknob, glasses, other facial features, global-to-local But: Something like the same thing happens in recall of alphabet letters and many other memorized lists ● Imageability rating are more effective than frequency of occurrence or frequency of cooccurrence in paired-associates learning. Vision is clearly involved when images are superimposed onto vision Many experiments show that when you project an image onto a display the image acts very much like a superimposed display • Shepard & Podgorny (paper folding task…) • Interference effects (Brooks) Controvercial Perky effect: Perception or response bias? Project an image onto a perceived form Brooks’ spatial interference study Respond by pointing to symbols in a table or by saying the words left or right Perception or attention effects? ● Many impressive imagery effects can be plausibly attributed to attention ● Bisiach widely-cited finding on visual neglect Bartolomeo, P., & Chokron, S. (2002). Orienting of attention in left unilateral neglect. Neuroscience and Biobehavioral Reviews, 26(2), 217-234. Dulin, D., Hatwell, Y., Pylyshyn, Z. W., & Chokron, S. (2008). Effects of peripheral and central visual impairment on mental imagery capacity. Neuroscience and Biobehavioral Reviews, 32(8), 1396-1408. Does neglect require vision? Chokron, S., Colliot, P., & Bartolomeo, P. (2004). The role of vision in spatial representations. Cortex, 40, 281-290. We can to some extent control our attended region Is an image being projected onto a percept, or just a selective attention? Farah, M. J. (1989). Mechanisms of imagery-perception interaction. Journal of Experimental Psychology: Human Perception and Performance, 15, 203-211. Shepard & Podgorny experiment Both when the displays are seen and when the F is imagined, RT to detect whether the dot was on the F is fastest when the dot is at the vertex of the F, then when on an arm of the F, then when far away from the F – and slowest when one square off the F. Similarities between perception of visual scenes and ‘perception’ of mental images ● Judgments from mental images Shape comparisons (of states: Shepard & Metzler) Size comparisons (Weber fraction or ratio effect) • What do they tell us about the format of images? • But this applies to nonvisual properties (e.g., price, taste) More demonstrations of the relation between vision, imagery (and later action) ● Images constructed from descriptions The D-J example(s) Perception or inference/guessing ● But there are even more persuasive counterexamples we will see later The two-parallelogram example • Amodal completion • Reconstruals: Slezak • Dynamic imagery Imagining actions: Paper Folding Mental rotation Time to judge whether (a)-(b) or (b)-(c) are the same except for orientation. Time increases linearly with the angle between them (Shepard & Metzler, 1971) What do you do to judge whether these two figures are the same shape? Is this how the process looked to you? When you make it rotate in your mind, does it seem to retain its rigid 3D shape without re-computing it? Mental rotation – the real story In mental rotation the phenomenology motivates the theory of “rotation” – but what the data actually show is that, Mental rotation is only found when the comparison figures are enantiomorphs or if the difference between figure pairs can only be expressed in figure-centric coordinates eg. they are 3D mirror-images No rotation occurs if the figures have landmarks that can be used to identify the relations among their parts. Records of eye movements show that mental rotation is done incrementally: It is not a holistic rotation as often reported. If fact even the phenomenology is not of a smooth continuous rotation. The “rate of rotation” depends on the conceptual complexity of both the figure and comparison task so that, at least, is not a result of the architecture (Pylyshyn, 1979). There are even demonstrations that it depends on how the subject interprets the figure (Kosslyn, 1994). Mental Scanning ● Hundreds of experiments have now been done demonstrating that it takes longer to scan attention between places that are further apart in the imagined scene. In fact the relation is linear between time and distance. ● These have been reviewed and described in: Denis, M., & Kosslyn, S. M. (1999). Scanning visual mental images: A window on the mind. Cahiers de Psychologie Cognitive / Current Psychology of Cognition, 18(4), 409-465. Studies of mental scanning Does it show that images have metrical space? 1.8 1.6 1.4 Latency (secs) 1.2 1 0.8 0.6 0.4 0.2 0 Relative distance on image Does this show that images are spatial, or have spatial properties, or that they “preserve metrical spatial properties”? (Kosslyn, S. M., T. M. Ball, et al. (1978). "Visual images preserve metric spatial information: Evidence from studies of image scanning." Journal of Experimental Psychology: Human Perception and Performance 4: 46-60. The idea of images being in some sense spatial is an interesting and important claim ● I will discuss this claim at some length later because it reveals a deep and all-consuming error that runs through all imagery theorizing – by psychologists, neuroscientists and philosophers. ● This is in addition to the errors I discussed earlier: The idea that subjects understand the task of imagining something to be the task of pretending they are seeing it, and the idea that certain properties of the world are properties of the image (the intentional fallacy) Constructing an image ● What determines what the image is like when it is constructed from memory or from knowledge? ● After constructing an image can you see novel aspects of the imagined situation? ● Examples Examples to probe your intuition and your tacit knowledge Imagine seeing these events unfolding… ● You hit a baseball. What shape trajectory does it trace? It is coming towards you: Where would you run to catch it? If you have ever played baseball you would have a great deal of “tacit knowledge” of what to do in such (well studied) cases. ● You drop a rubber ball on the pavement. Tap a button every time it hits the ground and bounces. Plot height vs time. height What is responsible for the pattern shown here? Time since drop ● Drop a heavy steel ball at the same time as you drop a light ball (a tennis ball), e.g., from the leaning tower of Pisa. Indicate when they hit the ground. Repeat for different heights. ● Take a clear glass containing a colored liquid. Tilt it 45º to the left (counter-clockwise). What is the orientation of the liquid? What color do you see when two color filters overlap? ? Where would the water go if you poured it over a full beaker of sugar? Is there conservation of volume in your image? If not, why not? Seeing Mental Images ● Do images have size? ● Can we say that one image is larger than another? ● If so, what properties do we expect the smaller/larger image to have? Do mental images have size? Imagine a very small mouse. Can you see its whiskers? Now imagine a huge mouse. Can you see its whiskers? Do this imagery exercise: Connect each corner of the top Now imagine an parallelogram identical Imagine a parallelogram like this onewith the corresponding corner directly of the bottom parallelogram below parallelogram this one What do you see when you imagine the connections? Did the imagined shape look (and change) like the one you see now? Slezak figures Pick one (or two) of these animals and memorize what they look like. Now rotate it in your mind by 90 degrees clockwise and see what it looks like. Slezak figures rotated 90o P 29 Space Images and the representation of spatial properties ● We need to understand what it could mean for a representation to be spatial. ● At the very least it must mean that there are constraints placed on the form of the representation that do not apply when the representation is not spatial. The idea that images are in some sense spatial is an interesting and important claim ● I will return to this claim later because it reveals a deep and ubiquitous error that runs through most (all?) imagery theorizing – by psychologists, neuroscientists and philosophers. This is the error of mistaking descriptive adequacy with explanatory adequacy. Let’s call this conflating, the missing constraint error. ● This is in addition to the two errors I discussed earlier: Ignoring the fact that the task of imagining something is actually the task of pretending you are seeing it, and The mistaken assumption that certain properties of the world are properties of the image (the intentional fallacy) Both vision and visual imagery have some connection to the motor system ● There are a number of experiments showing the close connection between images and motor control* You can get Stimulus-Response compatibility effects between the location of aanstimulus image in space and the location of the response button in space, Ronald Finke showed that you could get adaptation with the position of the misperceived imagined hand that was similar to adaptation to displacing prism goggles, Both these findings provide support for the view that the spatial character of images comes from something being projected onto a concurrently perceived scene and then functioning much as objects of perception. This is the main new idea in Chapter 5 of Things & Places) Recall the studies of mental scanning… Does properties? Doesthis thisresult resultshow showthat thatimages imageshave havemetrical spatial properties? 1.8 1.6 1.4 scan image Latency (secs) 1.2 1 0.8 0.6 0.4 0.2 0 Relative distance on image the image scanning effect is Cognitively Penetrable ButWe theshowed way wethat compute the time it takes to scan across an image is by imagining something moving across the real perceived display. Without this display, we could not use our time-to-collision computation to compute the time to cross various distances on the image because there are no actual distances on the image! (Pylyshyn & Cohen, 1999) Using a concurrently perceived room to anchor FINSTs tagged with map labels The Spatial character of images What does it mean to say that images are spatial? ● It means that certain constraints hold among spatial measures (e.g., ● ● ● axioms of geometry and measure theory, such as triangle inequality, symmetry of distances, Euclidean axioms, Pythagoras’ theorem…} That certain constraints hold among “distances”, that certain relations can be defined among these distances (e.g., ‘between’, ‘farther than’), that Newtonian Physics holds between the terms that are used in explanations (e.g., distances and time). That mental images and motor control interact with one another to some degree – so you can “point to” objects in your image. Certain visual-motor ‘reflexes’ are automatic or preconceptual They are computed within the encapsulated Visual Module Preconceptual motor control is not sensitive to visual illusions, relative to control that is computed by the cognitive (‘seeing as’) system. Mental images as “depictive” representations ● “A depictive representation is a type of picture, which specifies the locations and values of configurations of points in a space. ● The space in which the points appear need not be physical but can be like an array in a computer, which specifies spatial relations purely functionally. That is, the physical locations in the computer of each point in an array are not themselves arranged in an array; it is only by virtue of how this information is “read” and processed that it comes to function as if it were arranged into an array…. ● Depictive representations convey meaning via their resemblance to an object. ● When a depictive representation is used, not only is the shape of the represented parts immediately available to appropriate processes , but so is the shape of the empty space … [and] one cannot represent a shape in a depictive representation without also specifying a size and orientation….” Form vs Content of images ● As in earlier discussion, one must be careful in distinguishing form from content. We know that there is a difference between the content of images and the content of other (nonimaginal) thought: Images concern sensory appearances while ‘propositions’ can express most* other contents. ● In attributing a special form of representation to images one should ask whether some symbolic system (e.g., sentences of LOT) would not do. Simplicity (Occam’s Razor) would then prefer a single format over two, especially if the one format is essential for representing thoughts and inferences [Fodor, J. A. and Z. W. Pylyshyn (1988). "Connectionism and cognitive architecture: A critical analysis." Cognition 28: 3-71.] ● The most promising contents that might require different forms of representation are those that essentially represent magnitudes. Of the magnitudes most often associated with images are spatial ones. * There has been a long-standing debate in Artificial Intelligence concerning the advantages of logical formats vs other symbol systems vs something completely difference (procedure). Thou shalt not cheat ● ● There is no natural law that requires the representations of time, distance and speed to be related according to the motion equation. You could just as easily imagine an object moving instantly or with constant acceleration or with any motion relation you like, since it is your image! There are two possible reason why the observed relation Actual Time = Representation of distance Representation of speed typically holds in an image-scanning task: 1. 2. Because subjects have tacit knowledge that this is what would happen if they viewed a real display, or Because the matrix is taken to be a simulation of a real physical display, as it often is in computer science. Notice that in the second case the explanation for the Reaction Time comes from the simulated real display and not from the matrix. The missing constraint in appeals to “space” in both scanning and mental rotation • What is assumed about the format or architecture of the mental representation in the examples of mental rotation? • According to philosopher Jesse Prinz (2002) p 118, “If visual-image rotation uses a spatial medium of the kind Kosslyn envisions, then images must traverse intermediate positions when they rotate from one position to another. The propositional [i.e., symbolic] system can be designed to represent intermediate positions during rotation, but that is not obligatory.” • This is a very important observation, but it is incomplete. One still needs to answer the question: What makes it obligatory that the object must ‘pass through intermediate positions’ when rotating in ‘functional space’, and what constitutes an ‘intermediate position’? These terms apply to the represented world, not to the representation! The important distinction between architecture and represented content ● It is only obligatory that a certain pattern must occur if the pattern is caused by fixed properties of the architecture as opposed to being due to properties of what is represented (i.e., what the observer tacitly knows about the behavior of what is represented) If it is obligatory only because the theorist says it is, score that as a free empirical parameter that any theory can assume. This failure of image theories is quite general – all picture theories suffer from the same lack of principled constraints. The important distinction between descriptive and explanatory adequacy ● ● It is important to recognize that if we allow one theory to stipulate what is obligatory without there being a principle that mandates it, then any other theory can stipulate the same thing. Such a theories are unconstrained so they can fit any possible observation – i.e., they are able to describe anything but explain nothing. A theory that does not explain why some pattern is obligatory can still be useful the way an organized catalog is useful. It may even list the features according to which it is organized. But it does not give an account of why it is organized that way rather than some other way. To do that it needs to appeal to something constant such as a law of nature or a fixed property of the architecture. How are these ‘obligatory’ constraints realized? ● ● Image properties, such as size and rigidity are assumed to be inherent in the architecture (e.g., of the ‘display’) That raises the question of what kind of architecture could possibly enforce rigidity of shape? Notice that there is nothing about a spatial display, let alone a functional space, that makes it obligatory that shape be rigidly maintained as orientation is changed. Such rigidity could not be a necessary property of the architecture of an image system because we can easily imagine that rigidity does not hold (e.g. imagine a rotating snake!). There is also evidence that ‘mental rotation’ is incremental, not holistic, and the speed of rotation depends on the conceptual complexity of the shape and the comparison task. What makes some properties seem “natural” in a matrix but not so natural in a symbolic data structure? 1. A matrix is generally viewed as a two-dimensional structure in which specifying the x and y values (rows and columns) specifies the location of any cell. But that’s just the way it is conventionally viewed. Rows, columns and cells are not actually spatial locations. 2. In a computer there is no requirement that in getting from one cell to another one must pass through any other specified cells nor is there any requirement that there be empty cells between any pairs of cells. What makes some properties “natural” in a matrix while not so natural in a symbolic data structure? 3. The main reason it is natural to view a matrix as having spatial constraints is that one is tacitly assuming that it represents some space. Then it is the represented space that has the constraints, not the matrix. Notice the subtle succumbing to the intentional fallacy again! 4. Any constraints that the functional space exhibits are constraints extrinsic to the format. Such constraints reside in the external world which the ‘functional space’ represents. But such extrinsic constraints can be added to any model of scanning, including a propositional one. What warrants the ‘obligatory’ constraint? But it is no more obligatory that the relation between distance, speed and time hold in functional space than in a symbolic (propositional) representation. There is no natural law or principle that requires it. You could imagine an object moving instantly or according to any motion relation you like, and the functional space would then comply with that motion since it has no constraints of its own. So why does it seem natural for imagined moving objects to traverse a ‘functional space’ than a sequence of symbolic representations of locations? There are at least two reasons why a ‘functional space’ might seem more natural than a symbolic representation of space, and both depend on (1) subjective experience and (2) the intentional fallacy. Where does the obligatory constraint come from? There are at least two reasons why the following equation holds in the mental image scanning task, even though, unlike in the real vision case, it does not follow from a natural law. Actual Time = Representation of distance Representation of speed 1. Because subjects have tacit knowledge that this is what would happen if they viewed a real display, and they understand the task to be one of reproducing properties of this viewing, or 2. Because the matrix is taken to be a simulation of real space. In that case the reason that the equation holds is that it is supposed to be simulating real space and the equation holds in real space. In that case it is not something about the form of the representation that provides the principled constraint, it’s the fact that it is supposed to be simulating real space which is where the obligation comes from. But the same thing can be done for any form of representation. Why is it ‘natural’ to assume that functional space is like real space? There are several reasons why a functional space, such as a matrix data structure, appears to have natural spatial properties (e.g., distances, size, empty places): 1. Because when we think of functional space, such as a matrix, we think of how we usually interpret it. A matrix does not intrinsically have distance, empty places, direction or any other such property, except in the mind of the person who draws it or uses it! Moving from one cell to another does not require passing through intermediate cells unless we stipulate that it does. The same goes for the concept of ‘intermediate cell’ itself. Why is it ‘natural’ to assume that functional space is like real space? 2. Because when we think of a functional space, such as a matrix, we think of it as being a way of simulating real space in the model – making it more convenient to build the model which otherwise would require special hardware This is why we think of some cells as being ‘between’ others and some being farther away. This makes properties like distances seem natural because we interpret the matrix as simulating real space. In that case we are not appealing to a functional space in explaining the scanning effect, the size effect, etc. The explanatory force of the explanation comes from the real space that we are simulating. • This is just another way of assuming a real space (in the brain) where representations of objects are located in neural space • All the reasons why the assumption of real brain space cannot be sustained in explanations of mental imagery phenomena apply to this version of ‘functional space.’ Why is it ‘natural’ to assume that functional space is like real space? 2. Because what we really want to claim is that images are displayed on a real spatial surface – a blackboard. But to model this we would need to build a hardware display. {An easier way to do this is simply to claim explicitly that there is a display or even simulate one using software (such as Kosslyn, et al. (1979) claim to have done*)}. This allows us to view some cells as being ‘between’ others and some being farther away. This makes properties like distances seem natural because we interpret the matrix as simulating or standing in for a real spatial display board or screen. In that case we are not appealing to a functional space in explaining the scanning effect, the size effect, etc. The explanatory force of the explanation comes from the real space that we are claiming and simulating. This is just another way of assuming a real space (in the brain) where representations of objects are located in neural space. Functional space and explanatory power ● There is a notion of explanatory power that needs to be kept in mind. It is best illustrated in terms of models that contain empirical parameters, as in fitting a polynomial curve to data. ● The general fact about fitting a model to data is that the fewer parameters that need to be estimated from the data to be fitted, the more powerful the explanation. The most powerful explanation is one that does not have to use the to-be-fitted data to tune the model. ● In terms of the current example of explaining results of experiments involving mental imagery, appealing to a “functional space” leaves open an indeterminate number of empirical parameters, so it provides a very weak (or vacuous) explanation. ● A literal (brain) space, on the other hand, is highly constrained since it must conform to Euclidean axioms and Newtonian physics – otherwise it would not be the space of natural science. But that kind of space implies that images are displayed on a surface in the brain and while that is a logical possibility it is not an empirical one Explanation and Description ● Another way to look at what is going on is to think about the difference between a description and an explanation. The two ways of characterizing a set of phenomena appear similar – they both speak of how things are and how they change (think of the Code Box example). ● But a description of a system’s behavior can apply to many different types of system with different mechanisms and different causal properties. And the same mechanisms can also produce very different behaviors under different circumstances. Although a general statement of what constitutes scientific explanation and how it differs from description has a long and controversial history, the simple Code Box example will suffice to suggest the distinction I have in mind. Cognitive Penetrability again ● A description states the observed generalizations (the observed patterns of behavior). The explanation goes beyond this. The difference is related to the question of how mutable a set of generalizations are and what types of effects can lead to changes in these generalizations. ● Causal accounts tend to have a longer time scale and when they change they tend to change according to different sorts of principles than those that describe the patterns.* Notice that we have come back to the criterion of cognitive penetrability. According to this way of looking at the question of explanatory adequacy, an theory meets the criteria of explanatory adequacy if it describes the architecture of the system and its operation. A note about time scales and types of changes * “Causal accounts tend to have a longer time scale and when they change they tend to change according to different sorts of principles than those that describe the patterns.” Consider the Code Box example. Changes that are not architectural tend to occur rapidly – different patterns are observed simply because different topics or words or even languages might be transmitted. Changes that are architectural require altering which letters or other symbols are transmitted (e.g., they may be numerals) or changing whether the outputs consist of short and long pulses that are interpretable as Morse Code. They require what we might think of as “rewiring”. A different way of approaching the question of spatial representation I offer a provisional proposal that preserves some of the advantages of the global spatial display, but assumes that the relevant spatial properties are in the perceived world and can be accessed if we have the right access mechanisms for selecting and indexing objects in the perceived world Let’s call this the Index Projection Hypothesis because it suggests that mental objects are somehow projected onto and associated with perceived objects in real space But this proposal is very different from image-projection because only a few object-labels are projected – not the rich visual properties suggested by the phenomenology The Image Projection Hypothesis This projection hypothesis relies on the spatial locations of objects in the concurrently perceived world to meet the conditions outlined earlier. It rests on two assumptions: 1. We have a system of “pointers” (viz, the FINST perceptual index mechanism to be described) by which a small number (n≤4) of objects in the world can be selected and indexed. Indexes provide demonstrative references to individual targets qua individuals, that keep referring to these objects despite changes in their location or any other properties. 2. When we perceive a scene that contains indexed objects, our perceptual system is able to treat those objects as though they were assigned unique labels. Thus our perceptual system is able to detect configurational properties among the indexed objects. The index projection hypothesis (2) The hypothesis claims that the subjective impression that we have access to a panorama of detailed high-resolution perceptual information is illusory. What we have access to is only information about selected or indexed objects. We have the potential to obtain other information from more of the scene through the use of our system of perceptual indexes. This is the basic insight expressed in the “world as external memory” slogan or the “situated cognition” approach. In reasoning using mental images we may assign indexes to perceived objects based on our memory of (or our assumptions about) where certain ‘mental’ objects are located But notice that the memory representation is itself not used in spatial reasoning and therefore need not meet the spatial constraints listed earlier – it can be in some general LOT Examples of the projection hypothesis To illustrate how the projection hypothesis works, first consider index-based projection in the visual modality, where indexes can convert some apparently mental-space phenomena into perceived-space phenomena (more on the non-visual case later) Examples from some ‘mental imagery” experiments Mental scanning (Kosslyn, 1973) Mental image superposition (Podgorny& Shepard, 1978) Visual-motor adaptation (Finke, 1979) S-R compatibility to imagined locations (Tlauka, 1998) Studies of mental scanning Often cited to suggest that spatial representations are literally spatial and have metrical properties tower X windmill X steeple X tree X Time to “see” feature on image beach X X Distance on image Brain image or index-based projection? A way to do this task: Associate places on the memorized map with objects located in the same relative locations in the world that you perceive (e.g., the room you are in) Move your attention or gaze from one place to another as they are named Using a perceived room to anchor FINSTs tagged with map labels Using vision with selected ‘labeled’ objects If you ‘project’ the pattern of map places by indexing objects in the room in front of you that correspond to the memorized relative locations, then you can scan attention from one such indexed object to another. The relation time = distance speed holds because the space you are scanning is the real physical space in the room. You can also use the indexed objects to infer configurational properties you may not have noticed, despite memorizing the location of objects. e.g. Which 3 or more places on the map are collinear? Which place on the map is furthest North (or South, East, West)? Which 3 places form an isosceles triangle? Such configurational consequence can be detected as opposed to logically inferred, so long as they involve only a few places, because the visual system can examine the indexed objects in the scene Connecting Images and Motor actions Images and visual-motor phenomena S-R Compatibility / Simon effect Finke’s imagined wedge goggles Harry’s subitizing-by-pointing Both vision and visual imagery have some connection to the motor system There are a number of experiments showing the close connection between images and motor control* You can get Stimulus-Response compatibility effects between the location of aanstimulus image in space and the location of the response button in space, Ronald Finke showed that you could get adaptation with the position of the misperceived hand that was similar to imagined adaptation to displacing prism goggles, Both these findings provide support for the view that the spatial character of images comes from something being projected onto a concurrently perceived scene and then functioning much as objects of perception. This is the main new idea in Chapter 5 of Things & Places) This story is plausible for visual cases, but how does it work without vision (e.g., in the dark)? We must rely on our remarkable capacity to orient to (point to, navigate towards, …) perceived or recalled objects (including proprioceptive ‘objects’) in space without vision Call this general capacity our spatial sense How can the projection hypothesis account for this apparently world-centered spatial sense without assuming a global allocentric frame of reference? Answer: Just as it does with vision, by anchoring represented objects to (non-visually) perceived objects in the world The spatial sense and the projection hypothesis Indexing non-visual ‘objects’ must exploit auditory and proprioceptive signals, and perhaps even preparatory motor programs (the ‘intentional’ frame of reference proposed by Anderson & Bruneo, 2002; Duhamel, Colby & Goldberg, 1992) Is there some special problem about proprioceptive inputs that makes them different from visual inputs? Is there a problem with proprioceptive inputs indexing objects the way visual indexes do? Unlike visual objects, proprioceptive “objects” are not fixed in an allocentric frame of reference – or are all objects the same? Notice that in vision and audition, even though static objects are fixed in an allocentric frame of reference, they nonetheless move relative to sensors, so their location in an allocentric frame must be updated as the proximal pattern moves (Andersen, 1999; Stricanne, Anderson & Mazzoni, 1996) The neural implementation of FINST indexes in vision requires an active updating process of some kind Maybe the same updating operation can also yield the sense of “same location in space” for proprioceptive ‘objects’ There are good reasons to think that proprioceptive signals may also be given in an allocentric frame of reference! (Yves Rosetti) What is the real problem of our sense of space? In order to solve the problem of how we index objects in the world using proprioceptive inputs we need to solve the problem of how we recognize two such inputs as corresponding to actions (e.g., reaching) towards the same object in the world This is the problem of the equivalence of movements, or of proprioceptive inputs, corresponding to the same object – it is the problem that Henri Poincaré recognized as the central problem of understanding our sense of space (in Poincaré “Why space has three dimensions” Les Dernier Penseés, 1913) Solving the equivalence problem would solve the problem of coordinating signals across frames of reference That’s why mechanisms of coordinate transformation are of central importance – they generate the relevant equivalences! Assumption: Coordinate transformations are the basis for the illusory “global frame of reference” A coordinate transformation operation takes a representation of an object relative to one coordinate system – say retinal coordinates – and produces a representation of that object relative to another frame of reference – say relative to the location of a hand in proprioceptive or kinematic coordinates Coordinate transformations define equivalence classes of proprioceptive inputs that correspond to actions (e.g., reaching, eye movements) towards the same object in space Such transformations are well-known and ubiquitous in the brain (especially in posterior parietal cortex and superior colliculus) A consequence of these mechanisms is that, as (Colby & Goldberg, 1999) put it, “Direct sensory-to-motor coordinate transformation obviates the need for a single representation of space in environmental coordinates” (p319) Coordinate transformations need not transform all points in a given frame of reference Coordinate transformations need not transform all points (including points in empty space) or all sensory objects: Only a few selected objects need to be transformed at any one time The computational complexity of coordinate transformations can be made tractable by only transforming selected objects (as is done by matrix operations in computer graphics) This idea is closely related to the conversion-on-demand hypothesis of Henriques et al. (1998) and Crawford et al. (2004). In the Henriques et al COD proposal, visual information about object locations is held in a gaze-centered frame of reference and objects are converted to motor coordinates when needed Coordinate transformations define equivalence classes of gestures which individuate proprioceptive objects just the way that FINST indexes do in vision Coordinate transformations compute equivalence classes of proprioceptive signals {s} corresponding to distinct motor actions to individual objects in real space. The equivalence class is given by: s ≡ s′ iff there is a coordinate transformation between S ↔ S′ As in the visual case, only a few such equivalence classes are computed, corresponding to a few distal objects that were selected and assigned an index, as postulated in FINST Theory We can thus bind several objects of thought to objects in real space (including sensory ‘objects’ perceived in proprioceptive modalities) This can explain the ‘spatial’ character of spatial representations, just the way they did in the purely visual cases illustrated earlier Mental imagery and neuroscience ● Neuroanatomical evidence for a retinotopic display in the earliest visual area of the brain (V1) ● Neural imaging data showing V1 is more active during mental imagery than during other forms of thought The form of activity differs for small vs large images in the way that it differs when viewing small and large displays ● Transcranial magnetic stimulation of visual areas interferes more with imagery than other forms of thought ● Clinical cases show that visual and image impairment tend to be similar (Bisiach, Farah) ● More recently psychophysical measures of images shows parallels with comparable measures of vision, and these can be related to the receptive cells in V1 Status of different types of evidence in the debate about the form of mental images ● Phenomenology. Is it epiphenominal? ● Neuroscience evidence for: Role of vision Type and location of neural structures underlying images Are the neural mechanisms for early vision used in imagery? Does neuroanatomy provide evidence for the nature of “depictive” representations. Neuroscience has shown that the retinal pattern of activation is displayed on the surface of the cortex There is a topographical projection of retinal activity on the visual cortex of the cat and monkey. Tootell, R. B., Silverman, M. S., Switkes, E., & de Valois, R. L (1982). Deoxyglucose analysis of retinotopic organization in primate striate cortex. Science, 218, 902-904. Problems with drawing conclusions about the nature of mental images from neuroscience data 1. The capacity for imagery and for vision are known to be independent. Also all imagery results are observed in the blind. 2. Cortical topography is 2-D, but mental images are 3-D – all phenomena (e.g. rotation) occur in depth as well as in the plane. Patterns in the visual cortex are in retinal coordinates whereas images are in world-coordinates 3. 4. Your image stays fixed in the room when you move your eyes or turn your head or even walk around the room Accessing information from an image is very different from accessing it from the perceived world. Order of access from images is highly constrained. Conceptual rather than graphical properties are relevant to image complexity (e.g., mental rotation). Problems with drawing conclusions about mental images from the neuroscience evidence 5. 6. 7. 8. Retinal and cortical images are subject to Emmert’s Law, whereas mental images are not; The signature properties of vision (e.g. spontaneous 3D interpretation, automatic reversals, apparent motion, motion aftereffects, and many other phenomena) are absent in images; A cortical display account of most imagery findings is incompatible with the cognitive penetrability of mental imagery phenomena, such as scanning and image size effects; The fact that the Mind’s Eye is so much like a real eye (e.g., oblique effect, resolution fall-off) should serve to warn us that we may be studying what observers know about how the world looks to them, rather than what form their images take. Problems with drawing conclusions about mental images from the neuroscience evidence 9. Many clinical cases can be explained by appeal to tacit knowledge and attention The ‘tunnel effect’ found in vision and imagery (Farah) is likely due to the patient knowing what things now looked like to her post-surgery Hemispatial neglect seems to be a deficit in attention, which also explains the “representational neglect” in imagery reported by Bisiach A recent study shows that imaginal neglect does not appear if patients have their eyes closed. This fits well in the account I will offer in which the spatial character of a mental images derives from concurrently perceived space. 10. What if colored three-dimensional images were found in visual cortex? What would that tell you about the role of mental images in reasoning? Would this require a homunculus? Should we welcome back the homunculus? ● In the limit if the visual cortex contained the contents of ones conscious experience in imagery we would need an interpreter to “see” this display in visual cortex ● But we will never have to face this prospect because many experiments (including ones by Kosslyn) show that the contents of mental images are conceptual (or, as Kosslyn puts it, contain “predigested information”). ● And finally, it is clear to anyone who thinks about it for a few seconds that you can make your image do whatever you want and to have whatever properties you wish. There are no known constraints on mental images that cannot be attributed to lack of knowledge of the imagined situation (e.g., imagining a 4D cube). All currently claimed properties of mental images are cognitively penetrable. Explaining mental scanning, mental rotation and image size effects in terms of “functional space” ● When people are faced with the natural conclusion that the “iconic” position entails space (as in scanning and size effects) they appeal to “functional space” ● A Matrix in a computer are often cited as an example ● Consider a functional space account of scanning or of mental rotation: Why does it take longer to scan a greater distance in a functional space? Why does it take longer to rotate a mental image a greater angle? Why do conscious contents misguide us? The contents that appear in our conscious experience almost always concern what we are thinking about and not what we are thinking with – with content rather than form. The processes that we see unfolding in our mind are almost always attributable to what we know about how the things we are thinking about would enfold, rather than being due to laws that apply to our cognitive architecture. cf Code Box We should take seriously the possibility that (almost) all constraints and law-like behaviors of objects of our experience are constraints due to our knowledge rather than of the mental architecture. Notice the mental rotation example and the mistake that Jesse Prinz makes. This is what our conscious experience suggests goes on in vision… Kliban This is what the demands of explanation suggests must be going on in vision… Imagine this shape rotating slowly Is this how it looked to you? When you make it rotate in your mind, does it retain its rigid 3D shape without re-computing it? Would you expect to ‘see’ this kind of information process? Does the experience in this case reassure you that the rotation was smooth? Are you sure something rotated? What about the evidence of conscious experience? Is it irrelevant? I have often been accused of relegating conscious experience to the category of epiphenomena – something that accompanies a process but does not itself have a role in its causation. But images are not illusory or unnatural, they are quite real. The problem is that people have theories of the causal or informationprocessing that underlies these phenomena and these theories are almost always false because they assume a simple and obvious mapping from the experience to the computational or brain states so that we can see the form of the representation. The connection between conscious experience and information processing is deeply mysterious (it’s the mind-body problem). But one thing we do know is that the sequence of events that unfolds when we imagine something does not reveal causal laws because there are no causal laws of conscious states as conscious states. The important distinction between architecture and represented content It is only obligatory that a certain pattern must occur if the pattern is caused by fixed properties of the architecture, as opposed to being due to properties of what is represented (i.e., what the observer tacitly knows about the behavior of what is represented) If it is obligatory only because the theorist says it is, then score that as a free empirical parameter (a wild card) If we allow one theory to stipulate what is obligatory without there being a principle that mandates it, then any other theory can stipulate the same thing. Such theories are unconstrained and explain nothing. This failure of image theories is quite general – all picture theories suffer from the same lack of principled constraints The important distinction between architecture and represented content It is only obligatory that a certain pattern must occur if the pattern is caused by fixed properties of the architecture, as opposed to being due to properties of what is represented (i.e., what the observer tacitly knows about the behavior of what is represented) If it is obligatory only because the theorist says it is, then score that as a free empirical parameter (a wild card) If we allow one theory to stipulate what is obligatory without there being a principle that mandates it, then any other theory can stipulate the same thing. Such theories are unconstrained and explain nothing. This failure of image theories is quite general – all picture theories suffer from the same lack of principled constraints How are these ‘obligatory’ constraints realized? Image properties, such as size and rigidity, are assumed to be inherent in the system of representation (its architecture) That raises the question of what kind of architecture could possibly enforce rigidity of shape? Notice that neither a spatial display nor a functional space make it obligatory that shape be rigidly maintained as orientation is changed. Only certain physical properties can explain rigidity. Such rigidity could not be part of the architecture of an imagery system because we can easily imagine situations in which rigidity does not hold (e.g. imagine a rotating snake!). There is also evidence that ‘mental rotation’ is incremental, not holistic, and the speed of rotation depends on the conceptual complexity of the shape and the comparison task. Aside: What can we conclude from the contents of conscious experience? What should we conclude about the role of conscious appearance in cognitive science? We can’t do without it: When we ask which line appears longest or which version of an ambiguous figure we see we are asking for a report of conscious content. Much of what we know of how vision works depend on such evidence. What should we conclude about the role of conscious appearance in cognitive science? But we can’t accept it at face value: If you ask yourself “what am I thinking …?” you raise one of the most mysterious problems in the philosophy of mind. You could not possibly give the correct, or at least an unproblematic answer because it is quite possible that you don’t know what you are thinking! Your experience when thinking is as of speaking or of perceiving – what else could it be? Every imagined sentence is infinitely ambiguous and understanding it presupposes a huge amount about the context of its utterance. Your experience of speaking cannot be the same as your thought; your thought precedes your imagined speech and is just one of the contents that are expressed. The sentences you imagine, as the sentences you speak, follow Gricean conversational maxims – e.g., don’t state what your listener already knows, state only what is relevant, state only as much as necessary to convey your intentions. Something similar is true of imaging. Examples of conscious evidence that leads to false conclusions Sentence example Example of mental diagram and what it must assume Do we (or can we) experience our thoughts? There is much that can be said on this topic, but there is no time for it here. But see: Grice, H. P. (1975). "Logic and Conversation." Syntax and Semantics 3: 41-58. Hurlburt, R. T., & Schwitzgebel, E. (2007). Describing Inner Experience? Cambridge, MA: MIT Press. Schwitzgebel, E. (2011). Perplexities of Consciousness. Cambridge, MA: MIT Press. What should we conclude about the role of conscious appearance in cognitive science? But we can’t accept it at face value: If you ask yourself “what am I thinking …?” you raise one of the most mysterious problems in the philosophy of mind. You could not possibly give the correct, or at least an unproblematic answer because it is quite possible that you don’t know what you are thinking! Your experience when thinking is as of speaking or of perceiving – what else could it be? Every imagined sentence is infinitely ambiguous and understanding it presupposes a huge amount about the context of its utterance. Your experience of speaking cannot be the same as your thought; your thought precedes your imagined speech and is just one of the contents that are expressed. The sentences you imagine, as the sentences you speak, follow Gricean conversational maxims – e.g., don’t state what your listener already knows, state only what is relevant, state only as much as necessary to convey your intentions. Something similar is true of imaging. But there are examples of solving geometry problems easily with imagery • There are many problems that you can solve much more easily when you imagine a layout than when you do not. • In fact many instances of solving problems by imagining a layout that seem very similar to how would solve them if one had pencil-and-paper. • The question of how pictures, graphs, diagrams, etc help in reasoning is very closely related to the question of how imagined layouts function in reasoning. That is not in question. What is in question is what happens in either the visual or imagined cases and how images can benefit from this processes even though there is no real diagram. How do real visual displays help thinking? Why do diagrams, graphs, charts, maps, icons and other visual objects help us to reason and to solve problems? The question why visual aids help is nontrivial and Seeing & Visualizing, chapter 8 contains some speculative discussion, e.g., they allow the visual system to: • make certain kinds of visual inferences • make use of visual demonstratives to offload some of the memory load • Capitalize on the fact that the displays embody the axioms of measure theory and of geometry (which are then inherited by thought) The big question is whether any of these advantages carry over to imaginal thinking! Do mental images have some (or any) of the critical properties that make diagrams helpful in reasoning? Visual inferences? ● If we recall a visual display it is because we have encoded enough information about its visual-geometrical properties that we can meet some criteria, e.g., we can draw it. But there are innumerably many ways to encode this information that are sufficient for the task (e.g. by encoding pairwise spatial relations, global spatial relations, and so on). For many properties the task of translating from one form to another is much more difficult than the task of visually encoding it – the translation constitutes visual inference. ● The visual system generalizes from particular instances as part of its object-recognition skill (all recognition is recognition-as and therefore assumes generalization from tokens to types). It is also very good at noticing certain properties (e.g., relative sizes, deviations from square or circle, collinearity, inside, and so on). These capabilities can be exploited in graphical layouts. Memorize this map so you can draw it accurately From your memory: • • • • • Which groups of 3 or more locations are collinear? Which locations are midway between two others? Which locations are closest to the center of the island? Which pairs of locations are at the same latitude? Which is the top-most (bottom-most) location? If you could draw the map from memory using whatever properties you noticed and encoded, you could easily answer the questions by looking at your drawing – even if you had not encoded the relations in the queries. Draw a rectangle. Draw a line from the bottom corners to a point on the opposite vertical side. Do these two lines intersect? Is the point of intersection of the two lines below or above the midpoint? Does it depend on the particular rectangle you drew? A B x y m m’ D C Which properties of a real diagram also hold for a mental diagram? ● A mental “diagram” does not have any of the properties that a real diagram gets from being on a rigid 2D surface. ● When you imagine 3 points on a line, labeled A, B, and C, must B be between A and C? What makes that so? Is the distance AC greater than the distance AB or BC? ● When you imagine drawing point C after having drawn points A and B, must the relation between A and B remain unchanged (e.g., the distance between them, their qualitative relation such as above or below). Why? ● These questions raise what is known as the frame problem in Artificial Intelligence. If you plan a sequence of actions, how do you know which properties of the world a particular action will change and which it will not, given that there are an unlimited number of properties and connections in the world?