Different modes of visual organization for perception and for action Melvyn A. Goodale1 and Tzvi Ganel2 To appear in: Oxford Handbook of Perceptual Organization Oxford University Press Edited by Johan Wagemans 1 2 The Brain and Mind Institute, The University of Western Ontario, London Ontario Canada N6A 5C2 Department of Psychology, Ben-Gurion University of the Negev, Be’er-Sheva, Israel 1. Introduction We depend on vision, more than on any other sense, to perceive the world of objects and events beyond our bodies. We also use vision to move around that world and to guide our goal-directed actions. Over the last twenty-five years, it has become increasingly clear that the visual pathways in the brain that mediate our perception of the world are quite distinct from those that mediate the control of our actions. This distinction between ‘vision-for-perception’ and ‘vision-for-action’ has emerged as one of the major organizing principles of the visual brain, particularly with respect to the visual pathways in the cerebral cortex (Goodale and Milner, 1992; Milner and Goodale, 2006). According to Goodale and Milner’s (1992) account, the ventral stream of visual processing, which arises in early visual areas and projects to inferotemporal cortex, constructs the rich and detailed representation of the world that serves as a perceptual foundation for cognitive operations, allowing us to recognize objects, events and scenes, attach meaning and significance to them, and infer their causal relations. Such operations are essential for accumulating a knowledge-base about the world. In contrast, the dorsal stream, which also arises in early visual areas but projects instead to the posterior parietal cortex, provides the necessary visual control of skilled actions, such as manual prehension. Even though the two streams have different functions and operating principles, in everyday life they have to work together. The perceptual networks of the ventral stream interact with various high-level cognitive mechanisms and enable an organism to select a goal and an associated course of action, while the visuomotor networks in the dorsal stream (and their associated cortical and subcortical pathways) are responsible for the programming and on-line control of the particular movements that action entails. Of course, the dorsal and ventral streams have other roles to play as well. For example, the dorsal stream, together with areas in the ventral stream, plays a role in spatial navigation – and areas in the dorsal stream appear to be involved in some aspects of working memory (Kravitz, Saleem, Baker, and Mishkin, 2011). This review, however, will focus on the respective roles of the two streams in perception and action – and will concentrate largely on the implications of the theory for the principles governing perceptual organization and visuomotor control. 1 2. Different neural computations for perception and action Evidence from a broad range of empirical studies from human neuropsychology to single-unit recording in non-human primates (for reviews, see Culham and Valyear, 2006; Goodale, 2011; Kravitz et al., 2011) supports the idea of two cortical visual systems. Yet the question remains as to why two separate systems evolved in the first place. Why couldn’t one “general purpose” visual system handle both vision-for-perception and vision-for-action? The answer to this question lies in the differences in the computational requirements of vision-for-perception on the one hand and vision-for-action on the other. To be able to grasp an object successfully, for example, the visuomotor system has to deal with the actual size of the object and its orientation and position with respect to the hand you intend to use to pick it up. These computations need to reflect the real metrics of the world, or at the very least, make use of learned “look-up tables” that link neurons coding a particular set of sensory inputs with neurons that code the desired state of the limb (Thaler and Goodale, 2010). The time at which these computations are performed is equally critical. Observers and goal objects rarely stay in a static relationship with one another and, as a consequence, the egocentric location of a target object can often change radically from moment-to- moment. In other words, the required coordinates for action need to be computed at the very moment the movements are performed. In contrast to vision-for-action, vision-for-perception does not need to deal with the absolute size of objects or their egocentric locations. In fact, very often such computations would be counter-productive because our viewpoint with respect to objects does not remain constant – even though our perceptual representations of those objects do show constancy. Indeed, one can argue that it would be better to encode the size, orientation, and location of objects relative to each other. Such a scene-based frame of reference permits a perceptual representation of objects that transcends particular viewpoints, while preserving information about spatial relationships (as well as relative size and orientation) as the observer moves around. The products of perception also need to be available over a much longer time scale than the visual information used in the control of action. By working with perceptual representations that are object- or scene-based, we are able to maintain the constancies of size, shape, color, lightness, and relative location, over time and across different viewing conditions. The differences between the relative frames of reference required for vision-for-perception and absolute frames of reference required for vision-for-action lead in turn to clear differences in the way in which visual information about objects and their spatial relations is organized and represented. These differences can be most readily seen in the way in which the two visual systems deal with visual illusions. 2.1. Studies of visual illusions The most intriguing – yet also the most controversial evidence – for dissociations between action and perception in healthy subjects has come from studies of visual illusions of size (for a review see Goodale, 2011). In visual illusions of size an object is typically embedded within the context of other objects or of other pictorial cues that distort its perceived size. Visual illusions, by definition, have robust effects on perceptual judgments. Surprisingly, the same illusions can have little or no effects on visuomotor tasks such as grasping. Thus, even though a person might perceive an object embedded within an illusion to be larger or smaller than it really is, when they reach out to pick up the object, the opening of his or her 2 grasping hand is often unaffected by the illusion. In other words, the grip aperture is scaled to the real, not the apparent size of the goal object. This result has been interpreted as evidence for the idea that vision-for-action makes use of real-world metrics while vision-for-perception uses relative or scenebased metrics (Goodale and Milner, 2005). This interpretation, however, has been vigorously challenged over the past decade by studies claiming that when attention and other factors are taken into account, there is no difference between the effects of size-contrast illusions on grip scaling and perceptual reports of size (for a review, see Franz and Gegenfurtner, 2008). Figure 1. The effect of a size-contrast illusion on perception and action. A. The traditional Ebbinghaus illusion in which the central circle in the annulus of larger circles is typically seen as smaller than the central circle in the annulus of smaller circles, even though both central circles are actually the same size. B. The same display, except that the central circle in the annulus of larger circles has been made slightly larger. As a consequence, the two central circles now appear to be the same size. C. A 3D version of the Ebbinghaus illusion. Participants are instructed to pick up one of the two 3D disks placed either on the display shown in panel A or the display shown in panel B. D. Two trials with the display shown in panel B, in which the participant picked up the small disk on one trial and the large disk on another. Even though the two central disks were perceived as being the same size, the grip aperture in flight reflected the real not the apparent size of the disks. Adapted with permission from Aglioti et al. (1995). A representative example of such conflicting results comes from studies that have compared the effects of the Ebbinghaus illusion on action and perception. In this illusion, a circle surrounded by an annulus of smaller circles appears to be larger than the same circle surrounded by an annulus of larger circles (see Figure 1A). It is thought that the illusion arises because of an obligatory comparison between the size of the central circle and the size of the surrounding circles, with one circle looking relatively smaller than the other (Coren and Girgus, 1978). It is also possible that the central circle within the annulus of smaller circles will be perceived as more distant (and therefore larger) than the circle of equivalent 3 retinal-image size within the array of larger circles. In other words, the illusion may be simply a consequence of the perceptual system's attempt to make size-constancy judgments on the basis of an analysis of the entire visual array (Gregory, 1963). In addition, the distance between the surrounding circles and the central circle may also play a role; if the surrounding circles are close to the central circle, then the central circle appears larger, but if they are further away, the central circle appears smaller (Roberts, Harris, and Yates, 2005). In many experiments, the size of the surrounding circles and the distance between them and the central circle are confounded. But whatever the critical factors might be in any particular Ebbinghaus display, it is clear that the apparent size of the central circle is influenced the context in which it is embedded. These contextual effects are remarkably resistant to cognitive information about the real size of the circles. Thus, even when people are told that the two circles are identical in size (and this fact is demonstrated to them), they continue to experience a robust illusion of size. The first demonstration that grasping might be refractory to the Ebbinghaus illusion was carried out by Aglioti et al. (1995). These investigators constructed a 3-D version of the Ebbinghaus illusion, in which a poker-chip type disk was placed in the centre of a 2-D annulus made up of either smaller or larger circles (Figure 1C). Two versions of the Ebbinghaus display were used. In one case, the two central disks were physically identical in size but one appeared to be larger than the other (Figure 1A). In the second case, the size of one of the disks was adjusted so that the two disks were now perceptually identical but had different physical sizes (Figure 1B). Despite the fact that the participants in this experiment experienced powerful illusion of size, their anticipatory grip aperture was unaffected by the illusion when they reached out to pick up each of the central disks. In other words, even though their perceptual estimates of the size of the target disk were affected by the presence of the surrounding annulus, maximum grip aperture between the index finger and thumb of the grasping hand, which was reached about 70% of the way through the movement, was scaled to the real not the apparent size of the central disk (Figure 1D). The findings of Aglioti et al. (1995) have been replicated in a number of other studies (for a review, see Carey, 2001; Goodale, 2011). Nevertheless, other studies using the Ebbinghaus illusion have failed to replicate these findings. Franz et al. (2000, 2001), for example, used a modified version of the illusion and found similar (and significant) illusory effects on both vision-for-action and vision-for-perception, arguing that the two systems are not dissociable from one another, at least in healthy participants. These authors argued that the difference between their findings and those of Aglioti et al. resulted from different task demands. In particular, in the Aglioti study (as well as in a number of other studies showing that visuomotor control is resistant to visual illusions), subjects were asked to attend to both central disks in the illusory display in the perceptual task, but to grasp only one object at a time in the action task. Franz and colleagues argued that this difference in attention in the perceptual and action tasks could have accounted for the pattern of results in the Aglioti et al. study. In the experiments by Franz and colleagues, participants were presented with only a single disk surrounding by an annulus of either smaller or larger circles. Under these conditions, Franz and colleagues found that both grip aperture and perceptual reports were affected by the presence of the surrounding annulus. The force of this demonstration, however, was undercut in later experiments by Haffenden and Goodale (1998) who asked participants either to estimate the size of one of the central disks manually by opening their finger and thumb a matching amount or to pick up it up. Even though in both cases, participants were arguably directing their attention to only one of the disks, there was a clear difference in the effect of 4 the illusion: the manual estimates but not the grasping movements were affected by the size of the circles in the surrounding annulus. Franz (2003) later argued the slope of the function describing the relationship between manual estimates and the real size of the target object was far steeper than more ‘conventional’ psychophysical measures – and that, when one adjusted for the difference in slope, both action and perception were affected to the same degree by the Ebbinghaus and by other illusions. Although this explanation, at least on the face of it, is a compelling one, it cannot explain why Aglioti et al. (1995) and Haffenden and Goodale (1998) found that when the relative sizes of the two target objects in the Ebbinghaus display were adjusted so that they appeared to be perceptually identical, the grip aperture that participants used to pick up the two targets continued to reflect the physical difference in their size. Nor can it explain the findings of a recent study by Stöttinger and colleagues (2012) who showed that even when slopes were adjusted, manual estimates of object size were much more affected by the illusion (in this case, the Diagonal illusion) than were grasping movements. Recently, several studies have suggested that online visual feedback during grasping could be a relevant factor accounting for some of the conflicting results in the domain of visual illusions and grasping. For example, Bruno and Franz (2009) have performed a meta-analysis of studies that looked at the effects of the Müller-Lyer illusion on perception and action and concluded that the dissociation between the effects of this illusion on grasping and perception is mostly pronounced when online visual feedback is available. According to this account, feedback from the fingers and the target object during grasp can be affectively used by the visuomotor system to counteract the effect of visual illusions on grip aperture. Further support for this proposal comes from studies that showed that visual illusions, such as the Ebbinghaus illusion, affect grasping trajectories only during initial stages of the movement, but not in later stages, in which visual feedback can be effectively used allow the visuomotor system to compensate for the effects of the illusory context (Glover and Dixon, 2002). However, other studies that manipulated the availability of visual feedback during grasp failed to find evidence of visual feedback on grasping performance in the context of visual illusions (Ganel, Tanzer, and Goodale, 2008a; Westwood and Goodale, 2003). The majority of studies that have claimed that action escapes the effects of pictorial illusions have demonstrated this by finding a null effect of the illusory context on grasping movements. In other words, they have found that perception (by definition) was affected by the illusion but peak grip aperture of the grasping movement was not. But null effects like this are never as compelling as double dissociations between action and perception. 5 Figure 2. The effect of the Ponzo illusion on grasping and manual estimates. A. Two objects embedded in the Ponzo illusion used in Ganel et al.'s (2008a) study. Although the right object is perceived as larger, it is actually smaller in size. B. Maximum grip apertures and perceptual estimation data show that the fingers' aperture was not affected by the perceived but rather tuned to the actual sizes of the objects. Perceptual estimations, on the other hand, were affected by the Ponzo illusory context. Adapted with permission from Ganel et al. (2008a). As turns out, a more recent study has in fact demonstrated a double dissociation between perception and action. Ganel and colleagues (2008a) used the well-known Ponzo illusion in which the perceived size of an object is affected by its location within pictorial depth cues. Objects located at the diverging end of the display appear to be larger than those located at the converging end. To dissociate the effects of real size from those of illusory size, Ganel and colleagues manipulated the real sizes of two objects that were embedded in a Ponzo display so that the object that was perceived as larger was actually the smaller one of the pair (see Figure 2A). When participants were asked to make a perceptual judgment of the size of the objects, their perceptual estimates reflected the illusory Ponzo effect. In contrast, when they picked up the objects, the aperture between the finger and thumb of their grasping hand was tuned to their actual size. In short, the difference in their perceptual estimates of size for the two objects, which reflected the apparent difference in the size, went in the opposite direction from the difference in their peak grip aperture, which reflected the real difference in size (Figure 2B). This double dissociation between the effects of apparent and real size differences on perception and action respectively cannot be explained away by appealing to differences in attention or differences in slope (Franz et al., 2001; Franz, Gegenfurtner, Bülthoff, and Fahle, 2000; Franz, 2003). In a series of experiments that used both the Ebbinghaus and the Ponzo illusions, Gonzalez and her colleagues provided a deeper understanding of the conditions under which grasping can escape the effects of visual illusions (Gonzalez, Ganel, and Goodale, 2006). They argued that many of the earlier studies showing that actions are sensitive to the effects of pictorial illusions required participants to perform movements requiring different degrees of skill under different degrees of deliberate control and with different degrees of practice. If one accepts the idea that high-level conscious processing of visual information is mediated by the ventral stream (Milner and Goodale, 2006), then it is perhaps not 6 surprising that the less skilled, the less practiced, and thus the more deliberate an action, the greater the chances that the control of this action would be affected by ventral stream perceptual mechanisms. Gonzalez et al. (2006) provided support for this conjecture by demonstrating that awkward, unpractised grasping movements, in contrast to familiar precision grips, were sensitive to the Ponzo and Ebbinghaus illusions. In a follow-up experiment, they showed that the effects of these illusions on initially awkward grasps diminished with practice (Gonzalez et al., 2008). Interestingly, similar effects of practice were not obtained for right-handed subjects grasping with their left hand. Even more intriguing is the finding that grasping with the left hand, even for many left-handed participants, was affected to a larger degree by pictorial illusions compared to grasping with right hand (Gonzalez et al., 2006). Gonzalez and colleagues have interpreted these results as suggesting that the dorsal-stream mechanisms that mediate visuomotor control may have evolved preferentially in the left hemisphere, which primarily controls right-handed grasping. Additional support from this latter idea comes from work with patients with optic ataxia from unilateral lesions of the dorsal stream (Perenin and Vighetto, 1988). Patients with lefthemisphere lesions typically show what is often called a ‘hand effect’: they exhibit a deficit in their ability to visually direct reaching and grasping movements to targets situated in both the contralesional and the ipsilesional visual field. In contrast, patients with right-hemisphere lesions are impaired only when they reach out to grasp objects in the contralesional field. Although the debate of whether or not action escapes the effects of perceptual illusions is far from being resolved (for recent findings, see Foster, Kleinholdermann, Leifheit, and Franz, 2012; Heed et al., 2011; van der Kamp, de Wit, and Masters, 2012), the focus on this issue has directed attention away from the more general question of the nature of the computations underlying visuomotor control in more natural situations. One example of an issue that has received only minimal attention from researchers is the role of information about object shape on visuomotor control (but see Cuijpers, Brenner, and Smeets, 2006; Cuijpers, Smeets, and Brenner, 2004; Goodale et al., 1994; Lee, Crabtree, Norman, and Bingham, 2008) – and how that information might differ in its organization from conventional perceptual accounts of shape processing. 2.2. Studies of configural processing of shape The idea that vision treats the shape of an object in a holistic manner has been a basic theme running through theoretical accounts of perception from early Gestalt psychology (Koffka, 1935) to more contemporary cognitive neuroscience (e.g. Duncan, 1984; O’Craven, Downing and Kanwisher, 1999). Encoding an object holistically permits a representation of the object that preserves the relations between object parts and other objects in the visual array without requiring precise information about the absolute size of each of the object’s dimensions (see Behrmann, Richler, and Avidan, 2013; Pomerantz and Cragin, 2013). In fact, as discussed earlier, calculating the exact size, distance, and orientation of every aspect of every object in a visual scene carries a huge computational load. Holistic (or configural) processing is much more efficient for constructing perceptual representations of objects. When we interact with an object, however, it is imperative that the visual processes controlling the action take into account the absolute metrics of the most relevant dimension of the object without being influenced by other dimensions or features. In other words, rather than being holistic, the visual processing mediating action should be analytical. Empirical support for the idea that the visual control of action is analytical rather than configural comes from experiments using a variant of the Garner speeded classification task (Ganel and Goodale, 2003). In 7 these experiments, participants were required to either make perceptual judgments of the width of rectangles or to grasp them across their width – while in both cases trying to ignore the length. As expected, participants could not ignore the length of a rectangle when making judgements of its width. Thus, when the length of a rectangle was varied randomly from trial to trial, participants took longer to discriminate a wide rectangle from a narrow one than when the length did not change. In sharp contrast, however, participants appeared to completely ignore the length of an object when grasping it across its width. Thus, participants took no longer to initiate (or to complete) their grasping movement when the length of the object varied than when its length did not change. These findings show that the holistic processing that characterizes perceptual processing does not apply to the visual control of skilled actions such as grasping. Instead, the visuomotor mechanisms underlying this behaviour deal with the basic dimensions of objects as independent features. This finding of a dissociation between holistic and analytical processing for perception and action respectively using Garner's paradigm has been replicated by several other different studies (Janczyk and Kunde, 2012; Kunde, Landgraf, Paelecke, and Kiesel, 2007) and more recently has been reported in young children (Schum, Franz, Jovanovic, and Schwarzer, 2012). Figure 3. An example of a within-object illusion of shape. Although the two rectangles have an equal width, the shorter rectangle is perceived as wider than the taller rectangle (see Ganel and Goodale, 2003; Ben-Shalom and Ganel, 2012). Beyond being driven by configural processing, subjects' inability to ignore information about an irrelevant dimension when estimating the size of a relevant dimension often leads to a directional distortion in their size perception. In particular, because a rectangle's width is always perceived relative to its length, longer rectangles will be always perceived as narrower, even in cases in which their actual width is kept constant (see Figure 3). This type of illusion, in which the perceived element is affected by irrelevant dimensions belonging to the same object – has been termed a within-object illusion (see BenShalom and Ganel, 2012). Interestingly, it has been recently argued that within-object illusions and between-objects illusions (discussed in the previous section) rely on different cognitive mechanisms; for example, it has been shown that representations in iconic memory are affected by the later type of illusions, but not by within-object illusions. More relevant to the present discussion, it has been shown that within-object illusions, like betweenobject illusions, do not affect visuomotor control. That is, unlike perceptual estimations of rectangle's 8 width, which is affected by its length, the aperture of the fingers when grasping the rectangle across its width was shown to be unaffected by length (Ganel and Goodale, 2003). Taken together, all these findings point to the same conclusion: unlike visual perception, which is always affected by relative frames of reference, the visual control of action is more analytical and is therefore immune to the effects of both within-object and between-objects pictorial illusions. Figure 4. There appear to be fewer circles on the right than on the left – even though in both cases there are 22 individual circles. Connecting the circles with short lines creates the illusion of fewer circles. Even so, when our brain plans actions to these targets it computes the actual number of targets. In the task used by Milne et al. (2013) far fewer circles were used, but the effect was still present in perceptual judgments but not in the biasing of rapid reaching movements. In the action task, it was the actual not the apparent number of circles that affected performance. Adapted with permission from Milne et al. (2013). Recent work also suggests that there are fundamental differences in scene segmentation for perception and action planning. It is well-established that our perceptual system parses complex scenes into discrete objects, but what is less known is that parsing is also required for planning visually-guided movements, particularly when more than one potential target is present. In a recent study, Milne et al. (2013) explored whether perception and motor planning use the same or different parsing strategies – and whether perception is more sensitive to contextual effects than is motor planning. To do this, they used the ‘connectedness illusion’, in which observers typically report seeing fewer targets if pairs of targets are connected by short lines (Franconeri, Bemis, and Alvarez, 2009; He, Zhang, Zhou, and Chen, 2009; see Figure 4). Milne et al. (2013) tested participants in a rapid reaching paradigm they had developed that requires subjects to initiate speeded arm movements toward multiple potential targets before one of the targets is cued for action (Chapman et al., 2010). In their earlier work, they had shown that when there were an equal number of targets on each side of a display, participants aimed their initial trajectories toward a midpoint between the two target locations. Furthermore, when the distribution of targets on each side of a display was not equal (but each potential target had an equal probability of becoming the goal target), initial trajectories were biased toward the side of the display that contained a greater number of targets. They argued that this behavior maximizes the chances of success on the task because movements are directed toward the most probable location of the eventual goal, thereby minimizing the ‘cost’ of correcting the movement in-flight. Because it provides a behavioral ‘read-out’ of rapid 9 comparisons of target numerosity for motor planning, the paradigm is an ideal way to measure object segmentation in action in the context of the connectedness illusion. When participants were asked to make speeded reaches towards the targets where sometimes the targets were connected by lines, their reaches were completely unaffected by the presence of the connecting lines. Instead, their movement plans, as revealed by their movement trajectories, were influenced only by the difference in the number of targets present on each side of the display, irrespective of whether connecting lines were there or not. Not unexpectedly, however, when they were asked to report whether there were fewer targets present on one side as compared to the other, their reports were biased by the connecting lines between the targets. The work by Milne et al. (2013) suggests that scene segmentation for perception depends on mechanisms that are distinct from those that allow humans to plan rapid and efficient target-directed movements in situations where there are multiple potential targets. While the perception of object numerosity can be dramatically influenced by manipulations of object grouping, such as the connected illusion, the visuomotor system is able to ignore such manipulations and to parse individual objects and accurately plan, execute, and control rapid reaching movements to multiple goals. These results are especially compelling considering that initial goal selection is undoubtedly based on a perceptual representation of the goal (for a discussion of this issue, see Milner and Goodale, 2006). The planning of the final movement, however, is able to effectively by-pass the contextual biases of perception, particularly in situations where rapid planning and execution of the movement is paramount. 2.3. Studies of object size resolution The 19th C German physician and scientist, Ernst Heinrich Weber, is usually credited with the observation that our sensitivity to changes in any physical property or dimension of an object or sensory stimulus decreases as magnitude of that property or dimension increases. For example, if a bag of sugar weighs only 50 grams, then we will notice a change in weight if a only few grams of sugar are added or taken away. But if the bag weighs 500 grams, much more sugar must be added or taken away before we notice the difference. Typically, if the weight of something is doubled, then the smallest difference in weight that can be perceived is also doubled. Similar, but not identical functions have been demonstrated for the loudness of sounds, the brightness of visual stimuli, and a broad range of other sensory experiences. Imagine, for example, that you are riding on an express train on your way to an important meeting. As the train accelerates from 220 to 250 km an hour, you might scarcely notice the change in velocity – even though the same change in velocity was easily noticed as the train left the station earlier and began to accelerate. In short, the magnitude of the ‘just-noticeable difference’ (JND) increases with the magnitude or intensity of the stimulus. The German physicist-turned-philosopher Gustav Fechner later formalized this basic psychophysical principle mathematically and called it Weber’s Law. 10 Figure 5. Effects of object size on visual resolution (Just Noticeable Difference: JND). Left panel: The effect of object size on JNDs for Maximum Grip Apertures (MGAs) during grasping. Right panel: The effect of object size on JNDs during perceptual estimations. Note that JNDs for the perceptual condition increased linearly with length, following Weber's law, whereas the JNDs for grasping were unaffected by size. Adapted with permission from Ganel et al. (2008b). Weber's law is one of the most fundamental features of human perception. It is not clear, however, if the visual control of action is subject to the same universal psychophysical function. To investigate this possibility, Ganel and colleagues (Ganel, Chajut, and Algom, 2008b) carried out a series of psychophysical and visuomotor experiments in which participants were asked either to grasp or to make perceptual estimations of the length of rectangular objects. The JNDs were defined in this study by using the standard deviation of the mean grip aperture and the standard deviation of the mean perceptual judgement for a given stimulus. This is akin to the classical Method of Adjustment in which the amount of variation in the responses for a given size of a stimulus reflects an "area of uncertainty" in which participants are not sensitive to fluctuations in size. Not surprisingly, Ganel and colleagues found that the JNDs for the perceptual estimations of the object’s length showed a linear increase with length, as Weber’s law would predict. The JNDs for grip aperture, however, showed no such increase with object length and remained constant as the length of the object increased (see Figure 5). In other words, the standard deviation for grip aperture remained the same despite increases in the length of the object. Simply put, visually guided actions appear to violate Weber’s law reflecting a fundamental difference in the way that object size is computed for action and for perception (Ganel et al., 2008a; 2008b). This fundamental difference in the psychophysics of perception and action has been found to emerge in children as young as 5 years of age (Hadad, Avidan, and Ganel, 2012, see Figure 6). 11 Figure 6. JNDs for perceptual estimations (Panel A) and for grasping (Panel B) in different age groups. In all age groups, JNDs for perceptual condition increased with object size, following Weber's law. Importantly, however, the JNDs for grasping in all groups were unaffected by changes in the size of the target. Adapted with permission from Hadad et al. (2012). This difference in the psychophysics of perception and action can be observed in other contexts as well. In a recent study (Ganel, Freud, Chajut, and Algom, 2012), for example, participants were asked to grasp or to make perceptual comparisons between pairs of circular disks. Importantly, the actual difference in size between the members of the pairs was set below the perceptual JND. Again, a dissociation was observed between perceptual judgements of the size and the kinematic measures of the aperture of the grasping hand. Regardless of the whether or not participants were accurate in their judgments of the difference in size between the two disks, the maximum opening between the thumb and forefinger of their grasping hand in flight reflected the actual difference in size between the two disks (see Figure 7). These findings provide additional evidence for the idea that the computations underlying the perception of objects are different from those underlying the visual control of action. They also suggest that people can show differences in the tuning of grasping movements directed to objects of different sizes even when they are not conscious of those differences in size. Figure 7. Grasping objects that are perceptually indistinguishable. A) The set-up with examples of the stimuli that were used. Participants were asked on each trial to report 12 which object of the two was the larger and then to grasp the object in each pair that was in the centre of the table (task order was counterbalanced between subjects). B) MGAs for correct and for incorrect perceptual size classifications. MGAs reflected the real size differences between the two objects even in trials in which subjects erroneously judged the larger object in the pair as the smaller one. Adapted with permission from Ganel et al. (2012). The demonstrations showing that the visual control of grasping does not obey Weber’s law resonates with Milner and Goodale’s (2006) proposal that there is a fundamental difference in the frames of reference and metrics used by vision-for-perception and vision-for-action (Ganel et al. 2008b). This findings also converge with the results of imaging studies that suggest that the ventral and the dorsal streams represent objects in different ways (James, Humphrey, Gati, Menon, and Goodale, 2002; Konen and Kastner, 2008; Lehky and Sereno, 2007). Yet, the interpretation of these results has not gone unchallenged (Heath, Holmes, Mulla, and Binsted, 2012; Heath, Mulla, Holmes, and Smuskowitz, 2011; Holmes, Mulla, Binsted, and Heath, 2011; Smeets and Brenner, 2008). For example, in a series of papers, Heath and his colleagues (Heath et al., 2012, 2012; Holmes et al., 2011) have examined the effects of Weber's law on grip aperture throughout the entire movement trajectory and found an apparent adherence to Weber's law early but not later in the trajectory of the movement. A recent paper by Foster and Franz (2013), however, has suggested that these effects are confounded by movement velocity. In particular, due to task demands that require subjects to hold their finger and thumb together prior to each grasp, subjects tend to open their fingers faster for larger compared to smaller objects, a feature that characterizes only early stages of the grasping trajectory. Therefore, the increased grip variability for larger compared to smaller objects during the early portion of the trajectories could be attributed to velocity differences in the opening of the fingers rather than to the effects of Weber's law. In their commentary on Ganel et al.'s (2008b) paper, Smeets and Brenner (2008) argue that the results can be more efficiently accommodated by a ‘double-pointing’ account of grasping. According to this model, the movements of each finger of a grasping hand are controlled independently – each digit being simultaneously directed to a different location on the goal object (Smeets and Brenner, 1999, 2001). Thus, when people reach out to pick up an object with a precision grip, for example, the index finger is directed to one side of the object and the thumb to the other. No computation of object size is required, only the computation of two separate locations on the object, one for the finger and the other for the thumb. The apparent scaling of the grip to object size is nothing more than a by-product of the fact that the index finger and thumb are moving towards their respective end points. Smeets and Brenner go on to argue that because size is not computed for grasping, and only location matters, Weber's law would not apply. In other words, because location, unlike size, is a discrete rather than a continuous dimension, Weber’s law is irrelevant for grasping. Smeets and Brenner’s account also comfortably explains why grasping escapes the effects of pictorial illusions such as the Ebbinghaus and Ponzo illusions. In fact, more generally, their double-pointing or position-based account of grasping would appear to offer a more parsimonious account of a broad range of apparent dissociations between vision-for-perception and vision-for-action than appealing to a two-visual-systems model. Although Smeets and Brenner's (1999, 2001) interpretation is appealing, there are several lines of evidence showing that finger's trajectories during grasping are tuned to object size rather than location. For example, van de Kamp and Zaal (2007) have shown that when one side of a target object but not the 13 other is suddenly pushed in or out (with a hidden compressed-air device) as people are reaching out to grasp it, the trajectories of both digits are adjusted in flight. In other words, the trajectories of the both the finger and the thumb change to reflect the change in size of the target object. Smeets and Brenner’s model would not predict this. According to their double-pointing hypothesis, only the digit going to the perturbed side of the goal object should change course. But the fact that the trajectories of both digits show an adjustment is entirely consistent with the idea that the visuomotor system is computing the size of the target object. In other words, as the object changes size, so does the grip. Another line of evidence that goes against Smeets and Brenner’s double-pointing hypothesis comes from the neuropsychological literature. Damage to the ventral stream in the human occipitotemporal cortex can result in visual form agnosia, a deficit in visual object recognition. The best-documented example of such a case is patient DF, who has bilateral lesions to the lateral occipital area rendering her unable to recognize or discriminate between even simple geometric shapes such as a rectangle and a square. Despite her profound deficit in form perception, she is able to scale her grasp to the dimensions of the very objects she cannot describe or recognize, presumably using visuomotor mechanisms in her dorsal stream. As is often the case for neurological patients, DF is able to (partially) compensate for her deficits by relying on non-natural strategies based on their residual intact abilities. Schenk and Milner (2006), for example, found that under certain circumstances DF could use her intact visuomotor skills to compensate for her marked impairment in shape recognition. When DF was asked to make simple shape classifications (rectangle/square classifications), her performance was at chance. Yet, her shape classifications markedly improved when performed concurrently with grasping movements toward the target objects she was being asked to discriminate. Interestingly, this improvement appeared not to depend on afferent feedback from the grasping fingers because it was found that even when DF was planning her actions and just before the fingers actually started to move. Schenk and Milner therefore concluded that information about an object’s dimensions is available at some level via visuomotor activity in DF's intact dorsal stream and this in turn improves her shape-discrimination performance. For this to happen, the dorsal-stream mechanisms would have to be computing the relevant dimension of the object to be grasped and not simply the locations on that object to which the finger and thumb are being directed (for similar evidence in healthy individuals, see Linnell, Humphreys, McIntyre, Laitinen, and Wing, 2005). Again, these findings are clearly not in line with Smeets and Brenner's double-pointing hypothesis and suggest that the dorsal stream uses information about object size (more particularly, the relevant dimension of the target object) when engaged in visuomotor control. Parenthetically, it is interesting to note that the results of one of the experiments in the Schenk and Milner study also provide indirect evidence that grip aperture is not affected by the irrelevant dimension of the object to be grasped (Ganel and Goodale, 2003). When DF was asked to grasp objects across a dimension that was not informative of shape (i.e., grasp across rectangles of constant width that varied in length), no grasping-induced perceptual improvements in distinguishing between the different rectangles were found. This finding not only shows that shape per se was not being used in the earlier tasks where she did show some enhancement in her ability to discriminate between objects of different widths, but it also provides additional evidence for the idea that visuomotor control is carried out in an analytical manner (e.g. concentrating entirely on object width) without being influenced by differences in the configural aspects of the objects. As mentioned at the beginning of the chapter, Milner and Goodale (2006) have argued that visuomotor mechanisms in the dorsal stream tend to operate in real time. If the target object is no longer visible 14 when the imperative to begin the movement is given, then any object-directed action would have to be based on a memory of the target object, a memory that is necessarily dependent on earlier processing by perceptual mechanisms in the ventral stream. Thus, DF is unable to scale her grasp for objects that she saw only seconds earlier, presumably because of the damage to her ventral stream (Goodale, Jakobson, and Keillor, 1994). Similarly, when neurologically intact participants are asked to base their grasping on memory representations of the target object, rather than on direct vision, the kinematics of their grasping movements are affected by Weber's law and by pictorial illusions (Ganel et al. 2008b; for review, see Goodale 2011). Again, without significant modification, Smeets and Brenner's doublepointing model does not provide a parsimonious account for why memory-based action control should be affected by size whereas real-time actions should not. But as we have already seen, according to the two-visual systems account, when vision is not allowed and memory-based actions are performed, such actions have to rely on earlier perceptual processing of the visual scene, processing that in principle is subject to Weber’s law and pictorial illusions of size. 3. Conclusions The visual control of skilled actions, unlike visual perception, operates in real time and reflects the metrics of the real world. This means that many actions, such as reaching and grasping, are immune to the effects of a range of pictorial illusions, which by definition affect perceptual judgments. Only when the actions are deliberate and cognitively ‘supervised’ or are initiated after the target is no longer in view do the effects of illusions emerge. All of this suggests that our perceptual representations of objects are organized in a fundamentally different way from the visual information underlying the control of skilled actions directed at those objects. As we have seen, the visual perception of objects and their relations tends to be holistic and contextual with relative poor real-world metrics, whereas the visual control of skilled actions is more analytical, circumscribed, and metrically accurate. Of course, in everyday life, vision-for-perception and vision-for-action work together in the production of purposive behaviour – vision-for-perception, together with other cognitive systems, selects the goal object from the visual array while vision-for-action working with associated motor networks, carries out the required computations for the goal-directed action. In a very real sense, then, the strengths and weaknesses of these two kinds of vision complement each other in the production of adaptive behaviour. 15 4. References Aglioti, S., DeSouza, J.F., and Goodale, M.A. (1995). Size-contrast illusions deceive the eye but not the hand. Current Biology, 5(6), 679–685. Behrmann, M., Richler, J., and Avidan, G. (2013). Holistic face perception. In J. Wagemans (Ed.), Oxford Handbook of Perceptual Organization. Oxford , U.K: Oxford University Press. Ben-Shalom, A., and Ganel, T. (2012). Object representations in visual memory: evidence from visual illusions. Journal of Vision, 12(7). Bruno, N., and Franz, V.H. (2009). When is grasping affected by the Müller-Lyer illusion? A quantitative review. Neuropsychologia, 47(6), 1421–1433. Carey, D.P. (2001). Do action systems resist visual illusions? Trends in Cognitive Sciences, 5(3), 109–113. Chapman, C.S., Gallivan, J.P., Wood, D.K., Milne, J.L., Culham, J.C., and Goodale, M.A. (2010a). Reaching for the unknown: Multiple target encoding and real-time decision making in a rapid reach task. Cognition, 116, 168-176. Coren, S., and Girgus, J.S. (1978). Seeing is deceiving: the psychology of visual illusions. Lawrence Erlbaum Associates. Cuijpers, R.H., Brenner, E., and Smeets, J.B.J. (2006). Grasping reveals visual misjudgements of shape. Experimental Brain Research, 175(1), 32–44. Cuijpers, R.H., Smeets, J.B.J., and Brenner, E. (2004). On the relation between object shape and grasping kinematics. Journal of Neurophysiology, 91(6), 2598–2606. Culham, J.C., and Valyear, K.F. (2006). Human parietal cortex in action. Current Opinion in Neurobiology, 16(2), 205–212. Duncan, J. (1984). Selective attention and the organization of visual information. Journal of Experimental Psychology: General, 113(4), 501–517. Foster, R.M., and Franz, V.H. (2013). Inferences about time course of Weber’s Law violate statistical principles. Vision Research, 78, 56–60. Foster, R.M., Kleinholdermann, U., Leifheit, S., and Franz, V.H. (2012). Does bimanual grasping of the Müller-Lyer illusion provide evidence for a functional segregation of dorsal and ventral streams? Neuropsychologia, 50(14), 3392–3402. Franconeri, S.L., Bemis, D.K., and Alvarez, G.A. (2009). Number estimation relies on a set of segmented objects. Cognition, 113, 1-13. Franz, V.H. (2003). Manual size estimation: a neuropsychological measure of perception? Experimental Brain Research, 151(4), 471–477. Franz, V.H, Fahle, M., Bülthoff, H.H., and Gegenfurtner, K.R. (2001). Effects of visual illusions on grasping. Journal of Rxperimental Psychology. Human Perception and Performance, 27(5), 1124–1144. Franz, V.H, Gegenfurtner, K.R., Bülthoff, H.H., and Fahle, M. (2000a). Grasping visual illusions: no evidence for a dissociation between perception and action. Psychological Science, 11(1), 20–25. Franz, V.H, Gegenfurtner, K.R., Bülthoff, H.H., and Fahle, M. (2000b). Grasping visual illusions: no evidence for a dissociation between perception and action. Psychological Science, 11(1), 20–25. Franz, V.H, and Gegenfurtner, K.R. (2008). Grasping visual illusions: consistent data and no dissociation. Cognitive Neuropsychology, 25(7-8), 920–950. Ganel, T., Chajut, E., and Algom, D. (2008b). Visual coding for action violates fundamental psychophysical principles. Current Biology, 18(14), R599–601. Ganel, T., Freud, E., Chajut, E., and Algom, D. (2012). Accurate visuomotor control below the perceptual threshold of size discrimination. PloS One, 7(4), e36253. 16 Ganel, T., and Goodale, M.A. (2003). Visual control of action but not perception requires analytical processing of object shape. Nature, 426(6967), 664–667. Ganel, T., Tanzer, M., and Goodale, M.A. (2008a). A double dissociation between action and perception in the context of visual illusions: opposite effects of real and illusory size. Psychological Science, 19(3), 221–225. Glover, S., and Dixon, P. (2002). Dynamic effects of the Ebbinghaus illusion in grasping: support for a planning/control model of action. Perception and Psychophysics, 64(2), 266–278. Gonzalez, C.L.R, Ganel, T., Whitwell, R.L., Morrissey, B., and Goodale, M.A. (2008). Practice makes perfect, but only with the right hand: sensitivity to perceptual illusions with awkward grasps decreases with practice in the right but not the left hand. Neuropsychologia, 46(2), 624–631. Gonzalez, C.L.R, Ganel, T., and Goodale, M.A. (2006). Hemispheric specialization for the visual control of action is independent of handedness. Journal of Neurophysiology, 95(6), 3496–3501. Goodale, M.A. and Milner, A.D. (2005). Sight Unseen: An Exploration of Conscious and Unconscious Vision. Oxford University Press, USA. Goodale, M.A, Jakobson, L.S., and Keillor, J.M. (1994). Differences in the visual control of pantomimed and natural grasping movements. Neuropsychologia, 32(10), 1159–1178. Goodale, M.A, Meenan, J.P., Bülthoff, H.H., Nicolle, D.A., Murphy, K.J., and Racicot, C.I. (1994). Separate neural pathways for the visual analysis of object shape in perception and prehension. Current biology, 4(7), 604–610. Goodale, M.A, and Milner, A.D. (1992). Separate visual pathways for perception and action. Trends in Neurosciences, 15(1), 20–25. Goodale, M.A. (2011). Transforming vision into action. Vision Research, 51(13), 1567–1587. Gregory, R.L. (1963). Distortion of visual space as inappropriate constancy scaling. Nature, 199(678-91), 1. Hadad, B.-S., Avidan, G., and Ganel, T. (2012). Functional dissociation between perception and action is evident early in life. Developmental Science, 15(5), 653–658. Haffenden, A.M., and Goodale, M.A. (1998). The effect of pictorial illusion on prehension and perception. Journal of Cognitive Neuroscience, 10(1), 122–136. He, L., Zhang, J., Zhou, T., and Chen, L. (2009). Connectedness affects dot numerosity judgment: Implications for configural processing. Psychonomic Bulletin and Review, 16, 509-517. Heath, M., Holmes, S.A., Mulla, A., and Binsted, G. (2012). Grasping time does not influence the early adherence of aperture shaping to Weber’s law. Frontiers in Human Neuroscience, 6, 332. Heath, M., Mulla, A., Holmes, S.A., and Smuskowitz, L R. (2011). The visual coding of grip aperture shows an early but not late adherence to Weber’s law. Neuroscience Letters, 490(3), 200–204. Heed, T., Gründler, M., Rinkleib, J., Rudzik, F.H., Collins, T., Cooke, E., and O’Regan, J.K. (2011). Visual information and rubber hand embodiment differentially affect reach-to-grasp actions. Acta Psychologica, 138(1), 263–271. Holmes, S.A., Mulla, A., Binsted, G., and Heath, M. (2011). Visually and memory-guided grasping: aperture shaping exhibits a time-dependent scaling to Weber’s law. Vision Research, 51(17), 1941–1948. James, T.W., Humphrey, G.K., Gati, J.S., Menon, R.S., and Goodale, M.A. (2002). Differential effects of viewpoint on object-driven activation in dorsal and ventral streams. Neuron, 35(4), 793–801. Janczyk, M., and Kunde, W. (2012). Visual processing for action resists similarity of relevant and irrelevant object features. Psychonomic Bulletin and Review, 19(3), 412–417. Koffka, K. (1999). Principles of Gestalt Psychology. Routledge. 17 Konen, C.S., and Kastner, S. (2008). Two hierarchically organized neural systems for object information in human visual cortex. Nature Neuroscience, 11(2), 224–231. Kravitz, D.J., Saleem, K. ., Baker, C.I., and Mishkin, M. (2011). A new neural framework for visuospatial processing. Nature Reviews. Neuroscience, 12(4), 217–230. Kunde, W., Landgraf, F., Paelecke, M., and Kiesel, A. (2007). Dorsal and ventral processing under dualtask conditions. Psychological Science, 18(2), 100–104. Lee, Y.-L., Crabtree, C.E., Norman, J.F., and Bingham, G.P. (2008). Poor shape perception is the reason reaches-to-grasp are visually guided online. Perception and Psychophysics, 70(6), 1032–1046. Lehky, S.R., and Sereno, A.B. (2007). Comparison of shape encoding in primate dorsal and ventral visual pathways. Journal of Neurophysiology, 97(1), 307–319. Linnell, K. J., Humphreys, G.W., McIntyre, D.B., Laitinen, S., and Wing, A.M. (2005). Action modulates object-based selection. Vision Research, 45(17), 2268–2286. Milne, J.L., Chapman, C.S., Gallivan, J.P., Wood, D.K., Culham, J.C., and Goodale, M.A. (2013). Connecting the dots: Object connectedness deceives perception by not movement planning. Psychological Science. in press. Milner, A.D., and Goodale, M.A. (2006). The Visual Brain in Action (2nd ed.). Oxford University Press, USA. O’Craven, K.M., Downing, P.E., and Kanwisher, N. (1999). fMRI evidence for objects as the units of attentional selection. Nature, 401(6753), 584–587. Perenin, M.T., and Vighetto, A. (1988). Optic ataxia: a specific disruption in visuomotor mechanisms. I. Different aspects of the deficit in reaching for objects. Brain: a Journal of Neurology, 111 ( Pt 3), 643–674. Pomerantz, J.R., and Cragin, A.I. (2013). Emergent features and feature combination. In J. Wagemans (Ed.), Oxford Handbook of Perceptual Organization. Oxford , U.K: Oxford University Press. Roberts, B., Harris, M.G., and Yates, T.A. (2005). The roles of inducer size and distance in the Ebbinghaus illusion (Titchener circles). Perception, 34(7), 847–856. Schenk, T., and Milner, A.D. (2006). Concurrent visuomotor behaviour improves form discrimination in a patient with visual form agnosia. The European Journal of Neuroscience, 24(5), 1495–1503. Schum, N., Franz, V.H., Jovanovic, B., and Schwarzer, G. (2012). Object processing in visual perception and action in children and adults. Journal of Experimental Child Psychology, 112(2), 161–177. Smeets, J.B., and Brenner, E. (1999). A new view on grasping. Motor Control, 3(3), 237–271. Smeets, J.B., and Brenner, E. (2001). Independent movements of the digits in grasping. Experimental Brain Research, 139(1), 92–100. Smeets, J. B., and Brenner, E. (2008). Grasping Weber’s law. Current Biology, 18(23), R1089–1090; author reply R1090–1091. Stöttinger, E., Pfusterschmied, J., Wagner, H., Danckert, J., Anderson, B., and Perner, J. (2012). Getting a grip on illusions: replicating Stöttinger et al [Exp Brain Res (2010) 202:79-88] results with 3-D objects. Experimental Brain Research, 216(1), 155–157. Thaler, L., and Goodale, M.A. (2010). Beyond distance and direction: the brain represents target locations non-metrically. Journal of Vision, 10(3), 3.1–27. Van de Kamp, C., and Zaal, F.T. (2007). Prehension is really reaching and grasping. Experimental Brain Research, 182(1), 27–34. Van der Kamp, J., De Wit, M.M., and Masters, R.S.W. (2012). Left, right, left, right, eyes to the front! Müller-Lyer bias in grasping is not a function of hand used, hand preferred or visual hemifield, but foveation does matter. Experimental Brain Research, 218(1), 91–98. 18 Westwood, D.A., and Goodale, M.A. (2003). Perceptual illusion and the real-time control of action. Spatial vision, 16(3-4), 243–254. 19