Different modes of visual organization for

advertisement
Different modes of visual organization for perception and for action
Melvyn A. Goodale1 and Tzvi Ganel2
To appear in:
Oxford Handbook of Perceptual Organization
Oxford University Press
Edited by Johan Wagemans
1
2
The Brain and Mind Institute, The University of Western Ontario, London Ontario Canada N6A 5C2
Department of Psychology, Ben-Gurion University of the Negev, Be’er-Sheva, Israel
1. Introduction
We depend on vision, more than on any other sense, to perceive the world of objects and events
beyond our bodies. We also use vision to move around that world and to guide our goal-directed
actions. Over the last twenty-five years, it has become increasingly clear that the visual pathways in the
brain that mediate our perception of the world are quite distinct from those that mediate the control of
our actions. This distinction between ‘vision-for-perception’ and ‘vision-for-action’ has emerged as one
of the major organizing principles of the visual brain, particularly with respect to the visual pathways in
the cerebral cortex (Goodale and Milner, 1992; Milner and Goodale, 2006).
According to Goodale and Milner’s (1992) account, the ventral stream of visual processing, which arises
in early visual areas and projects to inferotemporal cortex, constructs the rich and detailed
representation of the world that serves as a perceptual foundation for cognitive operations, allowing us
to recognize objects, events and scenes, attach meaning and significance to them, and infer their causal
relations. Such operations are essential for accumulating a knowledge-base about the world. In
contrast, the dorsal stream, which also arises in early visual areas but projects instead to the posterior
parietal cortex, provides the necessary visual control of skilled actions, such as manual prehension. Even
though the two streams have different functions and operating principles, in everyday life they have to
work together. The perceptual networks of the ventral stream interact with various high-level cognitive
mechanisms and enable an organism to select a goal and an associated course of action, while the
visuomotor networks in the dorsal stream (and their associated cortical and subcortical pathways) are
responsible for the programming and on-line control of the particular movements that action entails. Of
course, the dorsal and ventral streams have other roles to play as well. For example, the dorsal stream,
together with areas in the ventral stream, plays a role in spatial navigation – and areas in the dorsal
stream appear to be involved in some aspects of working memory (Kravitz, Saleem, Baker, and Mishkin,
2011). This review, however, will focus on the respective roles of the two streams in perception and
action – and will concentrate largely on the implications of the theory for the principles governing
perceptual organization and visuomotor control.
1
2. Different neural computations for perception and action
Evidence from a broad range of empirical studies from human neuropsychology to single-unit recording
in non-human primates (for reviews, see Culham and Valyear, 2006; Goodale, 2011; Kravitz et al., 2011)
supports the idea of two cortical visual systems. Yet the question remains as to why two separate
systems evolved in the first place. Why couldn’t one “general purpose” visual system handle both
vision-for-perception and vision-for-action? The answer to this question lies in the differences in the
computational requirements of vision-for-perception on the one hand and vision-for-action on the
other. To be able to grasp an object successfully, for example, the visuomotor system has to deal with
the actual size of the object and its orientation and position with respect to the hand you intend to use
to pick it up. These computations need to reflect the real metrics of the world, or at the very least,
make use of learned “look-up tables” that link neurons coding a particular set of sensory inputs with
neurons that code the desired state of the limb (Thaler and Goodale, 2010). The time at which these
computations are performed is equally critical. Observers and goal objects rarely stay in a static
relationship with one another and, as a consequence, the egocentric location of a target object can
often change radically from moment-to- moment. In other words, the required coordinates for action
need to be computed at the very moment the movements are performed.
In contrast to vision-for-action, vision-for-perception does not need to deal with the absolute size of
objects or their egocentric locations. In fact, very often such computations would be counter-productive
because our viewpoint with respect to objects does not remain constant – even though our perceptual
representations of those objects do show constancy. Indeed, one can argue that it would be better to
encode the size, orientation, and location of objects relative to each other. Such a scene-based frame of
reference permits a perceptual representation of objects that transcends particular viewpoints, while
preserving information about spatial relationships (as well as relative size and orientation) as the
observer moves around. The products of perception also need to be available over a much longer time
scale than the visual information used in the control of action. By working with perceptual
representations that are object- or scene-based, we are able to maintain the constancies of size, shape,
color, lightness, and relative location, over time and across different viewing conditions.
The differences between the relative frames of reference required for vision-for-perception and
absolute frames of reference required for vision-for-action lead in turn to clear differences in the way in
which visual information about objects and their spatial relations is organized and represented. These
differences can be most readily seen in the way in which the two visual systems deal with visual
illusions.
2.1. Studies of visual illusions
The most intriguing – yet also the most controversial evidence – for dissociations between action and
perception in healthy subjects has come from studies of visual illusions of size (for a review see Goodale,
2011). In visual illusions of size an object is typically embedded within the context of other objects or of
other pictorial cues that distort its perceived size. Visual illusions, by definition, have robust effects on
perceptual judgments. Surprisingly, the same illusions can have little or no effects on visuomotor tasks
such as grasping. Thus, even though a person might perceive an object embedded within an illusion to
be larger or smaller than it really is, when they reach out to pick up the object, the opening of his or her
2
grasping hand is often unaffected by the illusion. In other words, the grip aperture is scaled to the real,
not the apparent size of the goal object. This result has been interpreted as evidence for the idea that
vision-for-action makes use of real-world metrics while vision-for-perception uses relative or scenebased metrics (Goodale and Milner, 2005). This interpretation, however, has been vigorously
challenged over the past decade by studies claiming that when attention and other factors are taken
into account, there is no difference between the effects of size-contrast illusions on grip scaling and
perceptual reports of size (for a review, see Franz and Gegenfurtner, 2008).
Figure 1. The effect of a size-contrast illusion on perception and action. A. The traditional
Ebbinghaus illusion in which the central circle in the annulus of larger circles is typically
seen as smaller than the central circle in the annulus of smaller circles, even though both
central circles are actually the same size. B. The same display, except that the central circle
in the annulus of larger circles has been made slightly larger. As a consequence, the two
central circles now appear to be the same size. C. A 3D version of the Ebbinghaus illusion.
Participants are instructed to pick up one of the two 3D disks placed either on the display
shown in panel A or the display shown in panel B. D. Two trials with the display shown in
panel B, in which the participant picked up the small disk on one trial and the large disk on
another. Even though the two central disks were perceived as being the same size, the grip
aperture in flight reflected the real not the apparent size of the disks. Adapted with
permission from Aglioti et al. (1995).
A representative example of such conflicting results comes from studies that have compared the effects
of the Ebbinghaus illusion on action and perception. In this illusion, a circle surrounded by an annulus of
smaller circles appears to be larger than the same circle surrounded by an annulus of larger circles (see
Figure 1A). It is thought that the illusion arises because of an obligatory comparison between the size of
the central circle and the size of the surrounding circles, with one circle looking relatively smaller than
the other (Coren and Girgus, 1978). It is also possible that the central circle within the annulus of
smaller circles will be perceived as more distant (and therefore larger) than the circle of equivalent
3
retinal-image size within the array of larger circles. In other words, the illusion may be simply a
consequence of the perceptual system's attempt to make size-constancy judgments on the basis of an
analysis of the entire visual array (Gregory, 1963). In addition, the distance between the surrounding
circles and the central circle may also play a role; if the surrounding circles are close to the central circle,
then the central circle appears larger, but if they are further away, the central circle appears smaller
(Roberts, Harris, and Yates, 2005). In many experiments, the size of the surrounding circles and the
distance between them and the central circle are confounded. But whatever the critical factors might
be in any particular Ebbinghaus display, it is clear that the apparent size of the central circle is influenced
the context in which it is embedded. These contextual effects are remarkably resistant to cognitive
information about the real size of the circles. Thus, even when people are told that the two circles are
identical in size (and this fact is demonstrated to them), they continue to experience a robust illusion of
size.
The first demonstration that grasping might be refractory to the Ebbinghaus illusion was carried out by
Aglioti et al. (1995). These investigators constructed a 3-D version of the Ebbinghaus illusion, in which a
poker-chip type disk was placed in the centre of a 2-D annulus made up of either smaller or larger circles
(Figure 1C). Two versions of the Ebbinghaus display were used. In one case, the two central disks were
physically identical in size but one appeared to be larger than the other (Figure 1A). In the second case,
the size of one of the disks was adjusted so that the two disks were now perceptually identical but had
different physical sizes (Figure 1B). Despite the fact that the participants in this experiment experienced
powerful illusion of size, their anticipatory grip aperture was unaffected by the illusion when they
reached out to pick up each of the central disks. In other words, even though their perceptual estimates
of the size of the target disk were affected by the presence of the surrounding annulus, maximum grip
aperture between the index finger and thumb of the grasping hand, which was reached about 70% of
the way through the movement, was scaled to the real not the apparent size of the central disk (Figure
1D).
The findings of Aglioti et al. (1995) have been replicated in a number of other studies (for a review, see
Carey, 2001; Goodale, 2011). Nevertheless, other studies using the Ebbinghaus illusion have failed to
replicate these findings. Franz et al. (2000, 2001), for example, used a modified version of the illusion
and found similar (and significant) illusory effects on both vision-for-action and vision-for-perception,
arguing that the two systems are not dissociable from one another, at least in healthy participants.
These authors argued that the difference between their findings and those of Aglioti et al. resulted from
different task demands. In particular, in the Aglioti study (as well as in a number of other studies
showing that visuomotor control is resistant to visual illusions), subjects were asked to attend to both
central disks in the illusory display in the perceptual task, but to grasp only one object at a time in the
action task. Franz and colleagues argued that this difference in attention in the perceptual and action
tasks could have accounted for the pattern of results in the Aglioti et al. study. In the experiments by
Franz and colleagues, participants were presented with only a single disk surrounding by an annulus of
either smaller or larger circles. Under these conditions, Franz and colleagues found that both grip
aperture and perceptual reports were affected by the presence of the surrounding annulus. The force of
this demonstration, however, was undercut in later experiments by Haffenden and Goodale (1998) who
asked participants either to estimate the size of one of the central disks manually by opening their finger
and thumb a matching amount or to pick up it up. Even though in both cases, participants were
arguably directing their attention to only one of the disks, there was a clear difference in the effect of
4
the illusion: the manual estimates but not the grasping movements were affected by the size of the
circles in the surrounding annulus.
Franz (2003) later argued the slope of the function describing the relationship between manual
estimates and the real size of the target object was far steeper than more ‘conventional’ psychophysical
measures – and that, when one adjusted for the difference in slope, both action and perception were
affected to the same degree by the Ebbinghaus and by other illusions. Although this explanation, at
least on the face of it, is a compelling one, it cannot explain why Aglioti et al. (1995) and Haffenden and
Goodale (1998) found that when the relative sizes of the two target objects in the Ebbinghaus display
were adjusted so that they appeared to be perceptually identical, the grip aperture that participants
used to pick up the two targets continued to reflect the physical difference in their size. Nor can it
explain the findings of a recent study by Stöttinger and colleagues (2012) who showed that even when
slopes were adjusted, manual estimates of object size were much more affected by the illusion (in this
case, the Diagonal illusion) than were grasping movements.
Recently, several studies have suggested that online visual feedback during grasping could be a relevant
factor accounting for some of the conflicting results in the domain of visual illusions and grasping. For
example, Bruno and Franz (2009) have performed a meta-analysis of studies that looked at the effects of
the Müller-Lyer illusion on perception and action and concluded that the dissociation between the
effects of this illusion on grasping and perception is mostly pronounced when online visual feedback is
available. According to this account, feedback from the fingers and the target object during grasp can be
affectively used by the visuomotor system to counteract the effect of visual illusions on grip aperture.
Further support for this proposal comes from studies that showed that visual illusions, such as the
Ebbinghaus illusion, affect grasping trajectories only during initial stages of the movement, but not in
later stages, in which visual feedback can be effectively used allow the visuomotor system to
compensate for the effects of the illusory context (Glover and Dixon, 2002). However, other studies that
manipulated the availability of visual feedback during grasp failed to find evidence of visual feedback on
grasping performance in the context of visual illusions (Ganel, Tanzer, and Goodale, 2008a; Westwood
and Goodale, 2003).
The majority of studies that have claimed that action escapes the effects of pictorial illusions have
demonstrated this by finding a null effect of the illusory context on grasping movements. In other
words, they have found that perception (by definition) was affected by the illusion but peak grip
aperture of the grasping movement was not. But null effects like this are never as compelling as double
dissociations between action and perception.
5
Figure 2. The effect of the Ponzo illusion on grasping and manual estimates. A. Two
objects embedded in the Ponzo illusion used in Ganel et al.'s (2008a) study. Although the
right object is perceived as larger, it is actually smaller in size. B. Maximum grip apertures
and perceptual estimation data show that the fingers' aperture was not affected by the
perceived but rather tuned to the actual sizes of the objects. Perceptual estimations, on the
other hand, were affected by the Ponzo illusory context. Adapted with permission from
Ganel et al. (2008a).
As turns out, a more recent study has in fact demonstrated a double dissociation between perception
and action. Ganel and colleagues (2008a) used the well-known Ponzo illusion in which the perceived
size of an object is affected by its location within pictorial depth cues. Objects located at the diverging
end of the display appear to be larger than those located at the converging end. To dissociate the effects
of real size from those of illusory size, Ganel and colleagues manipulated the real sizes of two objects
that were embedded in a Ponzo display so that the object that was perceived as larger was actually the
smaller one of the pair (see Figure 2A). When participants were asked to make a perceptual judgment of
the size of the objects, their perceptual estimates reflected the illusory Ponzo effect. In contrast, when
they picked up the objects, the aperture between the finger and thumb of their grasping hand was
tuned to their actual size. In short, the difference in their perceptual estimates of size for the two
objects, which reflected the apparent difference in the size, went in the opposite direction from the
difference in their peak grip aperture, which reflected the real difference in size (Figure 2B). This double
dissociation between the effects of apparent and real size differences on perception and action
respectively cannot be explained away by appealing to differences in attention or differences in slope
(Franz et al., 2001; Franz, Gegenfurtner, Bülthoff, and Fahle, 2000; Franz, 2003).
In a series of experiments that used both the Ebbinghaus and the Ponzo illusions, Gonzalez and her
colleagues provided a deeper understanding of the conditions under which grasping can escape the
effects of visual illusions (Gonzalez, Ganel, and Goodale, 2006). They argued that many of the earlier
studies showing that actions are sensitive to the effects of pictorial illusions required participants to
perform movements requiring different degrees of skill under different degrees of deliberate control
and with different degrees of practice. If one accepts the idea that high-level conscious processing of
visual information is mediated by the ventral stream (Milner and Goodale, 2006), then it is perhaps not
6
surprising that the less skilled, the less practiced, and thus the more deliberate an action, the greater
the chances that the control of this action would be affected by ventral stream perceptual mechanisms.
Gonzalez et al. (2006) provided support for this conjecture by demonstrating that awkward, unpractised
grasping movements, in contrast to familiar precision grips, were sensitive to the Ponzo and Ebbinghaus
illusions. In a follow-up experiment, they showed that the effects of these illusions on initially awkward
grasps diminished with practice (Gonzalez et al., 2008). Interestingly, similar effects of practice were not
obtained for right-handed subjects grasping with their left hand. Even more intriguing is the finding that
grasping with the left hand, even for many left-handed participants, was affected to a larger degree by
pictorial illusions compared to grasping with right hand (Gonzalez et al., 2006). Gonzalez and colleagues
have interpreted these results as suggesting that the dorsal-stream mechanisms that mediate
visuomotor control may have evolved preferentially in the left hemisphere, which primarily controls
right-handed grasping. Additional support from this latter idea comes from work with patients with
optic ataxia from unilateral lesions of the dorsal stream (Perenin and Vighetto, 1988). Patients with lefthemisphere lesions typically show what is often called a ‘hand effect’: they exhibit a deficit in their
ability to visually direct reaching and grasping movements to targets situated in both the contralesional
and the ipsilesional visual field. In contrast, patients with right-hemisphere lesions are impaired only
when they reach out to grasp objects in the contralesional field.
Although the debate of whether or not action escapes the effects of perceptual illusions is far from
being resolved (for recent findings, see Foster, Kleinholdermann, Leifheit, and Franz, 2012; Heed et al.,
2011; van der Kamp, de Wit, and Masters, 2012), the focus on this issue has directed attention away
from the more general question of the nature of the computations underlying visuomotor control in
more natural situations. One example of an issue that has received only minimal attention from
researchers is the role of information about object shape on visuomotor control (but see Cuijpers,
Brenner, and Smeets, 2006; Cuijpers, Smeets, and Brenner, 2004; Goodale et al., 1994; Lee, Crabtree,
Norman, and Bingham, 2008) – and how that information might differ in its organization from
conventional perceptual accounts of shape processing.
2.2. Studies of configural processing of shape
The idea that vision treats the shape of an object in a holistic manner has been a basic theme running
through theoretical accounts of perception from early Gestalt psychology (Koffka, 1935) to more
contemporary cognitive neuroscience (e.g. Duncan, 1984; O’Craven, Downing and Kanwisher, 1999).
Encoding an object holistically permits a representation of the object that preserves the relations
between object parts and other objects in the visual array without requiring precise information about
the absolute size of each of the object’s dimensions (see Behrmann, Richler, and Avidan, 2013;
Pomerantz and Cragin, 2013). In fact, as discussed earlier, calculating the exact size, distance, and
orientation of every aspect of every object in a visual scene carries a huge computational load. Holistic
(or configural) processing is much more efficient for constructing perceptual representations of objects.
When we interact with an object, however, it is imperative that the visual processes controlling the
action take into account the absolute metrics of the most relevant dimension of the object without
being influenced by other dimensions or features. In other words, rather than being holistic, the visual
processing mediating action should be analytical.
Empirical support for the idea that the visual control of action is analytical rather than configural comes
from experiments using a variant of the Garner speeded classification task (Ganel and Goodale, 2003). In
7
these experiments, participants were required to either make perceptual judgments of the width of
rectangles or to grasp them across their width – while in both cases trying to ignore the length. As
expected, participants could not ignore the length of a rectangle when making judgements of its width.
Thus, when the length of a rectangle was varied randomly from trial to trial, participants took longer to
discriminate a wide rectangle from a narrow one than when the length did not change. In sharp
contrast, however, participants appeared to completely ignore the length of an object when grasping it
across its width. Thus, participants took no longer to initiate (or to complete) their grasping movement
when the length of the object varied than when its length did not change. These findings show that the
holistic processing that characterizes perceptual processing does not apply to the visual control of skilled
actions such as grasping. Instead, the visuomotor mechanisms underlying this behaviour deal with the
basic dimensions of objects as independent features. This finding of a dissociation between holistic and
analytical processing for perception and action respectively using Garner's paradigm has been replicated
by several other different studies (Janczyk and Kunde, 2012; Kunde, Landgraf, Paelecke, and Kiesel,
2007) and more recently has been reported in young children (Schum, Franz, Jovanovic, and Schwarzer,
2012).
Figure 3. An example of a within-object illusion of shape. Although the two rectangles have
an equal width, the shorter rectangle is perceived as wider than the taller rectangle (see
Ganel and Goodale, 2003; Ben-Shalom and Ganel, 2012).
Beyond being driven by configural processing, subjects' inability to ignore information about an
irrelevant dimension when estimating the size of a relevant dimension often leads to a directional
distortion in their size perception. In particular, because a rectangle's width is always perceived relative
to its length, longer rectangles will be always perceived as narrower, even in cases in which their actual
width is kept constant (see Figure 3). This type of illusion, in which the perceived element is affected by
irrelevant dimensions belonging to the same object – has been termed a within-object illusion (see BenShalom and Ganel, 2012). Interestingly, it has been recently argued that within-object illusions and
between-objects illusions (discussed in the previous section) rely on different cognitive mechanisms; for
example, it has been shown that representations in iconic memory are affected by the later type of
illusions, but not by within-object illusions.
More relevant to the present discussion, it has been shown that within-object illusions, like betweenobject illusions, do not affect visuomotor control. That is, unlike perceptual estimations of rectangle's
8
width, which is affected by its length, the aperture of the fingers when grasping the rectangle across its
width was shown to be unaffected by length (Ganel and Goodale, 2003). Taken together, all these
findings point to the same conclusion: unlike visual perception, which is always affected by relative
frames of reference, the visual control of action is more analytical and is therefore immune to the
effects of both within-object and between-objects pictorial illusions.
Figure 4. There appear to be fewer circles on the right than on the left – even though in
both cases there are 22 individual circles. Connecting the circles with short lines creates
the illusion of fewer circles. Even so, when our brain plans actions to these targets it
computes the actual number of targets. In the task used by Milne et al. (2013) far fewer
circles were used, but the effect was still present in perceptual judgments but not in the
biasing of rapid reaching movements. In the action task, it was the actual not the apparent
number of circles that affected performance. Adapted with permission from Milne et al.
(2013).
Recent work also suggests that there are fundamental differences in scene segmentation for perception
and action planning. It is well-established that our perceptual system parses complex scenes into
discrete objects, but what is less known is that parsing is also required for planning visually-guided
movements, particularly when more than one potential target is present. In a recent study, Milne et al.
(2013) explored whether perception and motor planning use the same or different parsing strategies –
and whether perception is more sensitive to contextual effects than is motor planning. To do this, they
used the ‘connectedness illusion’, in which observers typically report seeing fewer targets if pairs of
targets are connected by short lines (Franconeri, Bemis, and Alvarez, 2009; He, Zhang, Zhou, and Chen,
2009; see Figure 4).
Milne et al. (2013) tested participants in a rapid reaching paradigm they had developed that requires
subjects to initiate speeded arm movements toward multiple potential targets before one of the targets
is cued for action (Chapman et al., 2010). In their earlier work, they had shown that when there were an
equal number of targets on each side of a display, participants aimed their initial trajectories toward a
midpoint between the two target locations. Furthermore, when the distribution of targets on each side
of a display was not equal (but each potential target had an equal probability of becoming the goal
target), initial trajectories were biased toward the side of the display that contained a greater number of
targets. They argued that this behavior maximizes the chances of success on the task because
movements are directed toward the most probable location of the eventual goal, thereby minimizing
the ‘cost’ of correcting the movement in-flight. Because it provides a behavioral ‘read-out’ of rapid
9
comparisons of target numerosity for motor planning, the paradigm is an ideal way to measure object
segmentation in action in the context of the connectedness illusion. When participants were asked to
make speeded reaches towards the targets where sometimes the targets were connected by lines, their
reaches were completely unaffected by the presence of the connecting lines. Instead, their movement
plans, as revealed by their movement trajectories, were influenced only by the difference in the number
of targets present on each side of the display, irrespective of whether connecting lines were there or
not. Not unexpectedly, however, when they were asked to report whether there were fewer targets
present on one side as compared to the other, their reports were biased by the connecting lines
between the targets.
The work by Milne et al. (2013) suggests that scene segmentation for perception depends on
mechanisms that are distinct from those that allow humans to plan rapid and efficient target-directed
movements in situations where there are multiple potential targets. While the perception of object
numerosity can be dramatically influenced by manipulations of object grouping, such as the connected
illusion, the visuomotor system is able to ignore such manipulations and to parse individual objects and
accurately plan, execute, and control rapid reaching movements to multiple goals. These results are
especially compelling considering that initial goal selection is undoubtedly based on a perceptual
representation of the goal (for a discussion of this issue, see Milner and Goodale, 2006). The planning of
the final movement, however, is able to effectively by-pass the contextual biases of perception,
particularly in situations where rapid planning and execution of the movement is paramount.
2.3. Studies of object size resolution
The 19th C German physician and scientist, Ernst Heinrich Weber, is usually credited with the observation
that our sensitivity to changes in any physical property or dimension of an object or sensory stimulus
decreases as magnitude of that property or dimension increases. For example, if a bag of sugar weighs
only 50 grams, then we will notice a change in weight if a only few grams of sugar are added or taken
away. But if the bag weighs 500 grams, much more sugar must be added or taken away before we
notice the difference. Typically, if the weight of something is doubled, then the smallest difference in
weight that can be perceived is also doubled. Similar, but not identical functions have been
demonstrated for the loudness of sounds, the brightness of visual stimuli, and a broad range of other
sensory experiences. Imagine, for example, that you are riding on an express train on your way to an
important meeting. As the train accelerates from 220 to 250 km an hour, you might scarcely notice the
change in velocity – even though the same change in velocity was easily noticed as the train left the
station earlier and began to accelerate. In short, the magnitude of the ‘just-noticeable difference’ (JND)
increases with the magnitude or intensity of the stimulus. The German physicist-turned-philosopher
Gustav Fechner later formalized this basic psychophysical principle mathematically and called it Weber’s
Law.
10
Figure 5. Effects of object size on visual resolution (Just Noticeable Difference: JND). Left
panel: The effect of object size on JNDs for Maximum Grip Apertures (MGAs) during
grasping. Right panel: The effect of object size on JNDs during perceptual estimations. Note
that JNDs for the perceptual condition increased linearly with length, following Weber's
law, whereas the JNDs for grasping were unaffected by size. Adapted with permission from
Ganel et al. (2008b).
Weber's law is one of the most fundamental features of human perception. It is not clear, however, if
the visual control of action is subject to the same universal psychophysical function. To investigate this
possibility, Ganel and colleagues (Ganel, Chajut, and Algom, 2008b) carried out a series of
psychophysical and visuomotor experiments in which participants were asked either to grasp or to make
perceptual estimations of the length of rectangular objects. The JNDs were defined in this study by using
the standard deviation of the mean grip aperture and the standard deviation of the mean perceptual
judgement for a given stimulus. This is akin to the classical Method of Adjustment in which the amount
of variation in the responses for a given size of a stimulus reflects an "area of uncertainty" in which
participants are not sensitive to fluctuations in size. Not surprisingly, Ganel and colleagues found that
the JNDs for the perceptual estimations of the object’s length showed a linear increase with length, as
Weber’s law would predict. The JNDs for grip aperture, however, showed no such increase with object
length and remained constant as the length of the object increased (see Figure 5). In other words, the
standard deviation for grip aperture remained the same despite increases in the length of the object.
Simply put, visually guided actions appear to violate Weber’s law reflecting a fundamental difference in
the way that object size is computed for action and for perception (Ganel et al., 2008a; 2008b). This
fundamental difference in the psychophysics of perception and action has been found to emerge in
children as young as 5 years of age (Hadad, Avidan, and Ganel, 2012, see Figure 6).
11
Figure 6. JNDs for perceptual estimations (Panel A) and for grasping (Panel B) in different
age groups. In all age groups, JNDs for perceptual condition increased with object size,
following Weber's law. Importantly, however, the JNDs for grasping in all groups were
unaffected by changes in the size of the target. Adapted with permission from Hadad et al.
(2012).
This difference in the psychophysics of perception and action can be observed in other contexts as well.
In a recent study (Ganel, Freud, Chajut, and Algom, 2012), for example, participants were asked to grasp
or to make perceptual comparisons between pairs of circular disks. Importantly, the actual difference in
size between the members of the pairs was set below the perceptual JND. Again, a dissociation was
observed between perceptual judgements of the size and the kinematic measures of the aperture of the
grasping hand. Regardless of the whether or not participants were accurate in their judgments of the
difference in size between the two disks, the maximum opening between the thumb and forefinger of
their grasping hand in flight reflected the actual difference in size between the two disks (see Figure 7).
These findings provide additional evidence for the idea that the computations underlying the perception
of objects are different from those underlying the visual control of action. They also suggest that people
can show differences in the tuning of grasping movements directed to objects of different sizes even
when they are not conscious of those differences in size.
Figure 7. Grasping objects that are perceptually indistinguishable. A) The set-up with
examples of the stimuli that were used. Participants were asked on each trial to report
12
which object of the two was the larger and then to grasp the object in each pair that was in
the centre of the table (task order was counterbalanced between subjects). B) MGAs for
correct and for incorrect perceptual size classifications. MGAs reflected the real size
differences between the two objects even in trials in which subjects erroneously judged the
larger object in the pair as the smaller one. Adapted with permission from Ganel et al.
(2012).
The demonstrations showing that the visual control of grasping does not obey Weber’s law resonates
with Milner and Goodale’s (2006) proposal that there is a fundamental difference in the frames of
reference and metrics used by vision-for-perception and vision-for-action (Ganel et al. 2008b). This
findings also converge with the results of imaging studies that suggest that the ventral and the dorsal
streams represent objects in different ways (James, Humphrey, Gati, Menon, and Goodale, 2002; Konen
and Kastner, 2008; Lehky and Sereno, 2007). Yet, the interpretation of these results has not gone
unchallenged (Heath, Holmes, Mulla, and Binsted, 2012; Heath, Mulla, Holmes, and Smuskowitz, 2011;
Holmes, Mulla, Binsted, and Heath, 2011; Smeets and Brenner, 2008). For example, in a series of papers,
Heath and his colleagues (Heath et al., 2012, 2012; Holmes et al., 2011) have examined the effects of
Weber's law on grip aperture throughout the entire movement trajectory and found an apparent
adherence to Weber's law early but not later in the trajectory of the movement. A recent paper by
Foster and Franz (2013), however, has suggested that these effects are confounded by movement
velocity. In particular, due to task demands that require subjects to hold their finger and thumb together
prior to each grasp, subjects tend to open their fingers faster for larger compared to smaller objects, a
feature that characterizes only early stages of the grasping trajectory. Therefore, the increased grip
variability for larger compared to smaller objects during the early portion of the trajectories could be
attributed to velocity differences in the opening of the fingers rather than to the effects of Weber's law.
In their commentary on Ganel et al.'s (2008b) paper, Smeets and Brenner (2008) argue that the results
can be more efficiently accommodated by a ‘double-pointing’ account of grasping. According to this
model, the movements of each finger of a grasping hand are controlled independently – each digit being
simultaneously directed to a different location on the goal object (Smeets and Brenner, 1999, 2001).
Thus, when people reach out to pick up an object with a precision grip, for example, the index finger is
directed to one side of the object and the thumb to the other. No computation of object size is
required, only the computation of two separate locations on the object, one for the finger and the other
for the thumb. The apparent scaling of the grip to object size is nothing more than a by-product of the
fact that the index finger and thumb are moving towards their respective end points. Smeets and
Brenner go on to argue that because size is not computed for grasping, and only location matters,
Weber's law would not apply. In other words, because location, unlike size, is a discrete rather than a
continuous dimension, Weber’s law is irrelevant for grasping. Smeets and Brenner’s account also
comfortably explains why grasping escapes the effects of pictorial illusions such as the Ebbinghaus and
Ponzo illusions. In fact, more generally, their double-pointing or position-based account of grasping
would appear to offer a more parsimonious account of a broad range of apparent dissociations between
vision-for-perception and vision-for-action than appealing to a two-visual-systems model.
Although Smeets and Brenner's (1999, 2001) interpretation is appealing, there are several lines of
evidence showing that finger's trajectories during grasping are tuned to object size rather than location.
For example, van de Kamp and Zaal (2007) have shown that when one side of a target object but not the
13
other is suddenly pushed in or out (with a hidden compressed-air device) as people are reaching out to
grasp it, the trajectories of both digits are adjusted in flight. In other words, the trajectories of the both
the finger and the thumb change to reflect the change in size of the target object. Smeets and
Brenner’s model would not predict this. According to their double-pointing hypothesis, only the digit
going to the perturbed side of the goal object should change course. But the fact that the trajectories of
both digits show an adjustment is entirely consistent with the idea that the visuomotor system is
computing the size of the target object. In other words, as the object changes size, so does the grip.
Another line of evidence that goes against Smeets and Brenner’s double-pointing hypothesis comes
from the neuropsychological literature. Damage to the ventral stream in the human occipitotemporal
cortex can result in visual form agnosia, a deficit in visual object recognition. The best-documented
example of such a case is patient DF, who has bilateral lesions to the lateral occipital area rendering her
unable to recognize or discriminate between even simple geometric shapes such as a rectangle and a
square. Despite her profound deficit in form perception, she is able to scale her grasp to the dimensions
of the very objects she cannot describe or recognize, presumably using visuomotor mechanisms in her
dorsal stream. As is often the case for neurological patients, DF is able to (partially) compensate for her
deficits by relying on non-natural strategies based on their residual intact abilities. Schenk and Milner
(2006), for example, found that under certain circumstances DF could use her intact visuomotor skills to
compensate for her marked impairment in shape recognition. When DF was asked to make simple shape
classifications (rectangle/square classifications), her performance was at chance. Yet, her shape
classifications markedly improved when performed concurrently with grasping movements toward the
target objects she was being asked to discriminate. Interestingly, this improvement appeared not to
depend on afferent feedback from the grasping fingers because it was found that even when DF was
planning her actions and just before the fingers actually started to move. Schenk and Milner therefore
concluded that information about an object’s dimensions is available at some level via visuomotor
activity in DF's intact dorsal stream and this in turn improves her shape-discrimination performance. For
this to happen, the dorsal-stream mechanisms would have to be computing the relevant dimension of
the object to be grasped and not simply the locations on that object to which the finger and thumb are
being directed (for similar evidence in healthy individuals, see Linnell, Humphreys, McIntyre, Laitinen,
and Wing, 2005). Again, these findings are clearly not in line with Smeets and Brenner's double-pointing
hypothesis and suggest that the dorsal stream uses information about object size (more particularly, the
relevant dimension of the target object) when engaged in visuomotor control. Parenthetically, it is
interesting to note that the results of one of the experiments in the Schenk and Milner study also
provide indirect evidence that grip aperture is not affected by the irrelevant dimension of the object to
be grasped (Ganel and Goodale, 2003). When DF was asked to grasp objects across a dimension that
was not informative of shape (i.e., grasp across rectangles of constant width that varied in length), no
grasping-induced perceptual improvements in distinguishing between the different rectangles were
found. This finding not only shows that shape per se was not being used in the earlier tasks where she
did show some enhancement in her ability to discriminate between objects of different widths, but it
also provides additional evidence for the idea that visuomotor control is carried out in an analytical
manner (e.g. concentrating entirely on object width) without being influenced by differences in the
configural aspects of the objects.
As mentioned at the beginning of the chapter, Milner and Goodale (2006) have argued that visuomotor
mechanisms in the dorsal stream tend to operate in real time. If the target object is no longer visible
14
when the imperative to begin the movement is given, then any object-directed action would have to be
based on a memory of the target object, a memory that is necessarily dependent on earlier processing
by perceptual mechanisms in the ventral stream. Thus, DF is unable to scale her grasp for objects that
she saw only seconds earlier, presumably because of the damage to her ventral stream (Goodale,
Jakobson, and Keillor, 1994). Similarly, when neurologically intact participants are asked to base their
grasping on memory representations of the target object, rather than on direct vision, the kinematics of
their grasping movements are affected by Weber's law and by pictorial illusions (Ganel et al. 2008b; for
review, see Goodale 2011). Again, without significant modification, Smeets and Brenner's doublepointing model does not provide a parsimonious account for why memory-based action control should
be affected by size whereas real-time actions should not. But as we have already seen, according to the
two-visual systems account, when vision is not allowed and memory-based actions are performed, such
actions have to rely on earlier perceptual processing of the visual scene, processing that in principle is
subject to Weber’s law and pictorial illusions of size.
3. Conclusions
The visual control of skilled actions, unlike visual perception, operates in real time and reflects the
metrics of the real world. This means that many actions, such as reaching and grasping, are immune to
the effects of a range of pictorial illusions, which by definition affect perceptual judgments. Only when
the actions are deliberate and cognitively ‘supervised’ or are initiated after the target is no longer in
view do the effects of illusions emerge. All of this suggests that our perceptual representations of
objects are organized in a fundamentally different way from the visual information underlying the
control of skilled actions directed at those objects. As we have seen, the visual perception of objects
and their relations tends to be holistic and contextual with relative poor real-world metrics, whereas the
visual control of skilled actions is more analytical, circumscribed, and metrically accurate. Of course, in
everyday life, vision-for-perception and vision-for-action work together in the production of purposive
behaviour – vision-for-perception, together with other cognitive systems, selects the goal object from
the visual array while vision-for-action working with associated motor networks, carries out the required
computations for the goal-directed action. In a very real sense, then, the strengths and weaknesses of
these two kinds of vision complement each other in the production of adaptive behaviour.
15
4. References
Aglioti, S., DeSouza, J.F., and Goodale, M.A. (1995). Size-contrast illusions deceive the eye but not the
hand. Current Biology, 5(6), 679–685.
Behrmann, M., Richler, J., and Avidan, G. (2013). Holistic face perception. In J. Wagemans (Ed.), Oxford
Handbook of Perceptual Organization. Oxford , U.K: Oxford University Press.
Ben-Shalom, A., and Ganel, T. (2012). Object representations in visual memory: evidence from visual
illusions. Journal of Vision, 12(7).
Bruno, N., and Franz, V.H. (2009). When is grasping affected by the Müller-Lyer illusion? A quantitative
review. Neuropsychologia, 47(6), 1421–1433.
Carey, D.P. (2001). Do action systems resist visual illusions? Trends in Cognitive Sciences, 5(3), 109–113.
Chapman, C.S., Gallivan, J.P., Wood, D.K., Milne, J.L., Culham, J.C., and Goodale, M.A. (2010a). Reaching
for the unknown: Multiple target encoding and real-time decision making in a rapid reach task.
Cognition, 116, 168-176.
Coren, S., and Girgus, J.S. (1978). Seeing is deceiving: the psychology of visual illusions. Lawrence
Erlbaum Associates.
Cuijpers, R.H., Brenner, E., and Smeets, J.B.J. (2006). Grasping reveals visual misjudgements of shape.
Experimental Brain Research, 175(1), 32–44.
Cuijpers, R.H., Smeets, J.B.J., and Brenner, E. (2004). On the relation between object shape and grasping
kinematics. Journal of Neurophysiology, 91(6), 2598–2606.
Culham, J.C., and Valyear, K.F. (2006). Human parietal cortex in action. Current Opinion in Neurobiology,
16(2), 205–212.
Duncan, J. (1984). Selective attention and the organization of visual information. Journal of Experimental
Psychology: General, 113(4), 501–517.
Foster, R.M., and Franz, V.H. (2013). Inferences about time course of Weber’s Law violate statistical
principles. Vision Research, 78, 56–60.
Foster, R.M., Kleinholdermann, U., Leifheit, S., and Franz, V.H. (2012). Does bimanual grasping of the
Müller-Lyer illusion provide evidence for a functional segregation of dorsal and ventral streams?
Neuropsychologia, 50(14), 3392–3402.
Franconeri, S.L., Bemis, D.K., and Alvarez, G.A. (2009). Number estimation relies on a set of segmented
objects. Cognition, 113, 1-13.
Franz, V.H. (2003). Manual size estimation: a neuropsychological measure of perception? Experimental
Brain Research, 151(4), 471–477.
Franz, V.H, Fahle, M., Bülthoff, H.H., and Gegenfurtner, K.R. (2001). Effects of visual illusions on grasping.
Journal of Rxperimental Psychology. Human Perception and Performance, 27(5), 1124–1144.
Franz, V.H, Gegenfurtner, K.R., Bülthoff, H.H., and Fahle, M. (2000a). Grasping visual illusions: no
evidence for a dissociation between perception and action. Psychological Science, 11(1), 20–25.
Franz, V.H, Gegenfurtner, K.R., Bülthoff, H.H., and Fahle, M. (2000b). Grasping visual illusions: no
evidence for a dissociation between perception and action. Psychological Science, 11(1), 20–25.
Franz, V.H, and Gegenfurtner, K.R. (2008). Grasping visual illusions: consistent data and no dissociation.
Cognitive Neuropsychology, 25(7-8), 920–950.
Ganel, T., Chajut, E., and Algom, D. (2008b). Visual coding for action violates fundamental
psychophysical principles. Current Biology, 18(14), R599–601.
Ganel, T., Freud, E., Chajut, E., and Algom, D. (2012). Accurate visuomotor control below the perceptual
threshold of size discrimination. PloS One, 7(4), e36253.
16
Ganel, T., and Goodale, M.A. (2003). Visual control of action but not perception requires analytical
processing of object shape. Nature, 426(6967), 664–667.
Ganel, T., Tanzer, M., and Goodale, M.A. (2008a). A double dissociation between action and perception
in the context of visual illusions: opposite effects of real and illusory size. Psychological Science,
19(3), 221–225.
Glover, S., and Dixon, P. (2002). Dynamic effects of the Ebbinghaus illusion in grasping: support for a
planning/control model of action. Perception and Psychophysics, 64(2), 266–278.
Gonzalez, C.L.R, Ganel, T., Whitwell, R.L., Morrissey, B., and Goodale, M.A. (2008). Practice makes
perfect, but only with the right hand: sensitivity to perceptual illusions with awkward grasps
decreases with practice in the right but not the left hand. Neuropsychologia, 46(2), 624–631.
Gonzalez, C.L.R, Ganel, T., and Goodale, M.A. (2006). Hemispheric specialization for the visual control of
action is independent of handedness. Journal of Neurophysiology, 95(6), 3496–3501.
Goodale, M.A. and Milner, A.D. (2005). Sight Unseen: An Exploration of Conscious and Unconscious
Vision. Oxford University Press, USA.
Goodale, M.A, Jakobson, L.S., and Keillor, J.M. (1994). Differences in the visual control of pantomimed
and natural grasping movements. Neuropsychologia, 32(10), 1159–1178.
Goodale, M.A, Meenan, J.P., Bülthoff, H.H., Nicolle, D.A., Murphy, K.J., and Racicot, C.I. (1994). Separate
neural pathways for the visual analysis of object shape in perception and prehension. Current
biology, 4(7), 604–610.
Goodale, M.A, and Milner, A.D. (1992). Separate visual pathways for perception and action. Trends in
Neurosciences, 15(1), 20–25.
Goodale, M.A. (2011). Transforming vision into action. Vision Research, 51(13), 1567–1587.
Gregory, R.L. (1963). Distortion of visual space as inappropriate constancy scaling. Nature, 199(678-91),
1.
Hadad, B.-S., Avidan, G., and Ganel, T. (2012). Functional dissociation between perception and action is
evident early in life. Developmental Science, 15(5), 653–658.
Haffenden, A.M., and Goodale, M.A. (1998). The effect of pictorial illusion on prehension and
perception. Journal of Cognitive Neuroscience, 10(1), 122–136.
He, L., Zhang, J., Zhou, T., and Chen, L. (2009). Connectedness affects dot numerosity judgment:
Implications for configural processing. Psychonomic Bulletin and Review, 16, 509-517.
Heath, M., Holmes, S.A., Mulla, A., and Binsted, G. (2012). Grasping time does not influence the early
adherence of aperture shaping to Weber’s law. Frontiers in Human Neuroscience, 6, 332.
Heath, M., Mulla, A., Holmes, S.A., and Smuskowitz, L R. (2011). The visual coding of grip aperture shows
an early but not late adherence to Weber’s law. Neuroscience Letters, 490(3), 200–204.
Heed, T., Gründler, M., Rinkleib, J., Rudzik, F.H., Collins, T., Cooke, E., and O’Regan, J.K. (2011). Visual
information and rubber hand embodiment differentially affect reach-to-grasp actions. Acta
Psychologica, 138(1), 263–271.
Holmes, S.A., Mulla, A., Binsted, G., and Heath, M. (2011). Visually and memory-guided grasping:
aperture shaping exhibits a time-dependent scaling to Weber’s law. Vision Research, 51(17),
1941–1948.
James, T.W., Humphrey, G.K., Gati, J.S., Menon, R.S., and Goodale, M.A. (2002). Differential effects of
viewpoint on object-driven activation in dorsal and ventral streams. Neuron, 35(4), 793–801.
Janczyk, M., and Kunde, W. (2012). Visual processing for action resists similarity of relevant and
irrelevant object features. Psychonomic Bulletin and Review, 19(3), 412–417.
Koffka, K. (1999). Principles of Gestalt Psychology. Routledge.
17
Konen, C.S., and Kastner, S. (2008). Two hierarchically organized neural systems for object information
in human visual cortex. Nature Neuroscience, 11(2), 224–231.
Kravitz, D.J., Saleem, K. ., Baker, C.I., and Mishkin, M. (2011). A new neural framework for visuospatial
processing. Nature Reviews. Neuroscience, 12(4), 217–230.
Kunde, W., Landgraf, F., Paelecke, M., and Kiesel, A. (2007). Dorsal and ventral processing under dualtask conditions. Psychological Science, 18(2), 100–104.
Lee, Y.-L., Crabtree, C.E., Norman, J.F., and Bingham, G.P. (2008). Poor shape perception is the reason
reaches-to-grasp are visually guided online. Perception and Psychophysics, 70(6), 1032–1046.
Lehky, S.R., and Sereno, A.B. (2007). Comparison of shape encoding in primate dorsal and ventral visual
pathways. Journal of Neurophysiology, 97(1), 307–319.
Linnell, K. J., Humphreys, G.W., McIntyre, D.B., Laitinen, S., and Wing, A.M. (2005). Action modulates
object-based selection. Vision Research, 45(17), 2268–2286.
Milne, J.L., Chapman, C.S., Gallivan, J.P., Wood, D.K., Culham, J.C., and Goodale, M.A. (2013). Connecting
the dots: Object connectedness deceives perception by not movement planning. Psychological
Science. in press.
Milner, A.D., and Goodale, M.A. (2006). The Visual Brain in Action (2nd ed.). Oxford University Press,
USA.
O’Craven, K.M., Downing, P.E., and Kanwisher, N. (1999). fMRI evidence for objects as the units of
attentional selection. Nature, 401(6753), 584–587.
Perenin, M.T., and Vighetto, A. (1988). Optic ataxia: a specific disruption in visuomotor mechanisms. I.
Different aspects of the deficit in reaching for objects. Brain: a Journal of Neurology, 111 ( Pt 3),
643–674.
Pomerantz, J.R., and Cragin, A.I. (2013). Emergent features and feature combination. In J. Wagemans
(Ed.), Oxford Handbook of Perceptual Organization. Oxford , U.K: Oxford University Press.
Roberts, B., Harris, M.G., and Yates, T.A. (2005). The roles of inducer size and distance in the Ebbinghaus
illusion (Titchener circles). Perception, 34(7), 847–856.
Schenk, T., and Milner, A.D. (2006). Concurrent visuomotor behaviour improves form discrimination in a
patient with visual form agnosia. The European Journal of Neuroscience, 24(5), 1495–1503.
Schum, N., Franz, V.H., Jovanovic, B., and Schwarzer, G. (2012). Object processing in visual perception
and action in children and adults. Journal of Experimental Child Psychology, 112(2), 161–177.
Smeets, J.B., and Brenner, E. (1999). A new view on grasping. Motor Control, 3(3), 237–271.
Smeets, J.B., and Brenner, E. (2001). Independent movements of the digits in grasping. Experimental
Brain Research, 139(1), 92–100.
Smeets, J. B., and Brenner, E. (2008). Grasping Weber’s law. Current Biology, 18(23), R1089–1090;
author reply R1090–1091.
Stöttinger, E., Pfusterschmied, J., Wagner, H., Danckert, J., Anderson, B., and Perner, J. (2012). Getting a
grip on illusions: replicating Stöttinger et al [Exp Brain Res (2010) 202:79-88] results with 3-D
objects. Experimental Brain Research, 216(1), 155–157.
Thaler, L., and Goodale, M.A. (2010). Beyond distance and direction: the brain represents target
locations non-metrically. Journal of Vision, 10(3), 3.1–27.
Van de Kamp, C., and Zaal, F.T. (2007). Prehension is really reaching and grasping. Experimental Brain
Research, 182(1), 27–34.
Van der Kamp, J., De Wit, M.M., and Masters, R.S.W. (2012). Left, right, left, right, eyes to the front!
Müller-Lyer bias in grasping is not a function of hand used, hand preferred or visual hemifield,
but foveation does matter. Experimental Brain Research, 218(1), 91–98.
18
Westwood, D.A., and Goodale, M.A. (2003). Perceptual illusion and the real-time control of action.
Spatial vision, 16(3-4), 243–254.
19
Download