Cognition 102 (2007) 299–310 www.elsevier.com/locate/COGNIT Brief article Perceptual and decisional contributions to audiovisual interactions in the perception of apparent motion: A signal detection study q Daniel Sanabria a,* , Charles Spence a, Salvador Soto-Faraco b a b Department of Experimental Psychology, University of Oxford, UK ICREA and Parc Cientı́fic de Barcelona – Universitat de Barcelona, Spain Received 30 May 2005; revised 23 November 2005; accepted 8 January 2006 Abstract Motion information available to different sensory modalities can interact at both perceptual and post-perceptual (i.e., decisional) stages of processing. However, to date, researchers have only been able to demonstrate the influence of one of these components at any given time, hence the relationship between them remains uncertain. We addressed the interplay between the perceptual and post-perceptual components of information processing by assessing their influence on performance within the same experimental paradigm. We used signal detection theory to discriminate changes in perceptual sensitivity (d 0 ) from shifts in response criterion (c) in performance on a detection (Experiment 1) and a classification (Experiment 2) task regarding the direction of auditory apparent motion streams presented in noise. In the critical conditions, a visual motion distractor moving either leftward or rightward was presented together with the auditory motion. The results demonstrated a significant decrease in sensitivity to the direction of the auditory targets in the crossmodal conditions as compared to the unimodal baseline conditions that was independent of the relative direction of the visual distractor. In addition, we also observed significant shifts in response criterion, which were dependent on the relative direction of the distractor apparent motion. These results support the view that the perceptual and q * This manuscript was accepted under the editorship of Jacques Mehler. Corresponding author. Tel.: +44 1865 271307; fax: +44 1865 310447. E-mail address: daniel.sanabria@psy.ox.ac.uk (D. Sanabria). 0010-0277/$ - see front matter Ó 2006 Elsevier B.V. All rights reserved. doi:10.1016/j.cognition.2006.01.003 300 D. Sanabria et al. / Cognition 102 (2007) 299–310 decisional components involved in audiovisual interactions in motion processing can coexist but are largely independent of one another. Ó 2006 Elsevier B.V. All rights reserved. Keywords: Crossmodal interactions; Motion perception; Apparent motion; Audition; Vision 1. Introduction Recent research has revealed substantial crossmodal links in the processing of motion information (Soto-Faraco, Kingstone, & Spence, 2003). Multisensory interactions in motion perception have often been addressed using intersensory conflict situations (Welch & Warren, 1980) where participants judge a particular feature of motion in one sensory modality (e.g., sound direction) while trying to ignore motion information presented in another modality (e.g., vision). Given that the irrelevant distractors can potentially affect perceptual (Soto-Faraco, Spence, & Kingstone, 2005) as well as decisional (post-perceptual) mechanisms (Wuerger, Hofbauer, & Meyer, 2003), the level of processing at which these interactions take place remains somewhat controversial (Bertelson & de Gelder, 2004; Soto-Faraco et al., 2003). Dissociating perceptual from post-perceptual processes is critical to many areas of crossmodal research (Aschersleben, Bachmann, & Musseler, 1999; Bertelson & de Gelder, 2004; de Gelder & Bertelson, 2003). Perceptual processes affect the combination of multisensory cues prior to response selection/execution, whereas post-perceptual processes influence general response selection and/or execution mechanisms instead. Several studies addressing audiovisual interactions in motion processing have reported post-perceptual influences in the absence of any perceptual effects, supporting the claim that motion cues are processed independently in each sensory modality at a perceptual level, with any interactions occurring only at decisional stages (Alais & Burr, 2004; Meyer & Wuerger, 2001; Wuerger et al., 2003). By contrast, other findings support the notion that auditory and visual motion information can interact at a perceptual level (Kitagawa & Ichihara, 2002; Mateeff, Hohnsbein, & Noack, 1985; Soto-Faraco et al., 2005; Vroomen & de Gelder, 2003). These latter studies have used methodologies such as psychophysical staircases (Soto-Faraco et al., 2005) or adaptation after-effects (Kitagawa & Ichihara, 2002; Vroomen & de Gelder, 2003), that are designed to minimize potential cognitive and/or response biases that affect decisional stages of processing. Overall, this pattern of results suggests that both perceptual and post-perceptual influences might play a significant role in explaining the interactions between auditory and visual motion. We investigated the relationship between these two components within the same experimental paradigm by using signal detection theory (SDT; Macmillan & Creelman, 1991). Participants had to detect auditory apparent motion streams moving in a predefined (target) direction (left or right). In one condition, auditory streams (that could move in the target or the non-target direction) were combined with visual D. Sanabria et al. / Cognition 102 (2007) 299–310 301 distractor motion that always moved in the direction of the pre-defined target (target-compatible block). In the other condition, the visual distractors always moved in the direction opposite to the pre-defined target auditory motion (target-incompatible block). Both conditions were physically equivalent (with directionally congruent and incongruent audiovisual motion displays being equiprobable), the only difference being that the direction of the visual distractor was either compatible or incompatible with the pre-defined auditory target direction (see Fig. 1). Participants also performed a unimodal block where they detected the direction of sound streams in the absence of visual distractors. If visual motion can capture the perceived direction of auditory motion (SotoFaraco et al., 2005), then we should find considerable lower sensitivity in crossmodal blocks than in the unimodal baseline. That is, participants should find it more difficult to discern the direction of sounds because of the influence exerted by the visual distractors, which always move in one direction. Since target-compatible and targetincompatible blocks contain physically equivalent trials, they should be no different in terms of participants’ sensitivity. Any influence of the visual distractor on participants’ decisions about sound direction should manifest itself as a change in the response criterion, that would shift as a function of compatibility of the direction of visual motion. To ensure that participants in Experiment 1 performed a pure detection task (as opposed to a discrimination task), we used a go/no-go procedure. In Experiment 2, we used a two-alternative forced-choice (2AFC) procedure to assess the influence of the task at hand on participants’ response criterion, as the processes involved in the decision and response execution have sometimes been shown to differ between these two tasks (Chmiel, 1989; Perea, Rosa, & Gómez, 2002). Fig. 1. Example of the type of audiovisual (directionally congruent and incongruent) trials present in the two crossmodal blocks (target-compatible and target-incompatible). In the example shown here, the direction of target auditory motion was toward the right. Note that when participants were instructed to detect leftward auditory motion the auditory and visual streams moved in opposite directions to those depicted in the figure. AM and VM refer to the direction of auditory and visual motion, respectively. 302 D. Sanabria et al. / Cognition 102 (2007) 299–310 2. Experiment 1 2.1. Methods 2.1.1. Participants Forty-eight participants (34 women, 19–40 years) took part. All reported normal hearing and normal or corrected-to-normal vision. 2.1.2. Apparatus and stimuli Four loudspeakers (10 cm in diameter) positioned at eye-level were used to present the auditory stimuli. The visual stimuli consisted of the illumination of four orange LEDs (1 cm in diameter), each situated directly above each loudspeaker (see Fig. 2). The apparent motion streams consisted of four events (white noise burst 65 dB[A] for the sounds, and LED flashes for the lights) sequentially presented (50 ms on, 50 ms off) from each of four correlative spatial locations from left-to-right or vice versa. An auditory mask consisting of four 50 ms white noise bursts (65 dB[A], 50 ms off time) originating from the centre (elicited by simultaneously presenting sounds to the loudspeakers located on either side) before and after the auditory stimulus was used to lower performance. 2.1.3. Procedure Participants sat in a dark room and were asked to detect either leftward or rightward auditory apparent motion by pressing a key on a keyboard, and withholding their response when the sound moved in the non-target direction (target direction was manipulated across participants). The sounds moved in the target direction on 50% of trials and in the opposite direction on the remaining trials. The next trial commenced 1500 ms after response or after 3500 ms on trials where no response was made. The session began with a practice block in which 12 auditory streams were presented in the absence of visual distractors (this was repeated if performance was Fig. 2. Illustration of the experimental set-up used in the present study. D. Sanabria et al. / Cognition 102 (2007) 299–310 303 below 83%). Then a threshold block was run to establish the mask-target-interval (MTI) at which the participant’s performance without visual distractors fell between 65% and 85% (using the method of limits). The MTI was continually adjusted according to the performance in the last 10 trials throughout the rest of the experiment. Next, in the baseline block participants completed 60 trials of auditory streams (50% in the target direction) presented in the absence of visual distractors. Finally, two crossmodal blocks (order counterbalanced) were run. Each consisted of 60 trials (50% in the target direction) in which a visual apparent motion stream was presented in synchrony with the auditory stream, and in a fixed direction. In the ‘‘target-compatible’’ block, the visual distractor direction matched that of the predefined target auditory motion (if the task was to detect leftward auditory motion, the visual motion always moved toward the left). In the ‘‘target-incompatible’’ block, the direction of the visual distractor was always opposite to that of the pre-defined target auditory motion. Note that crossmodal blocks contained 50% audiovisually congruent and 50% audiovisually incongruent trials, regardless of the auditory target direction to which participants were instructed to respond to (see Fig. 1) and the compatibility of the visual distractor with this target direction. 2.2. Results Sensitivity (d 0 ) and criterion (c) were assessed in the unimodal baseline, and in the target-compatible and target-incompatible crossmodal blocks for each participant based on their hit and false-alarm rate. Repeated-measures analyses of variance (ANOVA) revealed no significant differences in either d 0 or c (both Fs < l) as a function of absolute target direction (left or right), thus the data were pooled across this variable in subsequent analyses. d 0 was significantly higher in the baseline unimodal block (M = 1.76) than in either crossmodal block (target-compatible, M = 1.24, |t| (47) = 4.10, p < .001; target-incompatible, M = 1.36, |t| (47) = 2.95, p = .004). The difference in d 0 between the two crossmodal blocks was not significant, |t| < 1. For the criterion data, in the baseline condition c was virtually zero (there was no preference toward responding or not), whereas c was negative in the target-compatible block (M = 0.24, revealing a bias toward responding) and positive in the target-incompatible block (M = 0.22; people tended not to respond). The differences in c between the baseline block and the two crossmodal blocks, and the difference between the two crossmodal blocks themselves, were all significant (all ps < .01; see Fig. 3). The results of Experiment 1 revealed that visual motion information had a significant influence on the perceived direction of auditory motion (i.e., on the ability of participants to distinguish a leftward from a rightward moving sound). This decrease in d 0 was independent of the compatibility effect arising between the direction of tobe-detected auditory target and the direction of the visual distractor. However, response criterion shifted as a function of the compatibility between target direction and the distractor direction. This provides a dissociation between the perceptual and decisional components responsible for audiovisual interactions in motion perception. 304 D. Sanabria et al. / Cognition 102 (2007) 299–310 Fig. 3. (a) Mean d 0 (+SE) as a function of the Experiment (1 or 2) and condition (baseline, targetcompatible, and target-incompatible). (b) Mean c (+SE) as a function of the experiment and condition. In Experiment 1, we used a go/no-go task to ensure that participants performed a detection task. However, it has been argued that different responserelated processes might be involved in 2AFC discrimination tasks (Perea et al., 2002), thus potentially affecting the conclusions of our study. In Experiment 2, we used a 2AFC task in which participants responded to both the target direction (‘‘yes’’ response) and non-target direction (‘‘no’’ response) auditory streams. Another factor that might have affected the criterion measures in Experiment 1 was the particular auditory masking procedure used. As the MTI was adjusted continually on the basis of each participant’s performance, it is possible that the difficulty of the task changed within each experimental block, perhaps eliciting shifts in response criterion and interacting with the distracting effects induced by the irrelevant visual information. In Experiment 2, constant white noise was presented instead of the auditory mask to maintain a constant level of difficulty throughout the experimental session. D. Sanabria et al. / Cognition 102 (2007) 299–310 305 3. Experiment 2 3.1. Methods 3.1.1. Participants Twenty-eight new participants (18 women, 16–32 years) took part. The data from two participants were removed as they performed at chance levels in the baseline condition. 3.1.2. Apparatus and stimuli Participants were instructed to press ‘‘h’’ or ‘‘b’’ on the keyboard whenever the sound appeared to move in the target or the non-target direction (respectively; response mapping reversed for half of the participants). The next trial began only after participants had responded (1500 ms intertrial interval). White noise (77 dB[A]) was presented continuously from a loudspeaker placed 20 cm behind the participant to bring performance off ceiling (performance in a pilot study with this noise level was below 80%). 3.2. Results d 0 was significantly higher in the baseline block (M = 1.26) than in either the target-compatible, |t| (25) = 2.55, p = .02, or target-incompatible crossmodal blocks, |t| (25) = 2.32, p = .03 (M = 0.77 and 0.88, respectively), whereas the two crossmodal blocks did not differ, |t| < 1. Criterion (was close to zero in the unimodal baseline block (M = 0.02), negative in the target-compatible block (M = 0.33) and positive in the target-incompatible block M = 0.29). All comparisons reached statistical significance (all ps < .001; see Fig. 3). We performed a between-experiments analysis with Experiment (1 and 2) and condition (unimodal, target-compatible, and target-incompatible) as factors. There was no interaction between condition and experiment in d 0 F (1, 72) < 1, though participants performed worse in Experiment 2 than in Experiment 1 (M = 1.45 and 0.97, respectively; F (l, 72) = 17.51, p < .0001), indicating that the white noise mask (Experiment 2) proved more effective than the mask procedure used in Experiment 1. Critically, the main effect of condition, F (2, 72) = 13.91, p < .0001, revealed the same pattern observed in each of the individual experiments. The analysis of c only showed a significant main effect of condition, F (2, 72) = 46.33, p < .0001, again supporting the results of each individual experiment. In Experiment 2, participants performed a 2AFC task regarding the direction of auditory apparent motion (Soto-Faraco, Lyons, Gazzaniga, Spence, & Kingstone, 2002), allowing us to analyse the data from the crossmodal blocks in terms of the directional congruency between the two modalities. d 0 was higher for directionally congruent trials (M = 1.46) than for incongruent ones (M = 0.19; t (25) = 6.39, p < .001). Participant’s were also more sensitive on congruent trials than in the unimodal baseline (M = 1.26), although this difference was not statistically significant, t (25) = 1.09, p = .28. These results suggest that the decrement in d 0 reported in the 306 D. Sanabria et al. / Cognition 102 (2007) 299–310 crossmodal blocks was primarily caused by a decrease in participants’ sensitivity caused by directional incongruence between visual and acoustic cues to motion, supporting our earlier prediction. The results regarding participants’ sensitivity in Experiments 1 and 2 cannot be solely accounted for by an interference effect due to the presence of irrelevant visual information (which was present in both the congruent and incongruent trials). On the contrary, it appears that the perceived direction of the auditory apparent motion stream was captured by the direction of the visual apparent motion stream on incongruent trials, resulting in an overall reduction of perceptual sensitivity. The results of Experiment 2 reinforce those of Experiment 1 in showing that the perceptual and post-perceptual influences on audiovisual interactions are independent. Importantly, there were no significant changes in criterion between the two experiments, suggesting that the same response mechanisms were invoked by participants in the 2AFC and go/no-go tasks used here (Gómez, Perea, & Ratcliff, submitted). Furthermore, Experiment 2 confirmed that the shifts in c were caused by the direct influence of the visual information and not by the particular masking procedure used in Experiment I1. 4. General discussion These findings provide the first empirical evidence for the independent co-existence of both perceptual and post-perceptual influences on audiovisual interactions in motion processing2. Our results confirm the existence of interactions at a perceptual level in the processing of motion information (Soto-Faraco et al., 2005), consistent with previous data obtained using static events (Bertelson & de Gelder, 2004). Additionally, the reported shifts in c, independent of changes in d 0 , confirmed that the congruence of visual motion information can also bias decisions about the direction of sounds. Previous studies of audiovisual interactions in motion perception have arrived at apparently divergent conclusions regarding the level of processing at which these interactions occur. A number of studies have suggested that auditory and visual 1 A further control experiment (N = 16) was run with constant white noise (as in Experiment 2) and a go/no-go task (as in Experiment 1). The results were equivalent to those of Experiments 1 and 2. Namely, we found significant differences in d 0 between the unimodal baseline block (M = 2.41) and both crossmodal blocks (M = 1.47; p < .01, for target-compatible; and M = 1.62; p < .01, for targetincompatible), whereas the two crossmodal blocks did not differ (|t| < 1). The difference in c between the target-compatible and target-incompatible crossmodal blocks was significant (M = 0.10 and 0.63, respectively, p < .01). There were also significant differences in c between the unimodal baseline block (M = 0.39) and both crossmodal blocks (ps < .05). 2 It is worth noting that audiovisual interactions in the perception of motion are asymmetrical. While vision has been shown to exert a considerable influence over the perception of moving auditory stimuli, visual motion perception is usually little influenced by audition (Soto-Faraco, Spence, & Kingstone, 2004). Therefore, one might expect that, unless the visual signal is presented at or near threshold (cf. Meyer & Wuerger, 2001), visual motion processing would not be influenced by distracting auditory information. D. Sanabria et al. / Cognition 102 (2007) 299–310 307 motion information can interact prior to response selection/execution (Soto-Faraco et al., 2005), whereas others have argued against the existence of perceptual interactions in the perception of audiovisual motion (Wuerger et al., 2003). The present study demonstrates the existence of both perceptual and post-perceptual influences on audiovisual interactions in the perception of motion within the same experimental protocol. Importantly, these influences were shown to be largely independent of one another. Meyer and Wuerger (2001) studied the effect of irrelevant auditory motion information on visual motion processing, and reported large shifts in response criterion consistent with the direction of distractors, together with a small decrement in perceptual sensitivity to visual motion in congruent audiovisual displays. This sensitivity shift is an intriguing result, given that it contrasts with other studies that have demonstrated perceptual enhancement (and not inhibition) for co-localized (and congruent) audiovisual signals (Meyer, Wuerger, Röhrbein, & Zetzsche, 2005). Indeed, as noted by the authors themselves, it is unlikely that the changes in the sensitivity index in their study were caused by interactions taking place at a perceptual level of processing (cf. Soto-Faraco & Kingstone, 2004). It seems likely that the decrement in d 0 reported in the crossmodal blocks (as compared to the unimodal baseline block) of Experiments 1 and 2 reflects the existence of a perceptual illusion whereby the perceived direction of the auditory stream was, in many trials, ‘‘captured’’ by the direction of motion of the visual stream. That is, on trials where the auditory and visual streams moved in opposite directions, observers may have perceived the sounds moving in the direction of the visual stream, thereby resulting in a reduced ability to distinguish these incongruent trials from congruent ones (when sounds actually moved in the direction of visual distractors). This interpretation is supported by phenomenological reports from early studies (Zapparoli & Reatto, 1969) as well as by recent psychophysical data (Soto-Faraco et al., 2005). However, the decrease in d 0 reported in the crossmodal blocks might also reflect a cost of dividing attention between the visual and the acoustic information (Bonnel & Hafter, 1998). We do not favour this interpretation because, using a similar stimulus set-up to that described here, Soto-Faraco et al. (2002, Experiment 3) failed to observe any decrement in performance when the visual and auditory streams moved in orthogonal directions (i.e., the mere presence of visual motion information seemed insufficient in-and-of-itself to influence the perception of auditory motion in the manner observed here). Moreover, a decrement in d 0 was only found on incongruent audiovisual trials, while a numerical (albeit not statistically significant) increase in perceptual sensitivity was found on congruent audiovisual trials, supporting the ‘capture’ account, rather than the distraction account. Visual motion information not only affected the perceptual processing of auditory motion, but also participants’ decisions regarding the direction of the auditory streams. The fact that the feature to be judged (left–right direction) was the same as the dimension in which the visual distractors moved introduced potential stimulus-response compatibility effects (Simon, 1990; see Bosbach, Prinz, & Kerzel, 308 D. Sanabria et al. / Cognition 102 (2007) 299–310 2004, 2005, for examples of such effects elicited by moving stimuli). At decisional stages of information processing, the response code for the direction of motion would have been primed by the direction of the salient visual distractor stimulus, accounting for the tendency toward ‘‘yes’’ responses in target-compatible blocks and toward ‘‘no’’ responses in target-incompatible blocks. The results reported here go beyond previous research that has investigated multisensory motion processing by demonstrating the independent co-existence of both perceptual and post-perceptual influences on performance, while at the same time providing solid evidence regarding the perceptual nature of these crossmodal interactions. Thus, the present results stands in contrast with recent studies claiming that visual and auditory motion information interact only at decisional stages (Alais & Burr, 2004; Meyer & Wuerger, 2001; Wuerger et al., 2003). We believe that methodological factors that might have promoted sensory segregation rather than sensory integration (see Sanabria, Soto-Faraco, Chan, & Spence, 2005, for empirical evidence) may account for the null results reported in these earlier studies. For example, the use of random dots kinetograms and inter-aural differences for the visual and auditory stimuli (respectively) in these previous studies means that the auditory and visual cues to motion were different in number and were not necessarily spatially co-localised. In contrast, we used discrete visual and auditory stimuli that were both spatially and temporally matched (Soto-Faraco et al., 2002, 2004), and were of an equal number (Sanabria et al., 2005; Spence, Sanabria, & Soto-Faraco, in press). As has been shown for the case of static events (Morein-Zamir, Soto-Faraco, & Kingstone, 2003; Spence & Driver, 2004; see also Welch, 1999), it would seem likely that in order to find the most pronounced crossmodal interactions at early levels of processing, auditory and visual inputs need to be spatiotemporally aligned and matched in number as far as possible. Acknowledgements This study was supported by a Network Grant from the McDonnell-Pew Centre for Cognitive Neuroscience in Oxford to Salvador Soto-Faraco and Charles Spence. References Alais, D., & Burr, D. (2004). No direction-specific bimodal facilitation for audiovisual motion detection. Cognitive Brain Research, 19, 185–194. Aschersleben, G., Bachmann, T., & Musseler, J. (Eds.). (1999). Cognitive contributions to the perception of spatial and temporal events. Amsterdam: Elsevier Science. Bertelson, P., & de Gelder, B. (2004). The psychology of multimodal perception. In C. Spence & J. Driver (Eds.), Crossmodal space and crossmodal attention (pp. 141–179). Oxford: Oxford University Press. Bonnel, A., & Hafter, E. (1998). Divided attention between simultaneous auditory and visual signals. Perception & Psychophysics, 60, 179–190. Bosbach, S., Prinz, W., & Kerzel, D. (2004). A Simon effect with stationary moving stimuli. Journal of Experimental Psychology: Human Perception and Performance, 30, 39–55. D. Sanabria et al. / Cognition 102 (2007) 299–310 309 Bosbach, S., Prinz, W., & Kerzel, D. (2005). Is direction position? Position and direction-based correspondence effects in tasks with moving stimuli. Quarterly Journal of Experimental Psychology, 58A, 467–506. Chmiel, N. (1989). Response effect in the perception of conjunctions of colour and form. Psychological Research, 51, 117–122. de Gelder, B., & Bertelson, P. (2003). Multisensory integration, perception, and ecological validity. Trends in Cognitive Sciences, 7, 460–467. Gómez, P., Perea, M., & Ratcliff, R. (submitted for publication). A model of the go/no-go lexical decision task. Journal of Experimental Psychology: General. Kitagawa, N., & Ichihara, S. (2002). Hearing visual motion in depth. Nature, 416, 172–174. Macmillan, N. A., & Creelman, C. D. (1991). Detection theory: A user’s guide. Cambridge: Cambridge University Press. Mateeff, S., Hohnsbein, J., & Noack, T. (1985). Dynamic visual capture: Apparent auditory motion induced by a moving visual target. Perception, 14, 721–727. Meyer, G. F., & Wuerger, S. M. (2001). Cross-modal integration of auditory and visual motion signals. Neuroreport, 12, 2557–2560. Meyer, G. F., Wuerger, S. M., Röhrbein, F., & Zetzsche, C. (2005). Low-level integration of auditory and visual motion signals requires spatial co-localisation. Experimental Brain Research, 166, 538–547. Morein-Zamir, S., Soto-Faraco, S., & Kingstone, A. (2003). Auditory capture of vision: Examining temporal ventriloquism. Cognitive Brain Research, 17, 154–163. Perea, M., Rosa, E., & Gómez, C. (2002). Is the go/no-go lexical decision task an alternative to the yes/no lexical decision task? Memory & Cognition, 30, 34–45. Sanabria, D., Soto-Faraco, S., Chan, J., & Spence, C. (2005). Intramodal perceptual grouping modulates multisensory integration: Evidence from the crossmodal dynamic capture task. Neuroscience Letters, 377, 59–64. Simon, J. R. (1990). The effects of an irrelevant directional cue on human information processing. In R. W. Proctor & T. G. Reeve (Eds.), Stimulus-response compatibility (pp. 31–86). Amsterdam: Elsevier Science. Soto-Faraco, S., & Kingstone, A. (2004). Multisensory integration of dynamic information. In G. A. Calvert, C. Spence, & B. E. Stein (Eds.), The handbook of multisensory processes (pp. 49–68). Cambridge, MA: MIT Press. Soto-Faraco, S., Kingstone, A., & Spence, C. (2003). Multisensory contributions to the perception of motion. Neuropsychologia, 41, 1847–1862. Soto-Faraco, S., Lyons, J., Gazzaniga, M., Spence, C., & Kingstone, A. (2002). The ventriloquist in motion: Illusory capture of dynamic information across sensory modalities. Cognitive Brain Research, 14, 139–146. Soto-Faraco, S., Spence, C., & Kingstone, A. (2004). Cross-modal dynamic capture: Congruency effects in the perception of motion across sensory modalities. Journal of Experimental Psychology: Human Perception and Performance, 30, 330–345. Soto-Faraco, S., Spence, C., & Kingstone, A. (2005). Assessing automaticity in the audio-visual integration of motion. Acta Psychologica, 118, 71–92. Spence, C., & Driver, J. (Eds.). (2004). Crossmodal space and crossmodal attention. Oxford: Oxford University Press. Spence, C., Sanabria, D., & Soto-Faraco, S. (in press). Intersensory Gestalten and crossmodal scene perception. In K. Noguchi (Ed.). The psychology of beauty and Kansei: New horizon of Gestalt perception (to appear). Vroomen, J., & de Gelder, B. (2003). Visual motion influences the contingent auditory motion aftereffect. Psychological Science, 14, 357–361. Welch, R. B. (1999). Meaning, attention, and the ‘‘unity assumption’’ in the intersensory bias of spatial and temporal perceptions. In G. Aschersleben, T. Bachmann, & J. Müsseler (Eds.), Cognitive contributions to the perception of spatial and temporal events (pp. 371–387). Amsterdam: Elsevier Science. Welch, R. B., & Warren, D. H. (1980). Immediate perceptual response to intersensory discrepancy. Psychological Bulletin, 88, 638–667. 310 D. Sanabria et al. / Cognition 102 (2007) 299–310 Wuerger, S. M., Hofbauer, M., & Meyer, G. F. (2003). The integration of auditory and visual motion signals at threshold. Perception & Psychophysics, 65, 1188–1196. Zapparoli, G. C., & Reatto, L. L. (1969). The apparent movement between visual and acoustic stimulus and the problem of intermodal relations. Acta Psychologica, 29, 256–267.