614 Intermodal Perception, Development of Intermodal Perception, Development of Intermediate article Lorraine E Bahrick, Florida International University, Miami, Florida, USA CONTENTS Introduction Integration versus differentiation Amodal invariant relations Auditory±visual correspondence Bimodal perception of speech Intermodal perception is the perception of unitary objects and events through spatially and temporally coordinated stimulation from multiple sense modalities. Research suggests that the senses are united in early infancy, fostering the rapid development of intermodal perception. INTRODUCTION Intermodal perception is the perception of an object or event that makes information available to two or more sensory systems simultaneously. Most objects and events are multimodal in that they can be experienced through multiple sense modalities. For example, a person talking, a fire, or a bouncing ball can all be seen as well as heard and felt. Intermodal perception is thus one of the most fundamental human capabilities and forms the basis for most of what we perceive, learn, and remember. One of the questions developmental psychologists have asked is how and when the child comes to perceive multimodal events as single, unitary events, in the way adults do. For example, without prior experience with objects and events, how does the infant learn that certain patterns of auditory and visual stimulation, such as the sight of the mother's face and the sound of her voice, belong together and constitute a unitary event, whereas other concurrent patterns of sensory stimulation are unrelated? How does the child acquire intermodal knowledge such that the sound of footsteps in the hallway will elicit the expectation of seeing a person in the doorway? INTEGRATION VERSUS DIFFERENTIATION Researchers have discovered that intermodal perception develops rapidly during infancy. Infants Visual±tactile correspondence Visual±motor correspondence and the self Neural bases of intermodal perception Conclusion are intrinsically motivated to pick up new information. Some researchers (e.g., Piaget, 1954) have characterized the development of intermodal perception as a process of integration. According to this view, the senses are separate at birth and the infant must gradually learn to put together or `integrate' stimulation from the different sense modalities in order to perceive a unitary multimodal event. This `integration' may occur through associating concurrent information across different modalities. Thus, before integration takes place, infants would perceive only unrelated streams of light, sound, or tactile impressions. A contrasting position is the `differentiation' view of development (e.g., Gibson, 1969). According to this view, the senses are unified at birth, and perceptual development is characterized by a progressive process of `differentiation' of increasingly finer levels of stimulation. Thus, in early infancy, information from the different senses must be gradually separated from the global, undifferentiated perceptual array. From this perspective, intermodal perception of some kinds of information is possible at birth and infants continue to show perceptual learning of more complex multimodal relations throughout infancy and early childhood. Recent evidence has demonstrated that young infants are adept at perceiving a wide array of multimodal objects and events and they do so by detecting information that is common, or invariant, across the senses. This body of research has thus weakened the integration position, especially when intermodal abilities are discovered in very young infants who have had little opportunity to learn to associate or integrate information across the senses. Much infant research has provided support for the differentiation view, particularly the large body of research on young infants' detection of `amodal Intermodal Perception, Development of invariants', suggesting that the senses are unified in early infancy. AMODAL INVARIANT RELATIONS Amodal information is information that is not specific to a particular sense modality, but is completely redundant or invariant across two or more senses. For example, the sights and sounds of hands clapping share a synchrony relation, a common tempo of action, and a common rhythm. The same rhythm and tempo can be detected by watching or hearing the hands clap. Thus, synchrony, rhythm, and tempo are `amodal invariant relations' in that this information can be perceived across different sense modalities. Most information that is amodal characterizes how events are distributed in space and time, two of the most fundamental dimensions of our experience. According to the differentiation view, detection of amodal relations focuses attention on meaningful, unitary events and buffers infants from making incongruent, inappropriate associations (e.g., Bahrick and Pickens, 1994). For example, if the infant detects synchrony, shared rhythm, and common tempo between the sight of a person's moving face and the sound of the person's voice, the infant would necessarily be attending to a unitary event: the person talking. In this way, unrelated sounds and movements would not be merged with the event. In support of the differentiation view, research has found that young infants detect a wide array of amodal invariant relations in multimodal events. In contrast to amodal relations, information can also be nonredundant and arbitrarily related across the sense modalities (e.g., speech sounds and the objects they refer to; particular faces and voices). Information such as color, pattern, timbre, or pitch is `modality-specific' and can be perceived only through a single sense modality. Research suggests that infants detect amodal relations (such as temporal synchrony) developmentally prior to arbitrary relations, and detection of amodal relations can then guide and constrain learning about arbitrary relations. AUDITORY±VISUAL CORRESPONDENCE To assess intermodal perception of auditory±visual relations, sometimes an intermodal preference method is used. In this method, infants view two filmed events simultaneously, along with the soundtrack to one of them coming from a centralized speaker. It is expected that if the infant detects 615 the intermodal relations, he or she will look longer at the film that belongs with the soundtrack played. Research using these and similar procedures has demonstrated that young infants display a wide array of intersensory abilities in the area of audiovisual perception (see Gibson and Pick, 2000; Lewkowicz, 2000; Lewkowicz and Lickliter, 1994). Neonates turn their eyes in the direction of a sound, demonstrating a basic coordination of audio-visual space. In the first month of life, infants detect the temporal synchrony between sights and sounds of an object striking a surface, and the spatial location common to the sights and sounds of a moving object. By three to five months, infants can match films and soundtracks of moving objects on the basis of their substance (rigid versus elastic) or their composition (single versus multiple objects), as well as the rhythm and tempo of their impact sounds. Further, by four to six months, infants can match faces and voices on the basis of affective expressions, including happy, sad, neutral, and angry (Walker-Andrews, 1997). They can also match faces and voices on the basis of age (adults versus children) and gender of speaker. All these relations are amodal and invariant across vision and audition. BIMODAL PERCEPTION OF SPEECH The perception of speech, an auditory±visual event, has traditionally been studied as a unimodal, auditory event. However, speech is produced by a speaker who can be heard and seen, and who typically uses gesture as well. It turns out that the multimodal nature of speech is salient to infants and facilitates its perception (e.g., Meltzoff and Kuhl, 1994). By the age of at least two months, infants are sensitive to voice±lip synchrony during speech. By four months, infants are able to detect the voice± lip correspondence between speech sounds such as `a' and `i'. When one of these speech sounds is played in synchrony with two films side by side of a speaker's face intoning each sound, infants look more to the face with the matching lip movements. The McGurk effect, an auditory±visual illusion, also illustrates how infants and adults merge information for speech across the senses. When we view the face of a person speaking one speech sound such as `ga`, while hearing a different speech sound, for example `ba', we perceive another sound, `da', a blend between the two. Infants show evidence of this effect in the first half-year of life. Visual input appears to have significant auditory consequences. Amodal information during speech is also important for learning the arbitrary relation between 616 Intermodal Perception, Development of speech sounds and the objects they denote (Gogate et al., 2001). By 14 months of age, infants are able to learn to pair a speech sound and an object during a brief familiarization. However, if amodal synchrony unites the sounds and object movements, for example in showing and naming the object simultaneously, infants can learn the relation as early as seven months of age. Adults even match their teaching style to the infant's needs. They use more synchronous movement with labeling, to highlight object±sound relations, when they are first teaching the names of new objects to their young infants. Their use of synchrony decreases as infants become more linguistically competent. Further evidence for the importance of visual information for perceiving speech lies in the success of teaching speech to deaf individuals using a visual depiction of the lip and tongue movements involved in different speech sounds. VISUAL±TACTILE CORRESPONDENCE Amodal invariant relations also unite perception across vision and touch. Information for shape, texture, substance, and size are invariant across visual and tactile stimulation (Rose and Ruff, 1987). One method for investigating perception of visual± tactile correspondence is the cross-modal transfer method. An object is presented to one sense modality alone, and a preference test is then given in another sense modality to determine whether the information transfers across modalities. Using this method, research has shown that, by the age of one month, infants can perceive the correspondence between an object they experienced tactually (on the back or a pacifier) and a visual replica of the object. Infants looked more to the object of the shape and texture that they had previously experienced orally. Infants are also able to transfer information about the substance of an object (rigid versus deforming) across touch and sight. Evidence also shows that infants can transfer information obtained through manual exploration to vision, and this develops across the first year. One factor determining the extent to which manual information is perceived is whether exploration is active or passive. Tactile exploration develops over the first year. Young infants tend to grasp objects, whereas older infants become more adept at obtaining tactile feedback by moving their hand relative to the object's surface. By four months, infants can perceive whether two parts of objects are connected or separate, by the type of motion they produce during haptic exploration. By six months, infants can recognize the shape of an object visually that they have manually explored, as long as exploration is active. VISUAL±MOTOR CORRESPONDENCE AND THE SELF Infants are also able to perceive information specifying the self by detecting amodal invariant relations (Rochat, 1995). Even in the first weeks of life, infants can imitate facial expressions. In order to do this, they must relate the visual appearance of the adult's facial expression with their own production of the expression. This is most probably guided by proprioception: proprioception is information about self-movement based on feedback from the muscles, joints, and vestibular system. Facial imitation reveals evidence of early intermodal coordination between visual information and motor behavior, and this coordination continues to develop over the first year (Meltzoff and Moore, 1995). Infants also show evidence of self-perception by detecting amodal invariant relations in a procedure where they view their own body moving live in a video display (Bahrick, 1995). By three to five months, infants can distinguish between a live video of their own legs kicking and a video of another infant's legs kicking, a pre-recorded video of their own legs, or a spatially incongruent video of their own legs. They do this by detecting the amodal temporal synchrony and spatial relations common to the visual display of their motion and the proprioceptive experience of their motion. Infants also demonstrate a phenomenon called `visually guided reaching', which develops rapidly during the first year. That is, they show continuous adjustments in their reaching and manual behavior as a function of visual input about the size, shape, and position of objects. Infants are even able to contact a moving object by aiming their reach ahead of the object and taking into account the speed and direction of its movement as well as that of their arm motion. Later, infants show an ability to adapt their crawling and exploratory behavior as a function of visual information about the slant and solidity of the surface. These examples illustrate a close coupling between vision and motor behavior and an understanding of self in relation to objects (Gibson and Pick, 2000). NEURAL BASES OF INTERMODAL PERCEPTION Behavioral research on the rapid development of intermodal perception during infancy is consistent Intermodal Perception, Development of with research findings from the neurosciences. Some areas of the brain (cortex, superior colliculus) contain `multimodal neurons' that respond to inputs from multiple sense modalities, providing a biological basis for the early integration of the senses (e.g., Stein and Meredith, 1993). Further, some cells of the superior colliculus (devoted to attention and orienting) are activated much more by simultaneous auditory and visual inputs than by either auditory or visual information alone. Other cells, however, are modality-specific but can have receptive fields that are spatially coordinated across the sense modalities as a result of experience with multimodal events. Thus, auditory and visual input from the same spatial location can be related. Neurophysiological findings suggest that if input to one modality is somehow modified, the receptive field of cells in the superior colliculus can compensate and realign with those of the other modality to maintain a coherent multimodal spatial mapping. The early plasticity of the brain, its sensitivity to multimodal inputs, and its reliance on experience in the multimodal world to guide neuronal development, appears well tailored to the behavioral findings of the early development of intermodal perception. CONCLUSION Infants demonstrate a diverse array of intermodal abilities. These abilities illustrate the close connection between the senses during early development and the rapid growth in intersensory abilities across the first year of life. Development appears to be guided by the detection of amodal, invariant relations, and this promotes accurate and unitary perception of multimodal events. References Bahrick LE (1995) Intermodal origins of self-perception. In: Rochat P (ed.) The Self in Infancy: Theory and Research pp. 349±373. New York, NY: Elsevier. Bahrick LE and Pickens JN (1994) Amodal relations: the basis for intermodal perception and learning. In: Lewkowicz D and Lickliter R (eds) The Development of Intersensory Perception: Comparative Perspectives, pp. 205±233. Hillsdale, NJ: Lawrence Erlbaum. Gibson EJ (1969) Principles of Perceptual Learning and Development. New York: Appleton-Century-Crofts. 617 Gibson EJ and Pick AD (2000) An Ecological Approach to Perceptual Learning and Development. New York, NY: Oxford University Press. Gogate LJ, Walker-Andrews AS and Bahrick LE (2001) The intersensory origins of word comprehension: an ecological-dynamic systems view. Developmental Science 4: 1±37. Lewkowicz DJ (2000) The development of intersensory temporal perception: an epigenetic systems/limitations view. Psychological Bulletin 126: 281±308. Lewkowicz DJ and Lickliter R (eds) (1994) The Development of Intersensory Perception: Comparative Perspectives. Hillsdale, NJ: Lawrence Erlbaum. Meltzoff AN and Kuhl PK (1994) Faces and speech: intermodal processing of biologically relevant signals in infants and adults. In: Lewkowicz D and Lickliter R (eds) The Development of Intersensory Perception: Comparative Perspectives, pp. 335±369. Hillsdale, NJ: Lawrence Erlbaum. Meltzoff AN and Moore KM (1995) A theory of the role of imitation in the emergence of self. In: Rochat P (ed.) The Self in Infancy: Theory and Research, pp. 73±93. New York, NY: Elsevier. Piaget J (1954) The Construction of Reality in the Child. New York, NY: Basic Books. Rochat P (ed.) (1995) The Self in Infancy: Theory and Research. New York, NY: Elsevier. Rose SA and Ruff HA (1987) Cross-modal abilities in human infants. In: Osofsky J (ed.) Handbook of Infant Development, 2nd edn, pp. 338±362. New York, NY: John Wiley. Stein BE and Meredith MA (1993) The Merging of the Senses. Cambridge, MA: MIT Press. Walker-Andrews A (1997) Infants' perception of expressive behaviors: differentiation of multimodal information. Psychological Bulletin 121: 437±456. Further Reading Kellman PJ and Arterberry ME (1998) The Cradle of Knowledge: Development of Perception in Infancy. Cambridge, MA: MIT Press. Lickliter R and Bahrick LE (2000) The development of infant intersensory perception: advantages of a comparative convergent-operations approach. Psychological Bulletin 126: 260±280. Masarro DW (1998) Perceiving Talking Faces: From Speech Perception to a Behavioral Principle. Cambridge, MA: MIT Press. Thelen E and Smith LB (1994) A Dynamic Systems Approach to the Development of Cognition and Action. Cambridge, MA: MIT Press.