Chapter 12 – Auditory Localization and Organization Localization of sounds Localizing Visual stimuli in vertical and horizontal dimensions – Too easy!! Visual stimuli at different places activate different locations on the retina. Each different place on the retina is connected to a different area of the cortex. So it’s kind of easy to see how the brain can locate different visual stimuli – by noting where the activity occurs on the retina, LGN, area V1. Localizing in the 3rd dimension – depth – is more difficult. It requires “neural deduction” and comparison of images in the two eyes – binocular disparity. The issues involving perceiving sound are like those involved in using binocular disparity to perceive depth or distance . . . A sound originating from the left activates ALL the receptors on both the left cochlea and right cochlea. Similarly, a sound originating from the right activates ALL the receptors on both the left cochlea and right cochlea. So, perceiving location requires a comparison of the input to the two ears. Elevation – location above or below the head. Some terms . . . Azimuth – location on a circle around the head 0◦ 270◦ Or 90◦ left 45◦ 90◦ right 0◦ Azimuth – Location in a horizontal circle around the listener. Elevation – Location in a vertical circle around the listener. 0◦ is immediately in front of us. 180◦ Auditory Locatization and Organization - 1 3/21/5 Cues for localization Sound localization involves comparison of the sounds in the two ears. The primary cues are binaural cues: Specific differences in the sounds between the two ears. 1. Interaural Time Differences (ITDs): G9 p 291 Also called time of arrival differences A sound originating on the left will reach the left ear slightly sooner than it reaches the right ear. Vice versa for sounds originating on the right. Maximum difference in time of arrival is associated with a sound on immediately to the left of the left ear or the right of the right ear. . . . 600 µsec 600 µsec = .000600 sec. The maximum difference associated with a sound from one side is .000600 seconds, 600 millionths of a second, or 600 microseconds. Such time of arrive differences are by themselves enough to allow precise localization of sounds. Careful experiments in which the only cue for location was time of arrival have demonstrated that persons can accurately locate sounds based on this cue only. The threshold for ITD is about 100 µsec – sounds which differ in ITD by less than 100 µsec are perceived to be directly in front of us 50% of the time. Interaural time differences (ITDs) are most important for localization of low frequency sounds. Auditory Locatization and Organization - 2 3/21/5 2. Interaural Level Differences (ILDs) G9 p 291 Differences in Level (intensity) between the ears Sounds which originate on the left will tend to be more intense at the left ear. Sounds which originate on the right will tend to be more intense at the right ear. Our perception corresponds to reality . . . All other things equal, a sound localized to the left will be more intense in the left ear. A sound on the right will be more intense in the right ear. But: How different the intensities will be in the two ears depends on the frequency of the sound. A sound originating on the left has two ways of getting to the right ear . . .(assuming no reflective surfaces) a. bending around the head. Low frequency sounds bend easily around the head b. going through the head. High frequency sounds don’t bend, so they must go through the head, reducing intensity. Ear in “shadow” The result is the following: Frequency 300 Hz Maximum Intensity difference between ears 1 db (barely noticeable) Low frequency sound easily bend around the head. 4000 Hz 5 – 10 db (noticeable) 10,000 Hz 10-20 db (quite noticeable) High frequency sounds are blocked by the head. Auditory Locatization and Organization - 3 3/21/5 Graphically . . . So interaural level differences provide good information on location of high frequency sounds. But interaural intensity differences provide little information on location of low frequency sounds This is one of the reasons that some persons with high quality sound systems have only one speaker designed for very low frequency sounds. It’s because we are less able to identify the location of such sounds, so there is no need for two in a stereo system. Auditory Locatization and Organization - 4 3/21/5 Perceiving Elevation G9 p 293 The Cone of Confusion ITDs and ILDs give great information about localization in the azimuth. Alas, they don’t provide much information about elevation. The collection of locations for which ITD and ILD are equal is called the cone of confusion. The locations on the cone differ in elevation. This means that ITD and ILD are of little use in determining elevation. Spectrum cues used to judge elevation and whether sound is in front of us or in back The shape of the pinna differentially attenuates different frequency components of the sound stimulus. This shaping of the sound stimulus creates information for the location of the sound called the spectral shape cue. This cue is particularly important for perception of the elevation of the sound stimulus. The shaping of the sound spectrum by the pinna is specific to each individual ear. This means that using the shape of the sound spectrum to perceive elevation must be learned. It also implies that if the shape of an adult’s pinna were changed, that adult would have difficulty locating the elevation of sounds they could previously locate very easily. Note that the sound from above is generally more intense n the ear canal. Note also that the sound from below lacks intensity in the 6-8 KHz frequency range relative to the sound from above. Auditory Locatization and Organization - 5 3/21/5 Influence of shape of the pinna in judging elevation Figure 12.9 G9 p 295 shows the effects of wearing an artificial pinna on ability to localize sounds. The crossings of the blue grid are where sounds were presented. The red dots are where the sounds were reported to be located. On the first day that observers wore an implant, their reports were horribly inaccurate in the vertical dimension, although pretty nearly correct in the horizontal dimension. After 5 days of wearing the implant, observers’ reported locations in the vertical dimension became more accurate. After 19 days wearing the implant, observers’ reports of both vertical and horizontal location were pretty accurate. Immediately after removing the implant, obervers’ reports were still quite accurate, suggesting that they had not “lost” an ability, but simply gained a new ability. This study suggests that the timbre (the experienced frequency spectrum) of the sound is a cue for that sound’s elevation Auditory Locatization and Organization - 6 3/21/5 Using timbre to determine whether a sound is in front of us or in back. The pinna attenuates the high frequency components of sounds located in back of the head. This means that we can tell when a familiar sound is behind us because it sounds slightly muffled compared to the way it sounds when it’s in front of us. A study demonstrating the effect of the ear flap . . . Batteau (1967) as reported in L&S, p. 424. Batteau created artificial pinnas out of plaster and put them on an artificial head. Put microphones in the hole in the artificial pinna where the ear drum would be. Condition 1: Recorded sounds played in different locations around the head in which microphones had been embedded in the plaster pinnas. Condition 2: Recorded the same sounds played in different locations around a head in which microphones had been embedded without being inside plaster pinnas. That is, the artificial head had no ears, just ear canals with microphones at the end of the canals. Played back the recordings of those sounds to human observers. Results: Condition 1: Participants were able to identify locations of the sounds in front of and behind the artifical head. Condition 2: Participants were not able to localize the sounds correctly. Conclusion: The plaster, artificial pinnas muffled the sounds played behind the head in a way that was similar to how sounds are muffled by the real pinna. Perceiving distance (Not in G9, from K&S p 296) Familiarity – comparison of perceived sound with memories of similar sounds at different distances Spectral analysis – High frequency sounds decrease in intensity over distance more than low frequency sounds. So we hear the bass guitar across the lake, but not the singer. We hear the rumble of thunder but not the high frequency crack of lightning. Ratio of direct to reflected sound – Nearer sounds have a higher ratio of direct to reflected than far sounds. Auditory Locatization and Organization - 7 3/21/5 Possible neural circuits for ITD and ILD – G9 p 295 Study of barn owls has provided evidence for existence of neurons that respond only to sounds presented in a particular location. The superior olivary complex contains the medial superior olive, the collection of neurons that have been identified as responding to ITD Neurons have been found which respond only when one ear receives sound slightly before the other – to time of arrival differences – that is, ITDs. From the web site of David Heeger, NYU. Auditory Locatization and Organization - 8 3/21/5 A circuit for neurons responding to Interaural Time Differences. – G9 p 297 Issue: Design a neural circuit so that a neuron responds most actively when a sound arrives at the right ear before it arrives at the left ear. Sound on the right Stimulus Back of head Ears L Short axon from left cochlear nucleus R MSO MSO Neuron Long axon from right cochlear nucleus A simple way to create time of arrival signaling is to have a longer path from neurons originating in one ear to the postsynaptic neuron than the path from the other ear. This is analagous to the motion detectors we studied a couple of chapters ago. Sounds in front of us If a sound strikes both L and R at the same time, L’s action potential will arrive at the end of its axon, followed by R’s. That’s because the action potential has further to travel in R than it does in L. Thus, the total release of neurotransmitter substance at MSO will be long-term, but low intensity, resulting in low probability of MSO neuron responding. Sounds to the right of us Sound strikes the right ear first, but the path from the ear to the MSO from the right ear is longer. So R’s action potential will reach the end of its axon at about the same time as L’s, causing a massive release of neurotransmitter, enough to cause the MSO neuron to fire. Sounds to the left of us But if a sound strikes L first, then strikes R shortly afterward, L’s action potential will reach the end of its axon first and much much later (in neural time) R’s action potential will reach the end of its axon, resulting in insufficient neurotransmitter to cause the MSO neuron to respond. Note that in such circuits, the long path is from the ear closest to the sound. (In case you need to construct such a neural circuit.) Play ..\..\..\MDBT\P312\Ch14AuditoryPhenomena\ITD Neuron.pptx Auditory Locatization and Organization - 9 3/21/5 The Jeffress Neural Coincidence Model – combining many of the above circuits. Figure 12.12, G9 p 297 Each neuron – 1, 2, 3, etc except 5 – has a different length path to each ear. Neuron 5 has an equal length path from each ear, so it will respond best to sounds from the front. . Neuron 3 has a long path from the right ear and a short path from the left ear, so it will respond best to sound that reach the right ear first. Imagine 1000s of neurons like 1,2,3,4,5,6,7,8,9 above, each with a different length path to each ear. Each of those neurons will respond best to a sound at a particular location on the azimuth. Mike – Explain what each curve means. Auditory Locatization and Organization - 10 3/21/5 Mammalian Neurons It appears that turning curves of neurons in mammals are not as sharp as those of owls. The gerbil curve is about twice as wide as the owl curve in Figure 12.14 p 298. It’s been suggested that in mammals, there are more broadly tuned neurons, ones that fire over a wide range of locations to the right and others that fire over a wide range of locations to the left. The blue curve is of a neuron that responds above its base rate to sounds from the right and below its base rate to sounds from the left. The left curve has the opposite response characteristics. Note – each of these curves is like the response curve of an opponent processes neuron in the LGN. In this model, the location of a sound is signaled by the pattern of responses of these two types of neurons . . . Sound 1 is on the left- resulting in a large response of the L neuron, but a small response of the R neuron. Sound 3 is on the right – resulting in a small response of the L neuron but a large response of the R neuron. Auditory Locatization and Organization - 11 3/21/5 A Neural Circuit responding to Interaural Level Differences (This one is almost too easy.) Below is a simple-minded circuit to process intensity differences. L R ILD Sounds in front of us Neurotransmitter from L and R counteract each other, resulting in base rate activity of LSO. So a base rate of activity would signal that a sound is in front of us. Sounds to the right Since the sound is more intense at R, R releases more excitatory neurotransmitter substance the L, making ILD more active. So through its heightened activity, ILD signals “sounds to the right”. Sounds to the left Since the sound is more intense at L, L releases more inhibitory neurotransmitter substance than R, reducing LSO activity. So through its lowered rate of activity ILD signals “sounds to the left”. Obviously, the the neuron ILD by itself could not completely identify the location of a sound, but its activity, in conjunction with that of other neurons responding to other aspects of the auditory scene would help signal sound location in the azimuth. . Auditory Locatization and Organization - 12 3/21/5 The auditory pathway to the auditory cortex – Note that each primary auditory projection area (A1) is buried in a sulcus. An important point to get from this is that there is a LOT of processing of the auditory signal in each of these structures. A natural question is: What exactly are they doing? As Yantis, p 325 says, “The exact functions served by the different subcortical structures and the different response patterns of their neurons are the subject of considerable ongoing research and debate, but it’s clear that these structures and the neural signal processing that takes place within them along the auditory pathways play a critical role in encoding, with exquisitely high fidelity, the rapidly changing stimuli that typically make up the auditory environment.” Auditory Locatization and Organization - 13 3/21/5 An image of the initial processing of sound that emphasizes the tonotopic maps in the various areas . . . High Freq Low Freq Reference is journal.frontiersin.org Red represents low frequency sounds Blue represents high frequency sounds AN: Auditory Nerve CN: Cochlear Nucleus LSO: Lateral Superior Olive MSO: Medial Superior Olive MNTB: Medial Nucleus of the Trapezoidal Body – not mentioned in Y1 The point is that every area through which the auditory signal passes is organized tonotopically – by frequency. Auditory Locatization and Organization - 14 3/21/5 The auditory Cortex – Y1 p 327 The auditory cortex is located on the upper surface of the temporal lobe tucked into the lateral sulcus – the sulcus separating the temporal lobe from the frontal and parietal lobes. The primary discovery associated with the auditory cortex is the tonotopic map – the neurons are organized by the frequency of sound to which they’re most sensitive. All of the subcortical structures shown on the previous page and the cortical structures shown on this page are organized as tonotopic maps. Note that the maps reverse at the dividing line between the Rostral core and the Rostrotemporal core. The region is separated into three subregions – the core, the belt, and the parabelt. Neurons in the core respond best to pure tones. Neurons in the belt and parabelt respond to more complex sound stimuli. This is analogous to the relationships of visual neuron responses in V1 vs. V4, MT, and other cortical areas. The complexity of the stimuli to which neurons respond becomes greater, the farther you go from the initial projection area. Auditory Locatization and Organization - 15 3/21/5 What and where pathways Auditory system neurons that carry information about sound localization send their axons to the parietal lobe. Auditory system neurons that carry information that might help in identifying what is in the external environment send their axons to the frontal lobe. The auditory pathways meet the visual “what” and “where” pathways. The “where” pathways meet in the parietal lobe (in the monkey). The “what” pathways meet in the prefrontal cortex (again, in the monkey). This, of course, means that there are neurons that receive both visual and auditory information – in the blue areas in the above figure. Images of brain activity of a human performing either identification of sounds or location of sounds. From Goldstein’s 8th Ed. Auditory Locatization and Organization - 16 3/21/5 Architectural Acoustics G9 p 302 Reverberation – Delay and strength of echos Definition: The time it takes a sound to decrease its level by 60 decibels. That is, the time it takes for the reflections of the sound to become 60 decibels less than the original sound. Play VL 12.14 Best is about 2 for concert halls and 1.5 for opera houses. Best is 0.4-0.6 for a classroom, about 1.0-1.5 for an auditorium. Intimacy time: The time between arrival of the sound directly from the source and the arrival of the first reflection. Like reverberation, but focusing on the time of the first reflection, not the total amount of reflected sound. Best is about 20 milliseconds. (.020 sec). Bass ratio: The ratio of low frequencies to middle frequencies of reflected sounds. Best is a high ratio. Spaciousness factor: The fraction of all the sound received by a listener that is reflected, as opposed to direct. Best is high. Signal to Noise Ratio: The decibel difference between the amplitude of the signal (voice in a classroom) vs. the amplitude of sounds without the signal. Need about 10-15 S/N ratio – but higher is better. Auditory Locatization and Organization - 17 3/21/5 Auditory Scene Analysis – Y1 p 341 Identifying the objects in the auditory scene. The different pieces of the auditory experience. Example – Musical scenes . . . American Woman by the Guess Who – (Play from 1:11 on . . .) ..\..\..\MDBT\P312\Ch14AuditoryPhenomena\American Woman Whole Version by the Guess Who.mp4 Taken from YouTube. Turtle Blues – 2:56 – breaking glass ..\..\..\MDBT\P312\Ch14AuditoryPhenomena\Turtle Blues from Cheap Thrills by Janis Joplin.mp4 The problem . . . what we hear has multiple sources, each broadcasting its own waveform, but the sound waveform arrives as a single waveform, a mixture of all of the original waveforms. How do we break up the composite waveform into its pieces, identifying the different sources so that we hear different things simultaneously? Auditory Locatization and Organization - 18 3/21/5 Factors affecting how we identify auditory “objects”. These are analagous to the Gestalt Laws of Grouping 1. Location – aka spatial segregation. Sounds from the same location will be more likely to be perceived as coming from the same source. 2. Onset Time – aka temporal segregation. Sounds which begin at the same time are grouped together. 3. Pitch and timbre similarities – spectral segregation. Play VL 12.3 Play VL 12.6 Harmonic coherence – Most sounds are fundamental + harmonics So having a sound which is not one of the harmonics makes it seem separate. Sigview – Load 200,400,500,800 vs 200,400,600,800 workspaces. Play each. 4. Auditory Continuity. VL 12.9 / Web 12.10 5. Experience. The two melodies – Three Blind Mice and Mary had a Little Lamb – are more easily identified by those familiar with the two melodies. Web 12.11 / VL 12.11 Auditory Locatization and Organization - 19 3/21/5 Hearing and Vision. Sound influences what we see Play VL 12.15 / Web 12.12 First, play with sound turned down. Then, play with sound turned on. I’m not thrilled with the other two Web VLs. Areas of the brain activated by both vision and sound. G9 p 312, Figure 12.34 Neurons in the monkey parietal lobe were recorded during auditory and visual stimulation. The red areas are areas in which stimulation – auditory in (a), visual in (b) – resulted in high activity above the base rate. The rightmost panel (c) shows that the two areas overlap considerably. Auditory Locatization and Organization - 20 3/21/5 Echolocation by persons who are blind. Thaler et al (2011) study Recorded from the ear canals of experienced echolocators. So they recorded both the clicking sounds made by the echolocators and the echos of the clicks as those echos reached their ears. They then played the recordings of the clicks and their echos back to the echolocators and to normally sighted volunteers. They found . . . 1. Activity in the auditory cortexes of both. 2. Activity in the visual cortex of the echolocators, but not the sighted volunteer. Visual cortex activity in echolocator, as he/she listened to the clicks and their echos. Absence of visual cortex activity in normally sighted volunteer. Auditory Locatization and Organization - 21 3/21/5