Auditory Localization and Organization

advertisement
Chapter 12 – Auditory Localization and Organization
Localization of sounds
Localizing Visual stimuli in vertical and horizontal dimensions – Too easy!! Visual stimuli at
different places activate different locations on the retina. Each different place on the retina is
connected to a different area of the cortex. So it’s kind of easy to see how the brain can locate
different visual stimuli – by noting where the activity occurs on the retina, LGN, area V1.
Localizing in the 3rd dimension – depth – is more difficult. It requires “neural deduction” and
comparison of images in the two eyes – binocular disparity.
The issues involving perceiving sound are like those involved in using binocular disparity to
perceive depth or distance . . .
A sound originating from the left activates ALL the receptors on both the left cochlea and right
cochlea.
Similarly, a sound originating from the right activates ALL the receptors on both the left
cochlea and right cochlea.
So, perceiving location requires a comparison of the input to the two ears.
Elevation –
location
above or
below the
head.
Some terms . . .
Azimuth –
location on
a circle
around the
head
0◦
270◦
Or 90◦ left
45◦
90◦ right
0◦
Azimuth –
Location in a
horizontal
circle around
the listener.
Elevation –
Location in a
vertical circle
around the listener.
0◦ is immediately in
front of us.
180◦
Auditory Locatization and Organization - 1
3/21/5
Cues for localization
Sound localization involves comparison of the sounds in the two ears.
The primary cues are binaural cues: Specific differences in the sounds between the two ears.
1. Interaural Time Differences (ITDs): G9 p 291
Also called time of arrival differences
A sound originating on the left will reach the left ear slightly sooner than it reaches the right ear.
Vice versa for sounds originating on the right.
Maximum difference in time of arrival is associated with a sound on immediately to the left of
the left ear or the right of the right ear. . . .
600 µsec
600 µsec = .000600 sec.
The maximum difference associated with a sound from one side is .000600 seconds, 600
millionths of a second, or 600 microseconds.
Such time of arrive differences are by themselves enough to allow precise localization of
sounds. Careful experiments in which the only cue for location was time of arrival have
demonstrated that persons can accurately locate sounds based on this cue only.
The threshold for ITD is about 100 µsec – sounds which differ in ITD by less than 100 µsec are
perceived to be directly in front of us 50% of the time.
Interaural time differences (ITDs) are most important for localization of low frequency
sounds.
Auditory Locatization and Organization - 2
3/21/5
2. Interaural Level Differences (ILDs) G9 p 291
Differences in Level (intensity) between the ears
Sounds which originate on the left will tend to be more intense at the left ear.
Sounds which originate on the right will tend to be more intense at the right ear.
Our perception corresponds to reality . . .
All other things equal, a sound localized to the left will be more intense in the left ear.
A sound on the right will be more intense in the right ear.
But: How different the intensities will be in the two ears depends on the frequency of the sound.
A sound originating on the left has two ways of getting to the right ear . . .(assuming no
reflective surfaces)
a. bending around the head.
Low frequency sounds
bend easily around the head
b. going through the head.
High frequency sounds don’t bend,
so they must go through the head, reducing
intensity.
Ear in
“shadow”
The result is the following:
Frequency
300 Hz
Maximum Intensity difference between ears
1 db (barely noticeable) Low frequency sound easily bend around the
head.
4000 Hz
5 – 10 db (noticeable)
10,000 Hz
10-20 db (quite noticeable) High frequency sounds are blocked by the
head.
Auditory Locatization and Organization - 3
3/21/5
Graphically . . .
So interaural level differences provide good information on location of high frequency
sounds.
But interaural intensity differences provide little information on location of low frequency
sounds
This is one of the reasons that some persons with high quality sound systems have only one
speaker designed for very low frequency sounds. It’s because we are less able to identify the
location of such sounds, so there is no need for two in a stereo system.
Auditory Locatization and Organization - 4
3/21/5
Perceiving Elevation G9 p 293
The Cone of Confusion
ITDs and ILDs give great information about localization in the azimuth.
Alas, they don’t provide much information about elevation.
The collection of locations for which ITD and ILD are equal is called the cone of confusion.
The locations on the cone differ in elevation.
This means that ITD and ILD are of little use in determining elevation.
Spectrum cues used to judge elevation and whether sound is in front of us or in back
The shape of the pinna differentially attenuates different frequency components of the sound
stimulus.
This shaping of the sound stimulus creates information for the location of the sound called the
spectral shape cue. This cue is particularly important for perception of the elevation of the
sound stimulus.
The shaping of the sound spectrum by the pinna is specific to each individual ear.
This means that using the shape of the sound spectrum to perceive elevation must be learned.
It also implies that if the shape of an adult’s pinna were changed, that adult would have difficulty
locating the elevation of sounds they could previously locate very easily.
Note that the sound from
above is generally more
intense n the ear canal.
Note also that the sound
from below lacks
intensity in the 6-8 KHz
frequency range relative
to the sound from above.
Auditory Locatization and Organization - 5
3/21/5
Influence of shape of the pinna in judging elevation
Figure 12.9 G9 p 295 shows the effects of wearing an artificial pinna on ability to localize
sounds.
The crossings of the blue grid are where sounds were presented.
The red dots are where the sounds were reported to be located.
On the first day that observers wore an implant, their reports
were horribly inaccurate in the vertical dimension, although
pretty nearly correct in the horizontal dimension.
After 5 days of wearing the implant, observers’ reported
locations in the vertical dimension became more accurate.
After 19 days wearing the implant, observers’ reports of both
vertical and horizontal location were pretty accurate.
Immediately after removing the implant, obervers’ reports were
still quite accurate, suggesting that they had not “lost” an ability,
but simply gained a new ability.
This study suggests that the timbre (the experienced frequency spectrum) of the sound is a cue
for that sound’s elevation
Auditory Locatization and Organization - 6
3/21/5
Using timbre to determine whether a sound is in front of us or in back.
The pinna attenuates the high frequency components of sounds located in back of the head.
This means that we can tell when a familiar sound is behind us because it sounds slightly
muffled compared to the way it sounds when it’s in front of us.
A study demonstrating the effect of the ear flap . . .
Batteau (1967) as reported in L&S, p. 424.
Batteau created artificial pinnas out of plaster and put them on an artificial head. Put
microphones in the hole in the artificial pinna where the ear drum would be.
Condition 1: Recorded sounds played in different locations around the head in which
microphones had been embedded in the plaster pinnas.
Condition 2: Recorded the same sounds played in different locations around a head in which
microphones had been embedded without being inside plaster pinnas. That is, the artificial
head had no ears, just ear canals with microphones at the end of the canals.
Played back the recordings of those sounds to human observers.
Results:
Condition 1: Participants were able to identify locations of the sounds in front of and behind
the artifical head.
Condition 2: Participants were not able to localize the sounds correctly.
Conclusion: The plaster, artificial pinnas muffled the sounds played behind the head in a way
that was similar to how sounds are muffled by the real pinna.
Perceiving distance (Not in G9, from K&S p 296)
Familiarity – comparison of perceived sound with memories of similar sounds at different
distances
Spectral analysis – High frequency sounds decrease in intensity over distance more than low
frequency sounds. So we hear the bass guitar across the lake, but not the singer. We hear the
rumble of thunder but not the high frequency crack of lightning.
Ratio of direct to reflected sound – Nearer sounds have a higher ratio of direct to reflected than
far sounds.
Auditory Locatization and Organization - 7
3/21/5
Possible neural circuits for ITD and ILD – G9 p 295
Study of barn owls has provided evidence for existence of neurons that
respond only to sounds presented in a particular location.
The superior olivary complex contains the medial superior olive, the collection
of neurons that have been identified as responding to ITD
Neurons have been found which respond only when one ear receives sound slightly before the
other – to time of arrival differences – that is, ITDs.
From the web site of David Heeger, NYU.
Auditory Locatization and Organization - 8
3/21/5
A circuit for neurons responding to Interaural Time Differences. – G9 p 297
Issue: Design a neural circuit so that a neuron responds most actively when a sound arrives at
the right ear before it arrives at the left ear.
Sound
on the
right
Stimulus
Back of
head
Ears
L
Short axon from
left cochlear
nucleus
R
MSO
MSO
Neuron
Long axon from
right cochlear
nucleus
A simple way to create time of arrival signaling is to have a longer path from neurons
originating in one ear to the postsynaptic neuron than the path from the other ear. This is
analagous to the motion detectors we studied a couple of chapters ago.
Sounds in front of us
If a sound strikes both L and R at the same time, L’s action potential will arrive at the end of its
axon, followed by R’s. That’s because the action potential has further to travel in R than it does
in L. Thus, the total release of neurotransmitter substance at MSO will be long-term, but low
intensity, resulting in low probability of MSO neuron responding.
Sounds to the right of us
Sound strikes the right ear first, but the path from the ear to the MSO from the right ear is longer.
So R’s action potential will reach the end of its axon at about the same time as L’s, causing a
massive release of neurotransmitter, enough to cause the MSO neuron to fire.
Sounds to the left of us
But if a sound strikes L first, then strikes R shortly afterward, L’s action potential will reach the
end of its axon first and much much later (in neural time) R’s action potential will reach the end
of its axon, resulting in insufficient neurotransmitter to cause the MSO neuron to respond.
Note that in such circuits, the long path is from the ear closest to the sound. (In case you need
to construct such a neural circuit.)
Play ..\..\..\MDBT\P312\Ch14AuditoryPhenomena\ITD Neuron.pptx
Auditory Locatization and Organization - 9
3/21/5
The Jeffress Neural Coincidence Model – combining many of the above circuits.
Figure 12.12, G9 p 297
Each neuron – 1, 2, 3, etc except 5
– has a different length path to
each ear.
Neuron 5 has an equal length path
from each ear, so it will respond
best to sounds from the front.
.
Neuron 3 has a long path from the
right ear and a short path from the
left ear, so it will respond best to
sound that reach the right ear first.
Imagine 1000s of neurons like 1,2,3,4,5,6,7,8,9 above,
each with a different length path to each ear. Each of
those neurons will respond best to a sound at a particular
location on the azimuth.
Mike – Explain what each curve means.
Auditory Locatization and Organization - 10
3/21/5
Mammalian Neurons
It appears that turning curves of neurons in mammals are not as sharp as those of owls.
The gerbil curve is about twice as wide as the owl curve in Figure 12.14 p 298.
It’s been suggested that in mammals, there are more broadly tuned neurons, ones that fire over a
wide range of locations to the right and others that fire over a wide range of locations to the left.
The blue curve is of a neuron that responds above its
base rate to sounds from the right and below its base rate
to sounds from the left.
The left curve has the opposite response characteristics.
Note – each of these curves is like the response curve of
an opponent processes neuron in the LGN.
In this model, the location of a sound is signaled by the pattern of responses of these two types of
neurons . . .
Sound 1 is on the left- resulting in a large response of the
L neuron, but a small response of the R neuron.
Sound 3 is on the right – resulting in a small response of
the L neuron but a large response of the R neuron.
Auditory Locatization and Organization - 11
3/21/5
A Neural Circuit responding to Interaural Level Differences
(This one is almost too easy.)
Below is a simple-minded circuit to process intensity differences.
L
R
ILD
Sounds in front of us
Neurotransmitter from L and R counteract each other, resulting in base rate activity of LSO. So
a base rate of activity would signal that a sound is in front of us.
Sounds to the right
Since the sound is more intense at R, R releases more excitatory neurotransmitter substance the
L, making ILD more active. So through its heightened activity, ILD signals “sounds to the
right”.
Sounds to the left
Since the sound is more intense at L, L releases more inhibitory neurotransmitter substance than
R, reducing LSO activity. So through its lowered rate of activity ILD signals “sounds to the
left”.
Obviously, the the neuron ILD by itself could not completely identify the location of a sound, but
its activity, in conjunction with that of other neurons responding to other aspects of the auditory
scene would help signal sound location in the azimuth.
.
Auditory Locatization and Organization - 12
3/21/5
The auditory pathway to the auditory cortex –
Note that each primary
auditory projection area (A1)
is buried in a sulcus.
An important point to get from this is that there is a LOT of processing of the auditory signal in
each of these structures.
A natural question is: What exactly are they doing?
As Yantis, p 325 says, “The exact functions served by the different subcortical structures and the
different response patterns of their neurons are the subject of considerable ongoing research
and debate, but it’s clear that these structures and the neural signal processing that takes place
within them along the auditory pathways play a critical role in encoding, with exquisitely high
fidelity, the rapidly changing stimuli that typically make up the auditory environment.”
Auditory Locatization and Organization - 13
3/21/5
An image of the initial processing of sound that emphasizes the tonotopic maps in the
various areas . . .
High Freq
Low Freq
Reference is journal.frontiersin.org
Red represents low frequency sounds
Blue represents high frequency sounds
AN: Auditory Nerve
CN: Cochlear Nucleus
LSO: Lateral Superior Olive
MSO: Medial Superior Olive
MNTB: Medial Nucleus of the Trapezoidal Body – not mentioned in Y1
The point is that every area through which the auditory signal passes is organized tonotopically –
by frequency.
Auditory Locatization and Organization - 14
3/21/5
The auditory Cortex – Y1 p 327
The auditory cortex is located on the upper surface of the temporal lobe tucked into the lateral
sulcus – the sulcus separating the temporal lobe from the frontal and parietal lobes.
The primary discovery associated with the auditory cortex is the tonotopic map – the neurons are
organized by the frequency of sound to which they’re most sensitive.
All of the subcortical structures shown on the previous page and the cortical structures shown on
this page are organized as tonotopic maps.
Note that the maps reverse at the dividing line between the Rostral core and the Rostrotemporal
core.
The region is separated into three subregions – the core, the belt, and the parabelt.
Neurons in the core respond best to pure tones.
Neurons in the belt and parabelt respond to more complex sound stimuli.
This is analogous to the relationships of visual neuron responses in V1 vs. V4, MT, and other
cortical areas. The complexity of the stimuli to which neurons respond becomes greater, the
farther you go from the initial projection area.
Auditory Locatization and Organization - 15
3/21/5
What and where pathways
Auditory system neurons that carry information about sound localization send their axons to the
parietal lobe.
Auditory system neurons that carry information that might help in identifying what is in the
external environment send their axons to the frontal lobe.
The auditory pathways meet the visual “what” and “where” pathways.
The “where” pathways meet in the parietal lobe (in the monkey).
The “what” pathways meet in the prefrontal cortex (again, in the monkey).
This, of course, means that there are neurons that receive both visual and auditory information –
in the blue areas in the above figure.
Images of brain activity of a human performing
either identification of sounds or
location of sounds.
From Goldstein’s 8th Ed.
Auditory Locatization and Organization - 16
3/21/5
Architectural Acoustics G9 p 302
Reverberation – Delay and strength of echos
Definition: The time it takes a sound to decrease its level by 60 decibels.
That is, the time it takes for the reflections of the sound to become 60 decibels less than the
original sound.
Play VL 12.14
Best is about 2 for concert halls and 1.5 for opera houses.
Best is 0.4-0.6 for a classroom, about 1.0-1.5 for an auditorium.
Intimacy time: The time between arrival of the sound directly from the source and the arrival of
the first reflection. Like reverberation, but focusing on the time of the first reflection, not the
total amount of reflected sound.
Best is about 20 milliseconds. (.020 sec).
Bass ratio: The ratio of low frequencies to middle frequencies of reflected sounds.
Best is a high ratio.
Spaciousness factor: The fraction of all the sound received by a listener that is reflected, as
opposed to direct.
Best is high.
Signal to Noise Ratio: The decibel difference between the amplitude of the signal (voice in a
classroom) vs. the amplitude of sounds without the signal.
Need about 10-15 S/N ratio – but higher is better.
Auditory Locatization and Organization - 17
3/21/5
Auditory Scene Analysis – Y1 p 341
Identifying the objects in the auditory scene.
The different pieces of the auditory experience.
Example – Musical scenes . . .
American Woman by the Guess Who – (Play from 1:11 on . . .)
..\..\..\MDBT\P312\Ch14AuditoryPhenomena\American Woman Whole Version by the Guess
Who.mp4
Taken from YouTube.
Turtle Blues – 2:56 – breaking glass
..\..\..\MDBT\P312\Ch14AuditoryPhenomena\Turtle Blues from Cheap Thrills by Janis
Joplin.mp4
The problem . . .
what we hear has multiple sources, each broadcasting its own waveform,
but
the sound waveform arrives as a single waveform, a mixture of all of the original waveforms.
How do we break up the composite waveform into its pieces, identifying the different sources so
that we hear different things simultaneously?
Auditory Locatization and Organization - 18
3/21/5
Factors affecting how we identify auditory “objects”.
These are analagous to the Gestalt Laws of Grouping
1. Location – aka spatial segregation.
Sounds from the same location will be more likely to be perceived as coming from the same
source.
2. Onset Time – aka temporal segregation.
Sounds which begin at the same time are grouped together.
3. Pitch and timbre similarities – spectral segregation.
Play VL 12.3
Play VL 12.6
Harmonic coherence – Most sounds are fundamental + harmonics
So having a sound which is not one of the harmonics makes it seem
separate.
Sigview – Load 200,400,500,800 vs 200,400,600,800 workspaces. Play each.
4. Auditory Continuity.
VL 12.9 / Web 12.10
5. Experience.
The two melodies – Three Blind Mice and Mary had a Little Lamb – are more easily identified
by those familiar with the two melodies.
Web 12.11 / VL 12.11
Auditory Locatization and Organization - 19
3/21/5
Hearing and Vision.
Sound influences what we see
Play VL 12.15 / Web 12.12
First, play with sound turned down.
Then, play with sound turned on.
I’m not thrilled with the other two Web VLs.
Areas of the brain activated by both vision and sound. G9 p 312, Figure 12.34
Neurons in the monkey parietal lobe were recorded during auditory and visual stimulation.
The red areas are areas in which stimulation – auditory in (a), visual in (b) – resulted in high
activity above the base rate.
The rightmost panel (c) shows that the two areas overlap considerably.
Auditory Locatization and Organization - 20
3/21/5
Echolocation by persons who are blind.
Thaler et al (2011) study
Recorded from the ear canals of experienced echolocators.
So they recorded both the clicking sounds made by the echolocators and the echos of the clicks
as those echos reached their ears.
They then played the recordings of the clicks and their echos back to the echolocators and to
normally sighted volunteers.
They found . . .
1. Activity in the auditory cortexes of both.
2. Activity in the visual cortex of the echolocators, but not the sighted volunteer.
Visual cortex activity in echolocator, as he/she listened
to the clicks and their echos.
Absence of visual cortex activity in normally sighted
volunteer.
Auditory Locatization and Organization - 21
3/21/5
Download