Sound Source Localization in the Real World Requires Multisensory

advertisement

Sound Source Localization in the

Real World Requires

Multisensory Processing

William A. Yost

Speech and Hearing Science

Arizona State University

Outline of Presentation

• Brief Review of Sound Source Localization Cues

• Brief Overview of the Collection of Parametric

Data on Sound Source Localization

• Summary of Experiments measuring Sound

Source Localization when Listeners and Sound

Sources Rotate (move).

Acoustics Cues that are neurally calculated to account for Sound

Source Localization

HRTF

Reflection-Reduced Listening Room

RT

60

=102 ms; ambient background=32 dBA

Experiments controlled by equipment in adjacent

Control Room.

Head and Eye Tracker also used.

36 loudspeakers on a 5- foot sphere, 24 in one circle at the height of the listener’s pinna, 8 loudspeakers on a circle 33 o in elevation, and

4 loudspeakers in a circle at

67 o elevation. Computer-Controlled

Rotating Chair

Sound Source Localization Identification Accuracy as a Function of Bandwidth

MAJOR FINDINGS

1) Accuracy improves (errors decrease) with increasing bandwidth

2) For narrow bandwidths (<1 octave), accuracy depends on spectral content and bandwidth: low-frequencies highest accuracy, mid-frequencies worst accuracy, high-frequencies intermediate accuracy.

Sound Source Localization Identification Accuracy: Clicks

Mean +-

One Standard

Deviation of

Two-Octave Wide,

200-ms noise bands,

Yost et al, 2013

Repeating Clicks at Different Rates

Major Findings: Bandwidth, not duration or envelope, affects sound source localization accuracy.

Sound Source Localization of Amplitude Modulated, 4000-Hz Tonal Carrier

Major Finding: For narrow-band, high-frequency sounds providing an envelope might lead to a small improvement in sound source localization.

Almost all Studies of Spatial Hearing

Have Involved Only Stationary

Listeners.

However, All Acoustic Cues for Sound

Source Localization are

RELATIVE CUES.

They change when the source moves relative to the listener or when the listener moves relative to the source.

We perceive a moving source as moving and a stationary source as stationary whether we move or not.

If the acoustic cues for sound source localization are relative, shouldn’t our perception of sound source location vary when we move??!!

Relative to What?

The spatial cues used for sound source localization change relative to Head Position when either the source or the listener moves. That is, the spatial cues represent the location of sources in a “HEAD

CENTRIC” reference system.

But the sound sources themselves can be referenced to their location in their Environment

(e.g., relative to where in a room the source is located). In this case the reference is a “WORLD

CENTRIC” reference system.

WORLD

CENTRIC

( Stationary

Sound Source,

Rotating Listener )

HEAD

CENTRIC

WORLD

CENTRIC

( Rotating

Sound Source,

Rotating Listener )

HEAD

CENTRIC

World versus Head Centric Reference

Relative to What?

How come we localize sound sources in a World-Centric reference system, when the acoustic spatial cues vary in a

Head-Centric reference system; e.g., how come we do not perceive a stationary sound source as moving when we move, since the acoustic cues used for sound source localization change?

HYPOTHISIZED ANSWER : IN THE EVERYDAY WORLD THE

SPATIAL BRAIN PROCESSES BOTH THE ACOUSTIC SPATIAL

CUES AND THE LOCATION OF THE HEAD, AND AS A RESULT

CALCULATES THE WORLD-CENTRIC LOCATION OF THE

SOUND SOURCE BASED ON THESE TWO PIECES OF

INFORMATION.

World Centric Sound Source Localization Perception

Sound Source

World Centric θ wc

Head Centric θ hc

Body Position θ bp

θ

wc

=

θ

bp

-

θ

hc

World-Centric Sound Source Perception

Sound Source/Head Centric Cues: Auditory processing of ITD, ILD, and HRTF cues provide the spatial brain information about the head- centric acoustic cues required for localizing the position of a sound source.

Body/Head Position Cues: Visual cues and most probably vestibular, proprioceptive, somatosensory, auditory, and cognitive cues provide the spatial brain information about body/head position.

Sound Source Localization is based on a combination of these two pieces of information.

Where is the Evidence???!!

Logic of the Research Approach

Listeners are rotated in the chair and presented sounds from sources that are fixed in location or that change position by rotating from loudspeaker to loudspeaker around the 24-loudspeaker circular array. In one set of conditions the listener has full use of vision. In the other set of conditions visual input is denied

(listener is blind-folded, asked to close their eyes, and the room is entirely dark).

In order to investigate vestibular function, the listener’s rotation is either accelerating, decelerating, or constant velocity. The main vestibular output for sensing rotation is provided by the semi-circular canals, which are acceleration sensitive, i.e., there is no semi-circular canal output under constant velocity rotation.

To the best of our ability we try to eliminate, or significantly reduce, proprioceptive, somatosensory, eye movement, other auditory cues, and cognitive cues.

Listeners make perceptual judgments about the rotation of sound around the 24- loudspeaker circular array when visual cues are present or absent and when vestibular rotational cues are available (acceleration or deceleration) or are not available (constant velocity).

Logic of the Research Approach

Predictions:

With vision, sound sources are perceived in a world-centric reference system.

And

Without vision, the perception of the location of the sound sources would be in a head-centric reference system.

And

Under certain rotation conditions, vestibular outputs could change the perception of the relative location of the sound sources from head centric to world centric.

Experiment 1

Eight listeners judge whether or not the sound appears to rotate clockwise around the 24loudspeaker array (“Rotating”) or is perceived as stationary (“Stationary”).

Sound and Listeners are either Stationary or Rotate (Accelerating, Decelerating, or rotating at Constant Velocity).

When Sound Source and Listeners rotate, both Listener and Sound Source rotate at the same rate in the clockwise direction:

Constant Velocity: 45 o /s or

Accelerating: 1 o /s/s-[0 to 45 o /s) or

Decelerating: 1 o /s/s-(45 o /s to 0)

Sound: 200-ms (shaped with 20-ms rise/fall time) noise burst (filtered between 125 Hz &

15 kHz) presented at 65 dBA. Same noise burst (frozen) for each run.

Each run is 55 s (5-s ramp up to or down from terminal velocity which lasts 45 s).

Listeners report: no hearing loss, no vestibular problems, and that they were not prone to motion sickness. Two listeners left experiment after they started, because they began to feel uneasy during rotation (with their eyes closed).

Localization of Sound Sources: Sources & Listeners Change Position

Sound Source is Stationary

Eyes open

Listener

Stationary

Accelerating

Decelerating

Constant Velocity

Sound Source

"Stationary" "Rotating"

8

48

48

48

0

0

0

0

Sound Source is Rotating

Eyes open

Listener

Stationary

Accelerating

Decelerating

Constant Velocity

Sound Source

"Stationary" "Rotating"

0

0

0

0

24

48

48

48

Sound Source is Stationary

Eyes Closed

Listener

Sound Source

"Stationary" "Rotating"

Stationary

Accelerating

Decelerating

Constant Velocity

24

14

16

2

0

34

32

46

Sound Source is Rotating

Eyes Closed

Listener

Sound Source

"Stationary" "Rotating"

Stationary

Accelerating

Decelerating

Constant Velocity

0

27

29

39

24

21

19

9

World-Centric Head-Centric

8 Listeners: 1 run when Eyes Open-Stationary Listener & Source, 3 runs for all other Stationary Listener conditions (1 run for each Rotation condition), and 6 runs for all Rotating Listener conditions.

Each run lasted ~45 s. after which Listeners were asked about the status (“Source Stationary” or “Source

Rotating”) of the sound presented from the loudspeakers.

Experiment 2

Two Parts to Experiment 2: Exp. 2a and 2b.

Both parts: Listener and Sound Sources moving at different velocities

Sounds: Same as Experiment 1

Listeners: 17 listeners with reported normal hearing, vestibular function, and no motion sickness.

Three of the original 20 Listeners dropped out when they felt uneasy in rotation conditions.

90

45

0

0

Experiments 2a & 2b

Time-s

Listener Counterclockwise

Source Counterclockwise

90

Head centric positions:

Interaural Cues Clockwise

Interaural Cues Counterclockwise

World centric positions always:

Counterclockwise

Experimental Conditions for Experiments 2a and 2b

Counterclockwise Clockwise

90

45

Listener

Source

0

0 45 90

Source

Listener

Time-s

Interaural Cues Clockwise

Interaural Cues Counterclockwise

Questions asked:

1) At the beginning in which direction was the sound rotating?

2) Did the direction of

sound rotation

change during the

run?

3) How many times did

the direction of

rotation change

during a run?

Results for Experiment 2a

Listener

Sound Source Results for Experiment 2b: Summary Data

Experiment III

SAME

Trial Conditions

CLOCKWISE COUNTER-

CLOCKWISE

Time - s

(Interval 1)

Listener and Sound Source

Rotate at the Same Rate in Clockwise direction

Twelve Listeners;

Same noise stimuli as in Experiments I and II

(Interval 2)

Possible Responses:

World Centric: “S” “C” “CC”

Head Centric: “CC” “S” “CC”

(Interaural Cues, compare Interval 2 to Interval 1)

Results of Experiment III

Eyes Open

Eyes Closed

CONCLUSIONS

The location of sound sources in the everyday world where listeners and sound sources change position is based on a worldcentric reference system, i.e., the sources of sounds are located relative to their position in the world (environment).

The spatial cues used for sound source localization lead to the perception of the location of sound sources in a head-centric reference system.

In order for our perception of the location of the sources of sounds to be world-centric, the spatial brain must have information about both sound source localization cues and where the head is at any moment.

Visual and in some cases vestibular (and/or probably other) cues are required to be combined with auditory cues to account for the ability to localize the sources of sounds in the real world.

Sound Source Localization is a Multisensory Process!!

Download