Slides of Keynote Lecture #1 by Jens Blauert, D-Bochum

advertisement
Slides of Keynote Lecture #1 by Jens Blauert, D-Bochum
medial geniculate nucleus
primary
auditory
cortex
inferior colliculus
cochlea
superior olivary
complex
The Human
Auditory System
Prominent Features of Binaural Hearing
•
Localization
– Formation of positions of the auditory events
(i.e. azimuth, elevation & distance)
– Spatial extent of auditory events
•
Suppression of
– Directional information coming from reflections
(e.g., Precedence effect, summing localization)
– Reverberation, coloration and noise
•
Identification & Segregation of
– Auditory streams
(e.g., concurrent musical instruments,
cocktail-party effect,warning signals)
1
Binaural Models and their
Technological Application
Jens Blauert, D-Bochum
Application Area # 1
as Identified by AABBA
Spatial Scanning and Mapping of Auditory Scenes
Estimation of the position and the spatial extents of the
of auditory events that form an auditory scene
– be it a natural scene as in room acoustics
WW
or virtual scene as in virtual-reality
applications,
or at the play-back end of audio systems –
including spatially diffuse auditory events,
often perceived as components of reverberance
The AABBA grouping, 2009
What is AABBA? An international grouping of 14 laboratories (since 2009),
dealing with Aural Assessment By Means of Binaural Algorithms
2
Application Area # 2
as Identified by AABBA
Analysis of Auditory Scenes with the Aim of
Deriving Parametric Representations
at the Signal Level
Estimates of these parameters may be used, for instance,
● For coding and/or re-synthesis of auditory scenes
● For speech-enhancement in complex acoustic environments,
incl. hearing aids
● For systems to enhance the spatial perception in sound fields,
such as better localization and/or a better sense of
envelopment; further, decolouration and de-reverberation
● For the identification of perceptual invariances of auditory
scenes
The AABBA grouping, 2009
Application Area # 3
as Identified by AABBA
Analysis of Auditory Scenes with the Aim of
Deriving Parametric Representations
at the Symbolic Level
For example,
● Identification of determinants of meaning contained in
binaural-activity maps
● Assignment of meaningful symbols to the output
of binaural models
The AABBA grouping, 2009
3
Application Area # 4
as Identified by AABBA
Evaluation of Auditory Scenes in Terms of Quality
Where quality will strictly be judged from the
user’s point of view, for example,
● Quality of “the acoustics” of spaces for musical performances
● Quality of systems for holophonic representation of auditory
scenes, such as auditory displays and virtual-reality generators
● Spatial quality of audio-systems (for recording, transmission
and play-back), incl. systems that employ perceptual coding
● Quality of speech-enhancement systems, incl. hearing aids
The AABBA grouping, 2009
Binaural Models and their
Technological Application
●
●
Introduction
−
Prominent Features of Binaural Hearing
−
Generic Application Areas of Binaural Models
Model Architectures
−
The Periphery & Bottom-up Processing
ILDs, ITDs, interaural coherence
−
Binaural-activity Mapping & Interpretation
−
Higher-order Processes
Cocktail-party processing, sound-quality assessment
Precedence effect, auditory-scene analysis
Franssen effect
−
Interactive Listening
Architecture for a comprehensive model of binaural hearing
●
Conclusion and Outlook
4
−
The Periphery & Bottom-up Processing
ILDs, ITDs, interaural coherence
−
Binaural-activity Mapping & Interpretation
Cocktail-party processing, quality assessment
−
Higher-order Processes
Precedence effect, auditory-scene analysis
Franssen effect
−
Interactive Listening
Architecture for a comprehensive model of
binaural hearing
binaural-activity map
interaural-level-difference analysis
2006
2006
interaural-time-difference analysis
pure bottom-up
processing,
signal driven !
Block Diagram of the
Binaural-Analysis System
of IKA Bochum
5
Ear-Adequate Band-Pass-Filter Bank
spectral decomposition
envelope extraction
compressive A/D conversion
compressive!
Probabilistic
Spike
Generator
A Simplified Functional Model of the Hair Cells
6
The Jeffress Processor
Estimates Interaural Cross Correlation
Ψy(τ) = 1/(t1- t0)
t1
y (t) y (t +τ)
Σ
t=t
l
r
0
similarity to Jeffress‘ coincidence model:
τk τk+1 τk+2 τk+3 τk+4 τk+5 τk+6 τk+7 τk+8
Braasch 2002
contralaterally-inhibited IACC
IACC
Contral-Lateral Inhibition after Lindemann
7
Cross-Correlation vs. Lindemann
impulsive input signals with no ITD, but with an ILD,
looked at in an auditory „critical“ band about 800 Hz
schematic plot
Gaik 1988
„Natural Combinations“ of ITDs and ILDs
8
Gaik‘s
Extension: How to
learn the ears of
its owner?
Weighted Contralateral Inhibition
center frequency of critical band
Gaik 1988
left <
lateral deviation
> right
Output of the Jeffress-Lindemann-Gaik Model
frontal broad-band sound source
9
A „Figurative“ Plot of the Output of the
Gaik Processor
estimated angle
presented angle
natural hearing
Blumlein stereophony
loudspeakers 60°apart
loudspeakers 45°apart
Binaural-Model Estimation of ITDs
for Natural Hearing and Amplitude-Difference
Stereophony (Blumlein)
Blauert & Braasch 2008
10
−
The Periphery & Bottom-up Processing
(ILDs, ITDs, interaural coherence)
−
Binaural-activity Mapping & Interpretation
Cocktail-party processing, quality assessment
−
Higher-order Processes
Precedence effect, auditory-scene analysis
Franssen effect
−
Interactive Listening
An architecture for a comprehensive model of
binaural hearing
recognition, interpretation
and judgement
binaural–
activity mapping
Schematic Plot of a Typical Binaural Model
11
2 concurrent talkers)
„dry“ talker
reverberant talker
de-reverbed talker
Examples for Binaural Activity Maps of Speech
direction finder
weight assigment
Binaural
Model
Control
Unit
source selection
Wiener
Filter
spectral
decomposition
speech
cleansing
Architecture of a Cocktail-Party Processor,
Based on Binaural Modelling
12
one noise source
two concurrent talkers
A human listener would
not localize the reflection!
two moving talkers
talker plus wall reflection
Source Tracking with the Binaural Model
Bodden 1993
Spatial Selectivity of a Model of Binaural Hearing
Bodden 1993
13
taken at an instant t = to
from a running
correlogram
Sample Output of the Binaural Model
one frontal sound source, sending out a musical chord
Impulse Response and
Running Interaural Cross-correlation
in a Room with Reflecting Walls
14
prediction:
left <
center >
right
Model Output with Incoherent Ear-Input Signals
parameter: degree of interaural coherence, k
pink noise, 12 subjects
after Blauert & Lindemann, 1985
(plot enhanced for contrast)
Area Covered by Auditory Events as a Function of
the Amount of Interaural Coherence, k
15
critical band
about 700 Hz
reflected sounds
direct sound
left
lateral position
right
Binaural-Activity Map of the
Impulse Response of a Concert Hall
rendered by the model of binaural signal processing of Lindemann & Gaik 1986,1990
plotted by Okabe 1997
critical band at 500 Hz
2 noises from Φ = 30 & 3300
90% correlated
Binaural Model – Fast PC Version
snapshots across bands
Hess 2004
16
Application to Sound Quality
─ The Layer Model ─
Higher
Aural-communication Quality
“sound of quality”
– communication sciences
ideas, concepts, functionalities,
content, sound as a sign carrier
Acoustic Quality
“quality of realization”
Abstraction
– physics
– perceptual psychology
Lower
Auditive Quality
Aural-scene Quality
– classical psychoacoustics
physical attributes & properties,
absence of acoustic distortions,
physical form
“quality of presentation”
aural gestalt, authenticity, aural
perspective, enhancement, sensory consistency, perceptual form
“quality of sound”
auditive attributes & properties
Blauert & Jekosch 2007, 2012
brain of the system
top-down
processing
The „brain“ of the system
contains explicit knowledge:
e.g., data bases, rule system,
semantic networks, transition
probabilities, domain models,
the history of the situation
object building
Gestalt rules
Blauert 1999, 2005
binaural-activity mapping
Intelligent Evaluation of
Binaural-Activity Maps
17
− The Periphery & Bottom-up Processing
ILDs, ITDs, interaural coherence
− Binaural-activity Mapping & Interpretation
Cocktail-party processing, quality assessment
−
Higher-order Processes
Precedence effect, auditory-scene analysis,
Franssen effect
− Interactive Listening
An architecture for a comprehensive model of
binaural hearing
summing
localization
localization
dominance
region
region
region with two
auditory events
lead
lag
inter-stimulus interval, ISI
Blauert: Spatial Hearing, 1983
The Precedence Effect
Do not mix up with the Haas Effect!
18
Output of the
Binaural
Model for a Frontal
Sound Plus
One Lateral
Reflection
left panel:
cross correlation only
right panel:
cross correlation plus
contralateral inhibition
Short Clicks !
Psycho-Acoustic Results
no localization
dominance
band-pass-noise bursts
200-ms duration, 500-Hz center frequency
Ongoing Sounds !
19
lag
about 0.7 kHz
100-ms noise
ISI ...10 ms
estimate of the interaural cross-correlation function
Output of the Jeffress Coincidence Processor
– Lag Frontally-Fixed, Lead Moving About –
Braasch 2005
lag
about 0.7 kHz
100-ms noise
ISI ...10 ms
estimate of the inhibited interaural cross-correlation function
Output of Lindemann’s Processor
with Lateral Inhibition
– Lag Frontally-Fixed, Lead Moving About –
Braasch 2005
20
broad band
100-ms noise
ISI ... 10 ms
estimate of the interaural cross-correlation function
Output of the Jeffress Coincidence Processor
– Lag Frontally Fixed, Lead Moving About –
Braasch 2005
broad band
100-ms noise
ISI = 10 ms
estimate of the inhibited interaural cross-correlation function
Output of Lindemann’s Coincidence
Processor
with Lateral Inhibition
Braasch 2005
21
Modified Lindemann Model
inhibited IACC
x…type I
+…type II
level diff
Dizon’s experimental paradigm
left-ear
signal
right-ear
signal
observation window
Dizon 2001, Dizon & Colburn 2006
22
Time Course
unusual approach using a monaural lead/lag pair
lead
stronger
lag
weaker
monaural!
Autocorrelation Evaluation of Filtered Signal
lead only
lead & lag
lead, after lag removal
error
monaural!
23
current model:
matched filter
prior models:
simple substraction
Echo Canceller for a Simple Lead-Lag Pair
Braasch 2008
Cross-correlation Analysis for Binaural Signals
lead only
recovered
lead & lag
error
binaural!
24
model data
lead position
psychoacoustic data
very
narrow
band
lag position
Model Data with
Echo Canceller
adaptive filters
take time!
Binaural-Model
Architecture
with
Monaural
Echo
Cancellers
25
Fo
Fast & Slow Melodies – Trumpet
time
Echo Thresholds
Fast & Slow Melodies
ET in ms
100
92
85
53
60
40
82
75
80
45
35
F
65
60 57
52 48
40
55
42
30
FM
S
SM
20
0
L1
F…fast,
L2
L3
L4
FM…fast in mixture, S…slow, SM…slow in mixture
26
Wolf‘s Experiment II
alternative plot
The Clifton Effect
Clifton 1987
27
Sound Sources Alternating in Space
Blauert & Col 1989
Virtual Test Room
28
Test-Room Switching
The Franssen Effect
Franssen 1960
29
−
The Periphery & Bottom-up Processing
ILDs, ITDs, interaural coherence
−
Binaural-activity Mapping & Interpretation
Quality assessment, cocktail-party processing
−
Higher-order Processes
Precedence effect, Haas effect, auditory-scene analysis
Franssen effect
−
Interactive Listening
An architecture for a comprehensive model
of binaural hearing
Feed-back Paths to be Considered
●
Feedback from the binaural-mapping stage, that is, the output of
auditory signal processing, to head-position control – for instance,
the so-called “turn-to” reflex
●
Feedback from the cognitive stage to head-position control for
exploratory head movements
●
Feedback from the segmentation stage to the signal-processing stage
to solve ambiguities by activating additional pre-processing routines,
for example, cocktail-party effect and/or precedence effect processing
●
Feedback from the cognitive stage to the signal-processing stage,
to model efferent/reafferent effects of attention, such as by modifying
filter characteristics and/or concentrating on dominant spectral regions
●
Feedback from the cognitive stage to the segmentation stage,
for instance, to request task-specific and/or action-specific information
on particular features
30
Blauert 1999, 2011
Blauert & Obermaier 2012
symbols
signals
An architecture for a
Model of Interactive
Binaural Listening
indicating
possible feedback loops,
cross-modal-date input and
various expert modules
NB.: This is the architecture as
currently discussed in AABBA
Potential Applications for Binaural Algorithms
– looking back in time –
Blauert 1988
!
31
AABBA projects as of 2012
[1] Berlin Project (Raake): Application of binaural models
to multi-channel-reproduction assessment
QUALITY
[2] Bochum Project (Kolossa): Understanding of aural
scenes with a multi-feedback system and graphical
models
SCENE ANALYSIS
[3] Boston Project (Colburn): A binaural pre-processor
for speech recognition in complex environments
ENHANCEMENT
[4] Cardiff/Lyon Project (Culling): Predicting binaural
intelligibility for architectural acoustics
ENHANCEMENT
[5] Copenhagen Project (Dau): Binaural analysis of
complex listening environments
SCENE ANALYSIS
[6] Dresden Project (Jekosch/Altinsoy): Assessment of
human–machine interfaces employing binaural/tactile
interaction
MULTIMODAL
QUALITY
MEANING
AABBA
[7] Eindhoven Project (Kohlrausch): Binaural analysis of
acoustic scenes
ENHANCEMENT
[8] Helsinki Project (Pulkki): Assessment of parametric
spatial-audio coding with of binaural models
QUALITY
[9] Munich Project (Hemmert): Evaluation of binaural cues
in electric hearing"
QUALITY
[10] Oldenburg Project (Hohmann): Analysis of aural scenes with SCENE ANALYSIS
multi-dimensional statistical filters
[11] Paris/Toulouse Project (Gas): Binaural audition in robotics
ROBOT AUDITION
[12] Patras Project (Mourjopoulos): Assessment and reduction
of reverberance and colouration using binaural models
QUALITY
ENHANCEMENT
[13] Troy–NY Project (Braasch): Binaural analysis of
telematic music
SCENE ANALYSIS
Further: A MATLAB Toolbox for Binaural Modeling will be provided
AABBA
32
What does
the sound mean
to me ?
recognition
interpretation
FUNCTION
FORM
Where and how
is the sound ?
detection
perception
Audition and Cognition Come in Couples
Thank you!
Questions?
jens.blauert@rub.de
http://www.ruhr-uni-bochum.de/ika
33
©
Copyright note:
This material is not in the public domain.
The author(s) claim(s) all applicable rights.
However, permission to copy it is granted
under the condition that proper reference is
given to the author(s).
Corresponding author:
-----------------------------------------------------------Jens Blauert, Emeritus Professor of Acoustics
Institute of Communication Acoustics
Ruhr-Universitaet Bochum
D-44780 Bochum, Germany
Tel.: +49 234 322 2496 (direct: 3480)
Fax: +49 234 321 4165
e-mail: jens.blauert@rub.de
http://www.rub.de/ika
------------------------------------------------------------
34
Download