Slides of Keynote Lecture #1 by Jens Blauert, D-Bochum medial geniculate nucleus primary auditory cortex inferior colliculus cochlea superior olivary complex The Human Auditory System Prominent Features of Binaural Hearing • Localization – Formation of positions of the auditory events (i.e. azimuth, elevation & distance) – Spatial extent of auditory events • Suppression of – Directional information coming from reflections (e.g., Precedence effect, summing localization) – Reverberation, coloration and noise • Identification & Segregation of – Auditory streams (e.g., concurrent musical instruments, cocktail-party effect,warning signals) 1 Binaural Models and their Technological Application Jens Blauert, D-Bochum Application Area # 1 as Identified by AABBA Spatial Scanning and Mapping of Auditory Scenes Estimation of the position and the spatial extents of the of auditory events that form an auditory scene – be it a natural scene as in room acoustics WW or virtual scene as in virtual-reality applications, or at the play-back end of audio systems – including spatially diffuse auditory events, often perceived as components of reverberance The AABBA grouping, 2009 What is AABBA? An international grouping of 14 laboratories (since 2009), dealing with Aural Assessment By Means of Binaural Algorithms 2 Application Area # 2 as Identified by AABBA Analysis of Auditory Scenes with the Aim of Deriving Parametric Representations at the Signal Level Estimates of these parameters may be used, for instance, ● For coding and/or re-synthesis of auditory scenes ● For speech-enhancement in complex acoustic environments, incl. hearing aids ● For systems to enhance the spatial perception in sound fields, such as better localization and/or a better sense of envelopment; further, decolouration and de-reverberation ● For the identification of perceptual invariances of auditory scenes The AABBA grouping, 2009 Application Area # 3 as Identified by AABBA Analysis of Auditory Scenes with the Aim of Deriving Parametric Representations at the Symbolic Level For example, ● Identification of determinants of meaning contained in binaural-activity maps ● Assignment of meaningful symbols to the output of binaural models The AABBA grouping, 2009 3 Application Area # 4 as Identified by AABBA Evaluation of Auditory Scenes in Terms of Quality Where quality will strictly be judged from the user’s point of view, for example, ● Quality of “the acoustics” of spaces for musical performances ● Quality of systems for holophonic representation of auditory scenes, such as auditory displays and virtual-reality generators ● Spatial quality of audio-systems (for recording, transmission and play-back), incl. systems that employ perceptual coding ● Quality of speech-enhancement systems, incl. hearing aids The AABBA grouping, 2009 Binaural Models and their Technological Application ● ● Introduction − Prominent Features of Binaural Hearing − Generic Application Areas of Binaural Models Model Architectures − The Periphery & Bottom-up Processing ILDs, ITDs, interaural coherence − Binaural-activity Mapping & Interpretation − Higher-order Processes Cocktail-party processing, sound-quality assessment Precedence effect, auditory-scene analysis Franssen effect − Interactive Listening Architecture for a comprehensive model of binaural hearing ● Conclusion and Outlook 4 − The Periphery & Bottom-up Processing ILDs, ITDs, interaural coherence − Binaural-activity Mapping & Interpretation Cocktail-party processing, quality assessment − Higher-order Processes Precedence effect, auditory-scene analysis Franssen effect − Interactive Listening Architecture for a comprehensive model of binaural hearing binaural-activity map interaural-level-difference analysis 2006 2006 interaural-time-difference analysis pure bottom-up processing, signal driven ! Block Diagram of the Binaural-Analysis System of IKA Bochum 5 Ear-Adequate Band-Pass-Filter Bank spectral decomposition envelope extraction compressive A/D conversion compressive! Probabilistic Spike Generator A Simplified Functional Model of the Hair Cells 6 The Jeffress Processor Estimates Interaural Cross Correlation Ψy(τ) = 1/(t1- t0) t1 y (t) y (t +τ) Σ t=t l r 0 similarity to Jeffress‘ coincidence model: τk τk+1 τk+2 τk+3 τk+4 τk+5 τk+6 τk+7 τk+8 Braasch 2002 contralaterally-inhibited IACC IACC Contral-Lateral Inhibition after Lindemann 7 Cross-Correlation vs. Lindemann impulsive input signals with no ITD, but with an ILD, looked at in an auditory „critical“ band about 800 Hz schematic plot Gaik 1988 „Natural Combinations“ of ITDs and ILDs 8 Gaik‘s Extension: How to learn the ears of its owner? Weighted Contralateral Inhibition center frequency of critical band Gaik 1988 left < lateral deviation > right Output of the Jeffress-Lindemann-Gaik Model frontal broad-band sound source 9 A „Figurative“ Plot of the Output of the Gaik Processor estimated angle presented angle natural hearing Blumlein stereophony loudspeakers 60°apart loudspeakers 45°apart Binaural-Model Estimation of ITDs for Natural Hearing and Amplitude-Difference Stereophony (Blumlein) Blauert & Braasch 2008 10 − The Periphery & Bottom-up Processing (ILDs, ITDs, interaural coherence) − Binaural-activity Mapping & Interpretation Cocktail-party processing, quality assessment − Higher-order Processes Precedence effect, auditory-scene analysis Franssen effect − Interactive Listening An architecture for a comprehensive model of binaural hearing recognition, interpretation and judgement binaural– activity mapping Schematic Plot of a Typical Binaural Model 11 2 concurrent talkers) „dry“ talker reverberant talker de-reverbed talker Examples for Binaural Activity Maps of Speech direction finder weight assigment Binaural Model Control Unit source selection Wiener Filter spectral decomposition speech cleansing Architecture of a Cocktail-Party Processor, Based on Binaural Modelling 12 one noise source two concurrent talkers A human listener would not localize the reflection! two moving talkers talker plus wall reflection Source Tracking with the Binaural Model Bodden 1993 Spatial Selectivity of a Model of Binaural Hearing Bodden 1993 13 taken at an instant t = to from a running correlogram Sample Output of the Binaural Model one frontal sound source, sending out a musical chord Impulse Response and Running Interaural Cross-correlation in a Room with Reflecting Walls 14 prediction: left < center > right Model Output with Incoherent Ear-Input Signals parameter: degree of interaural coherence, k pink noise, 12 subjects after Blauert & Lindemann, 1985 (plot enhanced for contrast) Area Covered by Auditory Events as a Function of the Amount of Interaural Coherence, k 15 critical band about 700 Hz reflected sounds direct sound left lateral position right Binaural-Activity Map of the Impulse Response of a Concert Hall rendered by the model of binaural signal processing of Lindemann & Gaik 1986,1990 plotted by Okabe 1997 critical band at 500 Hz 2 noises from Φ = 30 & 3300 90% correlated Binaural Model – Fast PC Version snapshots across bands Hess 2004 16 Application to Sound Quality ─ The Layer Model ─ Higher Aural-communication Quality “sound of quality” – communication sciences ideas, concepts, functionalities, content, sound as a sign carrier Acoustic Quality “quality of realization” Abstraction – physics – perceptual psychology Lower Auditive Quality Aural-scene Quality – classical psychoacoustics physical attributes & properties, absence of acoustic distortions, physical form “quality of presentation” aural gestalt, authenticity, aural perspective, enhancement, sensory consistency, perceptual form “quality of sound” auditive attributes & properties Blauert & Jekosch 2007, 2012 brain of the system top-down processing The „brain“ of the system contains explicit knowledge: e.g., data bases, rule system, semantic networks, transition probabilities, domain models, the history of the situation object building Gestalt rules Blauert 1999, 2005 binaural-activity mapping Intelligent Evaluation of Binaural-Activity Maps 17 − The Periphery & Bottom-up Processing ILDs, ITDs, interaural coherence − Binaural-activity Mapping & Interpretation Cocktail-party processing, quality assessment − Higher-order Processes Precedence effect, auditory-scene analysis, Franssen effect − Interactive Listening An architecture for a comprehensive model of binaural hearing summing localization localization dominance region region region with two auditory events lead lag inter-stimulus interval, ISI Blauert: Spatial Hearing, 1983 The Precedence Effect Do not mix up with the Haas Effect! 18 Output of the Binaural Model for a Frontal Sound Plus One Lateral Reflection left panel: cross correlation only right panel: cross correlation plus contralateral inhibition Short Clicks ! Psycho-Acoustic Results no localization dominance band-pass-noise bursts 200-ms duration, 500-Hz center frequency Ongoing Sounds ! 19 lag about 0.7 kHz 100-ms noise ISI ...10 ms estimate of the interaural cross-correlation function Output of the Jeffress Coincidence Processor – Lag Frontally-Fixed, Lead Moving About – Braasch 2005 lag about 0.7 kHz 100-ms noise ISI ...10 ms estimate of the inhibited interaural cross-correlation function Output of Lindemann’s Processor with Lateral Inhibition – Lag Frontally-Fixed, Lead Moving About – Braasch 2005 20 broad band 100-ms noise ISI ... 10 ms estimate of the interaural cross-correlation function Output of the Jeffress Coincidence Processor – Lag Frontally Fixed, Lead Moving About – Braasch 2005 broad band 100-ms noise ISI = 10 ms estimate of the inhibited interaural cross-correlation function Output of Lindemann’s Coincidence Processor with Lateral Inhibition Braasch 2005 21 Modified Lindemann Model inhibited IACC x…type I +…type II level diff Dizon’s experimental paradigm left-ear signal right-ear signal observation window Dizon 2001, Dizon & Colburn 2006 22 Time Course unusual approach using a monaural lead/lag pair lead stronger lag weaker monaural! Autocorrelation Evaluation of Filtered Signal lead only lead & lag lead, after lag removal error monaural! 23 current model: matched filter prior models: simple substraction Echo Canceller for a Simple Lead-Lag Pair Braasch 2008 Cross-correlation Analysis for Binaural Signals lead only recovered lead & lag error binaural! 24 model data lead position psychoacoustic data very narrow band lag position Model Data with Echo Canceller adaptive filters take time! Binaural-Model Architecture with Monaural Echo Cancellers 25 Fo Fast & Slow Melodies – Trumpet time Echo Thresholds Fast & Slow Melodies ET in ms 100 92 85 53 60 40 82 75 80 45 35 F 65 60 57 52 48 40 55 42 30 FM S SM 20 0 L1 F…fast, L2 L3 L4 FM…fast in mixture, S…slow, SM…slow in mixture 26 Wolf‘s Experiment II alternative plot The Clifton Effect Clifton 1987 27 Sound Sources Alternating in Space Blauert & Col 1989 Virtual Test Room 28 Test-Room Switching The Franssen Effect Franssen 1960 29 − The Periphery & Bottom-up Processing ILDs, ITDs, interaural coherence − Binaural-activity Mapping & Interpretation Quality assessment, cocktail-party processing − Higher-order Processes Precedence effect, Haas effect, auditory-scene analysis Franssen effect − Interactive Listening An architecture for a comprehensive model of binaural hearing Feed-back Paths to be Considered ● Feedback from the binaural-mapping stage, that is, the output of auditory signal processing, to head-position control – for instance, the so-called “turn-to” reflex ● Feedback from the cognitive stage to head-position control for exploratory head movements ● Feedback from the segmentation stage to the signal-processing stage to solve ambiguities by activating additional pre-processing routines, for example, cocktail-party effect and/or precedence effect processing ● Feedback from the cognitive stage to the signal-processing stage, to model efferent/reafferent effects of attention, such as by modifying filter characteristics and/or concentrating on dominant spectral regions ● Feedback from the cognitive stage to the segmentation stage, for instance, to request task-specific and/or action-specific information on particular features 30 Blauert 1999, 2011 Blauert & Obermaier 2012 symbols signals An architecture for a Model of Interactive Binaural Listening indicating possible feedback loops, cross-modal-date input and various expert modules NB.: This is the architecture as currently discussed in AABBA Potential Applications for Binaural Algorithms – looking back in time – Blauert 1988 ! 31 AABBA projects as of 2012 [1] Berlin Project (Raake): Application of binaural models to multi-channel-reproduction assessment QUALITY [2] Bochum Project (Kolossa): Understanding of aural scenes with a multi-feedback system and graphical models SCENE ANALYSIS [3] Boston Project (Colburn): A binaural pre-processor for speech recognition in complex environments ENHANCEMENT [4] Cardiff/Lyon Project (Culling): Predicting binaural intelligibility for architectural acoustics ENHANCEMENT [5] Copenhagen Project (Dau): Binaural analysis of complex listening environments SCENE ANALYSIS [6] Dresden Project (Jekosch/Altinsoy): Assessment of human–machine interfaces employing binaural/tactile interaction MULTIMODAL QUALITY MEANING AABBA [7] Eindhoven Project (Kohlrausch): Binaural analysis of acoustic scenes ENHANCEMENT [8] Helsinki Project (Pulkki): Assessment of parametric spatial-audio coding with of binaural models QUALITY [9] Munich Project (Hemmert): Evaluation of binaural cues in electric hearing" QUALITY [10] Oldenburg Project (Hohmann): Analysis of aural scenes with SCENE ANALYSIS multi-dimensional statistical filters [11] Paris/Toulouse Project (Gas): Binaural audition in robotics ROBOT AUDITION [12] Patras Project (Mourjopoulos): Assessment and reduction of reverberance and colouration using binaural models QUALITY ENHANCEMENT [13] Troy–NY Project (Braasch): Binaural analysis of telematic music SCENE ANALYSIS Further: A MATLAB Toolbox for Binaural Modeling will be provided AABBA 32 What does the sound mean to me ? recognition interpretation FUNCTION FORM Where and how is the sound ? detection perception Audition and Cognition Come in Couples Thank you! Questions? jens.blauert@rub.de http://www.ruhr-uni-bochum.de/ika 33 © Copyright note: This material is not in the public domain. The author(s) claim(s) all applicable rights. However, permission to copy it is granted under the condition that proper reference is given to the author(s). Corresponding author: -----------------------------------------------------------Jens Blauert, Emeritus Professor of Acoustics Institute of Communication Acoustics Ruhr-Universitaet Bochum D-44780 Bochum, Germany Tel.: +49 234 322 2496 (direct: 3480) Fax: +49 234 321 4165 e-mail: jens.blauert@rub.de http://www.rub.de/ika ------------------------------------------------------------ 34