International Telecommunication Union Speech Enhancement Methods for Vehicle Applications Tim Haulick TEMIC Speech Dialog Systems “The Fully Networked Car, A Workshop on ICT in Vehicles” ITU- T Geneva, 2-4 March 2005 Contents ITU-T dates o Overview o Beamforming for vehicle applications • Principle • Examples • Adaptive self calibration • Wind-noise suppression • Compact dual-microphone array o Bandwidth Extension o In-car communication • Concept • Results of speech quality and intelligibility tests The Fully Networked Car, A Workshop on ICT in Vehicles ITU-T Geneva, 2-4 March 2005 2 EC EC EC EC EC EC EC EC adaptivebeamforming beamforming adaptive ITU-T Integrated Hands-Free System speech speech recognizer recognizer noise noise reduction reduction phone phone Microphone Array bandwidth bandwidth extension extension + speech speech output output dates The Fully Networked Car, A Workshop on ICT in Vehicles ITU-T Geneva, 2-4 March 2005 3 Beamforming ITU-T Structure of a generalized beamformer m0 Targ e Sign t al m1 T0 Filter T1 Filter Output Signal + MicrophoneA rray Int er fer en ce mM-1 dates TM-1 Filter Steering Delay Filtering (Alignment) (Beampattern) The Fully Networked Car, A Workshop on ICT in Vehicles ITU-T Geneva, 2-4 March 2005 4 Beamforming ITU-T speech signal interference - adaptive beamformer - fixed beamformer microphone array -20 -10 0 dates The Fully Networked Car, A Workshop on ICT in Vehicles ITU-T Geneva, 2-4 March 2005 5 Beamforming 4 Microphones, d = 5cm Attenuation [dB] ITU-T Beampattern at 1500Hz for a rotating noise source dates speech signal -20 blue – fixed beamformer -10 red – adaptive beamformer 0 The Fully Networked Car, A Workshop on ICT in Vehicles ITU-T Geneva, 2-4 March 2005 6 Beamforming Microphone Array Integration ITU-T 4 Microphone Array Integrated in Interior Mirror o Cost-efficient integration due to integrated microphone module Mirror o Fixed steered beamformer Microphone Module 5 cm could be used as driver direction of arrival (DOA) varies only within a small range (62°-75°) o Microphone array could be used by driver and co-driver dates The Fully Networked Car, A Workshop on ICT in Vehicles ITU-T Geneva, 2-4 March 2005 7 Beamforming Examples Driving Situation 120 km/h Attenuation [dB] ITU-T Time [s] Frequency [Hz] Significantly higher noise suppression in the low frequency range with the adaptive beamformer compared to the delay & sum beamformer dates The Fully Networked Car, A Workshop on ICT in Vehicles ITU-T Geneva, 2-4 March 2005 8 Beamforming Examples ITU-T Interfering Co-Driver microphone fixed beamformer adaptive beamformer Single Microphone Fixed Beamformer Co-Driver 0 1 Adaptive Beamformer Driver/ Co-Driver 2 3 time [s] 4 5 6 Suppression of interferer >15dB by adaptive beamformer dates The Fully Networked Car, A Workshop on ICT in Vehicles ITU-T Geneva, 2-4 March 2005 9 Beamforming Examples ITU-T Speech recognition tests: 50 speakers, 1000 digits strings with in sum 9000 digits per situation reduction of word error rate referring to a single beamformer microphone 70 % 60 50 40 30 20 10 0 130km/h 160km/h 100km/h, defroster on Reduction of word error rate by adaptive beamformer is significantly higher than 50%! dates The Fully Networked Car, A Workshop on ICT in Vehicles ITU-T Geneva, 2-4 March 2005 10 Beamforming Adaptive Self-Calibration ITU-T Problem: Beamformers are very sensitive to a mismatch of the microphones. Mutual deviations of the individual microphones may provoke a significant distortion of the beamformer output signal. Deviations inevitably occur due to fabrication tolerances and aging of the microphones. Solution: The mutual deviations of the microphones are compensated in a preprocessing unit which adjusts itself adaptively without being noticed by the driver. Benefits: o A costly calibration of the beamformer can be saved o Aging effects are tracked SelfCalibration dates Beamformer The Fully Networked Car, A Workshop on ICT in Vehicles ITU-T Geneva, 2-4 March 2005 11 Beamforming Adaptive Self-Calibration ITU-T microphone signal 0 dates 2 4 adaptive beamformer with self-calibration adaptive beamformer 6 8 time [s] 10 The Fully Networked Car, A Workshop on ICT in Vehicles ITU-T Geneva, 2-4 March 2005 12 14 16 12 Beamforming Wind Noise Suppression ITU-T Problem: Wind noise can provoke strong pulse-like disturbances of the microphone signals. In cars, this problem is mainly caused by the fan or an open top of a convertible. Due to design reasons or lack of space the standard wind shield of the microphones is often insufficient. Solution: Suppression of wind buffets by a (multi-channel) wind-noise suppression algorithm x 10 Microphone Signal 4 x 10 3 3 2 2 Mic 1 Beamformer Output Signal 1 0 0 -1 -1 -2 -2 -3 -3 0 dates 4 2 4 6 time [s] 8 10 0 2 4 The Fully Networked Car, A Workshop on ICT in Vehicles ITU-T Geneva, 2-4 March 2005 6 time [s] 8 10 13 Compact Dual-Microphone-Array ITU-T 2 x 10 4 Microphone Signal (Driver) microphone directed to co-driver 0 Mic -2 2 x 10 4 Processed: Hands-free Mode microphone directed to driver 0 HF -2 2 x 10 4 Processed: Recognizer Mode housing with optimized windprotection 0 -2 Driver 0 2 Co-Driver 4 6 Rec 8 time [s] dates The Fully Networked Car, A Workshop on ICT in Vehicles ITU-T Geneva, 2-4 March 2005 14 Bandwidth Extension ITU-T Problem: Degradation of speech quality due to the bandwidth limitation of the telephone network Solution: Extrapolation of missing frequency components from the received speech signal Telephone Network dates Bandwidth Extension The Fully Networked Car, A Workshop on ICT in Vehicles ITU-T Geneva, 2-4 March 2005 15 In-Car Communication ITU-T Passenger compartment -5…-15dB* Current Situation: o Communication between passengers is difficult, because of the acoustic loss (especially front to back o Front passengers have to speak louder than normal – longer conversations will be tiring o Driver turns around – road safety is reduced Solution: o Improve the speech quality and intelligibility by means of an intercom system Application: o Mid and high class automobiles, which are *Acoustic loss already equipped with the necessary audio and (referred to the ear signal processing of the driver) o Vans, etc. g systems with reduced quality dates The Fully Networked Car, A Workshop on ICT in Vehicles ITU-T Geneva, 2-4 March 2005 16 In-Car Communication Implementation ITU-T One-Way System o 2-4 microphones o 2-4 loudspeakers Two-Way System o 4-8 microphones o 6-8 loudspeakers Intercom system dates Intercom system The Fully Networked Car, A Workshop on ICT in Vehicles ITU-T Geneva, 2-4 March 2005 17 In-Car Communication Signal Processing Components ITU-T Algorithmic Structure for One Direction (Front g Rear): Echo cancellation Feedback cancellation Front loudspeakers Front microphones + Preprocessing Beamforming Rear loudspeakers + Feedback suppression Loss control Postprocessing Problems and Challenges: o Stability o System delay o Correlation of excitation and distortion dates The Fully Networked Car, A Workshop on ICT in Vehicles ITU-T Geneva, 2-4 March 2005 18 In-Car Communication Demo System ITU-T 4 microphones within the front top control unit 2x2 microphones (integrated within the rear grab handles) dates The Fully Networked Car, A Workshop on ICT in Vehicles ITU-T Geneva, 2-4 March 2005 19 In-Car Communication Subjective Tests ITU-T o Prerecorded speech examples with different Lombard levels were played back via an artificial mouth o Binaural recordings were made by means of a HEADacoustics NoiseBook on the seat behind the driver o Driving-Situations • 0km/h beside motorway • 130km/h on motorway dates The Fully Networked Car, A Workshop on ICT in Vehicles ITU-T Geneva, 2-4 March 2005 20 In-Car Communication ITU-T Results of Speech Quality Test (CMOSTest) 25 signal pairs per driving situation (intercom on/off) / 15 listeners per scenario o 0 km/h, vehicle parked close to a motorway • 19,7% prefer the system to be switched off • 29,7% have no preference • 50,7% prefer the system to be switched on o 130 km/h, motorway • 4,3% prefer the system to be switched off • 7,1% have no preference • 88,6% prefer the system to be switched on dates The Fully Networked Car, A Workshop on ICT in Vehicles ITU-T Geneva, 2-4 March 2005 21 In-Car Communication ITU-T Results of Speech Intelligibility Test (MRT) 48 utterances were presented to each listener per driving situation o 0 km/h, vehicle parked close to a motorway • No significant difference (95.2% correct answers for system on versus 95.0% for system off) • Due to the automatic gain adjustment the intercom system operates with only very small gain at these noise levels o 130 km/h, motorway • Significant improvement of speech intelligibility by the intercom • Nearly 50% error reduction dates The Fully Networked Car, A Workshop on ICT in Vehicles ITU-T Geneva, 2-4 March 2005 22 Conference Calls/ In-Car Communication ITU-T "Hello... "Hello.. "Hello... GSM "Hello.. o System Functionality: o Multi-channel hands-free system for driver and co-driver or passengers on the back seats o Conference calls with up to 4 partners with intercom functionality from the front to the Intercom functionality between passengers in the front and in the back "Hello.. back o Speech recognition capabilities available for all seats dates The Fully Networked Car, A Workshop on ICT in Vehicles ITU-T Geneva, 2-4 March 2005 23 ITU-T Thank you for your attention! dates The Fully Networked Car, A Workshop on ICT in Vehicles ITU-T Geneva, 2-4 March 2005 24