ITU Workshop on Making Media Accessible to all: the Options and the Economics (Geneva, Switzerland, 24 – 25 October 2013) Clean Audio Harald Fuchs, Fraunhofer IIS harald.fuchs@iis.fraunhofer.de Geneva, Switzerland, 24-25 October 2013 What is “Clean Audio”? ! The term “Clean Audio” is defined in the DVB specification for Audio and Video coding as follows: “Clean Audio refers to audio providing improved intelligibility. ! It is targeted for viewers with hearing impairments, ! but can as well serve as improvement for listening in noisy environments like airplanes.” ! Geneva, Switzerland, 24-25 October 2013 2 Challenge: Finding the Right Mix ! Audio mix with fixed balance between dialogue and background is always a compromise Hearing impaired people require a higher loudness of the dialogue ! Non-native speakers need about 3dB higher S/N ! The listening environment has an influence on the preferred setting of the mix ! Geneva, Switzerland, 24-25 October 2013 3 Options for Clean Audio (1) ! (2) ! (3) ! (4) ! Transmit several different mixes Transmit separate audio sources Receiver-side Post-Processing Object-based Dialogue Enhancement Geneva, Switzerland, 24-25 October 2013 4 (1) Transmit several different mixes ! Pro ! ! Backwards-compatible with existing receivers Cons ! Only limited number of mixes possible ! ! High additional data-rate ! ! Number of parallel audio streams in production workflow is one limiting factor one complete audio stream per additional mix No individual mix of speech/background Geneva, Switzerland, 24-25 October 2013 5 (2) Transmit separate audio sources ! Pro ! ! Individual mix of speech vs. background Cons Only backwards compatible if default mix is sent in addition ! high additional data-rate ! ! ! -> 2 or 3 complete audio streams Separate sources have to be available in production workflow Geneva, Switzerland, 24-25 October 2013 6 (3) Receiver-side Post-Processing Processing to identify and enhance speech parts of a mix audio signal ! Pros ! No additional data-rate ! Requires no changes in production ! ! Cons Maximum enhancement is limited ! Depends on signal characteristics ! Depends on available CPU resources ! Geneva, Switzerland, 24-25 October 2013 7 (4) Object-based Dialogue Enhancement ! Goal: Enable similar flexibility as with separate sources ! Less bitrate overhead compared to several mixes or separate sources ! ! Solution: Parametric, object-based side-info ! Transmission of only one audio mix ! Parameters sent with the mix, used in receiver to change the balance of mix ! Geneva, Switzerland, 24-25 October 2013 8 Signal Flow Dialogue Enhancement Geneva, Switzerland, 24-25 October 2013 9 Object-based Dialogue Enhancement ! Pros Flexibility: Individual mix of speech vs. background is possible ! Bitrate efficient: less overhead compared to separate source delivery ! Backward compatible: Default mix is always present ! ! Cons ! Separate dialogue source in production workflow, encoder update necessary Geneva, Switzerland, 24-25 October 2013 10 Object-based Dialogue Enhancement ! Technology ! Based on MPEG Spatial Audio Object Coding (SAOC) standard ! ! Definition of Dialogue Enhancement Profile (SAOC-DE) on-going Standardization process on-going in DVB ! Add SAOC-DE as Advanced Clean Audio to the DVB specification for audio and video coding Geneva, Switzerland, 24-25 October 2013 11 Listening Test Geneva, Switzerland, 24-25 October 2013 12 Speech Intelligibility Listening Test ! Test groups: 10 people with medium age related hearing impairments ! Group of 10 people with normal hearing as reference group ! ! German Sentence Test to measure speech intelligibility Sentences of five words with 10 options for each word ! Count correctly identified words ! Geneva, Switzerland, 24-25 October 2013 13 Speech Intelligibility Listening Test ! Two different noise signals ! Speech-shaped noise (SSN) ! ! ! similar frequency characteristic as speech Applause Default mix of speech and noise Set to a target level of 50% intelligibility for the hearing-impaired group ! In the test a slightly lower level was achieved (46% and 34%) ! Geneva, Switzerland, 24-25 October 2013 14 Listening Test Results ! Enhancement of 12 dB ! ! intelligibility for both noise signals up to 80-90% Similar to the intelligibility of normal hearing listeners at the default mix Normal-hearing listeners SSN Default mix 6 dB enhancement 12 dB enhancement 0% 20% 40% 60% 80% 100% Speech Intelligibility / % correct Geneva, Switzerland, 24-25 October 2013 15 Listening Test Results (2) ! Comparison of results for both noise signals ! ! Reference group: lower for Applause Bigger difference for 6dB enhancement between SSN and Applause compared to 12 dB Normal-hearing listeners Applause Default mix 6 dB enhancement 12 dB enhancement 0% 20% 40% 60% 80% 100% Speech Intelligibility / % correct Geneva, Switzerland, 24-25 October 2013 16 Conclusions ! Personalization for improved intelligibility ! ! Several options available for Clean Audio ! ! with different advantages and disadvantages New advanced Clean Audio solution ! ! Enable the audience to change the balance of dialogue vs. background currently under standardization at DVB Test Result: enhancement of 6 to 12 dB ! useful range for adaptation to personal preferences and listening environment Geneva, Switzerland, 24-25 October 2013 17