ITU Workshop on
“Making Media Accessible to all:
The options and the economics”
(Geneva, Switzerland, 24 (p.m.) – 25 October 2013)
Speech signal processing for media accessibility
Takayuki Ito, Dr. Eng.
Executive Research Engineer,
NHK Engineering System, Inc.
itou.takayuki@nes.or.jp
Geneva, Switzerland, 24 October 2013
Ageing : A Global Issue
Population of elderly persons is increasing globally because of fertility rates decline.
Japan
2010
Aged
65 and over
23%
2040 36%
Need providing elderly persons with the opportunity to continue contributing to society.
(UN 2002 Madrid International Plan of Action on Ageing)
From “supported” to “supporting”
Geneva, Switzerland, 24 October 2013
2
Ageing : degradation of hearing
Hearing loss especially in higher frequencies
Hearing Aid is available.
Background sound interferes to understand speech.
Better mixing balance for TV programs is needed.
Degradation of cognitive speed
Slower speech rate is preferable.
Compensating these degradations makes easier for their social participation.
Geneva, Switzerland, 24 October 2013 3
Speech rate conversion technology
Geneva, Switzerland, 24 October 2013
4
Speech rate conversion for elderly people
The elderly sometimes claim “Recent speeches on
TV programs are too fast for me to understand.”
A need to slow down speech rate without degrading sound quality
Faster
Original
Slower
①②④⑤⑥⑦⑨⑩
× × time
① ② ③ ④ ⑤ ⑥ ⑦ ⑧ ⑨ ⑩ time
①② ③ ③ ④⑤⑥⑦ ⑧ ⑧ ⑨⑩ time stop
① ② ③ ④ ⑤ ⑥ ⑦ ⑧ ⑨ ⑩
Analog elongation time
TV and radio set with “Slow button”
Geneva, Switzerland, 24 October 2013
5
Original
Speech rate conversion without changing length streaming data
Stop
Converted
Start is coincided at blue line positions
Geneva, Switzerland, 24 October 2013
Start is not coincided but…
Again it coincides
6
Intelligible high speed speech for visually impaired people
Original
( n times )
Visually impaired people use fast replay to find a main idea in audio books or web pages.
( Audio skimming ) recorded data
F
G
Important part ( speech ) E
BGM silent speech J
Stop
Converted
( same length)
Make slower Make slower time
Make this part easier to understand
7
Geneva, Switzerland, 24 October 2013
Applications of speech rate conversion
For elderly people
Learn foreign language
For people with learning disability slower
Quick news internet service
Audio skimming for visually impaired people faster
8
Geneva, Switzerland, 24 October 2013
Geneva, Switzerland, 24 October 2013
9
A TV receiver with clean audio dial
Various ways to realize this.
For detailed information, please see FG AVA TR Part 12.
10
Geneva, Switzerland, 24 October 2013
Receiver-side re-mixing for the elderly
( Clean Audio)
Separate speech from background sound by stereo correlation.
Estimated speech component is enhanced for clearer speech.
Speech and BG sound is re-mixed with favorite ratio.
Nothing is necessary to change in production and transmission.
Broadcast
Sound
Stereo signal adaptive filter
Estimated speech
Estimated
BG sound spectrum emphasiz
-er
× α
× β
×
×
γ
η
Re-mixing speech and
BG with specified ratio
Output
Sound
Voice detector
Speech / non-speech flag
11
Geneva, Switzerland, 24 October 2013
Demonstration of the receiver-side clear audio
Geneva, Switzerland, 24 October 2013
12
Conclusions and Recommendations
Compensating degraded functions of the elderly helps their social participation.
Speech rate conversion and re-mixing F/B sounds are promising technologies for these purpose.
Broadcasters/TV manufacturers are encouraged to provide these services/ devices with these functions.
Refer FG AVA Tech. Report Part 12 for more information.
Geneva, Switzerland, 24 October 2013
13
Geneva, Switzerland, 24 October 2013
14
Clear audio in studio :
Mixing balance meter
Mixing balance meter
Indicate loudness-based mixing balance
“ Elderly emulation mode ” indicates better mixing for the elderly.
Young mixing engineers can produce better balanced audio for the elderly.
Speech
(narration etc.)
Background sounds
Calculates
Loudness
&
Estimate the favorability of the MIX ‐ Level
Studio
Mixed sound
Mixing balance meter
15
Geneva, Switzerland, 24 October 2013