Predicting the Intelligibility of Cochlear-implant Vocoded
Speech from Objective Quality Measure(1)
Department of Electrical Engineering, The University of Texas at Dallas, Richardson, Texas75083, USA
Received 8 Jan 2011; Accepted 27 May 2011; doi: 10.5405/jmbe.885
Chairman:Hung-Chi Yang
Presenter: Yu-Kai Wang
Advisor: Dr. Yeou-Jiunn Chen
Date: 2012.12.12
Introduction (1)
Cochlear implants (CIs)
Restore partial hearing to patients with severe to profound deafness.
A number of factors may affect performance.
Electrode insertion depth and placement
Quiet and noisy conditions
Electric-acoustic stimulation (EAS)
An electrode array
It is implanted only partially into the cochlea so as to preserve the residual acoustic hearing .
(20-60 dB hearing loss (HL) up to 750 Hz and severe-to-profound hearing loss at 1000 Hz and above)
Introduction (2)
A speech intelligibility index
To predict the intelligibility of vocoded speech.
To guide development of new speech processing strategies for cochlear implants.
It is highly correlated with the perceptual evaluation of speech quality (PESQ) measure.
Originally designed for predicting subjective speech quality.
Purposes
Assess the performance of conventional objective measures
Applied to predicting the intelligibility of vocoded speech.
Material and Methods
Speech intelligibility data were collected from three listening experiments using NH listeners as subjects.
Vocoded English
Vocoded Mandarin Chinese
Experiments 1 and 2( by Chen and Loizou
)
Assessed the contribution of weak consonants to vocoded English speech intelligibility in noisy environments.
Experiment 3
Predicting the intelligibility of vocoded Mandarin sentences.
Material and Methods
Details of the subjects and test conditions for the three experiments are given in Tabel 1.
steady-state noise electric-acoustic stimulation
20 54
Material and Methods
English and Chinese sentence contained 8 and 7 words on average, respectively.
Two types of masker were used to corrupt the sentences.
Continuous steady-state noise (SSN)
Long-term spectrum was the same as those of the test sentences.
Two equal-level interfering female talkers (2-talker).
Material and Methods
The test of vocoded English , the sentences were corrupted.
-5, 0, and 5 dB SNR levels
The test of Mandarin Chinese , the sentences were corrupted.
-4, 0, 4, 8, and 12 dB SNR levels
The EAS-vocoder , the sentences were corrupted.
The SSN and 2-talker maskers
-4, -2, 0, 2, and 4 dB SNR levels
The above SNR levels
It were selected to avoid ceiling/floor effects for speech intelligibility data.
Material and Methods
The stimuli were presented in two signal processing conditions.
Tone-vocoder.
EAS-vocoder.
The first processing condition (tone-vocoder) .
To simulate eight-channel electrical stimulation.
Eightd-channel sinewave-excited vocoder.
Through a pre -emphasis filter (2000-Hz cutoff) with 3 dB/octave roll-off.
Band-passed into eight frequency bands btween 80 and 6000 Hz.
Sixth-order Butterworth filter.
Material and Methods
The envelope of the signal was extracted by.
Full-wave rectification
Low –pass filtering using a second-order Butterworth filter (400-
Hz cutoff).
Sinusoids were generated with amplitudes equal to the rootmean-square of the envelopes (computed every 4 ms) .
Frequencies equal to the center frequencies of the bandpass filters.
Material and Methods
The second processing condition (EAS-vocoder).
Simulated combined electric-acoustic stimulation.
The signal was first low-pass (LP)-filtered to 600 Hz using a sixth-order Butterworth filter.
To simulate the effects of EAS for patients with residual hearing below 600 Hz.
The LP stimulus was combined with the upper five channels of the eight-channel tone-vocoder.
Material and Methods