Predicting the Intelligibility of Cochlear-implant Vocoded Speech from Objective Quality Measure(1)

advertisement

Predicting the Intelligibility of Cochlear-implant Vocoded

Speech from Objective Quality Measure(1)

Department of Electrical Engineering, The University of Texas at Dallas, Richardson, Texas75083, USA

Received 8 Jan 2011; Accepted 27 May 2011; doi: 10.5405/jmbe.885

Chairman:Hung-Chi Yang

Presenter: Yu-Kai Wang

Advisor: Dr. Yeou-Jiunn Chen

Date: 2012.12.12

Outline

Introduction

Purposes

Material and Methods

Introduction (1)

Cochlear implants (CIs)

Restore partial hearing to patients with severe to profound deafness.

A number of factors may affect performance.

Electrode insertion depth and placement

Quiet and noisy conditions

Electric-acoustic stimulation (EAS)

An electrode array

It is implanted only partially into the cochlea so as to preserve the residual acoustic hearing .

(20-60 dB hearing loss (HL) up to 750 Hz and severe-to-profound hearing loss at 1000 Hz and above)

Introduction (2)

A speech intelligibility index

To predict the intelligibility of vocoded speech.

To guide development of new speech processing strategies for cochlear implants.

It is highly correlated with the perceptual evaluation of speech quality (PESQ) measure.

Originally designed for predicting subjective speech quality.

Purposes

Assess the performance of conventional objective measures

Applied to predicting the intelligibility of vocoded speech.

Material and Methods

2.1 Subjects

Speech intelligibility data were collected from three listening experiments using NH listeners as subjects.

Vocoded English

Vocoded Mandarin Chinese

Experiments 1 and 2( by Chen and Loizou

)

Assessed the contribution of weak consonants to vocoded English speech intelligibility in noisy environments.

Experiment 3

Predicting the intelligibility of vocoded Mandarin sentences.

Material and Methods

Details of the subjects and test conditions for the three experiments are given in Tabel 1.

steady-state noise electric-acoustic stimulation

20 54

Material and Methods

2.2 Stimuli

English and Chinese sentence contained 8 and 7 words on average, respectively.

Two types of masker were used to corrupt the sentences.

Continuous steady-state noise (SSN)

Long-term spectrum was the same as those of the test sentences.

Two equal-level interfering female talkers (2-talker).

Material and Methods

The test of vocoded English , the sentences were corrupted.

-5, 0, and 5 dB SNR levels

The test of Mandarin Chinese , the sentences were corrupted.

-4, 0, 4, 8, and 12 dB SNR levels

The EAS-vocoder , the sentences were corrupted.

The SSN and 2-talker maskers

-4, -2, 0, 2, and 4 dB SNR levels

The above SNR levels

It were selected to avoid ceiling/floor effects for speech intelligibility data.

Material and Methods

2.3 Signal processing

The stimuli were presented in two signal processing conditions.

Tone-vocoder.

EAS-vocoder.

The first processing condition (tone-vocoder) .

To simulate eight-channel electrical stimulation.

Eightd-channel sinewave-excited vocoder.

Through a pre -emphasis filter (2000-Hz cutoff) with 3 dB/octave roll-off.

Band-passed into eight frequency bands btween 80 and 6000 Hz.

Sixth-order Butterworth filter.

Material and Methods

The envelope of the signal was extracted by.

Full-wave rectification

Low –pass filtering using a second-order Butterworth filter (400-

Hz cutoff).

Sinusoids were generated with amplitudes equal to the rootmean-square of the envelopes (computed every 4 ms) .

Frequencies equal to the center frequencies of the bandpass filters.

Material and Methods

The second processing condition (EAS-vocoder).

Simulated combined electric-acoustic stimulation.

The signal was first low-pass (LP)-filtered to 600 Hz using a sixth-order Butterworth filter.

To simulate the effects of EAS for patients with residual hearing below 600 Hz.

The LP stimulus was combined with the upper five channels of the eight-channel tone-vocoder.

Material and Methods

Download