A speech locked loop for cochlear implants and speech prostheses Please share

advertisement

A speech locked loop for cochlear implants and speech prostheses

The MIT Faculty has made this article openly available.

Please share

how this access benefits you. Your story matters.

Citation

As Published

Publisher

Version

Accessed

Citable Link

Terms of Use

Detailed Terms

Wee, Keng Hoong, Lorenzo Turicchia, and Rahul Sarpeshkar. “A

Speech Locked Loop for Cochlear Implants and Speech

Prostheses.” 3rd International Symposium on Applied Sciences in Biomedical and Communication Technologies 2010 (ISABEL).

1–2. © Copyright 2010 IEEE http://dx.doi.org/10.1109/ISABEL.2010.5702864

Institute of Electrical and Electronics Engineers (IEEE)

Final published version

Thu May 26 10:32:18 EDT 2016 http://hdl.handle.net/1721.1/72672

Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use.

A Speech Locked Loop for Cochlear Implants and

Speech Prostheses

Keng Hoong Wee

Electrical and Computer Engineering

National University of Singapore

Singapore

Email: elewkh@nus.edu.sg

Abstract — We have previously described a feedback loop that combines an auditory processor with a low-power analog integrated-circuit vocal tract to create a speech-locked-loop . Here, we describe how the speech-locked loop can help improve speech recognition in noise by re-synthesizing clean speech from noisy speech. Therefore, it is potentially useful for improving speech recognition in noise, important in cochlear implant and other applications. We show that it can also produce good-quality speech with articulatory parameters, which are inherently lowdimensional and robust. Therefore, it is potentially also useful in brain-machine-based speech prostheses.

Keywords-speech prosthesis; hearing prothesis; speech locked loop (SLL)

I.

I NTRODUCTION

Models of biology provide inspiration for improving the performance of engineering systems [1]. Feedback loops play a critical role in ensuring that biological systems function in a robust manner. We have combined a model of the cochlea and a vocal tract in a bio-inspired feedback configuration [1, 2, 3,

4]. We call this feedback system, a speech-locked-loop (SLL), in analogy with phase-locked-loops (PLL), which are widely used in other communication systems. Specifically, the cochlea and vocal tract in the SLL are analogous to a phase detector and voltage-controlled-oscillator of the PLL, respectively.

Figure 1 is a block diagram representation of the SLL that illustrates the dual relationship between analysis and synthesis: analysis is needed to fine-tune synthesis and conversely, synthesis can be exploited for better analysis.

Lorenzo Turicchia, Rahul Sarpeshkar

Research Laboratory of Electronics

Massachusetts Institute of Technology

Cambridge, MA, USA

Email: {turic, rahuls}@mit.edu

In this abstract, we describe how SLLs can improve hearing in noise and therefore be useful in cochlear-implants (CIs).

Since cochlear-implant and hearing-impaired patients have extreme difficulty understanding speech in noisy environments, such noise-cancellation techniques are very helpful. We also discuss the application of the SLL to brain-machine-interface

(BMI) based speech prostheses and propose a speech-coding strategy ideal for such prostheses.

II.

A PPLICATIONS IN S PEECH AND H EARING P ROSTHESES

The SLL analyzes recordings of speech and extracts what we term an articulogram , which is a 3-D plot of the articulatory parameter trajectories as a function of time [2, 3, 4]. As the

SLL is based on a physiological model of the human vocal tract, it inherently synthesizes all and only speech signals.

Consequently, it has the ability to restore speech which has been corrupted in noise. Such signal-restorative properties are particularly important for the hearing impaired who often have difficulty understanding speech in noisy environments.

Figure 2(a) shows the spectrogram of a recording of the word “hid” produced by a male speaker. Figure 2(b) shows the same recording degraded in -2dB SNR white noise. Figure 2(c) shows the spectrogram of the word “hid” re-synthesized by our

SLL from the noise degraded version of Figure 2(b). In Figure

2(c) the property of the SLL to re-synthesise even the formants immersed in noise is clearly seen.

The articulogram utilized by the SLL can also be used in a speech prosthesis: For speech synthesis, we want independent, precise, and reliable control signals, which are very difficult to obtain using state-of-the-art brain machine interfaces. In the

SLL, speech can be generated by driving a set of control variables corresponding to articulatory movements. This strategy reduces the high dimensionality of the speech control space to a few parameters.

Our vocal-tract synthesizer, which includes an articulatory model, can be used to produce good-quality speech [2]. It is implemented on a chip as an analog integrated circuit [1, 2]. In our circuit model, the vocal tract is approximated as a spatially varying acoustic tube using a cascade of tunable two-port electrical circuit elements that corresponds to a concatenation of short cylindrical acoustic tubes with varying cross sections.

Figure 1. Concept of a speech locked loop.

978-1-4244-8132-3/10/$26.00 ©2010 IEEE

(a)

(a)

(b)

(b)

Figure 3. (a) Articulogram extracted from a recording of the vowel sequence

/aei/ by our SLL. (b) Spectrogram of the same vowel sequence re-synthesized by our integrated circuit.

(c)

III.

C ONCLUSION

We discussed the application of a speech-locked-loop to cochlear implants and to brain-machine speech prostheses. The

SLL is based on a physiological model of the human vocal tract such that it inherently synthesizes all and only speech signals.

We showed how to exploit this signal-restorative property for improving hearing in noise. We also examined the requirements of speech coding in the context of brain-machinebased speech prostheses and illustrated that coding speech with articulatory parameters is efficient and robust.

Figure 2. (a) Spectrogram of a recording of the word “hid.” (b) Spectrogram of the same recording degraded in -2dB SNR white noise. (c) Spectrogram of the word “hid” re-synthesized by our SLL from the noise degraded version.

Figure 3 shows how the speech synthesizer can generate speech from just a few, slowly-varying, and low-resolution articulatory parameters. Using a vocal-tract model to produce speech has many benefits [4]: Speech formants are automatically generated by vocal-tract resonances, leading to high quality and natural sounds. Articulatory parameters may be linearly interpolated such that the transition between speech sounds does not require frequent abrupt transitions in the control signals. In addition, as the control-variable sequence from phoneme to phoneme can be linearly interpolated, the learning process is simplified. Speech synthesis based on articulatory parameters is also more robust to noise compared with conventional synthesizers where control parameters are known to be very sensitive to small perturbations.

R EFERENCES

[1] R. Sarpeshkar, Ultra Low Power Bioelectronics : Fundamentals,

Biomedical Applications, and Bio-inspired Systems, Cambridge

University Press, Cambridge, U.K., February 2010.

[2] K. H. Wee, L. Turicchia and R. Sarpeshkar, “An Analog Integrated-

Circuit Vocal Tract,” IEEE Transactions on Biomedical Circuits and

Systems, Vol. 2, No. 4, pp. 316 - 327, 2008.

[3] R. Sarpeshkar, et al, “An Ultra-Low-Power Programmable Analog

Bionic Ear Processor,” IEEE Transactions on Biomedical Engineering ,

Vol. 52, No. 4, pp. 711-727, 2005.

[4] K. H. Wee, L. Turicchia, R. Sarpeshkar, “An Articulatory Speech-

Prosthesis System,” Proceedings of the IEEE International Conference on Body Sensor Networks (BSN 2010), pp. 133-138, 7-9 June 2010.

This work was supported in part by grants NIH NS-056140 and ONR

N00014-09-1-1015.

978-1-4244-8132-3/10/$26.00 ©2010 IEEE

Download