Title : Non-linear speech feature extraction for speaker recognition

advertisement
M. Chetouani, M. Faundez-Zanuy(*), B. Gas, J.L. Zarader
Laboratoire des Instruments et Systèmes d’Ile-De-France (LISIF)
Université Pierre et Marie Curie (UPMC)
Paris, France
(*) Escola Universitària Politècnica de Mataró, Barcelone,
ESPAGNE
Title : Non-linear speech feature extraction for phoneme classification and speaker
recognition.
Abstract :
Feature extraction is an important stage in pattern classification. The main objective of the
feature extraction is to extract relevant characteristics for the next stage which is the
classification stage. It is usually done in a same way for phoneme classification and for
speaker recognition whereas the final purpose is different.
In this talk, We will present a feature extractor based on neural networks: the Neural
Predictive Coding (NPC). It is a non-linear extension of the well known Linear Predictive
Coding (LPC) method.
In phoneme classification, we will discuss the importance of integrating data-driven methods
for the task. A good feature extractor has to modelize non-linear characteristics and also to
integrate discrimination between these characteristics. As we will see, the NPC model is an
appropriate model to do simultaneously these two tasks.
In speaker recognition, we will show how to extract speaker-dependent characteristics. One
way is to allocate one NPC model to one speaker. All the recognizer (feature extractor and
classifier) is then allocated to one speaker. Best results that we will present have been
obtained using this NPC structure.
A general presentation of the NPC model will be given and some results obtained on speaker
recognition and phoneme classification tasks will be discussed. One of our goal will be to
show the importance of the coding stage and more particularly the interest for taking into
account non linear features.
Download