PRONUNCIATION VARIATION MODELING FOR ARABIC ASRs: A

advertisement
PRONUNCIATION VARIATION MODELING FOR ARABIC ASRs: A DIRECT DATA-DRIVEN
APPROACH
By:Abuzeina, D (Abuzeina, Dia)[ 1 ]; Elshafei, M (Elshafei, Moustafa)[ 1 ]
Book Group Author(s):ASME
THIRD INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING AND
TECHNOLOGY (ICCET 2011)
Pages: 325-330
Published: 2011
View Journal Information
Abstract
Pronunciation variation is a major obstacle in improving the performance of Arabic
automatic speech recognition (ASR) systems. This phenomenon alters the pronunciation
spelling of words beyond their listed forms in the phonetic dictionary, leading to a number
of out of vocabulary (OOV) word forms. This paper presents a direct data-driven approach
to model pronunciation variations, in which the pronunciation variants are distilled from
the training speech. The proposed method is based on adding the pronunciation variants to
the ASR pronunciation dictionary as well as to the language model. We started with a
baseline Arabic speech recognition system using Carnegie Mellon University (CMU)
Sphinx3 speech recognition engine. The baseline is based on a 5.4 hour speech corpus of
Modern Standard Arabic (MSA) broadcast news, with a phonetic dictionary of 14,234
canonical pronunciations. The baseline system achieves a WER of 13.39%. The proposed
method identifies the variations in the phonetic transcription of the spoken words. The
phonemic variants of words are then filtered and added with distinctive letter spellings in
an expanded phonetic dictionary. In our experiment, 554 variants were added to the basic
phonetic dictionary as new words. The artificially added words are used together with
their sentences in the language model as well. Our results show that while the expanded
dictionary alone did not add appreciable improvements, the WER is reduced by 2.04%
when the variants are considered within the language model.
Keywords
Author Keywords:Speech recognition; pronunciation variation; direct data-driven
approach; pronunciation dictionary adaptation; Modern Standard Arabic
Author Information
Reprint Address: Abuzeina, D (reprint author)
King Fahd Univ Petr & Minerals, Dhahran, Saudi Arabia.
Organization-Enhanced Name(s)
King Fahd University of Petroleum & Minerals
Addresses:
[ 1 ] King Fahd Univ Petr & Minerals, Dhahran, Saudi Arabia
Organization-Enhanced Name(s)
King Fahd University of Petroleum & Minerals
E-mail Addresses:abuzeina@kfupm.edu.sa; shafei@mit.edu
Document Information
Document Type:Proceedings Paper
Language:English
Accession Number: WOS:000320340300052
ISBN:978-0-7918-5973-5
Download