TY - GEN
T1 - Small-word pronunciation modeling for Arabic speech recognition
T2 - A data-driven approach
AU - Abuzeina, Dia
AU - Al-Khatib, Wasfi
AU - Elshafei, Moustafa
PY - 2011
Y1 - 2011
N2 - Incorrect recognition of adjacent small words is considered one of the obstacles in improving the performance of automatic continuous speech recognition systems. The pronunciation variation in the phonemes of adjacent words introduces ambiguity to the triphone of the acoustic model and adds more confusion to the speech recognition decoder. However, small words are more likely to be affected by this ambiguity than longer words. In this paper, we present a data-driven approach to model the small words problem. The proposed method identifies the adjacent small words in the corpus transcription to generate the compound words. The unique compound words are then added to the expanded pronunciation dictionary, as well as to the language model as a new sentence. Results show a significant improvement of 2.16% in the word error rate compared to that of the Baseline speech corpus of Modern Standard Arabic broadcast news.
AB - Incorrect recognition of adjacent small words is considered one of the obstacles in improving the performance of automatic continuous speech recognition systems. The pronunciation variation in the phonemes of adjacent words introduces ambiguity to the triphone of the acoustic model and adds more confusion to the speech recognition decoder. However, small words are more likely to be affected by this ambiguity than longer words. In this paper, we present a data-driven approach to model the small words problem. The proposed method identifies the adjacent small words in the corpus transcription to generate the compound words. The unique compound words are then added to the expanded pronunciation dictionary, as well as to the language model as a new sentence. Results show a significant improvement of 2.16% in the word error rate compared to that of the Baseline speech corpus of Modern Standard Arabic broadcast news.
KW - Modern Standard Arabic
KW - Speech recognition
KW - language model
KW - phonetic dictionary
KW - pronunciation variation
KW - small-word
UR - http://www.scopus.com/inward/record.url?scp=84255178458&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-25631-8_48
DO - 10.1007/978-3-642-25631-8_48
M3 - Conference contribution
AN - SCOPUS:84255178458
SN - 9783642256301
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 529
EP - 537
BT - Information Retrieval Technology - 7th Asia Information Retrieval Societies Conference, AIRS 2011, Proceedings
ER -