Arabic phonemes transcription using data driven approach

Khalid Nahar, Husni Al-Muhtaseb, Wasfi Al-Khatib, Moustafa Elshafei, Mansour Alghamdi

Research output: Contribution to journalArticlepeer-review

14 Scopus citations

Abstract

The efficiency and correctness of continuous Arabic Speech Recognition Systems (ARS) hinge on the accuracy of the language phoneme set. The main goal of this research is to recognize and transcribe Arabic phonemes using a data-driven approach. We used the Hidden Markov Toolkit (HTK) to develop a phoneme recognizer, carrying out several experiments with different parameters, such as varying number of Hidden Markov Model (HMM) states and Gaussian mixtures to model the Arabic phonemes and find the best configuration. We used a corpus consisting of about 4000 files, representing 5 recorded hours of Modern Standard Arabic (MSA) of TV-News. A statistical analysis for the phonemes length, frequency and mode was carried out, in order to determine the best number of states necessary to represent each phoneme. Phoneme recognition accuracy of 56.79% was reached without using a language model. The recognition accuracy increased to 96.3% upon using a bigram language model.

Original languageEnglish
Pages (from-to)237-245
Number of pages9
JournalInternational Arab Journal of Information Technology
Volume12
Issue number3
StatePublished - 2015

Bibliographical note

Publisher Copyright:
© 2015, Zarka Private Univ. All rights reserved.

Keywords

  • Arabic speech corpus
  • Data-driven
  • Network lattices
  • Phoneme transcription
  • Speech recognition

ASJC Scopus subject areas

  • General Computer Science

Fingerprint

Dive into the research topics of 'Arabic phonemes transcription using data driven approach'. Together they form a unique fingerprint.

Cite this