Arabic Phonemes Transcription Using Learning Vector Quantization: 'Towards the Development of Fast Quranic Text Transcription'

Khalid M.O. Nahar, Wasfi G. Al-Khatib, Moustafa Elshafei, Husni Al-Muhtaseb, Mansour M. Alghamdi

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

In this paper, we investigated the use of Learning Vector Quantization (LVQ) for phoneme transcription in Arabic speech recognition systems. We used Arabic speech corpus of TV news clips. Then, we employed feature vectors, which embed the frame neighboring correlation information between adjacent phonemes to replace the traditional trip hones models. Next, we generated the phonemes codebooks using the K-means splitting algorithm. After that, we trained the generated codebooks using the LVQ algorithm. When using the trained LVQ codebooks in utterance phoneme transcription of an open vocabulary test corpus, the phoneme recognition rate was 72% without the use of any added phoneme big rams or HMM models. The results of this research if improved could be used to serve the holy Quran text transcription without any phonemes big rams (phonemes language model). This would increase the speed of the Quranic speech to text transcription and creates the infrastructure of suitable high speed automatic identification system of Quranic sounds recognition and translation.

Original languageEnglish
Title of host publicationProceedings - 2013 Taibah University International Conference on Advances in Information Technology for the Holy Quran and Its Sciences, NOORIC 2013
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages407-412
Number of pages6
ISBN (Electronic)9781479928231
DOIs
StatePublished - 25 Sep 2015

Publication series

NameProceedings - 2013 Taibah University International Conference on Advances in Information Technology for the Holy Quran and Its Sciences, NOORIC 2013

Bibliographical note

Publisher Copyright:
© 2015 IEEE.

Keywords

  • K-means
  • LVQ
  • Quranic Speech Recognition
  • codebooks
  • phoneme bigrams

ASJC Scopus subject areas

  • Information Systems

Fingerprint

Dive into the research topics of 'Arabic Phonemes Transcription Using Learning Vector Quantization: 'Towards the Development of Fast Quranic Text Transcription''. Together they form a unique fingerprint.

Cite this