Abstract
In attempt to increase the rate of Arabic phonemes recognition, we introduce a novel hybrid recognition algorithm. The algorithm is composed of the learning vector quantization (LVQ) and hidden Markov model (HMM). The hybrid algorithm used to recognizing Arabic phonemes in continuous open-vocabulary speech. A recorded Arabic corpus of different TV news for modern standard Arabic was used for training and testing purposes. We employ a data driven approach to generate the training feature vectors that embed the frame neighboring correlation information. Next, we generate the phonemes codebooks using the K-means splitting algorithm. Then, we trained the generated codebooks using the LVQ algorithm. We achieved a performance of 98.49 % during independent classification training and 90 % during dependent classification training. When using the trained LVQ codebooks in Arabic utterance transcription, the phoneme recognition rate was 72 % using LVQ only. We combined the LVQ codebooks with the single state HMM model using enhanced Viterbi algorithm which includes the phonemes bigrams. We achieved 89 % of Arabic phonemes recognition rate based on the hybrid LVQ/HMM algorithm.
Original language | English |
---|---|
Pages (from-to) | 495-508 |
Number of pages | 14 |
Journal | International Journal of Speech Technology |
Volume | 19 |
Issue number | 3 |
DOIs | |
State | Published - 1 Sep 2016 |
Bibliographical note
Publisher Copyright:© 2016, Springer Science+Business Media New York.
Keywords
- Codebooks
- Hidden Markov model (HMM)
- Hybrid LVQ/HMM model
- K-means algorithm
- Learning vector quantization (LVQ)
- Phonemes transcription
ASJC Scopus subject areas
- Software
- Language and Linguistics
- Human-Computer Interaction
- Linguistics and Language
- Computer Vision and Pattern Recognition