Abstract
Improving primary school students’ reading skills supports their academic growth and communication abilities. Pronunciation accuracy is central to reading, especially in Arabic, where small diacritic changes can alter meaning. This is complicated by Arabic’s low-resource nature. This study developed a Mispronunciation Detection and Diagnosis (MDD) system for Arabic learners, allowing teachers and learners to use Computer-Assisted Pronunciation Training (CAPT) for improved instruction and assessment. A pretrained self-supervised learning (SSL) model was fine-tuned to detect phoneme-level pronunciation errors in Modern Standard Arabic using a unique dataset of primary school learner speech from Saudi Arabia. The dataset consists of continuous speech from young native speakers (ages 8-11) recorded in uncontrolled environments. The data were structured, preprocessed, normalized, and aligned to phoneme sequences. The system showed improved phoneme recognition and performance approaching that of a human expert with an F1 score of 71.42%.
| Original language | English |
|---|---|
| Pages (from-to) | 175047-175068 |
| Number of pages | 22 |
| Journal | IEEE Access |
| Volume | 13 |
| DOIs | |
| State | Published - 2025 |
Bibliographical note
Publisher Copyright:© 2013 IEEE.
Keywords
- Arabic mispronunciation dataset
- Arabic mispronunciation detection and diagnosis (MDD)
- artificial intelligence (AI) in education
- computer-assisted pronunciation training (CAPT)
- phoneme alignment
- self-supervised learning (SSL)
- speech technology for underrepresented languages
- transfer learning
ASJC Scopus subject areas
- General Computer Science
- General Materials Science
- General Engineering