Handwritten Arabic text recognition using multi-stage sub-core-shape HMMs

Irfan Ahmad*, Gernot A. Fink

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

23 Scopus citations

Abstract

In this paper, we present a multi-stage HMM-based text recognition system for handwritten Arabic. This system employs a novel way of representing Arabic characters by separating the core shapes from the diacritics and then representing these core shapes by smaller units which we term as sub-core shapes. This results in huge reductions in the number of models that need to be trained for the text recognition task. Further, contextual HMM modeling utilizing these sub-core shapes is presented which demonstrates that using sub-core shapes as models improves the contextual HMM system in comparison with a contextual HMM system employing the standard Arabic character shapes as models, and it leads to significantly compact recognizer at the same time. Furthermore, multi-stream contextual sub-core-shape HMMs are presented where the features computed from a sliding window form one stream and its horizontal derivative features are the second stream with each stream having different weights. The system is evaluated on two publicly available databases for different text recognition tasks including conditions where little training data are available. The presented system outperforms the standard character-shape system on all the text recognition tasks on both the databases.

Original languageEnglish
Pages (from-to)329-349
Number of pages21
JournalInternational Journal on Document Analysis and Recognition
Volume22
Issue number3
DOIs
StatePublished - 1 Sep 2019

Bibliographical note

Publisher Copyright:
© 2019, Springer-Verlag GmbH Germany, part of Springer Nature.

Keywords

  • Arabic sub-core shapes
  • Arabic text recognition
  • Handwritten text recognition
  • Hidden Markov models
  • Multi-stage text recognition
  • Separating core shapes and diacritics

ASJC Scopus subject areas

  • Software
  • Computer Vision and Pattern Recognition
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'Handwritten Arabic text recognition using multi-stage sub-core-shape HMMs'. Together they form a unique fingerprint.

Cite this