Improving Handwritten Arabic Text Recognition Using an Adaptive Data-Augmentation Algorithm

Mohamed Eltay, Abdelmalek Zidouri*, Irfan Ahmad, Yousef Elarian

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

Deep learning has increased the performance of classification and object detection, but it generally requires large amounts of labeled data for training. In this paper, we introduce a new data augmentation algorithm that promotes diversity between classes, representing the characters of the Arabic script, and can balance samples between different classes. This algorithm gives each word in the lexicon a weight. The weight of a word is based on the occurrence probabilities of the characters constituting the word. Minority classes are given higher weight as compared to the classes frequently occurring in the text. The data augmentation technique was evaluated on a handwritten word recognition task using the publicly available IFN/ENIT and AHDB datasets. We see significant improvement in results by employing our data augmentation technique, and we achieve state-of-the-art results on both datasets.

Original languageEnglish
Title of host publicationDocument Analysis and Recognition – ICDAR 2021 Workshops - Proceedings
EditorsElisa H. Barney Smith, Umapada Pal
PublisherSpringer Science and Business Media Deutschland GmbH
Pages322-335
Number of pages14
ISBN (Print)9783030861971
DOIs
StatePublished - 2021

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12916 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Bibliographical note

Funding Information:
Acknowledgment. This research was supported by the King Fahd University of Petroleum and Minerals (KFUPM).

Publisher Copyright:
© 2021, Springer Nature Switzerland AG.

Keywords

  • Connectionist temporal classification
  • Data augmentation
  • Deep Learning Neural Network
  • Handwriting recognition
  • Recurrent Neural Network

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science (all)

Fingerprint

Dive into the research topics of 'Improving Handwritten Arabic Text Recognition Using an Adaptive Data-Augmentation Algorithm'. Together they form a unique fingerprint.

Cite this