Oran: A basis for an arabic OCR system

  • Abdelmalek Zidouri*
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

5 Scopus citations

Abstract

In this paper we present a system called ORAN (Offline Recognition of Arabic characters and Numerals). This system is based on a method called modified MCR (Minimum Covering Run) expression for document images. Using the correspondence between binary images and bipartite graphs, the MCR expression can be found by constructing a minimum covering or maximum matching in the corresponding graph. We use the structural information obtained from this expression, to describe the strokes of characters according to some extracted features. These are obtained after a zoning scheme, where the baseline is detected and the line of text divided into four zones. Reference prototypes for the system are built according to a structural description of characters in some model documents. By this method we overcome the problem of segmentation that is inherent to Arabic characters even when they are machine printed or typed. Simple matching is performed for the candidate characters to reference prototypes. A recognition rate of more than 97% is achieved.

Original languageEnglish
Title of host publication2004 International Symposium on Intelligent Multimedia, Video and Speech Processing, ISIMP 2004
Pages703-706
Number of pages4
StatePublished - 2004

Publication series

Name2004 International Symposium on Intelligent Multimedia, Video and Speech Processing, ISIMP 2004

ASJC Scopus subject areas

  • General Engineering

Fingerprint

Dive into the research topics of 'Oran: A basis for an arabic OCR system'. Together they form a unique fingerprint.

Cite this