Document layout analysis: A comprehensive survey

Galal M. Binmakhashen, Sabri A. Mahmoud

Research output: Contribution to journalReview articlepeer-review

127 Scopus citations

Abstract

Document layout analysis (DLA) is a preprocessing step of document understanding systems. It is responsible for detecting and annotating the physical structure of documents. DLA has several important applications such as document retrieval, content categorization, text recognition, and the like. The objective of DLA is to ease the subsequent analysis/recognition phases by identifying the document-homogeneous blocks and by determining their relationships. The DLA pipeline consists of several phases that could vary among DLA methods, depending on the documents' layouts and final analysis objectives. In this regard, a universal DLA algorithm that fits all types of document-layouts or that satisfies all analysis objectives has not been developed, yet. In this survey paper, we present a critical study of different document layout analysis techniques. The study highlights the motivational reasons for pursuing DLA and discusses comprehensively the different phases of the DLA algorithms based on a general framework that is formed as an outcome of reviewing the research in the field. The DLA framework consists of preprocessing, layout analysis strategies, post-processing, and performance evaluation phases. Overall, the article delivers an essential baseline for pursuing further research in document layout analysis.

Original languageEnglish
Article number109
JournalACM Computing Surveys
Volume52
Issue number6
DOIs
StatePublished - Oct 2019

Bibliographical note

Publisher Copyright:
© 2019 Association for Computing Machinery.

Keywords

  • Document image retrieval
  • Document image understanding
  • Document segmentation
  • Document structure analysis
  • Layout analysis
  • Physical document structure

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Document layout analysis: A comprehensive survey'. Together they form a unique fingerprint.

Cite this