Abstract
Layout analysis is the main component of a typical Document Image Analysis (DIA) system and plays an important role in pre-processing. However, regarding the Pashto language, the document images have not been explored so far. This research, for the first time, examines Pashto text along with graphics and proposes a deep learning-based classifier that can detect Pashto text and graphics per document. Another notable contribution of this research is the creation of a real dataset, which contains more than 1,000 images of the Pashto documents captured by a camera. For this dataset, we applied the convolution neural network (CNN) following a deep learning technique. Our intended method is based on the development of the advanced and classical variant of Faster R-CNN called Single-Shot Detector (SSD). The evaluation was performed by examining the 300 images from the test set. Through this way, we achieved a mean average precision (mAP) of 84.90%.
| Original language | English |
|---|---|
| Article number | e2089 |
| Journal | PeerJ Computer Science |
| Volume | 10 |
| DOIs | |
| State | Published - 2024 |
| Externally published | Yes |
Bibliographical note
Publisher Copyright:© 2024 Bahadar et al.
Keywords
- Deep learning
- Document images
- Graphic detection
- Script detection
ASJC Scopus subject areas
- General Computer Science
Fingerprint
Dive into the research topics of 'Pashto script and graphics detection in camera captured Pashto document images using deep learning model'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver