TY - GEN
T1 - A database for offline Arabic handwritten text recognition
AU - Mahmoud, Sabri A.
AU - Ahmad, Irfan
AU - Alshayeb, Mohammed
AU - Al-Khatib, Wasfi G.
PY - 2011
Y1 - 2011
N2 - Arabic handwritten text recognition has not received the same attention as that directed towards Latin script-based languages. In this paper, we present our efforts to develop a comprehensive Arabic Handwritten Text database (AHTD). At this stage, the database will consist of text written by 1000 writers from different countries. Currently, it has data from over 300 writers. It is composed of an images database containing images of the written text at various resolutions, and a ground truth database that contains meta-data describing the written text at the page, paragraph, and line levels. Tools to extract paragraphs from pages, segment paragraphs into lines have also been developed. Segmentation of lines into words will follow. The database will be made freely available to researchers world-wide. It is hoped that the AHTD database will stir research efforts in various handwritten-related problems such as text recognition, and writer identification and verification.
AB - Arabic handwritten text recognition has not received the same attention as that directed towards Latin script-based languages. In this paper, we present our efforts to develop a comprehensive Arabic Handwritten Text database (AHTD). At this stage, the database will consist of text written by 1000 writers from different countries. Currently, it has data from over 300 writers. It is composed of an images database containing images of the written text at various resolutions, and a ground truth database that contains meta-data describing the written text at the page, paragraph, and line levels. Tools to extract paragraphs from pages, segment paragraphs into lines have also been developed. Segmentation of lines into words will follow. The database will be made freely available to researchers world-wide. It is hoped that the AHTD database will stir research efforts in various handwritten-related problems such as text recognition, and writer identification and verification.
KW - Arabic Handwritten Text Database
KW - Arabic OCR
KW - Document Analysis
KW - Form Processing
UR - http://www.scopus.com/inward/record.url?scp=79960309203&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-21596-4_40
DO - 10.1007/978-3-642-21596-4_40
M3 - Conference contribution
AN - SCOPUS:79960309203
SN - 9783642215957
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 397
EP - 406
BT - Image Analysis and Recognition - 8th International Conference, ICIAR 2011, Proceedings
ER -