Utilizing motion and spatial features for sign language gesture recognition using cascaded CNN and LSTM models

Hamzah Luqman*, El Sayed M. El-Alfy

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

10 Scopus citations

Abstract

Sign language is a language produced by body parts gestures and facial expressions. The aim of an automatic sign language recognition system is to assign meaning to each sign gesture. Recently, several computer vision systems have been proposed for sign language recognition using a variety of recognition techniques, sign languages, and gesture modalities. However, one of the challenging problems involves image preprocessing, segmentation, extraction and tracking of relevant static and dynamic features related to manual and nonmanual gestures from different images in sequence. In this paper, we studied the efficiency, scalability, and computation time of three cascaded architectures of convolutional neural network (CNN) and long short-term memory (LSTM) for the recognition of dynamic sign language gestures. The spatial features of dynamic signs are captured using CNN and fed into a multilayer stacked LSTM for temporal information learning. To track the motion in video frames, the absolute temporal differences between consecutive frames are computed and fed into the recognition system. Several experiments have been conducted on three benchmarking datasets of two sign languages to evaluate the proposed models. We also compared the proposed models with other techniques. The attained results show that our models capture better spatio-temporal features pertaining to the recognition of various sign language gestures and consistently outperform other techniques with over 99% accuracy.

Original languageEnglish
Pages (from-to)2508-2525
Number of pages18
JournalTurkish Journal of Electrical Engineering and Computer Sciences
Volume30
Issue number7
DOIs
StatePublished - 2022

Bibliographical note

Publisher Copyright:
© TÜBİTAK.

Keywords

  • Arabic sign language recognition
  • CNN-LSTM
  • Sign language recognition
  • action recognition
  • gesture recognition
  • sign language translation

ASJC Scopus subject areas

  • General Computer Science
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Utilizing motion and spatial features for sign language gesture recognition using cascaded CNN and LSTM models'. Together they form a unique fingerprint.

Cite this