FUSION-BASED CONVOLUTIONAL RECURRENT NEURAL NETWORK FOR IMPROVED DYNAMIC THAI FINGERSPELLING RECOGNITION

  • Teerapong Sungsri
  • , Teerasak Sungsri
  • , Emmanuel Okafor
  • , Olarik Surinta*
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Sign language recognition (SLR) has been an active research area due to the difficulty of interpreting hand and upper body movements in real life. Dynamic finger-spelling recognition is a very challenging task due to the problem associated with algo-rithms attempting to understand the meaning of fingerspelling from real-time videos. In this research, we propose the fusion-based convolutional recurrent neural network (CR-NN) that fuses a three-dimensional convolutional neural network (3D-CNN) and CNN model for extracting robust spatiotemporal features from the sequential images in a video. The fusion based CRNN framework was divided into deep feature extraction and sequence learning modules. In the deep feature extraction, the video was extracted and only 32 frames were selected. Additionally, we trained a YOLOv5 model for detecting or localizing the upper body of a human designed region of interest (ROI). After calculat-ing the ROI, it was sent to 3D-CNN and CNN to extract the solid sequential features. Furthermore, an addition operator was used in merging the sequential features, and the resulting features were passed to a sequence learning mechanism (bidirectional long short-term memory) in creating a robust model for recognizing dynamic fingerspelling. In the experiments, we evaluated the fusion based CRNN on the dynamic Thai fingerspelling dataset, including short videos of 42 classes from 3,025 videos. The experimental results indicated that the fusion based CRNN achieved an accuracy of 91.73% on the dynamic Thai fingerspelling dataset and outperformed the existing method.

Original languageEnglish
Pages (from-to)201-210
Number of pages10
JournalICIC Express Letters, Part B: Applications
Volume16
Issue number2
DOIs
StatePublished - Feb 2025

Bibliographical note

Publisher Copyright:
© 2025 ICIC International.

Keywords

  • 3D convolutional neural network
  • Bidirectional long short-term memory
  • Dynamic fingerspelling recognition
  • Fusion strategy
  • Spatiotemporal feature

ASJC Scopus subject areas

  • General Computer Science

Fingerprint

Dive into the research topics of 'FUSION-BASED CONVOLUTIONAL RECURRENT NEURAL NETWORK FOR IMPROVED DYNAMIC THAI FINGERSPELLING RECOGNITION'. Together they form a unique fingerprint.

Cite this