Pedestrian Crossing Intent Prediction Using Vision Transformers

  • Ahmed Elgazwy*
  • , Somayya Elmoghazy*
  • , Khalid Elgazzar*
  • , Alaa Khamis*
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The prediction of pedestrian intentions is crucial and one of the most challenging problems for self-driving vehicles. For this reason, a fast, efficient, and robust vision-based model is required to predict pedestrian crossing as fast as possible and to prevent serious injuries or casualties that may occur. Transformers have rapidly replaced recurrent neural networks (RNN) based architectures for their better generalization and fast performance. Vision transformer (ViT) is a variant of transformers that has also proven to be efficient in image classification and has outperformed the state-of-the-art convolutional neural networks (CNN) when trained on large datasets. In this paper, a fully transformer-based architecture is presented to efficiently predict pedestrian intention with minimum latency. The proposed architecture is composed of two branches: the first branch handles the non-visual features while the second branch handles the visual features. The model is trained on the Joint Attention in Autonomous Driving (JAAD) dataset and different variants of the architecture are tested to find the optimal model. Experimental analysis shows that the proposed model outperforms all the previous state-of-the-art techniques, achieving the highest accuracy (83 %) and F1 score (64 %) on the testing dataset while maintaining the lowest processing time.

Original languageEnglish
Title of host publication2024 IEEE 27th International Conference on Intelligent Transportation Systems, ITSC 2024
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1211-1216
Number of pages6
ISBN (Electronic)9798331505929
DOIs
StatePublished - 2024
Externally publishedYes
Event27th IEEE International Conference on Intelligent Transportation Systems, ITSC 2024 - Edmonton, Canada
Duration: 24 Sep 202427 Sep 2024

Publication series

NameIEEE Conference on Intelligent Transportation Systems, Proceedings, ITSC
ISSN (Print)2153-0009
ISSN (Electronic)2153-0017

Conference

Conference27th IEEE International Conference on Intelligent Transportation Systems, ITSC 2024
Country/TerritoryCanada
CityEdmonton
Period24/09/2427/09/24

Bibliographical note

Publisher Copyright:
© 2024 IEEE.

Keywords

  • Pedestrian Crossing Intent Prediction
  • Self-driving Vehicles
  • Transformers
  • Vision Transformers

ASJC Scopus subject areas

  • Automotive Engineering
  • Mechanical Engineering
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'Pedestrian Crossing Intent Prediction Using Vision Transformers'. Together they form a unique fingerprint.

Cite this