Disentangling 3D/4D Facial Affect Recognition with Faster Multi-View Transformer

Muzammil Behzad, Xiaobai Li, Guoying Zhao*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

12 Scopus citations

Abstract

In this paper, we propose MiT: A novel multi-view transformer model1 for 3D/4D facial affect recognition. MiT incorporates patch and position embeddings from various patches of multi-views and uses them for learning various facial muscle movements to showcase an effective recognition performance. We also propose a multi-view loss function that is not only gradient-friendly, and hence speeds up the gradient computation during back-propagation, but it also leverages the correlation associated with the underlying facial patterns among multi-views. Additionally, we offer multi-view weights that are trainable and learnable, and help substantially in training. Finally, we equip our model with distributed performance for faster learning and computational convenience. With the help of extensive experiments, we show that our model outperform the existing methods on widely-used datasets for 3D/4D FER.

Original languageEnglish
Pages (from-to)1913-1917
Number of pages5
JournalIEEE Signal Processing Letters
Volume28
DOIs
StatePublished - 2021
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 1994-2012 IEEE.

Keywords

  • 3D/4D faces
  • Affect
  • emotion recognition
  • multi-views
  • transformer

ASJC Scopus subject areas

  • Signal Processing
  • Electrical and Electronic Engineering
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'Disentangling 3D/4D Facial Affect Recognition with Faster Multi-View Transformer'. Together they form a unique fingerprint.

Cite this