Knowledge distillation with predicted depth for robust and lightweight face presentation attack detection

Research output: Contribution to journalArticlepeer-review

Abstract

Face Presentation Attack Detection (FacePAD) is critical for safeguarding face recognition systems against spoofing attempts, including printed photos, video replays, and 3D masks. However, many existing approaches struggle with generalization across diverse attack types and real-world conditions. In this study, we propose a dual-branch deep learning framework that leverages both RGB images and synthetically predicted depth maps to improve anti-spoofing robustness and accuracy. A monocular depth estimation network is used to generate depth cues from a single RGB image, which are then processed in parallel with the original image through two distinct branches of a convolutional neural network. The extracted features-texture-based from RGB and structure-aware from depth-are fused via concatenation to facilitate more discriminative spoof detection. Extensive experiments on four benchmark datasets demonstrate that our method achieves state-of-the-art performance, reducing HTER to 0 % on Replay-Attack and Replay-Mobile, and 1.023 % on ROSE-Youtu. Similarly, an ACER of 0.56 % is achieved on OULU-NPU, while maintaining computational efficiency. Furthermore, we introduce a knowledge distillation scheme to compress the dual-branch model into a lightweight single-branch variant suitable for real-time deployment in mobile authentication, surveillance, and biometric access control scenarios.

Original languageEnglish
Article number114325
JournalKnowledge-Based Systems
Volume329
DOIs
StatePublished - 4 Nov 2025

Bibliographical note

Publisher Copyright:
© 2025 Elsevier B.V.

Keywords

  • Deep learning
  • Face anti-spoofing
  • Face presentation attack detection
  • MobileNetV3
  • Spatio-temporal

ASJC Scopus subject areas

  • Management Information Systems
  • Software
  • Information Systems and Management
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Knowledge distillation with predicted depth for robust and lightweight face presentation attack detection'. Together they form a unique fingerprint.

Cite this