Abstract
This paper presents a multimodal biometric system fusing photoplethysmography (PPG) signals and fingerprints for robust human verification. Instead of relying on heterogeneous biosensors, the PPG signals and fingerprints are both obtained through video recordings from a smartphone's camera, as users place their fingers on the lens. To capture the unique characteristics of each user, we propose a homogeneous neural network consisting of two structured state space model (SSM) encoders to handle the distinct modalities. Specifically, the fingerprint images are flattened into sequences of pixels, which, along with segmented PPG beat waveforms, are fed into the encoders. This is followed by a cross-modal attention mechanism to learn more nuanced feature representations. Furthermore, their feature distributions are aligned within a unified latent space, utilizing a distribution-oriented contrastive loss. This alignment facilitates the learning of intrinsic and transferable intermodal relationships, thereby improving the system's performance with unseen data. Experimental results on the datasets collected for this study demonstrate the superiority of the proposed approach, validated across a broad range of evaluation metrics in both single-session and two-session authentication scenarios. The system achieved an accuracy of 100% and an equal error rate (EER) of 0.1% for single-session data, and an accuracy of 94.3% and an EER of 6.9% for two-session data.
| Original language | English |
|---|---|
| Pages (from-to) | 1-7 |
| Number of pages | 7 |
| Journal | Pattern Recognition Letters |
| Volume | 197 |
| DOIs | |
| State | Published - Nov 2025 |
Bibliographical note
Publisher Copyright:© 2025 Elsevier B.V.
Keywords
- Biometrics
- Distribution alignment
- Fingerprints
- Multimodal deep learning
- PPG
- SSM
ASJC Scopus subject areas
- Software
- Signal Processing
- Computer Vision and Pattern Recognition
- Artificial Intelligence