Abstract
Continuous sign language recognition (CSLR) focuses on interpreting and transcribing sequences of sign language gestures in videos. In this work, we propose CLIP sign language adaptation (CLIP-SLA), a novel CSLR framework that leverages the powerful pre-trained visual encoder from the CLIP model to sign language tasks through parameterefficient fine-tuning (PEFT). We introduce two variants, SLA-Adapter and SLA-LoRA, which integrate PEFT modules into the CLIP visual encoder, enabling fine-tuning with minimal trainable parameters. The effectiveness of the proposed frameworks is validated on four datasets: Phoenix2014, Phoenix2014-T, CSL-Daily, and Isharah500, where both CLIP-SLA variants outperformed several SOTA models with fewer trainable parameters. Extensive ablation studies emphasize the effectiveness and flexibility of the proposed methods with different vision-language models for CSLR. These findings showcase the potential of adapting large-scale pre-trained models for scalable and efficient CSLR, which pave the way for future advancements in sign language understanding. Code is available at https://github.com/snalyami/CLIP-SLA.
| Original language | English |
|---|---|
| Title of host publication | Proceedings - 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2025 |
| Publisher | IEEE Computer Society |
| Pages | 4098-4108 |
| Number of pages | 11 |
| ISBN (Electronic) | 9798331599942 |
| DOIs | |
| State | Published - 2025 |
| Event | 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2025 - Nashville, United States Duration: 11 Jun 2025 → 12 Jun 2025 |
Publication series
| Name | IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops |
|---|---|
| ISSN (Print) | 2160-7508 |
| ISSN (Electronic) | 2160-7516 |
Conference
| Conference | 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2025 |
|---|---|
| Country/Territory | United States |
| City | Nashville |
| Period | 11/06/25 → 12/06/25 |
Bibliographical note
Publisher Copyright:© 2025 IEEE.
Keywords
- continuous sign language recognition
- gesture recognition
- parameter-efficient fine-tuning
- sign language recognition
- vision-language models
ASJC Scopus subject areas
- Computer Vision and Pattern Recognition
- Electrical and Electronic Engineering