Abstract
Audio-visual speech enhancement is the task of improving the quality of a speech signal when video of the speaker is available. It opens-up the opportunity of improving speech intelligibility in adverse listening scenarios that are currently too challenging for audio-only speech enhancement models. The Audio-Visual Speech Enhancement (AVSE) challenge aims to set the first benchmark in this area. We provide participants with datasets and scripts to test their audio-visual speech enhancement models under a common framework for both training and evaluation. The data is derived from real-world videos, and comprises noisy mixes, in which audio from target speaker is mixed with either a competing speaker or a noise signal. The submitted systems are evaluated by conducting AV intelligibility tests involving human participants. We expect this challenge to be a platform for advancing the field of audio-visual speech-enhancement and to provide further insight about the scope and limitations of current AV speech enhancement approaches.
| Original language | English |
|---|---|
| Title of host publication | 2022 IEEE Spoken Language Technology Workshop, SLT 2022 - Proceedings |
| Publisher | Institute of Electrical and Electronics Engineers Inc. |
| Pages | 465-471 |
| Number of pages | 7 |
| ISBN (Electronic) | 9798350396904 |
| DOIs | |
| State | Published - 2023 |
| Externally published | Yes |
| Event | 2022 IEEE Spoken Language Technology Workshop, SLT 2022 - Doha, Qatar Duration: 9 Jan 2023 → 12 Jan 2023 |
Publication series
| Name | 2022 IEEE Spoken Language Technology Workshop, SLT 2022 - Proceedings |
|---|
Conference
| Conference | 2022 IEEE Spoken Language Technology Workshop, SLT 2022 |
|---|---|
| Country/Territory | Qatar |
| City | Doha |
| Period | 9/01/23 → 12/01/23 |
Bibliographical note
Publisher Copyright:© 2023 IEEE.
Keywords
- Audio-visual speech enhancement
- LRS3 dataset
- subjective intelligibility
ASJC Scopus subject areas
- Computer Vision and Pattern Recognition
- Hardware and Architecture
- Media Technology
- Instrumentation
- Linguistics and Language
Fingerprint
Dive into the research topics of 'AVSE Challenge: Audio-Visual Speech Enhancement Challenge'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver