Skip to main navigation Skip to search Skip to main content

AVSE Challenge: Audio-Visual Speech Enhancement Challenge

  • Andrea Lorena Aldana Blanco
  • , Cassia Valentini-Botinhao
  • , Ondrej Klejch
  • , Mandar Gogate
  • , Kia Dashtipour
  • , Amir Hussain
  • , Peter Bell

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

21 Scopus citations

Abstract

Audio-visual speech enhancement is the task of improving the quality of a speech signal when video of the speaker is available. It opens-up the opportunity of improving speech intelligibility in adverse listening scenarios that are currently too challenging for audio-only speech enhancement models. The Audio-Visual Speech Enhancement (AVSE) challenge aims to set the first benchmark in this area. We provide participants with datasets and scripts to test their audio-visual speech enhancement models under a common framework for both training and evaluation. The data is derived from real-world videos, and comprises noisy mixes, in which audio from target speaker is mixed with either a competing speaker or a noise signal. The submitted systems are evaluated by conducting AV intelligibility tests involving human participants. We expect this challenge to be a platform for advancing the field of audio-visual speech-enhancement and to provide further insight about the scope and limitations of current AV speech enhancement approaches.

Original languageEnglish
Title of host publication2022 IEEE Spoken Language Technology Workshop, SLT 2022 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages465-471
Number of pages7
ISBN (Electronic)9798350396904
DOIs
StatePublished - 2023
Externally publishedYes
Event2022 IEEE Spoken Language Technology Workshop, SLT 2022 - Doha, Qatar
Duration: 9 Jan 202312 Jan 2023

Publication series

Name2022 IEEE Spoken Language Technology Workshop, SLT 2022 - Proceedings

Conference

Conference2022 IEEE Spoken Language Technology Workshop, SLT 2022
Country/TerritoryQatar
CityDoha
Period9/01/2312/01/23

Bibliographical note

Publisher Copyright:
© 2023 IEEE.

Keywords

  • Audio-visual speech enhancement
  • LRS3 dataset
  • subjective intelligibility

ASJC Scopus subject areas

  • Computer Vision and Pattern Recognition
  • Hardware and Architecture
  • Media Technology
  • Instrumentation
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'AVSE Challenge: Audio-Visual Speech Enhancement Challenge'. Together they form a unique fingerprint.

Cite this