Abstract
Keyword spotting (KWS) is important in numerous trigger, trigger-command and command and control applications of embedded platforms. However, the embedded platforms used currently in the fast growing market of the Internet of Things (IoT) and in standalone systems have still considerable processing power, memory and battery constraints. In IoT and smart devices applications, speakers are usually far from the microphone resulting in severe distortions and considerable amounts of noise and noticeable reverberation. Speech enhancement can be used as a front-end or pre-processing module to improve the performance of the KWS. However, denoisers and dereverberators as front-end processing modules add to the complexity of the keyword spotting system and the computing, memory and battery requirements of the embedded platforms. In this paper, a noise robust keyword spotting engine with small memory footprint is presented. Multi-condition utterances training of a deep neural networks model is developed to increase the keyword spotting noise robustness. A comparative study is conducted to compare the deep learning approach with Gaussian mixture model. Experimental results show that deep learning outperforms the Gaussian approach in both clean and noisy conditions. Moreover, deep learning model trained using partially noisy data saves the need for using speech enhancement module or denoiser for front-end processing.
| Original language | English |
|---|---|
| Title of host publication | Image Analysis and Recognition - 16th International Conference, ICIAR 2019, Proceedings |
| Editors | Fakhri Karray, Alfred Yu, Aurélio Campilho |
| Publisher | Springer Verlag |
| Pages | 134-146 |
| Number of pages | 13 |
| ISBN (Print) | 9783030272715 |
| DOIs | |
| State | Published - 2019 |
| Externally published | Yes |
| Event | 16th International Conference on Image Analysis and Recognition, ICIAR 2019 - Waterloo, Canada Duration: 27 Aug 2019 → 29 Aug 2019 |
Publication series
| Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
|---|---|
| Volume | 11663 LNCS |
| ISSN (Print) | 0302-9743 |
| ISSN (Electronic) | 1611-3349 |
Conference
| Conference | 16th International Conference on Image Analysis and Recognition, ICIAR 2019 |
|---|---|
| Country/Territory | Canada |
| City | Waterloo |
| Period | 27/08/19 → 29/08/19 |
Bibliographical note
Publisher Copyright:© Springer Nature Switzerland AG 2019.
Keywords
- Deep belief network
- Deep learning
- Embedded platform
- Keyword spotting
- Noisy speech
- Phoneme classification
ASJC Scopus subject areas
- Theoretical Computer Science
- General Computer Science
Fingerprint
Dive into the research topics of 'A deep learning-based noise-resilient keyword spotting engine for embedded platforms'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver