Abstract
Speech emotion recognition continues to be a challenging problem and a flourishing area of active research especially under mixed language scenarios. In this paper, we show that emotion and language type are dependent and that improved emotion recognition accuracy by taking into consideration the nature of different languages. Recognizing emotion across such language diversity can be challenging and may result in a very large number of classes (product of number of emotion types and number of languages). Here, we propose a cross language emotion recognition system using Deep Learning Neural Networks (DNN). We show that the system is able to recognize accurately the six common types of emotion: Neutral, Happy, Angry, Sad, Fear and Bored. While in our experiments we considered two languages (from Berlin and Polish databases), the work can be extended to a larger pool of languages. The overall recognition accuracy obtained with the proposed technique reached above 93%. It is worth noting that the proposed algorithm outperforms by far the recognition accuracy of systems not considering the nature of specific languages (barely around 50%).
| Original language | English |
|---|---|
| Title of host publication | 2018 15th International Multi-Conference on Systems, Signals and Devices, SSD 2018 |
| Publisher | Institute of Electrical and Electronics Engineers Inc. |
| Pages | 1241-1245 |
| Number of pages | 5 |
| ISBN (Electronic) | 9781538653050 |
| DOIs | |
| State | Published - 7 Dec 2018 |
Publication series
| Name | 2018 15th International Multi-Conference on Systems, Signals and Devices, SSD 2018 |
|---|
Bibliographical note
Publisher Copyright:© 2018 IEEE.
Keywords
- deep learning
- language recognition
- speech emotion recognition
ASJC Scopus subject areas
- Computer Networks and Communications
- Signal Processing
- Control and Optimization
- Instrumentation