Skip to main navigation Skip to search Skip to main content

Fusing audio, visual and textual clues for sentiment analysis from multimodal content

  • Soujanya Poria
  • , Erik Cambria*
  • , Newton Howard
  • , Guang Bin Huang
  • , Amir Hussain
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

473 Scopus citations

Abstract

A huge number of videos are posted every day on social media platforms such as Facebook and YouTube. This makes the Internet an unlimited source of information. In the coming decades, coping with such information and mining useful knowledge from it will be an increasingly difficult task. In this paper, we propose a novel methodology for multimodal sentiment analysis, which consists in harvesting sentiments from Web videos by demonstrating a model that uses audio, visual and textual modalities as sources of information. We used both feature- and decision-level fusion methods to merge affective information extracted from multiple modalities. A thorough comparison with existing works in this area is carried out throughout the paper, which demonstrates the novelty of our approach. Preliminary comparative experiments with the YouTube dataset show that the proposed multimodal system achieves an accuracy of nearly 80%, outperforming all state-of-the-art systems by more than 20%.

Original languageEnglish
Pages (from-to)50-59
Number of pages10
JournalNeurocomputing
Volume174
DOIs
StatePublished - 22 Jan 2016
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 2015 Elsevier B.V.

Keywords

  • Big social data analysis
  • Multimodal fusion
  • Multimodal sentiment analysis
  • Opinion mining
  • Sentic computing

ASJC Scopus subject areas

  • Computer Science Applications
  • Cognitive Neuroscience
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Fusing audio, visual and textual clues for sentiment analysis from multimodal content'. Together they form a unique fingerprint.

Cite this