Abstract
Sentiment analysis is a growing research area that analyzes people’s opinions towards a specific target using posts shared in social media. However, spammers can inject false opinions to change sentiment-oriented decisions, e.g. low quality products or policies can be promoted or advocated over others. Therefore, identifying and removing spam posts in social media is a crucial data cleaning operation for text mining tasks including sentiment analysis. An inherent problem related to spam detection is the imbalanced-class problem. In this paper, we explore the impact of imbalance ratio on the performance of Twitter spam detection using multiple approaches of single and ensemble classifiers. Besides ensemble-based learning (Bagging and Random forest), we apply the SMOTE oversampling technique to improve detection performance especially for classifiers sensitive to imbalanced datasets.
| Original language | English |
|---|---|
| Title of host publication | Security in Computing and Communications - 6th International Symposium, SSCC 2018, Revised Selected Papers |
| Editors | Sabu M. Thampi, Danda B. Rawat, Jose M. Alcaraz Calero, Sanjay Madria, Guojun Wang |
| Publisher | Springer Verlag |
| Pages | 157-167 |
| Number of pages | 11 |
| ISBN (Print) | 9789811358258 |
| DOIs | |
| State | Published - 2019 |
Publication series
| Name | Communications in Computer and Information Science |
|---|---|
| Volume | 969 |
| ISSN (Print) | 1865-0929 |
Bibliographical note
Publisher Copyright:© Springer Nature Singapore Pte Ltd. 2019.
Keywords
- Imbalanced dataset
- Opinion spam detection
- SMOTE
- Sentiment analysis
- Social big data
- Social media security
ASJC Scopus subject areas
- General Computer Science
- General Mathematics