An ensemble approach to detect review spam using hybrid machine learning technique

M. N.Istiaq Ahsan, Tamzid Nahian, Abdullah All Kafi, Md Ismail Hossain, Faisal Muhammad Shah

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

14 Scopus citations

Abstract

Online reviews are becoming one of the vital components of e-commerce in recent years as so many people consider having different opinions prior to buying online products or apprehending any online service. Nowadays, in the era of web 2.0, it is completely understandable that people rely on online reviews more than ever while taking a decision. However, guaranteeing the authenticity of these sensitive and valuable information is hardly visible. Due to fulfill some immoral benefits, many people post fake review or fabricated opinion to uphold or devalue a certain product or service which certainly hampers the ingenuousness of the real fact. To detect fake reviews, many methodologies were introduced by harvesting the obvious content features, rating consistency, empirical conditions, helpfulness voting etc. The most of them are supervised models which mostly rely on pseudo fake reviews and the scarcity of good quality largescale labeled dataset is still a hindrance. In this paper, we introduce an ensemble learning approach which combines two different types of learning methods (active and supervised) by creating a hybrid dataset of both real-life and pseudo reviews. This model holds 3 different filtering phases that is based on KL and JS distance, TF-IDF features and n-gram features of the review content. It achieves phenomenal results while working on almost 3600 reviews from different domains. In the best case, the precision, recall and f-score are above 95% and the accuracy it achieved is slightly above 88%. In the process, about 2000 reviews were manually labeled. After evaluating and comparing the results with other successful methods, it is quite clear that this detecting method is efficient and very promising.

Original languageEnglish
Title of host publication19th International Conference on Computer and Information Technology, ICCIT 2016
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages388-394
Number of pages7
ISBN (Electronic)9781509040896
DOIs
StatePublished - 21 Feb 2017
Externally publishedYes
Event19th International Conference on Computer and Information Technology, ICCIT 2016 - Dhaka, Bangladesh
Duration: 18 Dec 201620 Dec 2016

Publication series

Name19th International Conference on Computer and Information Technology, ICCIT 2016

Conference

Conference19th International Conference on Computer and Information Technology, ICCIT 2016
Country/TerritoryBangladesh
CityDhaka
Period18/12/1620/12/16

Bibliographical note

Publisher Copyright:
© 2016 IEEE.

Keywords

  • Fake review
  • Machine learning
  • Review spam detection
  • Spam detection
  • Spam review

ASJC Scopus subject areas

  • General Computer Science

Fingerprint

Dive into the research topics of 'An ensemble approach to detect review spam using hybrid machine learning technique'. Together they form a unique fingerprint.

Cite this