Review spam detection using active learning

M. N.Istiaq Ahsan, Tamzid Nahian, Abdullah All Kafi, Md Ismail Hossain, Faisal Muhammad Shah

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

25 Scopus citations

Abstract

As the access to Internet has been so much easier in the last decade or so, people are using online applications more than ever. Online marketing, in fact, the whole e-commerce is getting enormous day by day if not in every minute. Online Reviews play a very important role in this field and proving itself to be auspicious in terms of decision making from a customer's point of view. Even though these are very sensitive and significant information, ensuring the authenticity of user-generated content (Reviews, forums, blogs, discussion groups etc.) is erratically visible. That is why spamming, fake reviews and fabricated opinions are on the rise. Materially, it has become a profitable business which hampers the ingenuousness of the real fact. Several techniques have been introduced regarding this problem which depend mostly upon empirical conditions, rating consistency, obvious content features, and helpfulness voting etc. which confines the effectiveness of this undertaking. Most of the existing researches are supervised models whereas, good quality large-scale datasets are still very scarce and most of the models use pseudo fake reviews instead of real fake reviews. In this research, we introduce active learning approach to detect review spam using the TF-IDF features of the review content. Our model achieves phenomenal improvements in performance measures, working on almost 3600 reviews from different domains. In the best case, it achieves up to 88% accuracy and precision, recall and f-scores are above 85% in most cases. Additionally, about 2000 reviews were manually labeled during the process. Finally, after evaluating results, it indicates that this is a promising methodology for detecting review spams.

Original languageEnglish
Title of host publication7th IEEE Annual Information Technology, Electronics and Mobile Communication Conference, IEEE IEMCON 2016
EditorsHimadri Nath Saha, Satyajit Chakrabarti
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781509009961
DOIs
StatePublished - 16 Nov 2016
Externally publishedYes
Event7th IEEE Annual Information Technology, Electronics and Mobile Communication Conference, IEEE IEMCON 2016 - Vancouver, Canada
Duration: 13 Oct 201615 Oct 2016

Publication series

Name7th IEEE Annual Information Technology, Electronics and Mobile Communication Conference, IEEE IEMCON 2016

Conference

Conference7th IEEE Annual Information Technology, Electronics and Mobile Communication Conference, IEEE IEMCON 2016
Country/TerritoryCanada
CityVancouver
Period13/10/1615/10/16

Bibliographical note

Publisher Copyright:
© 2016 IEEE.

Keywords

  • Fake Review
  • Opinion Mining
  • Review spam detection
  • Spam Detection
  • Spam Review

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Instrumentation
  • Information Systems

Fingerprint

Dive into the research topics of 'Review spam detection using active learning'. Together they form a unique fingerprint.

Cite this