Sentiment analysis of tweets from airlines in the gulf region using machine learning

Mazen M. Hrazi, Abdulrahman M. Althagafi, Abdullah T. Aljuhani, Jenifar Rahman, Md Mahfuzur Rahman, Mohammad Shorfuzzaman

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

Twitter is one of the most popular social networking services in the world. Over the past few years' people have been using Twitter messages on a daily basis to express their views and share their feelings. As a result, the size of data is increasing dramatically creating opportunity for researchers to use these tweets as sources for data mining and extract valuable information. Being popular in Saudi Arabia, we believe that twitter messages (tweets) are a good source to capture the sentiment of people. These twitter messages can be divided mostly into two classes: positive or negative. Our goal is to design and implement a sentiment analyzer that will classify real tweets collected by Twitter API into one of the above categories. We used a machine learning based sentiment analysis method and applied several supervised learning algorithms such as Logistics Regression, Naïve Bayes, Support Vector Machine, and Decision Trees. To accomplish this, we preprocess the data-set that we have chosen to train our classifier. We used Bag-of-words and TF-IDF techniques to extract features from the preprocessed tweets. We also used uni-gram, bi-gram, and tri-gram to rank our features to identify best predictive accuracy for the classifier. Among the classification techniques that we used, Logistic regression performs the best in terms of accuracy for the validation and test data while using tri-gram features with stop words and TF-IDF feature extraction technique.

Original languageEnglish
Title of host publication2021 International Conference of Women in Data Science at Taif University, WiDSTaif 2021
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781665449489
DOIs
StatePublished - 30 Mar 2021

Publication series

Name2021 International Conference of Women in Data Science at Taif University, WiDSTaif 2021

Bibliographical note

Publisher Copyright:
© 2021 IEEE.

Keywords

  • Bag-of words model
  • Data Mining
  • Machine Learning
  • Sentiment Analysis
  • Stemming
  • Stop Words
  • TF-IDF model
  • Text Preprocessor

ASJC Scopus subject areas

  • Computer Science (miscellaneous)
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'Sentiment analysis of tweets from airlines in the gulf region using machine learning'. Together they form a unique fingerprint.

Cite this