Skip to main navigation Skip to search Skip to main content

Transfer Learning-Based Automatic Sentiment Annotation of a Twitter-Based Arabic Mental Illness (AMI) Dataset

  • Arwa Diwali*
  • , Kawther Saeedi
  • , Kia Dashtipour
  • , Mandar Gogate
  • , Zain Hussain
  • , Adam Howard
  • , Amir Hussain
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Sentiment analysis, crucial for discerning emotional tones in text, relies on manual annotation to train machine learning models and is considered the gold standard for creating annotated corpora. However, this process is time-consuming, labour-intensive, and prone to biases. This paper proposes an automatic annotation approach for the Twitter-based Arabic Mental Illness (AMI) dataset, which encompasses both Modern Standard Arabic and Dialectal Arabic. The approach leverages transfer learning with existing manually annotated datasets and three advanced Arabic language models to automate annotation, thereby enriching Arabic as a low-resource language with labelled sentiment data. Validation was conducted by comparing the automatically generated annotations to manual annotation on the same dataset, achieving strong inter-annotator agreement with a Cohen's Kappa statistic of k = 0.8457. Additionally, various baseline models were evaluated on the AMI dataset, identifying AraBERT as the top performer with the highest F1 score and accuracy.

Original languageEnglish
Article numbere70128
JournalExpert Systems
Volume42
Issue number10
DOIs
StatePublished - Oct 2025
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 2025 The Author(s). Expert Systems published by John Wiley & Sons Ltd.

Keywords

  • Arabic sentiment analysis
  • Corpus
  • data annotation
  • language models

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Theoretical Computer Science
  • Computational Theory and Mathematics
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Transfer Learning-Based Automatic Sentiment Annotation of a Twitter-Based Arabic Mental Illness (AMI) Dataset'. Together they form a unique fingerprint.

Cite this