Skip to main navigation Skip to search Skip to main content

Ma’aks: manually-curated parallel dataset for Arabic text sentiment swap

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

The advancement of NLP has made significant strides in sentiment style transfer, modifying the linguistic style of a text while preserving its content. However, most existing datasets are non-parallel and focus on English, neglecting low-resource languages like Arabic. The lack of comprehensive Arabic parallel datasets has hindered the development and evaluation of robust sentiment transfer models for Arabic. To address this, we introduce MA’AKS, a novel Arabic parallel dataset for sentiment style transfer. MA’AKS consists of 5k sentences in modern standard Arabic with positive/negative sentiments. Each sentence is meticulously annotated to ensure high-quality parallel sentiment pairs, supporting both supervised and unsupervised learning. To benchmark the dataset, we evaluated AceGPT, JAIS, and Llama-3 LLMs on Arabic sentiment transfer with different learning settings, including zero-shot, few-shot, and fine-tuning. By publicly releasing MA’AKS, annotation guidelines, and experiment code, we aim to advance research on Arabic sentiment transfer and contribute to the NLP community.

Original languageEnglish
Article number1
JournalLanguage Resources and Evaluation
Volume60
Issue number1
DOIs
StatePublished - Mar 2026

Bibliographical note

Publisher Copyright:
© The Author(s), under exclusive licence to Springer Nature B.V. 2025.

Keywords

  • Arabic NLP
  • Few-shot learning
  • Fine-tuning learning
  • LLMs
  • Parallel dataset
  • Sentiment Swap
  • Style transfer
  • Text generation
  • Zero-shot learning

ASJC Scopus subject areas

  • Language and Linguistics
  • Education
  • Linguistics and Language
  • Library and Information Sciences

Fingerprint

Dive into the research topics of 'Ma’aks: manually-curated parallel dataset for Arabic text sentiment swap'. Together they form a unique fingerprint.

Cite this