MAWQIF: A Multi-label Arabic Dataset for Target-specific Stance Detection

Nora Alturayeif, Hamzah Luqman, Moataz Ahmed

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

25 Scopus citations

Abstract

Social media platforms are becoming inherent parts of people's daily life to express opinions and stances toward topics of varying polarities. Stance detection determines the viewpoint expressed in a text toward a target. While communication on social media (e.g., Twitter) takes place in more than 40 languages, the majority of stance detection research has been focused on English. Although some efforts have recently been made to develop stance detection datasets in other languages, no similar efforts seem to have considered the Arabic language. In this paper, we present MAWQIF, the first Arabic dataset for target-specific stance detection, composed of 4,121 tweets annotated with stance, sentiment, and sarcasm polarities. MAWQIF, as a multi-label dataset, can provide more opportunities for studying the interaction between different opinion dimensions and evaluating a multi-task model. We provide a detailed description of the dataset, present an analysis of the produced annotation, and evaluate four BERT-based models on it. Our best model achieves a macro-F1 of 78.89%, which shows that there is ample room for improvement on this challenging task. We publicly release our dataset, the annotation guidelines, and the code of the experiments.

Original languageEnglish
Title of host publicationWANLP 2022 - 7th Arabic Natural Language Processing - Proceedings of the Workshop
PublisherAssociation for Computational Linguistics (ACL)
Pages174-184
Number of pages11
ISBN (Electronic)9781959429272
StatePublished - 2022
Event7th Arabic Natural Language Processing Workshop, WANLP 2022 held with EMNLP 2022 - Abu Dhabi, United Arab Emirates
Duration: 8 Dec 2022 → …

Publication series

NameWANLP 2022 - 7th Arabic Natural Language Processing - Proceedings of the Workshop

Conference

Conference7th Arabic Natural Language Processing Workshop, WANLP 2022 held with EMNLP 2022
Country/TerritoryUnited Arab Emirates
CityAbu Dhabi
Period8/12/22 → …

Bibliographical note

Publisher Copyright:
© 2022 Association for Computational Linguistics.

ASJC Scopus subject areas

  • Language and Linguistics
  • Computational Theory and Mathematics
  • Software
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'MAWQIF: A Multi-label Arabic Dataset for Target-specific Stance Detection'. Together they form a unique fingerprint.

Cite this