Arabic Keyphrase Extraction: Enhancing Deep Learning Models with Pre-trained Contextual Embedding and External Features

Randah Alharbi, Husni Al-Muhtaseb

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Keyphrase extraction is essential to many Information retrieval (IR) and Natural language Processing (NLP) tasks such as summarization and indexing. This study investigates deep learning approaches to Arabic keyphrase extraction. We address the problem as sequence classification and create a Bi-LSTM model to classify each sequence token as either part of the keyphrase or outside of it. We have extracted word embeddings from two pre-trained models, Word2Vec and BERT. Moreover, we have investigated the effect of incorporating linguistic, positional, and statistical features with word embeddings on performance. Our best-performing model has achieved 0.45 F1-score on ArabicKPE dataset when combining linguistic and positional features with BERT embedding.

Original languageEnglish
Title of host publicationWANLP 2022 - 7th Arabic Natural Language Processing - Proceedings of the Workshop
PublisherAssociation for Computational Linguistics (ACL)
Pages320-330
Number of pages11
ISBN (Electronic)9781959429272
StatePublished - 2022
Event7th Arabic Natural Language Processing Workshop, WANLP 2022 held with EMNLP 2022 - Abu Dhabi, United Arab Emirates
Duration: 8 Dec 2022 → …

Publication series

NameWANLP 2022 - 7th Arabic Natural Language Processing - Proceedings of the Workshop

Conference

Conference7th Arabic Natural Language Processing Workshop, WANLP 2022 held with EMNLP 2022
Country/TerritoryUnited Arab Emirates
CityAbu Dhabi
Period8/12/22 → …

Bibliographical note

Publisher Copyright:
© 2022 Association for Computational Linguistics.

ASJC Scopus subject areas

  • Language and Linguistics
  • Computational Theory and Mathematics
  • Software
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'Arabic Keyphrase Extraction: Enhancing Deep Learning Models with Pre-trained Contextual Embedding and External Features'. Together they form a unique fingerprint.

Cite this