A novel approach to Arabic keyphrase extraction

  • Dhiaa Musleh
  • , Rashad Ahmed
  • , Atta-Ur-rahman*
  • , Fahd Alhaidari
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

12 Scopus citations

Abstract

Keyword extraction is one of the most important research areas of information retrieval. The task is challenging, and it has been receiving the attention of researchers in the last decade. The importance of this problem originates from the fact that extracted keywords can be used in many fields such as document indexing, clustering, classification, summarization, metadata generation, topic identification, and information visualization. In addition, recent years have witnessed a dramatic growth in the number of documents that are available online with no key-phrases assigned. Assigning keyphrase to such documents manually is impractical. This situation demands automatic keyphrase extraction. In this regard, several approaches have been proposed in the literature. These approaches use techniques borrowed from areas such as machine learning, computational linguistic and statistical analysis. In this paper, Arabic keyphrase extraction system is developed for Arabic documents. A new boosting factor is proposed by which occurrence of compound terms is boosted based on occurrences of their words. This is motivated by the fact that long phrases are preferred to be keywords than single words. The performance of the proposed keyphrase extraction method is evaluated using three Arabic datasets and the results show that the proposed method has comparable performance to that of KP-Miner.

Original languageEnglish
Pages (from-to)875-884
Number of pages10
JournalICIC Express Letters, Part B: Applications
Volume10
Issue number10
DOIs
StatePublished - Oct 2019

Bibliographical note

Publisher Copyright:
© 2019 ICIC International.

Keywords

  • Arabi douments
  • Arabi keyphrase extration
  • KP-Miner

ASJC Scopus subject areas

  • General Computer Science

Fingerprint

Dive into the research topics of 'A novel approach to Arabic keyphrase extraction'. Together they form a unique fingerprint.

Cite this