Abstract
Keyword extraction is one of the most important research areas of information retrieval. The task is challenging, and it has been receiving the attention of researchers in the last decade. The importance of this problem originates from the fact that extracted keywords can be used in many fields such as document indexing, clustering, classification, summarization, metadata generation, topic identification, and information visualization. In addition, recent years have witnessed a dramatic growth in the number of documents that are available online with no key-phrases assigned. Assigning keyphrase to such documents manually is impractical. This situation demands automatic keyphrase extraction. In this regard, several approaches have been proposed in the literature. These approaches use techniques borrowed from areas such as machine learning, computational linguistic and statistical analysis. In this paper, Arabic keyphrase extraction system is developed for Arabic documents. A new boosting factor is proposed by which occurrence of compound terms is boosted based on occurrences of their words. This is motivated by the fact that long phrases are preferred to be keywords than single words. The performance of the proposed keyphrase extraction method is evaluated using three Arabic datasets and the results show that the proposed method has comparable performance to that of KP-Miner.
| Original language | English |
|---|---|
| Pages (from-to) | 875-884 |
| Number of pages | 10 |
| Journal | ICIC Express Letters, Part B: Applications |
| Volume | 10 |
| Issue number | 10 |
| DOIs | |
| State | Published - Oct 2019 |
Bibliographical note
Publisher Copyright:© 2019 ICIC International.
Keywords
- Arabi douments
- Arabi keyphrase extration
- KP-Miner
ASJC Scopus subject areas
- General Computer Science
Fingerprint
Dive into the research topics of 'A novel approach to Arabic keyphrase extraction'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver