Highly cited articles capture the attention of significant contributors in the research community as an opportunity to improve knowledge, source of ideas or solutions, and advance their research in general. Typically, these articles are authored by a large number of scientists with international collaboration. However, this could not be the only reason for an article to be highly cited, there might be several other characteristics for an article to be more attractive to researchers and readers. In other words, there are a few other characteristics that help articles/papers to be more than others to appear in search engines or to grab readers’ attention. In this study, we modeled several machine-learning methods with a set of articles, and journal characteristics including authors-count, title characteristics, abstract length, international collaboration, number of keywords, funding information, journal characteristics, etc. We extracted 20 characteristics and developed multiple machine-learning models to automate highly-cited papers recognition from regular papers. In experiments conducted with an ensemble machine learning algorithm, 97% recognition accuracy was achieved. Other algorithms including a deep learning method using LSTMs also achieved high recognition accuracy. Such high performances can be utilized for a promising HCP auto-detection system in the future.
Bibliographical notePublisher Copyright:
Copyright Author (s) 2023.
- Artificial Intelligence
- Bibliometric Analysis
- Digital Libraries
- Highly Cited Paper Indicators
- Machine Learning
ASJC Scopus subject areas
- Information Systems
- Computer Science Applications
- Library and Information Sciences