Abstract
App stores usually allow users to give reviews and ratings that are used by developers to resolve issues and make plans for their apps. In this way, these app stores collect large amounts of data for analysis. However, there are several challenges that must first be addressed, related to redundancy and the volume of data, by using machine learning. This study performs experiments on a dataset that contains reviews for Shopify apps. To overcome the aforementioned limitations, we first categorize user reviews into two groups, i.e., happy and unhappy, and then perform preprocessing on the reviews to clean the data. At a later stage, several feature engineering techniques, such as bag-of-words, term frequency-inverse document frequency (TF-IDF), and chi-square (Chi2), are used singly and in combination to preserve meaningful information. Finally, the random forest, AdaBoost classifier, and logistic regression models are used to classify the reviews as happy or unhappy. The performance of our proposed pipeline was evaluated using average accuracy, precision, recall, and $f_{1}$ score. The experiments reveal that a combination of features can improve machine learning models performance and in this study, logistic regression outperforms the others and achieves an 83% true acceptance rate when combined with TF-IDF and Chi2.
| Original language | English |
|---|---|
| Article number | 8988264 |
| Pages (from-to) | 30234-30244 |
| Number of pages | 11 |
| Journal | IEEE Access |
| Volume | 8 |
| DOIs | |
| State | Published - 2020 |
| Externally published | Yes |
Bibliographical note
Publisher Copyright:© 2013 IEEE.
Keywords
- Feature engineering
- feature extraction
- feature selection
- machine learning
- review classification
- text mining
ASJC Scopus subject areas
- General Computer Science
- General Materials Science
- General Engineering
Fingerprint
Dive into the research topics of 'Classification of Shopify App User Reviews Using Novel Multi Text Features'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver