Dynamic stacking ensemble for cross-language code smell detection

Hamoud Aljamaan*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Code smells refer to poor design and implementation choices by software engineers that might affect the overall software quality. Code smells detection using machine learning models has become a popular area to build effective models that are capable of detecting different code smells in multiple programming languages. However, the process of building of such effective models has not reached a state of stability, and most of the existing research focuses on Java code smells detection. The main objective of this article is to propose dynamic ensembles using two strategies, namely greedy search and backward elimination, which are capable of accurately detecting code smells in two programming languages (i.e., Java and Python), and which are less complex than full stacking ensembles. The detection performance of dynamic ensembles were investigated within the context of four Java and two Python code smells. The greedy search and backward elimination strategies yielded different base models lists to build dynamic ensembles. In comparison to full stacking ensembles, dynamic ensembles yielded less complex models when they were used to detect most of the investigated Java and Python code smells, with the backward elimination strategy resulting in less complex models. Dynamic ensembles were able to perform comparably against full stacking ensembles with no significant detection loss. This article concludes that dynamic stacking ensembles were able to facilitate the effective and stable detection performance of Java and Python code smells over all base models and with less complexity than full stacking ensembles.

Original languageEnglish
Article numbere2254
JournalPeerJ Computer Science
Volume10
DOIs
StatePublished - 2024

Bibliographical note

Publisher Copyright:
Copyright 2024 Aljamaan

Keywords

  • Code smell
  • Detection
  • Dynamic ensemble
  • Ensemble learning
  • Machine learning
  • Stacking ensemble

ASJC Scopus subject areas

  • General Computer Science

Fingerprint

Dive into the research topics of 'Dynamic stacking ensemble for cross-language code smell detection'. Together they form a unique fingerprint.

Cite this