Enhancing Python Code Smell Detection with Heterogeneous Ensembles

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

Code smells indicate potential issues in Software design that can impact maintainability, testing and overall quality. Detecting them early is crucial for improving system reliability. While machine learning has been used for code smell detection, most studies focused on Java, with limited research on other languages. In this study, we empirically investigated the effectiveness of both deep learning and heterogeneous ensemble models in detecting multiple Python code smells, including Large Class, Long Method, Long Scope Chaining, Long Parameter List and Long Base Class List. We evaluated three heterogeneous ensemble models: Stacking, Hard Voting and Soft Voting ensembles, alongside three deep learning models: Convolutional Neural Networks, Long Short-Term Memory and Gated Recurrent Units. Each ensemble was built using eight base models, and the Wilcoxon test was used to assess performance differences. Results indicated that Stacking consistently outperformed other models with superior stability and detection performance. Convolutional Neural Networks performed well in some smells but struggled with complex nested structures, where ensemble models offered more stability. Hard and Soft Voting ensembles were competitive but less stable than Stacking. These findings highlight the potential of ensemble and deep learning models in enhancing Python code smell detection.

Original languageEnglish
Pages (from-to)963-986
Number of pages24
JournalInternational Journal of Software Engineering and Knowledge Engineering
Volume35
Issue number7
DOIs
StatePublished - 1 Jul 2025

Bibliographical note

Publisher Copyright:
© 2025 World Scientific Publishing Company.

Keywords

  • Python
  • code smell
  • deep learning
  • ensemble learning
  • stacking
  • voting

ASJC Scopus subject areas

  • Software
  • Computer Networks and Communications
  • Computer Graphics and Computer-Aided Design
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Enhancing Python Code Smell Detection with Heterogeneous Ensembles'. Together they form a unique fingerprint.

Cite this