Enhancing Software Defect Prediction: A Stacking Ensemble Framework with Advanced Feature Engineering

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Software defect prediction is a critical aspect of software quality assurance, aiming to identify faulty modules early in the development lifecycle to mitigate potential risks and reduce maintenance costs. This paper presents a comprehensive stacking ensemble framework that amalgamates multiple machine learning techniques to enhance defect prediction accuracy. The methodology encompasses meticulous data preprocessing, a two-stage feature selection process involving Minimum Redundancy Maximum Relevance (mRMR) and polynomial feature expansion, followed by dimensionality reduction using Principal Component Analysis (PCA). Four diverse base learners - Extreme Learning Machine (ELM), Support Vector Machine (SVM), Random Forest, and XGBoost - are trained on the transformed feature set, and their outputs are integrated through a Logistic Regression meta-learner. Empirical evaluations conducted on five benchmark NASA datasets - MC1, CM1, KC2, KC3, and PC1 - demonstrate the robustness of the proposed framework, achieving accuracies up to 93.80%. These results underscore the efficacy of leveraging ensemble learning and sophisticated feature engineering in capturing intricate data patterns for improved software defect detection.

Original languageEnglish
Title of host publicationProceedings of the 29th International Conference on Evaluation and Assessment in Software Engineering , EASE, 2025 edition, EASE Companion 2025
EditorsMuhammad Ali Babar, Ayse Tosun, Stefan Wagner, Viktoria Stray
PublisherAssociation for Computing Machinery, Inc
Pages28-34
Number of pages7
ISBN (Electronic)9798400718328
DOIs
StatePublished - 23 Dec 2025
Event29th International Conference on Evaluation and Assessment of Software Engineering, EASE 2025 - Istanbul, Turkey
Duration: 17 Jun 202520 Jun 2025

Publication series

NameProceedings of the 29th International Conference on Evaluation and Assessment in Software Engineering , EASE, 2025 edition, EASE Companion 2025

Conference

Conference29th International Conference on Evaluation and Assessment of Software Engineering, EASE 2025
Country/TerritoryTurkey
CityIstanbul
Period17/06/2520/06/25

Bibliographical note

Publisher Copyright:
© 2025 Copyright held by the owner/author(s).

Keywords

  • Extreme Learning Machine
  • NASA datasets
  • PCA
  • Software defect prediction
  • mRMR
  • stacking ensemble

ASJC Scopus subject areas

  • Software

Fingerprint

Dive into the research topics of 'Enhancing Software Defect Prediction: A Stacking Ensemble Framework with Advanced Feature Engineering'. Together they form a unique fingerprint.

Cite this