Testing Machine Learning and Deep Learning Systems: Achievements and Challenges

  • Salma Albelali*
  • , Moataz Ahmed
  • *Corresponding author for this work

Research output: Contribution to journalReview articlepeer-review

Abstract

Rapid advancements in artificial intelligence have driven the integration of learning algorithms-machine learning (ML) and deep learning (DL) models-across various industries, posing new challenges for testing these complex systems. Rigorous testing of ML/DL-based systems (MLSs) is especially critical in high-stakes domains like autonomous driving, healthcare diagnostics, and financial forecasting, where system reliability is paramount. Unlike traditional software, MLS quality relies not only on model architecture and development processes but also significantly on the quality of the training data. This study offers a comprehensive review of MLS testing methodologies, with a focus on the emerging role of Data-Box testing, alongside established Black-Box and White-Box techniques. Data-Box testing assesses training data quality to ensure it meets criteria such as sufficiency and adequacy, bridging Black-Box and White-Box methods to enhance system reliability. The study further addresses the increasing use of mutation testing (MT) in DL, exploring MT techniques and mutation operators to ensure adequate coverage. By synthesizing recent advances, we propose an integrated MLS testing framework that encapsulates these critical aspects, offering insights and highlighting areas for future research to refine MLS testing practices.

Original languageEnglish
Pages (from-to)11433-11484
Number of pages52
JournalArabian Journal for Science and Engineering
Volume50
Issue number15
DOIs
StatePublished - Aug 2025

Bibliographical note

Publisher Copyright:
© King Fahd University of Petroleum & Minerals 2025.

Keywords

  • Black-box testing
  • Data quality
  • Deep learning
  • Machine learning
  • Model testing
  • Mutation testing

ASJC Scopus subject areas

  • General

Fingerprint

Dive into the research topics of 'Testing Machine Learning and Deep Learning Systems: Achievements and Challenges'. Together they form a unique fingerprint.

Cite this