Abstract
Rapid advancements in artificial intelligence have driven the integration of learning algorithms-machine learning (ML) and deep learning (DL) models-across various industries, posing new challenges for testing these complex systems. Rigorous testing of ML/DL-based systems (MLSs) is especially critical in high-stakes domains like autonomous driving, healthcare diagnostics, and financial forecasting, where system reliability is paramount. Unlike traditional software, MLS quality relies not only on model architecture and development processes but also significantly on the quality of the training data. This study offers a comprehensive review of MLS testing methodologies, with a focus on the emerging role of Data-Box testing, alongside established Black-Box and White-Box techniques. Data-Box testing assesses training data quality to ensure it meets criteria such as sufficiency and adequacy, bridging Black-Box and White-Box methods to enhance system reliability. The study further addresses the increasing use of mutation testing (MT) in DL, exploring MT techniques and mutation operators to ensure adequate coverage. By synthesizing recent advances, we propose an integrated MLS testing framework that encapsulates these critical aspects, offering insights and highlighting areas for future research to refine MLS testing practices.
| Original language | English |
|---|---|
| Pages (from-to) | 11433-11484 |
| Number of pages | 52 |
| Journal | Arabian Journal for Science and Engineering |
| Volume | 50 |
| Issue number | 15 |
| DOIs | |
| State | Published - Aug 2025 |
Bibliographical note
Publisher Copyright:© King Fahd University of Petroleum & Minerals 2025.
Keywords
- Black-box testing
- Data quality
- Deep learning
- Machine learning
- Model testing
- Mutation testing
ASJC Scopus subject areas
- General