Abstract
Given the critical role of data in Machine Learning (ML)-based system development, it has become increasingly important to assess the quality of data attributes and ensure that the data meets specific requirements before its utilization. This work proposes an approach to guide non-experts in identifying data requirements for ML systems using goal modeling. In this approach, we first develop the Data Requirement Goal Model (DRGM) by surveying the scientific literature to identify and categorize the issues and challenges faced by data scientists and requirement engineers working on ML-related projects. An initial DRGM was built to accommodate common tasks that would generalize across projects. Then, a customization mechanism is built to help adjust the tasks, KPIs, and goals' importance of different elements within the DRGM. The generated model can aid its users in evaluating different datasets using Goal-oriented Requirement Language (GRL) evaluation strategies. We then validate the approach through two illustrative examples based on real-world projects. The results demonstrate that the data requirements identified by the proposed approach align with the requirements of real-world projects, showing the practicality and effectiveness of the proposed framework. For future work, we recommend further evaluation of the proposed approach across more ML problem types and contexts, as well as implementing tool support for generating the DRGM via a chatbot interface.
| Original language | English |
|---|---|
| Title of host publication | Proceedings - 2025 IEEE 33rd International Requirements Engineering Conference Workshops, REW 2025 |
| Publisher | Institute of Electrical and Electronics Engineers Inc. |
| Pages | 62-73 |
| Number of pages | 12 |
| ISBN (Electronic) | 9798331538347 |
| DOIs | |
| State | Published - 2025 |
| Event | 33rd IEEE International Requirements Engineering Conference Workshops, REW 2025 - Valencia, Spain Duration: 1 Sep 2025 → 5 Sep 2025 |
Publication series
| Name | Proceedings - 2025 IEEE 33rd International Requirements Engineering Conference Workshops, REW 2025 |
|---|
Conference
| Conference | 33rd IEEE International Requirements Engineering Conference Workshops, REW 2025 |
|---|---|
| Country/Territory | Spain |
| City | Valencia |
| Period | 1/09/25 → 5/09/25 |
Bibliographical note
Publisher Copyright:© 2025 IEEE.
Keywords
- data quality
- data requirements
- Goal model
ASJC Scopus subject areas
- Artificial Intelligence
- Software
- Safety, Risk, Reliability and Quality
- Modeling and Simulation