Abstract
This study addresses the need to generate reliable synthetic data in radioactive waste management, specifically on DSRSwhich will be integrated into a machine learning-based data management system focusing on Indonesia's Radioactive Waste Treatment Installation. Five distinct synthetic data generation methods such as Monte Carlo Gaussian, Data Augmentation, Copula Models, Bayesian Network, and VAEsrs are evaluated for their efficacy in replicating the statistical characteristics of confidential DSRS data. The evaluation criteria include the methods' ability to emulate the original data distribution, handle outliers, and their implications for DSRS management in predictive modelling. Bayesian Networks closely match the original dataset (with MRE = 17.60 %, Kolmogorov-Smirnov Dn = 0.03, p-value = 0.31), making them the most effective method for generating synthetic data with high mean consistency. The results show Bayesian Network methods is the most suitable and effective methods for generating synthetic data that closely aligns with the original dataset in terms of mean consistency. These findings demonstrate that synthetic data can improve DSRS management, guiding future research and regulatory compliance.
| Original language | English |
|---|---|
| Article number | 103524 |
| Journal | Nuclear Engineering and Technology |
| Volume | 57 |
| Issue number | 7 |
| DOIs | |
| State | Published - Jul 2025 |
Bibliographical note
Publisher Copyright:© 2025 Korean Nuclear Society
Keywords
- Data privacy
- Disused sealed radioactive sources
- Radioactive waste management
- Statistical modelling
- Synthetic data generation
ASJC Scopus subject areas
- Nuclear Energy and Engineering