Skip to main navigation Skip to search Skip to main content

A Novel Machine Learning Framework for Predicting 232Th Distribution in Radionuclide-Contaminated Soils Using Physicochemical Environmental Factors

  • Sati Lubis
  • , Haruna Adamu*
  • , Jamilu Usman
  • , Abdullahi Garba Usman
  • , Sani Isah Abba
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

This study investigates the role of soil chemistry, specifically pH, organic carbon (OC), organic matter (OM), and cation exchange capacity (CEC), in influencing the mobility and distribution of 232Th radionuclides in abandoned mine soils using advanced machine learning (ML) models. Soil samples were collected from multiple locations across different seasons. Gaussian Process Regression (GPR), Long Short-Term Memory (LSTM) networks, Adaptive Neuro-Fuzzy Inference System (ANFIS), and Random Forest (RF) models were employed to predict 232Th distribution, with feature selection identifying optimal model combinations (C1, C2, and C3). The performance evaluation of machine learning models revealed distinct patterns in predicting 232Th distribution. The results indicate that GPR-C1 exhibited the highest predictive accuracy, with MAPE improving from 8.9909 to 3.0468 and MAE reducing from 3.5236 to 1.6044 during the verification phase. In addition, GPR-C1 emerged as the top-performing model during both training (RMSE = 7.0851, DC = 0.6482) and testing (RMSE = 4.5808, DC = 0.5848), demonstrating its robustness in capturing non-linear relationships between soil properties (pH, OC, OM, CEC) and 232Th mobility. In contrast, RF models (RF-C1, RF-C3) exhibited the poorest performance (training RMSE > 11.5123; testing RMSE > 7.6855), likely due to their inability to resolve complex geochemical interactions, as evidenced by their low DC (<0.2) and PCC (<0.3) values. A notable observation was that several models exhibited lower RMSE in the testing set than in calibration, reflecting the reduced variance within the held-out site-season blocks; however, nested cross-validation and a leave-site-out analysis consistently identified GPR-C1 as the most reliable and accurate model. This aligns with field data showing higher 232Th mobility during wet seasons due to leaching and runoff transport ( p < 0.05 ). For instance, testing RMSE (4.5808) of GPR-C1 was significantly lower than its training RMSE (7.0851), reinforcing the role of seasonal dynamics in 232Th redistribution. Therefore, this model demonstrates significant potential for accurately predicting 232Th behaviour and distribution, crucial for environmental risk assessments. Hence, accurate predictions of 232Th distribution can guide targeted remediation efforts and inform land management practices, mitigating risks associated with 232Th exposure.

Original languageEnglish
Pages (from-to)327-341
Number of pages15
JournalPollution
Volume12
Issue number1
DOIs
StatePublished - Jan 2026

Bibliographical note

Publisher Copyright:
© The Author(s).

Keywords

  • Land Management
  • Machine Learning
  • Pollution
  • Radionuclides
  • Soil Chemistry

ASJC Scopus subject areas

  • General Environmental Science

Fingerprint

Dive into the research topics of 'A Novel Machine Learning Framework for Predicting 232Th Distribution in Radionuclide-Contaminated Soils Using Physicochemical Environmental Factors'. Together they form a unique fingerprint.

Cite this