TY - JOUR
T1 - Machine learning predictive insight of water pollution and groundwater quality in the Eastern Province of Saudi Arabia
AU - Jibrin, Abdulhayat M.
AU - Al-Suwaiyan, Mohammad
AU - Aldrees, Ali
AU - Dan’azumi, Salisu
AU - Usman, Jamilu
AU - Abba, Sani I.
AU - Yassin, Mohamed A.
AU - Scholz, Miklas
AU - Sammen, Saad Sh
N1 - Publisher Copyright:
© The Author(s) 2024.
PY - 2024/12
Y1 - 2024/12
N2 - This study presents an innovative approach for predicting water and groundwater quality indices (WQI and GWQI) in the Eastern Province of Saudi Arabia, addressing critical challenges of scarcity and pollution in arid regions. Recent literature highlights the increasing attention towards WQI based on water pollution index (WPI) and GWQI as essential tools for simplifying complex hydrogeological data, thereby facilitating effective groundwater management and protection. Unlike previous works, the present research introduces a novel hybrid method that integrates non-parametric kernel Gaussian learning (GPR), adaptive neuro-fuzzy inference system (ANFIS), and decision tree (DT) algorithms. This approach marks the first application of a non-parametric kernel for groundwater quality pollution index prediction in Saudi Arabia, offering a significant advancement in the field. Through laboratory analysis and the combination of various machine learning (ML) techniques, this study enhances prediction capabilities, particularly for unmonitored sites in arid and semi-arid regions. The study’s objectives include feature engineering based on dependency sensitivity analysis to identify the most influential variables affecting WQI and GWQI, and the development of predictive models using ANFIS, GPR, and DT for both indices. Furthermore, it aims to assess the impact of different data portions on WQI and GWQI predictions, exploring data divisions such as (70% / 30%), (60% / 40%), and (80% / 20%) for training and testing phase, respectively. By filling a critical gap in water resource management, this research offers significant implications for the prediction of water quality in regions facing similar environmental challenges. Through its innovative methodology and comprehensive analysis, this study contributes to the broader effort of managing and protecting water resources in arid and semi-arid areas. The result proved that GPR-M1 exhibited exceptional testing phase accuracy with RMSE = 0.0169 for GWQI. Similarly, for WPI, the ANFIS-M1 achieved high testing predictive skills with RMSE = 0.0401. The results emphasize the critical role of data quality and quantity in training for enhancing model robustness and prediction precision in water quality assessment.
AB - This study presents an innovative approach for predicting water and groundwater quality indices (WQI and GWQI) in the Eastern Province of Saudi Arabia, addressing critical challenges of scarcity and pollution in arid regions. Recent literature highlights the increasing attention towards WQI based on water pollution index (WPI) and GWQI as essential tools for simplifying complex hydrogeological data, thereby facilitating effective groundwater management and protection. Unlike previous works, the present research introduces a novel hybrid method that integrates non-parametric kernel Gaussian learning (GPR), adaptive neuro-fuzzy inference system (ANFIS), and decision tree (DT) algorithms. This approach marks the first application of a non-parametric kernel for groundwater quality pollution index prediction in Saudi Arabia, offering a significant advancement in the field. Through laboratory analysis and the combination of various machine learning (ML) techniques, this study enhances prediction capabilities, particularly for unmonitored sites in arid and semi-arid regions. The study’s objectives include feature engineering based on dependency sensitivity analysis to identify the most influential variables affecting WQI and GWQI, and the development of predictive models using ANFIS, GPR, and DT for both indices. Furthermore, it aims to assess the impact of different data portions on WQI and GWQI predictions, exploring data divisions such as (70% / 30%), (60% / 40%), and (80% / 20%) for training and testing phase, respectively. By filling a critical gap in water resource management, this research offers significant implications for the prediction of water quality in regions facing similar environmental challenges. Through its innovative methodology and comprehensive analysis, this study contributes to the broader effort of managing and protecting water resources in arid and semi-arid areas. The result proved that GPR-M1 exhibited exceptional testing phase accuracy with RMSE = 0.0169 for GWQI. Similarly, for WPI, the ANFIS-M1 achieved high testing predictive skills with RMSE = 0.0401. The results emphasize the critical role of data quality and quantity in training for enhancing model robustness and prediction precision in water quality assessment.
KW - Eastern Province
KW - Environmental monitoring
KW - Groundwater quality
KW - Machine learning
KW - Saudi Arabia
KW - Water pollution
UR - http://www.scopus.com/inward/record.url?scp=85202653371&partnerID=8YFLogxK
U2 - 10.1038/s41598-024-70610-4
DO - 10.1038/s41598-024-70610-4
M3 - Article
C2 - 39198674
AN - SCOPUS:85202653371
SN - 2045-2322
VL - 14
JO - Scientific Reports
JF - Scientific Reports
IS - 1
M1 - 20031
ER -