Abstract
The accurate mapping and assessment of groundwater vulnerability index are crucial for the preservation of groundwater resources from the possible contamination. In this research, novel intelligent predictive Machine Learning (ML) regression models of k-Neighborhood (KNN), ensemble Extremely Randomized Trees (ERT), and ensemble Bagging regression (BA) at two levels of modeling were utilized to improve DRASTIC-LU model in the Miryang aquifer located in South Korea. The predicted outputs from level 1 (KNN and ERT models) were used as inputs for ensemble bagging (BA) in level 2. The predictive groundwater pollution vulnerability index (GPVI), derived from DRASTIC-LU model was adjusted by NO3–N data and was utilized as the target data of the ML models. Hyperparameters for all models were tuned using a Grid Searching approach to determine the best effective model structures. Various statistical metrics and graphical representations were used to evaluate the superior predictive performance among ML models. Ensemble BA model in level 2 was more precise than standalone KNN and ensemble ERT models in level 1 for predicting GPVI values. Furthermore, the ensemble BA model offered suitable outcomes for the unseen data that could subsequently prevent the overfitting issue in the testing phase. Therefore, ML modeling at two levels could be an excellent approach for the proactive management of groundwater resources against contamination.
| Original language | English |
|---|---|
| Article number | 137671 |
| Journal | Chemosphere |
| Volume | 314 |
| DOIs | |
| State | Published - Feb 2023 |
Bibliographical note
Publisher Copyright:© 2022 Elsevier Ltd
Keywords
- BA
- ERT
- GPVI
- KNN
- Modeling at two levels
ASJC Scopus subject areas
- Environmental Engineering
- Environmental Chemistry
- General Chemistry
- Pollution
- Public Health, Environmental and Occupational Health
- Health, Toxicology and Mutagenesis