Monthly sodium adsorption ratio forecasting in rivers using a dual interpretable glass-box complementary intelligent system: Hybridization of ensemble TVF-EMD-VMD, Boruta-SHAP, and eXplainable GPR

Mehdi Jamei*, Mumtaz Ali, Masoud Karbasi, Bakhtiar Karimi, Neshat Jahannemaei, Aitazaz Ahsan Farooque, Zaher Mundher Yaseen

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

The sodium adsorption ratio (SAR) is the most crucial irrigation water quality indicator to diagnose the suitability of agricultural water resources. Due to this reason, accurate forecasting of SAR in the absence of its time series, based on limited input sequences, is recently considered a challenging environmental issue on a monthly scale. This research developed a dual eXplainable multivariate expert framework for the first time to forecast monthly SAR at Zayanderud River, Iran. The framework (i.e., BS-GPR-E.TVF-EMD-VMD) consisting of a Boruta coupled with SHapley Additive exPlanations (Boruta-SHAP) feature selection, an ensemble of time-varying filter-based empirical mode decomposition (TVF-EMD) and variational modal decomposition (VMD), namely (E.TVF-EMD-VMD), and eXplainable Gaussian process regression (GPR). The main novelty of this framework is converting the “black-box” nature of the forecasting model to a dual interpretable “glass box” before and during the learning process. For this purpose, among nine hydrometric and water quality parameters associated with Zayanderud River at two stations (Regulating dam and Zaman Khan) over the period of 1969 to 2016, the significant two-month antecedent information (lags) signals were extracted using the Boruta-SHAP feature selection. Afterwards, the optimal inputs signal lags for each station were decomposed into sub-components to reduce the complexity and non-stationary of original signals using three pre-processing techniques (i.e., E.TVF-EMD-VMD, TVF-EMD, and VMD). The decomposed predictors were employed as inputs into the multilayer perceptron neural network (MLP), Random Forest (RF), Elman recurrent neural network (ERNN), and eXplainable GPR approaches. Statistical validation and infographic tools revealed that the BS-GPR-E.TVF-EMD-VMD regarding the best performance in the Regulating dam (R = 0.9817, RMSE = 0.1431, and NSE = 0.8866) and Zaman Khan (R = 0.9632, RMSE = 0.0610, and NSE = 0.9233) stations, outperformed the other complementary and standalone counterpart frameworks followed by the BS-GPR-TVF-EMD and BS-ERNN-E.TVF-EMD-VMD, respectively. SHAP explainer through the GPR model clearly interpreted the effect of the lagged-time sub-components related to each predictor and represented the impact of each decomposition technique on the input signals through E.TVF-EMD-VMD aiming to forecast SAR in standalone and complementary frameworks.

Original languageEnglish
Article number121512
JournalExpert Systems with Applications
Volume237
DOIs
StatePublished - 1 Mar 2024

Bibliographical note

Publisher Copyright:
© 2023 Elsevier Ltd

Keywords

  • Boruta-SHAP
  • E.TVF-EMD-VMD
  • Glass-box
  • SHapley Additive exPlanations
  • Sodium adsorption ratio
  • Surface water quality

ASJC Scopus subject areas

  • General Engineering
  • Computer Science Applications
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Monthly sodium adsorption ratio forecasting in rivers using a dual interpretable glass-box complementary intelligent system: Hybridization of ensemble TVF-EMD-VMD, Boruta-SHAP, and eXplainable GPR'. Together they form a unique fingerprint.

Cite this