Abstract
In the present study, nine deep learning models were tested for modeling coagulant dosage in an Algerian water treatment plant. Firstly, seven daily measured raw water quality variables were collected from the period ranging from 01 February 2018 to 02 August 2023 (1345 samples), and the aluminum sulfate (Al2(SO4)3.18H2O) corresponds to the coagulant dose to be modeled. The raw water quality variables were as follows: (i) water temperature (Tw), (ii) water pH, (iii) specific conductance (SC), (iv) water turbidity (TU), (v) water dissolved oxygen (DO), (vi) water ultraviolet absorption (UV254), and (vii) water color (COU). The objective was to provide robust deep learning models for predicting coagulant, by comparing single deep learning models, i.e., long short-term memory (LSTM), bidirectional long short-term memory (BiLSTM), gated recurrent unit (GRU), bidirectional gated recurrent unit (BiGRU), and convolutional neural network (CNN), with hybrid models based on CNN, i.e., LSTM-CNN, BiLSTM-CNN, GRU-CNN, and BiGRU-CNN. The models were developed according to various input combinations and two scenarios: with and without periodicity. The performances of the models were evaluated using various numerical indices, namely the root-mean-square error (RMSE), mean absolute error (MAE), Nash-Sutcliffe efficiency (NSE), and correlation coefficient (R). Furthermore, the SHapley Additive exPlanations (SHAP) algorithm was used for global and local models’ interpretability. In both scenarios, hybrid models based on CNN algorithms significantly surpassed single models, and the predictive accuracy of scenario 02 was superior to those of scenario 01, showing the importance of the periodicity inclusion. For scenario 01 without periodicity, the LSTM-CNN model exhibited the best performance with RMSE, MAE, NSE, and R values of 2.739 mg/L, 1.824 mg/L, 0.934, and 0.869, respectively, while the BiLSTM was found to be the poorest one with RMSE, MAE, R, and NSE values of 4.428 mg/L, 3.267 mg/L, 0.831, and 0.658, respectively. By including the periodicity, i.e., the day (DD), month (MM), and year (YY) number, there was a strong correlation between measured and predicted coagulant dosage and excellent improvement rates were gained for all models, and the LSTM-CNN model exhibited the best performance with RMSE, MAE, R, and NSE values of 2.000 mg/L, 1.212 mg/L, 0.966, and 0.930, respectively. Results obtained using the SHAP algorithm revealed that, in the major cases, the specific conductance and the color were found to be the most influencing variables on coagulant dosage.
| Original language | English |
|---|---|
| Article number | 109 |
| Journal | Water Conservation Science and Engineering |
| Volume | 10 |
| Issue number | 3 |
| DOIs | |
| State | Published - Dec 2025 |
Bibliographical note
Publisher Copyright:© The Author(s), under exclusive licence to Springer Nature Singapore Pte Ltd. 2025.
Keywords
- Coagulant dose
- Deep learning
- Modeling
- SHAP
- Water treatment
ASJC Scopus subject areas
- Environmental Engineering
- Water Science and Technology
- Ocean Engineering
- Waste Management and Disposal