River water quality index prediction and uncertainty analysis: A comparative study of machine learning models

  • Seyed Babak Haji Seyed Asadollah
  • , Ahmad Sharafati*
  • , Davide Motta
  • , Zaher Mundher Yaseen
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

321 Scopus citations

Abstract

The Water Quality Index (WQI) is the most common indicator to characterize surface water quality. This study introduces a new ensemble machine learning model called Extra Tree Regression (ETR) for predicting monthly WQI values at the Lam Tsuen River in Hong Kong. The ETR model performance is compared with that of the classic standalone models, Support Vector Regression (SVR) and Decision Tree Regression (DTR). The monthly input water quality data including Biochemical Oxygen Demand (BOD), Chemical Oxygen Demand (COD), Dissolved Oxygen (DO), Electrical Conductivity (EC), Nitrate-Nitrogen (NO3 -N), Nitrite-Nitrogen (NO2 -N), Phosphate (PO43-), potential for Hydrogen (pH), Temperature (T) and Turbidity (TUR) are used for building the prediction models. Various input data combinations are investigated and assessed in terms of prediction performance, using numerical indices and graphical comparisons. The analysis shows that the ETR model generally produces more accurate WQI predictions for both training and testing phases. Although including all the ten input variables achieves the highest prediction performance (R2test=0.98, RMSEtest=2.99), a combination of input parameters including only BOD, Turbidity and Phosphate concentration provides the second highest prediction accuracy (R2test=0.97, RMSEtest=3.74). The uncertainty analysis relative to model structure and input parameters highlights a higher sensitivity of the prediction results to the former. In general, the ETR model represents an improvement on previous approaches for WQI prediction, in terms of prediction performance and reduction in the number of input parameters.

Original languageEnglish
Article number104599
JournalJournal of Environmental Chemical Engineering
Volume9
Issue number1
DOIs
StatePublished - Feb 2021
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 2020 Elsevier Ltd.

Keywords

  • Ensemble machine learning
  • Lam Tsuen river
  • River water quality
  • Water quality index

ASJC Scopus subject areas

  • Chemical Engineering (miscellaneous)
  • Waste Management and Disposal
  • Pollution
  • Process Chemistry and Technology

Fingerprint

Dive into the research topics of 'River water quality index prediction and uncertainty analysis: A comparative study of machine learning models'. Together they form a unique fingerprint.

Cite this