Intercomparison and Assessment of Stand-Alone and Wavelet-Coupled Machine Learning Models for Simulating Rainfall-Runoff Process in Four Basins of Pothohar Region, Pakistan

IRIS

The science of hydrological modeling has continuously evolved under the influence of rapid advancements in software and hardware technologies. Starting from simple rational formulae for estimating peak discharge and developing into sophisticated univariate predictive models, accurate conversion of rainfall into runoff and the assessment of inherent uncertainty has been a prime focus for researchers. Therefore, alternative data-driven methods have gained widespread attention in hydrology. Moreover, scientists often couple conventional machine learning models with data pre-processing techniques, i.e., wavelet transformation (WT), to enhance modelling accuracy. In this context, this research work attempts to explore the latent linkage between rainfall and runoff in Pothohar region of Pakistan by developing a novel linkage of five streamline techniques of machine learning, including single decision tree (SDT), decision tree forest (DTF), tree boost (TB), multilayer perceptron (MLP), and gene expression modeling (GEP), with a more sophisticated variant of WT, i.e., maximal overlap discrete wavelet transformation (MODWT), for boundary correction of the transformed components of timeseries data. This study also implements these machine learning models in a stand-alone mode for a more comprehensive comparative analysis of performances. Furthermore, the study uses a combined-basin approach that divides Pothohar region into two basins to compensate for the complex topographic division of the study area. The results indicate that MODWT-based DTF outperformed other stand-alone and hybrid models in terms of modeling accuracy. In the first scenario, considering the Bunha-Kahan River basin, MODWT-DTF yielded the highest NSE (0.86) and the lowest RMSE (220.45 mm) and R2 (0.92 at lag order 3 (Lo3)) when transformed with daubechies4 (db4) at level three. While in the Soan-Haro River basin, MODWT-DTF produced the highest accuracy modeling at lag order 4 (Lo4) (NSE = 0.88, RMSE = 21.72 m(3)/s, and R2 = 0.91). The highly accurate performance of 3- and 4-days lagged models reflects the temporal consistency in hydrological response of the study area. The comparison of simple and hybrid model performance indicates up to a 55% increase in modeling accuracy due to data pre-processing with wavelet transformation.

Intercomparison and Assessment of Stand-Alone and Wavelet-Coupled Machine Learning Models for Simulating Rainfall-Runoff Process in Four Basins of Pothohar Region, Pakistan

Khan M. T.;Shoaib M.;Albano R.;Inam M. A.;Salahudin H.;Hammad M.;Ahmad S.;Ali M. U.;Hashim S.;Ullah M. K.

2023-01-01

Abstract

The science of hydrological modeling has continuously evolved under the influence of rapid advancements in software and hardware technologies. Starting from simple rational formulae for estimating peak discharge and developing into sophisticated univariate predictive models, accurate conversion of rainfall into runoff and the assessment of inherent uncertainty has been a prime focus for researchers. Therefore, alternative data-driven methods have gained widespread attention in hydrology. Moreover, scientists often couple conventional machine learning models with data pre-processing techniques, i.e., wavelet transformation (WT), to enhance modelling accuracy. In this context, this research work attempts to explore the latent linkage between rainfall and runoff in Pothohar region of Pakistan by developing a novel linkage of five streamline techniques of machine learning, including single decision tree (SDT), decision tree forest (DTF), tree boost (TB), multilayer perceptron (MLP), and gene expression modeling (GEP), with a more sophisticated variant of WT, i.e., maximal overlap discrete wavelet transformation (MODWT), for boundary correction of the transformed components of timeseries data. This study also implements these machine learning models in a stand-alone mode for a more comprehensive comparative analysis of performances. Furthermore, the study uses a combined-basin approach that divides Pothohar region into two basins to compensate for the complex topographic division of the study area. The results indicate that MODWT-based DTF outperformed other stand-alone and hybrid models in terms of modeling accuracy. In the first scenario, considering the Bunha-Kahan River basin, MODWT-DTF yielded the highest NSE (0.86) and the lowest RMSE (220.45 mm) and R2 (0.92 at lag order 3 (Lo3)) when transformed with daubechies4 (db4) at level three. While in the Soan-Haro River basin, MODWT-DTF produced the highest accuracy modeling at lag order 4 (Lo4) (NSE = 0.88, RMSE = 21.72 m(3)/s, and R2 = 0.91). The highly accurate performance of 3- and 4-days lagged models reflects the temporal consistency in hydrological response of the study area. The comparison of simple and hybrid model performance indicates up to a 55% increase in modeling accuracy due to data pre-processing with wavelet transformation.

Scheda breve

Scheda completa

Scheda completa (DC)

Anno del prodotto

2023

Appare nelle tipologie:

1.1 Articolo su Rivista

File in questo prodotto:

File	Dimensione	Formato
atmosphere-14-00452.pdf accesso aperto Tipologia: Documento in Post-print Licenza: Non definito Dimensione 5.62 MB Formato Adobe PDF Visualizza/Apri	5.62 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11563/166734

Citazioni

ND

2

2

social impact