Flood damage assessment remains challenging, as conventional flood risk management mainly relies on hydraulic hazard maps that do not explicitly reproduce observed damage patterns. Recent advances in remote sensing and machine learning (ML) enable the integration of environmental and socio-economic data with historical impact information to improve flood damage modeling. This study proposes an explainable machine learning framework for flood damage susceptibility mapping, using observed institutional damage records from the 2011 and 2013 flood events combined with 17 geospatial flood risk factors (FRFs) representing hazard, exposure, and vulnerability. This approach enables the capture of non-linear relationships between flood damage and FRFs. For comparison purposes, the same framework was also applied using hydraulically modeled flood extents corresponding to return periods of 30, 200, and 500 years. The framework was tested along the Basilicata Ionian coast in southern Italy, a Mediterranean region characterized by complex geomorphology, intense rainfall events, and recurrent flood impacts. An eXtreme Gradient Boosting (XGBoost) model was trained using 17 FRFs related to hazard, exposure, and vulnerability at a spatial resolution of 20 m. The model achieved high performance with an accuracy of 0.988, an F1-score for the minority class of 0.860, and an ROC-AUC (test) of 0.996. High to very high flood damage probability was predicted in approximately 4.1% of the study area, mainly in low-lying floodplains near river corridors and infrastructure. SHAP-based explainability analysis revealed that damage susceptibility was predominantly driven by hazard and exposure factors: Drainage density (17.10%), Railway distance (16.33%), and Elevation (15.42%), extreme precipitation (Max rainfall, 10.66%) and Street distance (7.51%), with socio-economic vulnerability contributing less than 4%. The observed damage target exhibited clear threshold-like patterns (e.g., sharp risk increases below ~25/35 m elevation or within ~150/200 m of road infrastructure), contrasting with the smoother, continuous gradients produced by hydraulic scenarios. This analysis identified the most influential predictors and their response ranges. The proposed framework complements hydraulic hazard mapping by explicitly modeling observed flood damage, supporting flood risk assessment in flood-prone coastal regions.

An Explainable Machine Learning Framework for Flood Damage Mapping Using Remote Sensing and Ground-Based Data: Application to the Basilicata Ionian Coast (Italy)

Dal Sasso, Silvano Fortunato
;
Aung, Htay Htay;Telesca, Vito
2026-01-01

Abstract

Flood damage assessment remains challenging, as conventional flood risk management mainly relies on hydraulic hazard maps that do not explicitly reproduce observed damage patterns. Recent advances in remote sensing and machine learning (ML) enable the integration of environmental and socio-economic data with historical impact information to improve flood damage modeling. This study proposes an explainable machine learning framework for flood damage susceptibility mapping, using observed institutional damage records from the 2011 and 2013 flood events combined with 17 geospatial flood risk factors (FRFs) representing hazard, exposure, and vulnerability. This approach enables the capture of non-linear relationships between flood damage and FRFs. For comparison purposes, the same framework was also applied using hydraulically modeled flood extents corresponding to return periods of 30, 200, and 500 years. The framework was tested along the Basilicata Ionian coast in southern Italy, a Mediterranean region characterized by complex geomorphology, intense rainfall events, and recurrent flood impacts. An eXtreme Gradient Boosting (XGBoost) model was trained using 17 FRFs related to hazard, exposure, and vulnerability at a spatial resolution of 20 m. The model achieved high performance with an accuracy of 0.988, an F1-score for the minority class of 0.860, and an ROC-AUC (test) of 0.996. High to very high flood damage probability was predicted in approximately 4.1% of the study area, mainly in low-lying floodplains near river corridors and infrastructure. SHAP-based explainability analysis revealed that damage susceptibility was predominantly driven by hazard and exposure factors: Drainage density (17.10%), Railway distance (16.33%), and Elevation (15.42%), extreme precipitation (Max rainfall, 10.66%) and Street distance (7.51%), with socio-economic vulnerability contributing less than 4%. The observed damage target exhibited clear threshold-like patterns (e.g., sharp risk increases below ~25/35 m elevation or within ~150/200 m of road infrastructure), contrasting with the smoother, continuous gradients produced by hydraulic scenarios. This analysis identified the most influential predictors and their response ranges. The proposed framework complements hydraulic hazard mapping by explicitly modeling observed flood damage, supporting flood risk assessment in flood-prone coastal regions.
2026
File in questo prodotto:
File Dimensione Formato  
remotesensing-18-01257.pdf

accesso aperto

Tipologia: Pdf editoriale
Licenza: Versione editoriale
Dimensione 6.4 MB
Formato Adobe PDF
6.4 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11563/214336
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact