Machine Learning in Hydrology
Data-driven models for flood susceptibility mapping, environmental monitoring, and hydrological prediction—bridging process-based understanding with the power of large, heterogeneous datasets.
Data-Driven Flood Modelling
Machine learning techniques are increasingly used to model complex hydrological processes by integrating large and heterogeneous datasets. In the context of flood hazard assessment, data-driven approaches complement traditional physically-based models by efficiently capturing non-linear relationships between environmental predictors and observed flood patterns.
This research line applies Random Forest (RF), deep neural networks (DNN), and other ensemble methods to map flood susceptibility at regional to national scales. A distinctive element is the integration of satellite observations (SAR, multispectral) with geomorphic terrain descriptors such as the Geomorphic Flood Index (GFI), elevation, slope, and contributing area—grounding purely data-driven predictions in physical terrain attributes.
A novel feature selection framework, the Average Merit of Information (AMI), has been developed to maximise the information content of flood conditioning factors (FCFs) while controlling for redundancy through Pearson correlation and Variance Inflation Factor (VIF) analysis. Results demonstrate that as few as 10 well-chosen predictors can yield high-accuracy susceptibility maps, and that the GFI substantially reduces overestimation near river corridors.
From Data to Susceptibility Map
Data Collection
Assemble multi-source predictors: DEM derivatives (slope, aspect, curvature, GFI, HAND), satellite imagery (SAR backscatter, spectral indices), land use, soil, rainfall, and drainage network features.
Flood Inventory
Compile observed flood extents from satellite observations, regional records, and official hazard maps to build training and validation datasets.
Feature Selection
Apply AMI ranking, Pearson correlation, and VIF analysis to identify the most informative and non-redundant subset of flood conditioning factors.
Model Training & Calibration
Train Random Forest, DNN, or other ensemble classifiers. Optimise hyperparameters via cross-validation. Calibrate probability thresholds using ROC analysis.
Validation & Map Production
Validate against independent flood extents and official hazard maps. Generate continuous susceptibility maps and binary flood-prone/safe rasters with performance metrics.
Key Findings
From predictor selection to national-scale susceptibility mapping.
Publications
Peer-reviewed articles on machine learning methods for flood susceptibility and hydrological prediction.
This study applies a Random Forest model to assess flood susceptibility across Italy using 26 potential flood conditioning factors (FCFs). A novel feature selection framework called Average Merit of Information (AMI) maximises the information content of predictors while controlling redundancy. Results show that 10 well-chosen predictors are sufficient for accurate flood representation, and that the Geomorphic Flood Index (GFI) substantially reduces overestimation near flooded river corridors. Satellite observations and regional historical flood records were used to calibrate the model. Official flood hazard maps were used for generalisation assessment.
@article{saavedra2025rf,
title = {{Mapping Flood Susceptibility Using Random Forest Exploiting
Satellite Observations and Geomorphic Features}},
author = {{Saavedra Navarro}, Jorge and Zhuang, Ruodan and
Albertini, Cinzia and Manfreda, Salvatore},
journal = {Science of the Total Environment},
volume = {1002},
pages = {180592},
year = {2025},
doi = {10.1016/j.scitotenv.2025.180592}
}@article{albertini2024rf,
title = {{Assessing Multi-source Random Forest Classification and
Robustness of Predictor Variables in Flooded Areas Mapping}},
author = {Albertini, C. and Gioia, A. and Iacobellis, V. and
Petropoulos, G. P. and Manfreda, Salvatore},
journal = {Remote Sensing Applications: Society and Environment},
volume = {35},
pages = {101239},
year = {2024},
doi = {10.1016/j.rsase.2024.101239}
}@inproceedings{balestra2022dnn,
title = {{Flood Susceptibility Mapping Using a Deep Neural
Network Model: The Case Study of Southern Italy}},
author = {Balestra, F. and {Del Vecchio}, M. and Pirone, D. and
Pedone, M. A. and Spina, D. and Manfreda, S. and
Menduni, G. and Bignami, D. F.},
booktitle = {Environmental Sciences Proceedings},
volume = {21},
pages = {36},
year = {2022},
doi = {10.3390/environsciproc2022021036}
}Develops predictive models for envelope flood extents by combining geomorphic and climatic-hydrologic catchment characteristics. The approach demonstrates that terrain-derived features, when combined with catchment-scale hydrological and climatic descriptors, can generalise flood extent predictions across diverse physiographic settings.
@article{tavarescosta2020,
title = {{Predictive Modelling of Envelope Flood Extents Using
Geomorphic and Climatic-Hydrologic Catchment Characteristics}},
author = {{Tavares da Costa}, R. and Zanardo, S. and Bagli, S. and
Hilberts, A. G. J. and Manfreda, S. and Samela, C. and
Castellarin, A.},
journal = {Water Resources Research},
year = {2020},
doi = {10.1029/2019WR026453}
}Downloads
PDFs freely available.

