Reliable estimation of surface-level nitrogen dioxide (NO2) concentrations is critical for air quality assessment in urban regions, where ground-based monitoring networks often provide limited spatial coverage. Satellite observations from Sentinel-5P offer valuable information on tropospheric NO2 columns, but their translation to surface concentrations remains challenging due to strong spatial heterogeneity and meteorological influences. This study investigates the estimation of daily surface-level NO2 concentrations across the Community of Madrid (Spain) by combining Sentinel-5P observations with routinely available ground-based meteorological data using machine learning approaches. Four modelling paradigms were evaluated: Random Forest, Support Vector Machine, hybrid ensemble models based on Random Forest and XGBoost, and Artificial Neural Networks. The analysis systematically assessed the influence of temporal and spatial preprocessing by comparing two temporal aggregation strategies—satellite overpass conditions alone and a dual-time window including antecedent atmospheric conditions—and four spatial configurations ranging from regional aggregation to environmentally coherent clustering. Results indicate that incorporating historical meteorological information through an extended temporal window consistently improves predictive performance across all models. Spatial stratification was found to be equally critical: grouping monitoring stations into environmentally coherent clusters substantially outperformed both purely geometric grids and region-wide aggregation. The best-performing configuration, an Artificial Neural Network combined with simplified coherent spatial clusters, achieved an RMSE of 2.44 μg/m3, an R2 of 0.87, and a MAE of 1.61 μg/m3. These findings demonstrate that high predictive accuracy can be achieved through informed temporal and spatial design choices without increasing model complexity or relying on auxiliary emission inventories or chemical transport models, providing a transferable framework for urban air quality monitoring.

Estimating surface-level NO2 concentrations in the Madrid region using Sentinel-5P observations and ground-based meteorological data with machine learning approaches

Giosa R.;Serio C.;Masiello G.;Liuzzi G.;
2026-01-01

Abstract

Reliable estimation of surface-level nitrogen dioxide (NO2) concentrations is critical for air quality assessment in urban regions, where ground-based monitoring networks often provide limited spatial coverage. Satellite observations from Sentinel-5P offer valuable information on tropospheric NO2 columns, but their translation to surface concentrations remains challenging due to strong spatial heterogeneity and meteorological influences. This study investigates the estimation of daily surface-level NO2 concentrations across the Community of Madrid (Spain) by combining Sentinel-5P observations with routinely available ground-based meteorological data using machine learning approaches. Four modelling paradigms were evaluated: Random Forest, Support Vector Machine, hybrid ensemble models based on Random Forest and XGBoost, and Artificial Neural Networks. The analysis systematically assessed the influence of temporal and spatial preprocessing by comparing two temporal aggregation strategies—satellite overpass conditions alone and a dual-time window including antecedent atmospheric conditions—and four spatial configurations ranging from regional aggregation to environmentally coherent clustering. Results indicate that incorporating historical meteorological information through an extended temporal window consistently improves predictive performance across all models. Spatial stratification was found to be equally critical: grouping monitoring stations into environmentally coherent clusters substantially outperformed both purely geometric grids and region-wide aggregation. The best-performing configuration, an Artificial Neural Network combined with simplified coherent spatial clusters, achieved an RMSE of 2.44 μg/m3, an R2 of 0.87, and a MAE of 1.61 μg/m3. These findings demonstrate that high predictive accuracy can be achieved through informed temporal and spatial design choices without increasing model complexity or relying on auxiliary emission inventories or chemical transport models, providing a transferable framework for urban air quality monitoring.
2026
File in questo prodotto:
File Dimensione Formato  
1-s2.0-S1352231026001330-main (1).pdf

accesso aperto

Descrizione: Pdf Editorilae
Tipologia: Pdf editoriale
Licenza: Creative commons
Dimensione 4.47 MB
Formato Adobe PDF
4.47 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11563/212977
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact