This study proposes an ensemble machine learning model to analyse the association between respiratory Emergency Room (ER) admissions and environmental factors, such as air pollution and weather-climatic conditions. The analysed climatic variables include air temperatures Tmin, Tmax, Taverage, atmospheric pressure P, relative humidity RH, and levels of CO, O₃, PM₁₀, and NO₂. The data were processed as daily averages to ensure consistency and comparability in the analyses. Data on ER, provided by the Policlinico of Bari, cover the period from 2013 to 2023. The analysis was conducted using ensemble learning techniques, applying three regression models: Random Forest, XGBoost, and Adaboost. The models were trained on a pre-processed database using a 7-day exponential moving average (EMA7) to obtain a more stable time series. Model hyperparameters were optimized through Bayesian optimization. Among the analysed models, XGBoost showed high predictive capacity in test sets. In particular, the R2 value was 0.772, while the MAE was 0.049 cases/day. Applying SHAP (SHapley Additive exPlanations) analysis to the XGBoost model allowed us to identify the most important variables influencing hospital admissions and their related patterns. The most relevant features, ranked by importance, were: low values of average air temperature and atmospheric pressure, and high values of CO. The SHAP method, and in particular the use of Bee Swarm plots, were used to globally interpret the results obtained by the model and allowed us to reach the above results. Furthermore, in order to determine for the most important features the values ​​that cause an increase in admission to the emergency room for respiratory diseases, a local analysis was carried out by applying the LIME model which allowed us to say that the greater onset of respiratory diseases is associated with average temperatures lower than 12.28 ℃, atmospheric pressure values ​​lower than or equal to 1006.81 hPa and CO concentrations greater than 0.84 mg/m3.

The International Conference on Computational Science and its Applications - ICCSA 2025

Vito Telesca;Marica Rondinone
2025-01-01

Abstract

This study proposes an ensemble machine learning model to analyse the association between respiratory Emergency Room (ER) admissions and environmental factors, such as air pollution and weather-climatic conditions. The analysed climatic variables include air temperatures Tmin, Tmax, Taverage, atmospheric pressure P, relative humidity RH, and levels of CO, O₃, PM₁₀, and NO₂. The data were processed as daily averages to ensure consistency and comparability in the analyses. Data on ER, provided by the Policlinico of Bari, cover the period from 2013 to 2023. The analysis was conducted using ensemble learning techniques, applying three regression models: Random Forest, XGBoost, and Adaboost. The models were trained on a pre-processed database using a 7-day exponential moving average (EMA7) to obtain a more stable time series. Model hyperparameters were optimized through Bayesian optimization. Among the analysed models, XGBoost showed high predictive capacity in test sets. In particular, the R2 value was 0.772, while the MAE was 0.049 cases/day. Applying SHAP (SHapley Additive exPlanations) analysis to the XGBoost model allowed us to identify the most important variables influencing hospital admissions and their related patterns. The most relevant features, ranked by importance, were: low values of average air temperature and atmospheric pressure, and high values of CO. The SHAP method, and in particular the use of Bee Swarm plots, were used to globally interpret the results obtained by the model and allowed us to reach the above results. Furthermore, in order to determine for the most important features the values ​​that cause an increase in admission to the emergency room for respiratory diseases, a local analysis was carried out by applying the LIME model which allowed us to say that the greater onset of respiratory diseases is associated with average temperatures lower than 12.28 ℃, atmospheric pressure values ​​lower than or equal to 1006.81 hPa and CO concentrations greater than 0.84 mg/m3.
2025
978-3-031-97657-5
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11563/204216
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact