A Decadal Study of PM2.5 Concentrations over Delhi using MERRA-2 and Ground Measurements: Predictive Insights via Machine Learning

Authors

  • Sumit Singh aDepartment of Civil Engineering, Institute of Engineering and Technology, Lucknow, UP 226 021, India
  • Vikash Singh Civil Engineering Department, Institute of Engineering and Technology, Lucknow
  • Ajay Kumar Department of Civil Engineering, Institute of Engineering and Technology, Lucknow, UP 226 021, India
  • Amarendra Singh dCentre for Atmospheric Sciences, Indian Institute of Technology, Hauz Khas, New Delhi 110 016, India
  • Atul Kumar Srivastava bIndian Institute of Tropical Meteorology, Ministry of Earth Sciences, New Delhi 110 060, India
  • Virendra Pathak aDepartment of Civil Engineering, Institute of Engineering and Technology, Lucknow, UP 226 021, India

DOI:

https://doi.org/10.56042/ijpap.v62i9.11443

Keywords:

PM2.5 concentrations, Delhi, Machine learning models, air pollution, MERRA-2

Abstract

This study investigates the spatial and temporal variations of PM2.5 concentrations in Delhi from 2014 to 2023, utilizing ground-based measurements from the Central Pollution Control Board (CPCB) and MERRA-2 reanalysis data. The analysis reveals strong positive correlations (r > 0.90) across all districts, highlighting city-wide factors influencing PM2.5 levels, such as vehicular emissions, industrial activities, and regional weather patterns. Seasonal patterns show PM2.5 concentrations peaking during winter, attributed to lower temperatures, reduced wind speeds, and increased emissions from heating sources.To enhance the accuracy of PM2.5 predictions, various machine learning (ML) models were employed, including Extra Trees Regressor, Random Forest Regressor, Light Gradient Boosting Machine (LGBM) Regressor, and a Stacking Regressor. These models utilized MERRA-2 sub-parameters like Dust, Organic Carbon, Black Carbon, Sea Salt, and Sulfate. The Stacking Regressor demonstrated the best performance, achieving an R² value of 0.67 and a significant improvement in correlation with CPCB measurements (r = 0.86). The ML models significantly improved the prediction accuracy of PM2.5 concentrations compared to the original MERRA-2 data, reducing the Mean Bias from -39.4 µg/m3 to around 10.4µg/m3 and the Root Mean Squared Error (RMSE) from 71.1 µg/m3 to below 40 µg/m3. Additionally, the Fraction of predictions within a factor of 2 increased from 0.61 for MERRA-2 to over 0.89 for all ML models.These findings underscore the effectiveness of integrating machine learning models with MERRA-2 sub-parameters to accurately estimate PM2.5 concentrations. This approach provides more reliable predictions of air quality, essential for developing targeted and effective air quality management strategies in Delhi.

Downloads

Published

2024-09-03