A Decadal Study of PM2.5 Concentrations over Delhi using MERRA-2 and Ground Measurements: Predictive Insights via Machine Learning

Sumit Singh; Vikash Singh; Ajay Kumar; Amarendra Singh; Atul Kumar Srivastava; Virendra Pathak

doi:10.56042/ijpap.v62i9.11443

Authors

Sumit Singh aDepartment of Civil Engineering, Institute of Engineering and Technology, Lucknow, UP 226 021, India
Vikash Singh Civil Engineering Department, Institute of Engineering and Technology, Lucknow
Ajay Kumar Department of Civil Engineering, Institute of Engineering and Technology, Lucknow, UP 226 021, India
Amarendra Singh dCentre for Atmospheric Sciences, Indian Institute of Technology, Hauz Khas, New Delhi 110 016, India
Atul Kumar Srivastava bIndian Institute of Tropical Meteorology, Ministry of Earth Sciences, New Delhi 110 060, India
Virendra Pathak aDepartment of Civil Engineering, Institute of Engineering and Technology, Lucknow, UP 226 021, India

DOI:

https://doi.org/10.56042/ijpap.v62i9.11443

Keywords:

PM2.5 concentrations, Delhi, Machine learning models, air pollution, MERRA-2

Abstract

This study investigates the spatial and temporal variations of PM_2.5 concentrations in Delhi from 2014 to 2023, utilizing ground-based measurements from the Central Pollution Control Board (CPCB) and MERRA-2 reanalysis data. The analysis reveals strong positive correlations (r > 0.90) across all districts, highlighting city-wide factors influencing PM_2.5 levels, such as vehicular emissions, industrial activities, and regional weather patterns. Seasonal patterns show PM_2.5 concentrations peaking during winter, attributed to lower temperatures, reduced wind speeds, and increased emissions from heating sources.To enhance the accuracy of PM_2.5 predictions, various machine learning (ML) models were employed, including Extra Trees Regressor, Random Forest Regressor, Light Gradient Boosting Machine (LGBM) Regressor, and a Stacking Regressor. These models utilized MERRA-2 sub-parameters like Dust, Organic Carbon, Black Carbon, Sea Salt, and Sulfate. The Stacking Regressor demonstrated the best performance, achieving an R² value of 0.67 and a significant improvement in correlation with CPCB measurements (r = 0.86). The ML models significantly improved the prediction accuracy of PM_2.5 concentrations compared to the original MERRA-2 data, reducing the Mean Bias from -39.4 µg/m³ to around 10.4µg/m³ and the Root Mean Squared Error (RMSE) from 71.1 µg/m³ to below 40 µg/m³. Additionally, the Fraction of predictions within a factor of 2 increased from 0.61 for MERRA-2 to over 0.89 for all ML models.These findings underscore the effectiveness of integrating machine learning models with MERRA-2 sub-parameters to accurately estimate PM_2.5 concentrations. This approach provides more reliable predictions of air quality, essential for developing targeted and effective air quality management strategies in Delhi.

A Decadal Study of PM2.5 Concentrations over Delhi using MERRA-2 and Ground Measurements: Predictive Insights via Machine Learning

Authors

DOI:

Keywords:

Abstract

Downloads

Published

Issue

Section

License

How to Cite

Similar Articles

Most read articles by the same author(s)

Make a Submission

Language

Information

Latest publications

Keywords