Source apportionment of air pollution burden using geometric non-negative matrix factorization and high-throughput multi-pollutant air sensor data in Curtis Bay, Baltimore, USA
Air sensor networks provide hyperlocal, high temporal resolution data on multiple pollutants that can support credible identification of common pollution sources. Source apportionment using least squares-based non-negative matrix factorization is non-unique and often does not scale. A recent geometric source apportionment framework focuses inference on the source attribution matrix, which is shown to remain identifiable even when the factorization is not. Recognizing that the method scales with and benefits from large data volumes, we use this geometric method to analyze 451,946 one-minute air sensor records from Curtis Bay, collected from October 21, 2022 to June 16, 2023, covering size-resolved particulate matter (PM), black carbon (BC), carbon monoxide (CO), nitric oxide (NO), and nitrogen dioxide (NO2). The analysis identifies three stable sources. Source 1 explains > 70% of fine and coarse PM and ~30% of BC. Source 2 dominates CO and contributes ~70% of BC, NO, and NO2. Source 3 is specific to the larger PM fractions, PM10 to PM40. Regression analyses show Source 1 and Source 3 rise during bulldozer activity at a nearby coal terminal and under winds from the terminal, indicating a direct coal terminal influence, while Source 2 exhibits diurnal patterns consistent with traffic. A case-study on the day with a known bulldozer incident at the coal terminal further confirms the association of terminal activities with Sources 1 and 3. Extreme episodes identified from Source 1 intensity affected ~33 minutes per day at the study site nearest the coal terminal, with impacts attenuating at locations farther from the terminal. The results are stable under sensitivity analyses. The analysis demonstrates that geometric source apportionment, paired with high temporal resolution data from multi-pollutant air sensor networks, delivers scalable and reliable evidence to inform mitigation strategies.
💡 Research Summary
This study presents a novel application of geometric non-negative matrix factorization (NMF) for source apportionment of air pollution, using high-throughput, multi-pollutant sensor data from the Curtis Bay neighborhood in Baltimore, USA. The research addresses the well-known limitations of traditional least-squares-based NMF methods, such as non-unique solutions and poor scalability with large datasets. Instead, it employs a recently developed geometric framework that focuses inference on the identifiable “source attribution matrix,” which quantifies the percentage contribution of each latent source to each pollutant’s concentration.
The analysis utilized a massive dataset of 451,946 one-minute records collected from October 2022 to June 2023 across four monitoring sites. Measurements included size-resolved particulate matter (PM1, PM2.5, PM10, TSP), black carbon (BC), carbon monoxide (CO), nitric oxide (NO), and nitrogen dioxide (NO2). The geometric NMF algorithm treated each minute-by-pollutant observation as a point in a multi-dimensional space, estimated the convex hull of this point cloud, and identified its extreme vertices as the source profiles.
The model robustly identified three stable pollution sources. Source 1 was responsible for over 70% of fine and coarse PM and approximately 30% of BC. Source 2 dominated CO and contributed about 70% of BC, NO, and NO2. Source 3 was specific to the largest particle fraction (TSP-PM10). Subsequent regression analyses and a focused case study linked Sources 1 and 3 to visible bulldozer activity at a nearby open-air coal export terminal and to wind patterns originating from the terminal direction. In contrast, Source 2 exhibited clear diurnal patterns consistent with local traffic emissions. The study quantified that extreme episodes driven by Source 1 impacted the site nearest the coal terminal for approximately 33 minutes per day on average, with effects diminishing with distance.
The paper concludes that the geometric NMF approach, when paired with high temporal resolution sensor network data, provides a scalable, reliable, and interpretable method for pollution source apportionment. It offers tangible scientific evidence supporting long-standing community concerns about fugitive dust from the coal terminal and delivers actionable insights for informing environmental regulation and public health mitigation strategies in Curtis Bay and similar industrial-urban interface communities globally.
Comments & Academic Discussion
Loading comments...
Leave a Comment