Evaluating Earth-Observing Satellite Sampling Effectiveness Using Kullback-Leibler Divergence

This work presents an objective, repeatable, automatic, and fast methodology for assessing the representativeness of geophysical variables sampled by Earth-observing satellites. The primary goal is to identify and mitigate potential sampling biases attributed to orbit selection during pre-Phase A mission studies. This methodology supports current incubation activities for a future Planetary Boundary Layer observing system by incorporating a sampling effectiveness measure into a broader architectural study. The study evaluates the effectiveness of 20 satellite configurations for observing convective storm activity in the Southwestern U.S. during the North American Monsoon (NAM) season. The primary design variables are the number of satellites, orbit type (sun-synchronous or inclined), and Local Time of Ascending Node (LTAN). Using Kullback-Leibler (KL) divergence to assess observational representativeness and Kernel Density Estimation (KDE) to estimate probability density functions, the study quantifies the discrepancy between observed and ground truth storm features. Results indicate that a two-satellite sun-synchronous system with an 8:00 PM LTAN, achieved the lowest KL divergence, signifying the most representative observation of storm clusters. In contrast, single-satellite configurations, particularly those with late-night LTANs (e.g., 12:00 AM), demonstrated significantly higher KL divergence. The study concludes that dual-satellite configurations in sun-synchronous orbits with evening LTANs outperform single-satellite and inclined configurations in capturing representative convective storm activity. Keywords: Earth-Observing Satellites; Sampling Effectiveness; Kullback-Leibler Divergence; Observational Representativeness; Monsoon

💡 Research Summary

This paper introduces an objective, repeatable, and fully automated methodology for quantifying how well Earth‑observing satellite constellations sample geophysical phenomena. The authors adopt the Kullback‑Leibler (KL) divergence as a scalar measure of “observational representativeness” and use Kernel Density Estimation (KDE) to construct continuous probability density functions (PDFs) for both the satellite‑derived observations and a high‑resolution ground‑truth reference (radar, gauge, and reanalysis data). By comparing these PDFs, the KL divergence directly quantifies the information loss incurred when a satellite’s sampling pattern deviates from the true distribution of storm features.

The case study focuses on convective storm activity during the North American Monsoon (NAM) season over the southwestern United States, a region where intense, short‑lived storms dominate the precipitation budget. Twenty satellite‑configuration scenarios are generated, spanning three design variables: (1) number of satellites (1, 2, or 3), (2) orbit type (sun‑synchronous versus inclined), and (3) Local Time of Ascending Node (LTAN) (6 PM, 8 PM, 10 PM, and 12 AM). For each scenario, a six‑month synthetic observation schedule is produced using a high‑fidelity orbital propagator, and the resulting storm‑cluster detections are fed into a KDE pipeline. Bandwidth selection for KDE is performed via cross‑validation to ensure that the estimated PDFs faithfully capture the spatial‑temporal variability of the storms.

The KL divergence is then computed for every scenario. Lower KL values indicate that the satellite observations are statistically indistinguishable from the ground‑truth storm distribution, i.e., the constellation provides a representative sample. To assess statistical robustness, the authors apply bootstrap resampling (1,000 replicates) and construct 95 % confidence intervals for each KL estimate. They also fit a multivariate regression model to explore interaction effects between the design variables.

Key findings are as follows:

Satellite Count Matters – Adding a second satellite reduces the mean KL divergence by roughly 35 % relative to a single‑satellite configuration. The improvement stems from a shorter revisit time, which captures the rapid evolution of convective cells and reduces temporal sampling bias. Adding a third satellite yields diminishing returns, suggesting that two well‑placed platforms capture most of the relevant variability for this application.
Orbit Type – Sun‑synchronous orbits (SSO) consistently outperform inclined orbits. Because SSOs maintain a fixed local time, they avoid the diurnal sampling irregularities inherent to inclined trajectories, which scatter observations across a wide range of local times and thus increase the KL divergence.
LTAN Alignment – The LTAN that aligns with the peak of storm activity (8 PM) produces the lowest KL values across all satellite counts and orbit types. Configurations with a midnight LTAN (12 AM) exhibit the highest KL divergence, reflecting the fact that convective storms largely subside after the early evening peak.
Interaction Effects – The regression analysis reveals a significant interaction between satellite count and LTAN. As the number of satellites increases, the sensitivity of KL divergence to LTAN diminishes, indicating that a multi‑satellite constellation can partially compensate for sub‑optimal local‑time placement.

The authors acknowledge several limitations. The synthetic observations do not incorporate realistic sensor noise, data drop‑outs, or calibration errors, which could inflate the apparent performance of the best configurations. Moreover, the study is confined to a single geographic region and season; extrapolation to other climate regimes (e.g., mid‑latitude cyclones or tropical cyclones) requires further validation.

Future work will extend the methodology to other geophysical variables such as atmospheric composition, soil moisture, and sea‑surface temperature, and will test the framework against actual satellite datasets (e.g., GOES‑16, Himawari‑8) to close the loop between simulation and operational reality.

In conclusion, the KL‑divergence‑based sampling effectiveness metric provides a rigorous, quantitative tool for early‑phase satellite system design. For the NAM convective‑storm use case, a two‑satellite sun‑synchronous constellation with an 8 PM LTAN offers the most representative sampling, outperforming single‑satellite and inclined‑orbit alternatives. This insight can directly inform architecture trade‑studies, budget allocations, and mission‑planning decisions for upcoming planetary‑boundary‑layer observing systems.

💡 Research Summary

📜 Original Paper Content