Replicating weak-lensing summary-statistic covariances with normalizing flows

Replicating weak-lensing summary-statistic covariances with normalizing flows
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

We explore the ability of normalizing flow (NF) generative models to reproduce weak-lensing summary statistics when trained on a set of cosmological simulations. Our analysis focuses on how accurately NF models recover the mean, standard deviation, and covariance of key statistics derived from convergence ($κ$) maps: The angular power spectrum $C_{\ell}$, probability density function, and Minkowski functionals of weak lensing convergence $κ$-maps. We test two scenarios for training: (1) on the data vectors and (2) on the full $κ$-maps. In both cases, the NF models reproduce the mean and variance of the target statistics within percent-level accuracy. However, the accuracy of the off-diagonal elements of the covariance matrix is underestimated by up to $\sim25%$. We study several mitigation strategies and find that data augmentation and training with noisy fields help improve covariance recovery to $\mathcal{O}(10%)$. Our study demonstrates that while the means and variances of weak lensing statistics can be well modeled by NF, covariances can be significantly underestimated if mitigation strategies are not applied.


💡 Research Summary

This paper investigates the capability of normalizing flow (NF) generative models to faithfully reproduce the statistical properties of weak‑lensing convergence (κ) maps, focusing on three key summary statistics: the angular power spectrum (Cℓ), the one‑point probability density function (PDF), and the Minkowski functionals (MFs). Using the publicly available SLICS suite of N‑body simulations, the authors train NF models on 954 independent 10 deg² κ‑maps. Two training regimes are explored: (1) direct learning on the low‑dimensional data vectors (the summary statistics themselves) and (2) learning on the full 256 × 256 pixel κ‑maps. Both regimes employ relatively compact NF architectures—neural spline flows (NSF) and multiscale flows (MS)—with 3–4 × 10⁵ trainable parameters and are optimized using a Kullback‑Leibler divergence loss that enforces exact likelihood evaluation.

The results show that NF models can recover the means and standard deviations of all three statistics to within about 1 % of the ground‑truth values, demonstrating that the flows capture the non‑Gaussian features of the convergence field effectively. However, the off‑diagonal elements of the covariance matrices are systematically underestimated, with errors reaching up to ~25 %. This shortfall is attributed to the difficulty of learning high‑dimensional correlation structures from a limited number of training samples.

To mitigate this issue, the authors test two strategies: (i) data augmentation through rotations, flips, and scalings, which increases the effective size of the training set, and (ii) adding Gaussian noise to the κ‑maps during training, thereby exposing the model to a broader distribution of field realizations. Both approaches improve the fidelity of the recovered covariances, reducing the error on off‑diagonal terms to roughly 10 %.

A comparative analysis of the two NF architectures reveals that adding more multiscale layers (increasing parameters from ~518 k to ~665 k) yields only marginal gains, indicating that even relatively lightweight flows are sufficient for accurate mean and variance reconstruction. The study also highlights that while NF provides an exact, invertible mapping and avoids mode collapse typical of GANs, its ability to capture subtle inter‑statistic correlations is limited without additional training tricks.

In conclusion, normalizing flows are promising tools for generating large ensembles of weak‑lensing maps at negligible computational cost, accurately reproducing first‑order statistics. Nevertheless, careful treatment—such as data augmentation and noisy‑field training—is essential to obtain reliable covariance estimates, which are crucial for cosmological parameter inference in upcoming surveys like LSST, Euclid, and the Roman Space Telescope. Future work may extend this framework to conditional flows that directly emulate the dependence of summary‑statistic covariances on cosmological parameters, further enhancing the utility of NF in precision cosmology.


Comments & Academic Discussion

Loading comments...

Leave a Comment