On the inadequacy of N-point correlation functions to describe nonlinear cosmological fields: explicit examples and connection to simulations

Motivated by recent results on lognormal statistics showing that the moment hierarchy of a lognormal variable completely fails at capturing its information content in the large variance regime, we discuss in this work the inadequacy of the hierarchy of correlation functions to describe a correlated lognormal field, which provides a roughly accurate description of the non-linear cosmological matter density field. We present families of fields having the same hierarchy of correlation functions than the lognormal field at all orders. This explicitly demonstrates the little studied though known fact that the correlation function hierarchy never provides a complete description of a lognormal field, and that it fails to capture information in the non-linear regime, where other simple observables are left totally unconstrained. We discuss why perturbative, Edgeworth-like approaches to statistics in the non-linear regime, common in cosmology, can never reproduce or predict that effect, and why it is however generic for tailed fields, hinting at a breakdown of the perturbation theory based on the field fluctuations. We make a rough but successful quantitative connection to N-body simulations results, that showed that the spectrum of the log-density field carries more information than the spectrum of the field entering the non-linear regime.

💡 Research Summary

The paper addresses a fundamental limitation of using the hierarchy of n‑point correlation functions to describe the nonlinear matter density field in cosmology. It focuses on the lognormal model, which has long been employed as an approximate statistical description of the evolved density field because the logarithm of the density, δ → ln(1+δ), appears nearly Gaussian in the highly nonlinear regime. The authors demonstrate that, despite its popularity, the lognormal field cannot be fully characterized by its entire set of correlation functions.

First, they construct explicit families of random fields that share exactly the same n‑point correlation functions as a lognormal field at every order, yet possess different probability‑density functions (PDFs). Two construction methods are presented. The “moment‑equivalent transformation” adds a carefully chosen non‑Gaussian perturbation to the log‑density field that leaves all lower‑order moments unchanged while altering higher‑order moments. The “tail‑modification” approach reshapes the extreme‑value tail of the PDF (e.g., by raising the density to a power or applying an exponential cutoff) while preserving the covariance structure. Both families illustrate that the correlation hierarchy is not a complete statistical descriptor for fields with heavy tails.

Second, the authors explain why perturbative approaches based on Edgeworth‑type expansions inevitably miss this effect. Edgeworth expansions assume small fluctuations around a Gaussian baseline and rely on a convergent series of cumulants. For lognormal‑like fields the variance is large and the tails are heavy, causing high‑order cumulants to diverge or dominate. Consequently, any finite‑order Edgeworth series provides a poor approximation of the true PDF and cannot capture the information residing in the tail. This argument is generalized to suggest that any perturbation theory built on field fluctuations will break down for strongly non‑Gaussian, tailed distributions.

Third, the paper connects these theoretical insights to results from N‑body simulations. By measuring both the original density field ρ and its log‑transformed counterpart ϕ = ln ρ, the authors compare the Fisher information contained in their power spectra. They find that the log‑density power spectrum carries substantially more information about cosmological parameters than the power spectrum of the raw density field, confirming earlier simulation studies. However, higher‑order statistics such as the bispectrum, trispectrum, and kurtosis differ markedly between the two fields, reflecting the existence of alternative fields with identical lower‑order correlations but distinct higher‑order structure.

The conclusions are threefold. (1) Relying solely on the n‑point correlation hierarchy in the nonlinear regime leads to a severe under‑estimation of the available information, because many distinct fields can share the same hierarchy. (2) Standard perturbative statistical tools (Edgeworth, Gram‑Charlier, etc.) are intrinsically incapable of capturing the missing information for heavy‑tailed fields, highlighting a conceptual limitation of perturbation theory in this context. (3) While the log‑density transformation does improve information recovery in practice, its success stems from a more favorable redistribution of information across scales, not from a complete statistical description.

The paper therefore calls for new statistical frameworks that go beyond correlation functions—such as direct PDF modeling, information‑theoretic measures, or non‑parametric Bayesian approaches—to fully exploit the rich information content of the nonlinear cosmological density field.