Towards an optimal extraction of cosmological parameters from galaxy cluster surveys using convolutional neural networks

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

The possibility to constrain cosmological parameters from galaxy surveys using field-level machine learning methods that bypass traditional summary statistics analyses, depends crucially on our ability to generate simulated training sets. The latter need to be both realistic, as to reproduce the key features of the real data, and produced in large numbers, as to allow us to refine the precision of the training process. The analysis presented in this paper is an attempt to respond to these needs by (a) using clusters of galaxies as tracers of large-scale structure, together with (b) adopting a 3LPT code (Pinocchio) to generate a large training set of $32,768$ mock X-ray cluster catalogues. X-ray luminosities are stochastically assigned to dark matter haloes using an empirical $M-L_X$ scaling relation. Using this training set, we test the ability and performances of a 3D convolutional neural network (CNN) to predict the cosmological parameters, based on an input overdensity field derived from the cluster distribution. We perform a comparison with a neural network trained on traditional summary statistics, that is, the abundance of clusters and their power spectrum. Our results show that the field-level analysis combined with the cluster abundance yields a mean absolute relative error on the predicted values of $Ω_{\rm m}$ and $σ_8$ that is a factor of $\sim 10 %$ and $\sim 20%$ better than that obtained from the summary statistics. Furthermore, when information about the individual luminosity of each cluster is passed to the CNN, the gain in precision exceeds $50%$.

💡 Research Summary

This paper addresses the challenge of extracting cosmological parameters from galaxy‑cluster surveys by combining fast approximate simulations with deep learning. The authors generate a large training set of 32,768 mock X‑ray cluster catalogues using the Pinocchio code, which implements third‑order Lagrangian perturbation theory (3LPT) to produce dark‑matter halo fields at a fraction of the computational cost of full N‑body runs. They sample the five‑dimensional ΛCDM parameter space (Ω_m, σ₈, h, n_s, Ω_b) with a Sobol sequence, creating 4,096 distinct cosmologies; each cosmology is realized with a different random seed, and stochastic X‑ray luminosities are assigned to halos via an empirical mass‑luminosity (M‑L_X) scaling relation. The resulting catalogues reproduce the observed REFLEX‑II X‑ray luminosity function, confirming their realism.

Two machine‑learning approaches are compared. The first is a three‑dimensional convolutional neural network (CNN) that ingests the overdensity field derived from the spatial distribution of clusters. The network consists of multiple 3D convolutional and pooling layers followed by fully‑connected layers that jointly regress Ω_m and σ₈. A second version of the CNN includes an additional input channel containing the average X‑ray luminosity per voxel, allowing the network to learn directly from the M‑L_X scatter. The benchmark model is a fully‑connected neural network that receives compressed summary statistics: the total number of clusters (abundance) and the power spectrum P(k).

Training uses an 80/10/10 split for training, validation, and testing, with mean‑squared error loss optimized by Adam. Hyper‑parameters (number of filters, depth, batch size) are tuned via Bayesian optimization. On the held‑out test set, the field‑level CNN achieves mean absolute relative errors (MARE) of ≈1.8 % for Ω_m and ≈2.5 % for σ₈, improving upon the summary‑statistics network by roughly 10 % and 20 % respectively. When the luminosity channel is added, the errors drop further to ≈0.9 % (Ω_m) and ≈1.2 % (σ₈), a gain exceeding 50 % relative to the baseline CNN. Feature‑map visualizations indicate that the CNN captures higher‑order spatial correlations beyond what is encoded in the power spectrum.

The authors discuss limitations: 3LPT does not fully resolve non‑linear small‑scale dynamics, so the current gains stem mainly from large‑scale clustering information. Real surveys will introduce selection functions, measurement noise, and other systematics that are not yet modeled. Future work will incorporate hydrodynamical effects, explore graph neural networks to exploit explicit cluster‑cluster connections, and develop a fully Bayesian inference pipeline that outputs posterior distributions rather than point estimates.

Overall, the study demonstrates that (1) fast approximate simulations can produce sufficiently realistic, large‑scale training data for cosmological machine learning, and (2) a field‑level CNN can extract more information from the same data than traditional summary statistics, especially when individual cluster luminosities are supplied. This approach promises substantial improvements for upcoming all‑sky X‑ray and SZ surveys such as eROSITA and Athena, potentially delivering tighter constraints on Ω_m, σ₈, and other cosmological parameters than conventional analyses.

Towards an optimal extraction of cosmological parameters from galaxy cluster surveys using convolutional neural networks

💡 Research Summary

Comments & Academic Discussion

Leave a Comment