From Redshift to Real Space: Combining Linear Theory With Neural Networks
Spectroscopic redshift surveys are key tools to trace the large-scale structure (LSS) of the Universe and test the $Λ$CDM model. However, using redshifts as distance proxies introduces distortions in the 3D galaxy distribution. If uncorrected, these distortions lead to systematic errors in LSS analyses and cosmological parameter estimation. We present a new method that combines linear theory (LT) and a neural network (NN) to mitigate redshift space distortions (RSDs). The hybrid LT+NN approach is trained and validated on dark matter halo fields from z = 1 snapshots of the Quijote N-body simulations. LT corrects large-scale distortions in the linear regime, while the NN learns quasi-linear and small-scale features. The LT correction is applied first, then the NN is trained on the resulting fields to improve accuracy across scales. The method uses a Mean Squared Error (MSE) loss and yields significant performance gains: approximately 50% improvement over LT alone and 12% over NN alone. The reconstructed fields from the LT+NN method show stronger correlations with the true real-space fields than either LT or NN separately. The hybrid method also improves clustering statistics such as halo-halo and halo-void correlations, with benefits extending to BAO scales. Compared to NN-only, it provides better suppression of spurious anisotropies on large and quasi-linear scales, as measured by the quadrupole moments of correlation functions. This work shows that combining a physically motivated dynamical model with a machine learning algorithm leverages the strengths of both approaches. The LT+NN method achieves high accuracy with modest training data and computational cost, making it a promising tool for future applications to more realistic galaxy surveys.
💡 Research Summary
This paper introduces a hybrid reconstruction technique that combines linear perturbation theory (LT) with a three‑dimensional convolutional neural network (CNN) to mitigate redshift‑space distortions (RSD) in spectroscopic galaxy surveys. The authors train and validate the method on dark‑matter halo fields extracted from the Quijote N‑body simulations at redshift z = 1. The workflow proceeds in two stages. First, a linear‑theory based algorithm solves the continuity equation in Fourier space, applying a Gaussian smoothing of radius R_s = 10 h⁻¹ Mpc to the observed redshift‑space overdensity field. This yields a displacement field Ψ, which is used to shift halos back toward their real‑space positions, effectively correcting large‑scale coherent flows (Kaiser effect). The linear growth rate f and halo bias b are estimated from the simulations (f ≈ 0.88, b from the low‑k halo‑matter power‑spectrum ratio).
Second, the LT‑corrected density field serves as input to a U‑Net‑style autoencoder. The encoder consists of five convolution‑pooling blocks; each block contains two 3D convolutional layers with ReLU activation followed by a 2×2×2 max‑pooling layer. The number of filters grows from 16 to 256, allowing the network to capture increasingly abstract features. The decoder mirrors the encoder, using transposed convolutions and skip connections to reconstruct the field at the original resolution. Training minimizes a mean‑squared‑error (MSE) loss between the network output and the true real‑space halo density field. The network therefore learns to model quasi‑linear and non‑linear small‑scale features that LT alone cannot capture.
Performance is evaluated using several metrics. Compared with LT alone, the hybrid LT+NN reduces the MSE by roughly 50 %; compared with a NN trained directly on redshift‑space data, it improves MSE by about 12 %. Correlation coefficients between reconstructed and true fields increase from ~0.92 (LT) and ~0.94 (NN) to ~0.96 for the hybrid. Two‑point statistics—real‑space halo‑halo correlation function ξ(r) and power spectrum P(k)—show that the hybrid method recovers the baryon acoustic oscillation (BAO) peak position and amplitude as accurately as LT on large scales while preserving small‑scale power better than LT. The quadrupole moment of the correlation function, a sensitive probe of anisotropic RSD, is significantly suppressed on both large (≳30 Mpc h⁻¹) and quasi‑linear (10–30 Mpc h⁻¹) scales, indicating that the hybrid approach avoids the spurious anisotropies sometimes introduced by a pure NN.
Additional analyses explore the dependence on the smoothing radius R_s, confirming that R_s = 10 h⁻¹ Mpc yields optimal performance for the statistics considered. Void‑galaxy cross‑correlations are also examined; the hybrid reconstruction reproduces the undistorted void density profiles more faithfully than either LT or NN alone.
The authors acknowledge limitations: the study assumes the same ΛCDM cosmology for training and reconstruction, so potential biases from an incorrect cosmological model are not quantified. Accurate knowledge of halo bias and growth rate is required for the LT step, which may be non‑trivial for real galaxy samples. The grid resolution (128³ cells, ~7.8 h⁻¹ Mpc per cell) limits the ability to resolve very small scales (< 5 h⁻¹ Mpc) where shot noise dominates. Nevertheless, the method achieves high accuracy with a modest training set (100 simulations) and modest computational cost, making it attractive for upcoming large surveys such as DESI, Euclid, and the Roman Space Telescope.
In conclusion, the paper demonstrates that integrating a physically motivated linear reconstruction with a data‑driven neural network leverages the strengths of both: LT provides a robust large‑scale correction grounded in theory, while the NN captures complex small‑scale dynamics. This synergy yields a more accurate, less anisotropic reconstruction of the real‑space density field than either component alone, paving the way for improved cosmological analyses that rely on precise RSD mitigation.
Comments & Academic Discussion
Loading comments...
Leave a Comment