Modeling Non-Gaussianities in Pulsar Timing Array data analysis using Gaussian Mixture Models
In Pulsar Timing Array (PTA) data analysis, noise is typically assumed to be Gaussian, and the marginalized likelihood has a well-established analytical form derived within the framework of Gaussian processes. However, this Gaussianity assumption may break down for certain classes of astrophysical and cosmological signals, particularly for a gravitational wave background (GWB) generated by a population of supermassive black hole binaries (SMBHBs). In this work, we present a new method for testing the presence of non-Gaussian features in PTA data. We go beyond the Gaussian assumption by modeling the noise or signal statistics using a Gaussian mixture model (GMM). An advantage of this approach is that the marginalization of the likelihood remains fully analytical, expressed as a linear combination of Gaussian PTA likelihoods. This makes the method straightforward to implement within existing data analysis tools. Moreover, this method extends beyond the free spectrum analysis by producing posterior probability distributions of higher-order moments inferred from the data, which can be incorporated into spectral refitting techniques. We validate the model using simulations and demonstrate the sensitivity of PTAs to non-Gaussianity by computing the Bayes factor in favor of the GMM as a function of the injected excess moments. We apply the method to a more astrophysically motivated scenario where a single SMBHB is resolved on top of a Gaussian GWB and show that significant non-Gaussianities are introduced by the individual source. Finally, we test our model on a realistic GWB generated from a simulated population of SMBHBs.
💡 Research Summary
Pulsar Timing Arrays (PTAs) are emerging as a powerful tool for detecting nanohertz gravitational waves, with recent collaborations reporting a common-spectrum process consistent with a stochastic gravitational‑wave background (GWB) and the Hellings–Downs spatial correlations. Standard PTA analyses assume that both signal and noise are Gaussian random processes, which leads to a closed‑form marginalized likelihood expressed in terms of the Fourier coefficients of the timing residuals. However, this Gaussian assumption can break down when only a few bright supermassive black‑hole binaries (SMBHBs) dominate the signal or when deterministic continuous gravitational‑wave (CGW) sources are superimposed on the background. In such cases the distribution of the Fourier coefficients acquires heavy tails or asymmetries, and higher‑order moments (skewness, kurtosis) become relevant.
The authors propose to model the prior distribution of the Fourier coefficients with a Gaussian Mixture Model (GMM). A GMM is a weighted sum of Gaussian components, each characterized by a weight α_i, a scale factor c_i (relative variance), and a mean µ_i. By fixing the means to zero the mixture remains symmetric, and non‑Gaussianity is controlled solely by the mixture weight α and the variance scaling c. Because each component is Gaussian, the marginalized likelihood remains a linear combination of the standard Gaussian PTA likelihoods, preserving analytical tractability.
The paper derives explicit expressions for the central moments of a two‑component mixture (Table I) and shows that the even‑order moments follow a simple formula M_{2q} = (2q−1)!! Φ^q
Comments & Academic Discussion
Loading comments...
Leave a Comment