Extreme deconvolution: Inferring complete distribution functions from noisy, heterogeneous and incomplete observations
We generalize the well-known mixtures of Gaussians approach to density estimation and the accompanying Expectation–Maximization technique for finding the maximum likelihood parameters of the mixture to the case where each data point carries an individual $d$-dimensional uncertainty covariance and has unique missing data properties. This algorithm reconstructs the error-deconvolved or “underlying” distribution function common to all samples, even when the individual data points are samples from different distributions, obtained by convolving the underlying distribution with the heteroskedastic uncertainty distribution of the data point and projecting out the missing data directions. We show how this basic algorithm can be extended with conjugate priors on all of the model parameters and a “split-and-merge” procedure designed to avoid local maxima of the likelihood. We demonstrate the full method by applying it to the problem of inferring the three-dimensional velocity distribution of stars near the Sun from noisy two-dimensional, transverse velocity measurements from the Hipparcos satellite.
💡 Research Summary
The paper introduces “Extreme Deconvolution” (ED), an extension of the classic Gaussian‑Mixture‑Model (GMM) density‑estimation framework that can handle data points with individually varying measurement uncertainties and arbitrary patterns of missing dimensions. In the standard GMM, all observations are assumed to be drawn directly from a mixture of multivariate Gaussians with a common covariance structure; consequently, the method fails when each datum is a noisy, projected version of the underlying latent variable. ED models each observation x_i as a linear projection M_i of a latent vector z_i plus additive Gaussian noise ε_i with its own covariance Σ_i: x_i = M_i z_i + ε_i, where ε_i ∼ N(0, Σ_i). The latent variable itself follows a K‑component mixture p(z) = ∑_k α_k N(μ_k, Λ_k).
To estimate the mixture parameters (weights α_k, means μ_k, covariances Λ_k), the authors derive an Expectation‑Maximization algorithm that explicitly incorporates the per‑sample noise and projection operators. In the E‑step, for each component k and datum i, they compute the posterior responsibility γ_{ik}=p(k|x_i) and the conditional expectations of the latent variable: the mean (\hat{z}_{ik}=E
Comments & Academic Discussion
Loading comments...
Leave a Comment