Bayesian finite mixtures: a note on prior specification and posterior computation

Bayesian finite mixtures: a note on prior specification and posterior   computation
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

A new method for the computation of the posterior distribution of the number k of components in a finite mixture is presented. Two aspects of prior specification are also studied: an argument is made for the use of a Poisson(1) distribution as the prior for k; and methods are given for the selection of hyperparameter values in the mixture of normals model, with natural conjugate priors on the components parameters.


💡 Research Summary

The paper addresses two long‑standing practical issues in Bayesian finite mixture modeling: (1) how to compute the posterior distribution of the unknown number of mixture components k in a computationally efficient way, and (2) how to choose sensible prior distributions for k and for the component‑specific parameters in a normal mixture.
The authors first argue that a Poisson(1) prior for k is both theoretically justified and practically advantageous. Because the Poisson(1) mass decays rapidly for values larger than one, it automatically penalizes overly complex models without the need for ad‑hoc penalty terms. This choice is motivated by the principle of minimal information and by empirical comparisons showing that it yields lower over‑fitting rates than uniform or Poisson(λ > 1) priors.
For the normal mixture, the component means μ_j and variances σ_j² are given conjugate normal‑inverse‑gamma (N‑IG) priors. The paper provides a data‑driven recipe for the hyper‑parameters (m₀, κ₀, α₀, β₀). The overall sample mean and variance are used to set m₀ = (\bar{x}) and κ₀ proportional to the sample size, ensuring that the prior mean of μ_j matches the empirical location. The variance hyper‑parameters are calibrated so that the prior expected variance β₀/(α₀ − 1) coincides with the observed sample variance, while α₀ is kept modest (typically 2–5) to give a reasonably heavy‑tailed prior. This systematic calibration reduces the subjectivity that often plagues prior specification in mixture models.
The methodological core is a closed‑form expression for the marginal likelihood m_k(data) of a model with k components, obtained by analytically integrating out the component parameters under the N‑IG prior. The posterior probability of each k is then
\


Comments & Academic Discussion

Loading comments...

Leave a Comment