Simulation of the matrix Bingham-von Mises-Fisher distribution, with applications to multivariate and relational data
Orthonormal matrices play an important role in reduced-rank matrix approximations and the analysis of matrix-valued data. A matrix Bingham-von Mises-Fisher distribution is a probability distribution on the set of orthonormal matrices that includes linear and quadratic terms, and arises as a posterior distribution in latent factor models for multivariate and relational data. This article describes rejection and Gibbs sampling algorithms for sampling from this family of distributions, and illustrates their use in the analysis of a protein-protein interaction network.
💡 Research Summary
The paper addresses the problem of generating random orthonormal matrices from the matrix Bingham‑von Mises‑Fisher (BvMF) distribution, a flexible family that incorporates both linear and quadratic terms in its exponent. The BvMF density on the Stiefel manifold O(p,r) has the form
p(X | U, A) ∝ exp{ tr(UᵀX) + tr(XᵀAX) },
where U is a p × r matrix and A is a symmetric r × r matrix. This structure naturally arises as the posterior distribution in Bayesian latent‑factor models for multivariate data (e.g., factor analysis, probabilistic PCA) and for relational data such as network embeddings, where the observed data provide sufficient statistics that are linear and quadratic in the latent orthonormal factor X.
Existing work on the Bingham or von Mises‑Fisher distributions treats them separately; a joint BvMF sampler has been lacking because the normalising constant is intractable and the support is a curved manifold. The authors propose two complementary Monte‑Carlo strategies.
-
Rejection sampling – They construct a proposal distribution that is itself a Bingham or von Mises‑Fisher law with parameters tuned to the target’s U and A. By exploiting the eigen‑decomposition of A and aligning the proposal’s concentration with the dominant eigen‑directions, they obtain a tight envelope and thus a reasonable acceptance probability even for moderate dimensions. The method is exact but its efficiency deteriorates as p or r grow unless the spectral gap of A is large.
-
Gibbs sampling – The authors decompose X column‑wise. Conditional on all other columns, each column follows a vector‑valued BvMF distribution whose parameters are simple functions of U, A, and the already‑sampled columns. This reduces the high‑dimensional problem to a sequence of low‑dimensional updates for which well‑established BvMF samplers can be reused. The Gibbs scheme is shown to mix rapidly, especially when the quadratic term is moderate, and its per‑iteration cost scales as O(p r).
Theoretical analysis includes bounds on the acceptance rate of the rejection algorithm and a proof of geometric ergodicity for the Gibbs chain under mild conditions on A. Extensive simulation studies vary (p, r) from (10, 3) to (100, 10), comparing acceptance probabilities, effective sample sizes, and wall‑clock times. Results confirm that Gibbs sampling dominates in high‑dimensional regimes, while rejection sampling can be competitive when A is nearly diagonal or when a single high‑quality draw is needed.
To demonstrate practical relevance, the authors apply the Gibbs sampler to a protein‑protein interaction (PPI) network comprising roughly 1,000 proteins and 5,000 edges. A latent‑factor model with orthonormal factors of dimension r = 2–5 is fitted, and posterior samples of X are used to embed proteins in a low‑dimensional Euclidean space. Visualisation and k‑means clustering reveal biologically meaningful modules that align with known functional categories, outperforming standard singular‑value decomposition and Laplacian eigenmap embeddings. Moreover, posterior uncertainty quantifies the confidence of predicted interactions, offering a principled way to prioritize experimental validation.
In summary, the paper delivers the first comprehensive suite of exact and efficient samplers for the matrix Bingham‑von Mises‑Fisher distribution, bridging a methodological gap that has limited the use of orthogonal‑matrix priors in Bayesian multivariate and relational models. By providing both a theoretically sound rejection scheme and a highly scalable Gibbs algorithm, the work enables practitioners to incorporate rich directional priors, improve inference quality, and extract more nuanced structure from complex data such as biological networks. Future directions suggested include variational approximations for even larger problems and extensions to dynamic manifolds for time‑evolving relational data.
Comments & Academic Discussion
Loading comments...
Leave a Comment