The Infinite Hierarchical Factor Regression Model
We propose a nonparametric Bayesian factor regression model that accounts for uncertainty in the number of factors, and the relationship between factors. To accomplish this, we propose a sparse variant of the Indian Buffet Process and couple this with a hierarchical model over factors, based on Kingman’s coalescent. We apply this model to two problems (factor analysis and factor regression) in gene-expression data analysis.
💡 Research Summary
The paper introduces an Infinite Hierarchical Factor Regression Model that simultaneously addresses two major sources of uncertainty in factor analysis and regression: the unknown number of latent factors and the potentially hierarchical relationships among those factors. To achieve this, the authors combine a sparsity‑enhanced variant of the Indian Buffet Process (IBP) with a hierarchical prior over factors based on Kingman’s coalescent.
In the proposed construction, the binary factor‑feature matrix (Z) is drawn from a sparse IBP. Sparsity is enforced by placing a Beta‑Bernoulli mixture prior on each entry, allowing the model to automatically prune irrelevant factor‑feature connections in high‑dimensional settings such as gene‑expression data. The real‑valued loading matrix (A) is given a Gaussian prior, and the product (F = Z \odot A) constitutes the latent factor matrix.
The hierarchical component is introduced by treating the set of factors as a population that evolves backward in time according to Kingman’s coalescent. This stochastic process generates a random binary tree in which internal nodes represent “ancestor” factors that capture shared variation among descendant (leaf) factors. The coalescent tree therefore encodes a prior over factor covariances without requiring an explicit covariance matrix; the depth and branch lengths of the tree dictate how strongly factors are correlated.
The observation model is a standard linear regression:
\
Comments & Academic Discussion
Loading comments...
Leave a Comment