The Infinite Hierarchical Factor Regression Model

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

We propose a nonparametric Bayesian factor regression model that accounts for uncertainty in the number of factors, and the relationship between factors. To accomplish this, we propose a sparse variant of the Indian Buffet Process and couple this with a hierarchical model over factors, based on Kingman’s coalescent. We apply this model to two problems (factor analysis and factor regression) in gene-expression data analysis.

💡 Research Summary

The paper introduces an Infinite Hierarchical Factor Regression Model that simultaneously addresses two major sources of uncertainty in factor analysis and regression: the unknown number of latent factors and the potentially hierarchical relationships among those factors. To achieve this, the authors combine a sparsity‑enhanced variant of the Indian Buffet Process (IBP) with a hierarchical prior over factors based on Kingman’s coalescent.

In the proposed construction, the binary factor‑feature matrix (Z) is drawn from a sparse IBP. Sparsity is enforced by placing a Beta‑Bernoulli mixture prior on each entry, allowing the model to automatically prune irrelevant factor‑feature connections in high‑dimensional settings such as gene‑expression data. The real‑valued loading matrix (A) is given a Gaussian prior, and the product (F = Z \odot A) constitutes the latent factor matrix.

The hierarchical component is introduced by treating the set of factors as a population that evolves backward in time according to Kingman’s coalescent. This stochastic process generates a random binary tree in which internal nodes represent “ancestor” factors that capture shared variation among descendant (leaf) factors. The coalescent tree therefore encodes a prior over factor covariances without requiring an explicit covariance matrix; the depth and branch lengths of the tree dictate how strongly factors are correlated.

The observation model is a standard linear regression:
\

The Infinite Hierarchical Factor Regression Model

💡 Research Summary

Comments & Academic Discussion

Leave a Comment