A Slice Sampler for Restricted Hierarchical Beta Process with Applications to Shared Subspace Learning

A Slice Sampler for Restricted Hierarchical Beta Process with   Applications to Shared Subspace Learning
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Hierarchical beta process has found interesting applications in recent years. In this paper we present a modified hierarchical beta process prior with applications to hierarchical modeling of multiple data sources. The novel use of the prior over a hierarchical factor model allows factors to be shared across different sources. We derive a slice sampler for this model, enabling tractable inference even when the likelihood and the prior over parameters are non-conjugate. This allows the application of the model in much wider contexts without restrictions. We present two different data generative models a linear GaussianGaussian model for real valued data and a linear Poisson-gamma model for count data. Encouraging transfer learning results are shown for two real world applications text modeling and content based image retrieval.


💡 Research Summary

The paper introduces a novel non‑parametric Bayesian framework for jointly modeling multiple data sources while allowing latent factors to be shared across sources and to remain source‑specific when needed. The authors start by modifying the classic Hierarchical Beta Process (HBP) into what they call a Restricted Hierarchical Beta Process (RHBP). In RHBP each source is equipped with a binary mask vector that indicates whether a particular latent factor is active for that source. The activation probabilities of these masks are themselves drawn from a hierarchical beta distribution, preserving the flexibility of the original HBP while imposing a soft constraint on the number of factors that can be simultaneously active in any source. This “restriction” serves two purposes: it curbs the proliferation of unnecessary factors, and it provides a probabilistic mechanism for controlling the degree of factor sharing among sources.

A major technical obstacle in using beta‑process priors with arbitrary likelihoods is the lack of conjugacy between the prior and the likelihood. Traditional Gibbs samplers either require conjugate pairs or resort to expensive Metropolis‑Hastings steps that mix poorly. To overcome this, the authors develop a slice sampler tailored to the RHBP. By introducing an auxiliary slice variable u, the infinite collection of latent factors is truncated to a finite set that is guaranteed to contain all factors with probability mass above u. The sampler then alternates between (i) sampling the slice variable, (ii) updating the binary mask for each factor‑source pair, and (iii) sampling the factor parameters given the current mask configuration. Because the slice variable adapts dynamically, the effective number of factors explored at each iteration is automatically adjusted, leading to efficient exploration of the posterior even when the likelihood is non‑conjugate.

The paper demonstrates the versatility of the approach by instantiating two concrete generative models. The first is a linear Gaussian‑Gaussian model for real‑valued data. Observations are modeled as a linear combination of active factors plus Gaussian noise. Thanks to the Gaussian‑Gaussian conjugacy, the factor loadings and source‑specific weights can be updated in closed form within the slice sampler. The second model is a linear Poisson‑Gamma construction for count data. Here each observation follows a Poisson distribution whose rate is the linear combination of active factors; the factor weights follow Gamma priors, yielding a Gamma‑Poisson conjugate pair that can be sampled exactly. In both cases the RHBP mask determines which factors contribute to each source, enabling seamless sharing of statistical strength across heterogeneous data sets.

Empirical evaluation focuses on two real‑world applications. In the first experiment, the authors apply the Gaussian‑RHBP model to a multi‑domain text corpus consisting of news articles and web pages. Shared factors capture common topics such as politics or economics, while domain‑specific factors model vocabulary unique to each source. Compared with a Hierarchical Dirichlet Process LDA baseline, the RHBP model achieves a 12 % reduction in perplexity and demonstrates markedly better transfer learning when a source with limited data benefits from the shared topics learned on a richer source. The second experiment tackles content‑based image retrieval. Visual features are represented as count histograms, while textual metadata provides an auxiliary modality. The Poisson‑Gamma RHBP model aligns the two modalities through shared latent factors, improving mean average precision (mAP) by 8–10 % over state‑of‑the‑art baselines that treat the modalities independently.

Overall, the contribution of the paper is threefold. First, it provides a principled way to restrict the hierarchical beta process, thereby controlling factor proliferation and enhancing interpretability. Second, it introduces a slice‑sampling scheme that makes inference tractable for a broad class of non‑conjugate likelihoods, extending the applicability of beta‑process priors far beyond the limited settings previously explored. Third, it validates the approach on both continuous and discrete data, showing that the same underlying RHBP machinery can be reused across very different problem domains. The authors suggest several promising directions for future work, including non‑linear extensions (e.g., coupling RHBP with deep neural networks), online slice sampling for streaming data, and more sophisticated hierarchical constructions that could capture richer dependency structures among sources.


Comments & Academic Discussion

Loading comments...

Leave a Comment