Default Bayesian analysis for multi-way tables: a data-augmentation approach

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

This paper proposes a strategy for regularized estimation in multi-way contingency tables, which are common in meta-analyses and multi-center clinical trials. Our approach is based on data augmentation, and appeals heavily to a novel class of Polya-Gamma distributions. Our main contributions are to build up the relevant distributional theory and to demonstrate three useful features of this data-augmentation scheme. First, it leads to simple EM and Gibbs-sampling algorithms for posterior inference, circumventing the need for analytic approximations, numerical integration, Metropolis–Hastings, or variational methods. Second, it allows modelers much more flexibility when choosing priors, which have traditionally come from the Dirichlet or logistic-normal family. For example, our approach allows users to incorporate Bayesian analogues of classical penalized-likelihood techniques (e.g. the lasso or bridge) in computing regularized estimates for log-odds ratios. Finally, our data-augmentation scheme naturally suggests a default strategy for prior selection based on the logistic-Z model, which is strongly related to Jeffreys’ prior for a binomial proportion. To illustrate the method we focus primarily on the particular case of a meta-analysis/multi-center study (or a JxKxN table). But the general approach encompasses many other common situations, of which we will provide examples.

💡 Research Summary

The paper introduces a novel data‑augmentation framework for Bayesian analysis of multi‑way contingency tables, focusing on the common J × K × N structure encountered in meta‑analyses and multi‑center clinical trials. The authors’ key innovation is the use of a newly defined class of Polya‑Gamma (PG) distributions to represent the logistic likelihood as a mixture of normal distributions. This representation eliminates the non‑conjugacy that traditionally forces analysts to rely on numerical integration, Laplace approximations, Metropolis–Hastings, or variational methods.

The development begins with the simplest case: a single 2 × 2 table. Let ψ₁ and ψ₂ denote the log‑odds for treatment and control groups, respectively, and assume a bivariate normal prior N(μ, Σ). The posterior density, originally a product of binomial terms and a Gaussian prior, is shown (Theorem 1) to be conditionally normal given latent precision variables ω₁ and ω₂, where Ω = diag(ω₁, ω₂). Each ωⱼ follows a Polya‑Gamma distribution PG(nⱼ, 0). Moreover, the conditional posterior of ωⱼ given ψⱼ is again PG(nⱼ, ψⱼ). This two‑part result provides a complete hierarchical representation: ψ | Ω ∼ N(m_Ω, V_Ω) and ω | ψ ∼ PG.

Extending to N independent centers yields a 2 × 2 × N table. For each center i, the same latent structure is introduced, leading to ψ_i | Ω_i ∼ N(m_i, V_i) and ω_{ij} | ψ_{ij} ∼ PG(n_{ij}, ψ_{ij}). The authors exploit this structure to derive two computational algorithms.

EM Algorithm – In the E‑step, the expectation of ω_{ij} under its conditional PG distribution is computed analytically using the identity E

Default Bayesian analysis for multi-way tables: a data-augmentation approach

💡 Research Summary

Comments & Academic Discussion

Leave a Comment