Asymptotic Model Selection for Naive Bayesian Networks

Asymptotic Model Selection for Naive Bayesian Networks

We develop a closed form asymptotic formula to compute the marginal likelihood of data given a naive Bayesian network model with two hidden states and binary features. This formula deviates from the standard BIC score. Our work provides a concrete example that the BIC score is generally not valid for statistical models that belong to a stratified exponential family. This stands in contrast to linear and curved exponential families, where the BIC score has been proven to provide a correct approximation for the marginal likelihood.


💡 Research Summary

The paper tackles the problem of Bayesian model selection for a class of naive Bayes networks that contain a single hidden categorical variable with two states and a set of binary observable features. While the Bayesian Information Criterion (BIC) is a widely used asymptotic approximation to the marginal likelihood for regular statistical models, the authors demonstrate that naive Bayes models with hidden variables belong to a stratified exponential family rather than a linear or curved exponential family. This structural difference creates singularities in the parameter space—points where the Fisher information matrix becomes degenerate—so that the standard Laplace approximation underlying BIC fails.

To address this, the authors adopt Watanabe’s algebraic‑geometric framework for singular learning models. They decompose the parameter vector into the prior probability of the hidden class (π) and the conditional probabilities of each binary feature given the hidden class (α₁,…,αₙ). The log‑likelihood can be expressed as –N·KL(p‖q_θ)+O(1), where KL denotes the Kullback–Leibler divergence between the true data distribution and the model distribution. At a singular point the KL term vanishes, causing the usual quadratic term in the Taylor expansion to disappear; higher‑order terms dominate the integral of the likelihood over the parameter space.

By performing a resolution of singularities, the authors derive a generalized asymptotic expansion for the marginal likelihood: \