Bayesian Parameter Estimation for Latent Markov Random Fields and Social Networks

Undirected graphical models are widely used in statistics, physics and machine vision. However Bayesian parameter estimation for undirected models is extremely challenging, since evaluation of the posterior typically involves the calculation of an intractable normalising constant. This problem has received much attention, but very little of this has focussed on the important practical case where the data consists of noisy or incomplete observations of the underlying hidden structure. This paper specifically addresses this problem, comparing two alternative methodologies. In the first of these approaches particle Markov chain Monte Carlo (Andrieu et al., 2010) is used to efficiently explore the parameter space, combined with the exchange algorithm (Murray et al., 2006) for avoiding the calculation of the intractable normalising constant (a proof showing that this combination targets the correct distribution in found in a supplementary appendix online). This approach is compared with approximate Bayesian computation (Pritchard et al., 1999). Applications to estimating the parameters of Ising models and exponential random graphs from noisy data are presented. Each algorithm used in the paper targets an approximation to the true posterior due to the use of MCMC to simulate from the latent graphical model, in lieu of being able to do this exactly in general. The supplementary appendix also describes the nature of the resulting approximation.

💡 Research Summary

The paper tackles a notoriously difficult problem in Bayesian inference for undirected graphical models: the presence of an intractable normalising constant (the partition function) makes direct evaluation of the posterior impossible. While much of the existing literature focuses on fully observed models, this work concentrates on the practically important case where the data are noisy or incomplete observations of a hidden structure, i.e., latent Markov random fields (LMRFs) such as Ising models and exponential random graph models (ERGMs). The authors compare two distinct strategies for approximating the posterior distribution of the model parameters.

The first strategy combines the exchange algorithm (Murray, Ghahramani & MacKay, 2006) with particle Markov chain Monte Carlo (pMCMC) (Andrieu et al., 2010). The exchange algorithm eliminates the need to compute the partition function by introducing an auxiliary data set drawn from the model at the proposed parameter value; the ratio of likelihoods for the observed and auxiliary data cancels the normalising constants. However, drawing exact samples from a latent field is itself intractable. To overcome this, the authors embed a particle filter inside the exchange step: a set of N particles approximates the conditional distribution of the latent variables given the observed noisy data and a candidate parameter vector. The particle filter provides an unbiased estimate of the auxiliary likelihood, preserving the exactness of the exchange move. The paper supplies a formal proof (in the supplementary material) that the resulting Markov chain targets the correct posterior despite the use of particle approximations. The algorithm proceeds by (i) proposing a new parameter θ′, (ii) running a particle filter to obtain an auxiliary latent configuration y∗ and an unbiased estimate of the likelihood under θ′, (iii) computing the Metropolis–Hastings acceptance probability using the exchange ratio, and (iv) updating the particle system for the next iteration. The authors discuss practical choices such as the number of particles, resampling schemes, and proposal distributions, and they provide pseudo‑code for the full procedure.

The second strategy is Approximate Bayesian Computation (ABC). ABC sidesteps the partition function entirely by comparing simulated data to the observed data through low‑dimensional summary statistics. For the Ising model the authors use the total spin‑pair product and the overall magnetisation; for ERGMs they employ edge count and triangle count. An SMC‑ABC scheme is employed: a population of parameter particles is propagated through a sequence of decreasing tolerance levels ε, with weights updated according to the proportion of simulated summaries that fall within ε of the observed summaries. The authors explore different distance metrics, kernel functions, and adaptive ε‑schedules, and they highlight the sensitivity of the results to the choice of summaries.

Empirical evaluation is carried out on synthetic data and on data corrupted with realistic noise. In the Ising experiments a 30×30 lattice is generated with interaction strengths β∈{0.3,0.6,0.9} and external field h=0.1; Gaussian noise (σ=0.5) is added to the spins before observation. In the ERGM experiments a 100‑node network is generated with parameters governing edge density and triangle propensity; observation noise consists of random edge deletions (10 %) and false‑positive edges (5 %). For each model the authors run both algorithms multiple times, reporting posterior means, 95 % credible intervals, effective sample sizes (ESS), and wall‑clock times.

Results show that the exchange‑pMCMC method consistently recovers the true parameters with narrow credible intervals, even in regimes of strong interaction where the posterior is highly multimodal. The particle filter with N≈300–500 particles yields ESS values above 200 and converges within a few thousand MCMC iterations. ABC, while able to locate the high‑probability region of the parameter space, exhibits larger posterior variance and a systematic bias when the tolerance ε is not extremely small. Reducing ε improves accuracy but at a steep computational cost, as the number of required simulations grows dramatically. In terms of runtime, a well‑tuned exchange‑pMCMC run takes roughly 1.5–2 hours on a standard workstation, comparable to the best‑case ABC run (≈1.2 hours) but with substantially higher statistical efficiency.

The discussion balances the two approaches. The exchange‑pMCMC framework retains the theoretical guarantees of exact Bayesian inference (up to Monte‑Carlo error from the particle filter) and can be applied to any latent undirected model where a particle approximation is feasible. Its drawbacks are the need to tune particle numbers, resampling thresholds, and proposal kernels, which can be computationally demanding for very large graphs. ABC’s main advantage is simplicity and the complete avoidance of the partition function, but it relies heavily on the choice of informative summaries and on an ε‑schedule that trades bias against computational effort. The authors suggest several avenues for future work: adaptive particle numbers based on online ESS diagnostics, automatic learning of summary statistics (e.g., via neural networks or mutual information criteria), and distributed implementations to scale the methods to networks with thousands of nodes.

In conclusion, the paper provides a thorough comparative study of two state‑of‑the‑art Bayesian techniques for latent Markov random fields and social network models under noisy observation. By demonstrating that the exchange algorithm, when coupled with particle MCMC, yields accurate and efficient posterior estimates, the authors establish a practical pathway for fully Bayesian analysis of complex undirected models. At the same time, they acknowledge the utility of ABC as a fast exploratory tool, especially when computational resources are limited or when a quick approximation suffices. This work therefore offers valuable methodological guidance for statisticians, physicists, and machine‑vision researchers facing intractable normalising constants in realistic, partially observed settings.