Non-linear regression models for Approximate Bayesian Computation
Approximate Bayesian inference on the basis of summary statistics is well-suited to complex problems for which the likelihood is either mathematically or computationally intractable. However the methods that use rejection suffer from the curse of dimensionality when the number of summary statistics is increased. Here we propose a machine-learning approach to the estimation of the posterior density by introducing two innovations. The new method fits a nonlinear conditional heteroscedastic regression of the parameter on the summary statistics, and then adaptively improves estimation using importance sampling. The new algorithm is compared to the state-of-the-art approximate Bayesian methods, and achieves considerable reduction of the computational burden in two examples of inference in statistical genetics and in a queueing model.
💡 Research Summary
Approximate Bayesian Computation (ABC) has become a cornerstone for Bayesian inference when the likelihood is intractable, but traditional rejection‑ABC suffers dramatically from the curse of dimensionality as the number of summary statistics grows. This paper introduces a novel machine‑learning‑driven framework that replaces the simple distance‑based acceptance rule with a full probabilistic model of the relationship between parameters θ and summary statistics s. Specifically, the authors propose a nonlinear conditional heteroscedastic regression of the form θ = m(s) + σ(s)·ε, where both the conditional mean m(s) and the conditional variance σ²(s) are estimated using flexible nonlinear learners such as neural networks, random forests, or Gaussian processes. The error term ε is assumed standard normal, turning the regression into a conditional density estimator q(θ|s) = N(m(s), σ²(s)).
Once this conditional density is learned from a set of simulated (θ_i, s_i) pairs drawn from the prior, the observed summary statistics s_obs are fed into the model to obtain a proposal distribution q(θ|s_obs). The second key component is an adaptive importance‑sampling step that re‑weights draws from q(θ|s_obs) to correct for the approximation error. For each proposal θ′, a new simulation generates s′, and the importance weight is computed as w(θ′) ∝ p(θ′)·K(‖s′‑s_obs‖)/q(θ′), where K is a kernel measuring the distance between simulated and observed summaries. Normalised weights produce an unbiased set of posterior samples. The procedure can be iterated: after each importance‑sampling round the regression model is updated with the newly weighted samples, yielding progressively better proposals.
The authors evaluate the method on two realistic problems. The first is a statistical‑genetics scenario involving multi‑locus association mapping, where dozens of summary statistics (allele frequencies, linkage‑disequilibrium measures, test statistics) are used. The second is a classic M/M/1 queueing model where the goal is to infer arrival and service rates from observed waiting‑time moments. In both cases, the proposed nonlinear‑regression‑plus‑importance‑sampling algorithm (NR‑IS) is benchmarked against state‑of‑the‑art ABC approaches: ABC‑SMC, ABC‑MCMC, and the linear‑regression correction of Beaumont et al. (2002). Results show that NR‑IS attains comparable posterior accuracy—measured by credible‑interval coverage and mean squared error—while reducing the required number of model simulations by a factor of 5 to 10. Notably, the acceptance rate in high‑dimensional summary spaces improves dramatically (rejection rates drop from >90 % to <20 %), confirming that the learned conditional density captures the complex, possibly heteroscedastic relationship between θ and s.
A thorough discussion highlights several practical considerations. The choice of regression learner and its hyper‑parameters strongly influences performance; overly complex models may overfit the simulated training data, inflating the variance of importance weights. The authors recommend cross‑validation, regularisation, or ensemble methods to mitigate overfitting. They also note that the framework is agnostic to the specific summary statistics and can be combined with automatic statistic selection or dimensionality‑reduction techniques. Extensions such as Bayesian neural networks for a fully Bayesian conditional density, or parallel/distributed implementations to exploit modern high‑performance computing resources, are suggested as future work.
In summary, this paper makes a substantial contribution to the ABC literature by (1) introducing a flexible, heteroscedastic, nonlinear conditional regression to model the posterior directly from simulated data, and (2) coupling this model with adaptive importance sampling to correct residual bias. The resulting algorithm dramatically alleviates the curse of dimensionality, cuts computational cost, and retains high inferential fidelity, thereby expanding the practical applicability of ABC to complex scientific models in genetics, queueing theory, and beyond.
Comments & Academic Discussion
Loading comments...
Leave a Comment