Constructing Summary Statistics for Approximate Bayesian Computation: Semi-automatic ABC

Constructing Summary Statistics for Approximate Bayesian Computation:   Semi-automatic ABC
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Many modern statistical applications involve inference for complex stochastic models, where it is easy to simulate from the models, but impossible to calculate likelihoods. Approximate Bayesian computation (ABC) is a method of inference for such models. It replaces calculation of the likelihood by a step which involves simulating artificial data for different parameter values, and comparing summary statistics of the simulated data to summary statistics of the observed data. Here we show how to construct appropriate summary statistics for ABC in a semi-automatic manner. We aim for summary statistics which will enable inference about certain parameters of interest to be as accurate as possible. Theoretical results show that optimal summary statistics are the posterior means of the parameters. While these cannot be calculated analytically, we use an extra stage of simulation to estimate how the posterior means vary as a function of the data; and then use these estimates of our summary statistics within ABC. Empirical results show that our approach is a robust method for choosing summary statistics, that can result in substantially more accurate ABC analyses than the ad-hoc choices of summary statistics proposed in the literature. We also demonstrate advantages over two alternative methods of simulation-based inference.


💡 Research Summary

Approximate Bayesian Computation (ABC) has become a standard tool for inference when likelihoods are intractable but forward simulation is feasible. The central difficulty in ABC is the choice of summary statistics: poor summaries discard information and lead to biased or imprecise posterior approximations. This paper tackles that problem by proposing a semi‑automatic method that constructs near‑optimal summaries directly from simulated data.

The authors first establish a theoretical benchmark: the posterior mean of the parameters, (E


Comments & Academic Discussion

Loading comments...

Leave a Comment