On the computability of conditional probability

Reading time: 5 minute
...

📝 Original Info

  • Title: On the computability of conditional probability
  • ArXiv ID: 1005.3014
  • Date: 2019-11-19
  • Authors: Researchers from original ArXiv paper

📝 Abstract

As inductive inference and machine learning methods in computer science see continued success, researchers are aiming to describe ever more complex probabilistic models and inference algorithms. It is natural to ask whether there is a universal computational procedure for probabilistic inference. We investigate the computability of conditional probability, a fundamental notion in probability theory and a cornerstone of Bayesian statistics. We show that there are computable joint distributions with noncomputable conditional distributions, ruling out the prospect of general inference algorithms, even inefficient ones. Specifically, we construct a pair of computable random variables in the unit interval such that the conditional distribution of the first variable given the second encodes the halting problem. Nevertheless, probabilistic inference is possible in many common modeling settings, and we prove several results giving broadly applicable conditions under which conditional distributions are computable. In particular, conditional distributions become computable when measurements are corrupted by independent computable noise with a sufficiently smooth bounded density.

💡 Deep Analysis

Deep Dive into On the computability of conditional probability.

As inductive inference and machine learning methods in computer science see continued success, researchers are aiming to describe ever more complex probabilistic models and inference algorithms. It is natural to ask whether there is a universal computational procedure for probabilistic inference. We investigate the computability of conditional probability, a fundamental notion in probability theory and a cornerstone of Bayesian statistics. We show that there are computable joint distributions with noncomputable conditional distributions, ruling out the prospect of general inference algorithms, even inefficient ones. Specifically, we construct a pair of computable random variables in the unit interval such that the conditional distribution of the first variable given the second encodes the halting problem. Nevertheless, probabilistic inference is possible in many common modeling settings, and we prove several results giving broadly applicable conditions under which conditional distribut

📄 Full Content

The use of probability to reason about uncertainty has wide-ranging applications in science and engineering, and some of the most important computational problems relate to conditioning, which is used to perform Bayesian inductive reasoning in probabilistic models. As researchers have faced more complex phenomena, their representations have also increased in complexity, which in turn has led to more complicated inference algorithms. It is natural to ask whether there is a universal inference algorithm -in other words, whether it is possible to automate probabilistic reasoning via a general procedure that can compute conditional probabilities for an arbitrary computable joint distribution.

We demonstrate that there are computable joint distributions with noncomputable conditional distributions. As a consequence, no general algorithm for computing conditional probabilities can exist. Of course, the fact that generic algorithms cannot exist for computing conditional probabilities does not rule out the possibility that large classes of distributions may be amenable to automated inference. The challenge for mathematical theory is to explain the widespread success of probabilistic methods and characterize the circumstances when conditioning is possible. In this vein, we describe broadly applicable conditions under which conditional probabilities are computable.

We begin by describing a setting, probabilistic programming, that motivates the search for these results. We proceed to describe the technical frameworks for our results, computable probability theory and the modern formulation of conditional probability. We then highlight related work, and end the introduction with a summary of results of the paper.

Within probabilistic artificial intelligence and machine learning, probabilistic programming provides formal languages and algorithms for describing and computing answers from probabilistic models. Probabilistic programming languages themselves build on modern programming languages and their facilities for recursion, abstraction, modularity, etc., to enable practitioners to define intricate, in some cases infinite-dimensional, models by implementing a generative process that produces an exact sample from the model’s joint distribution. Probabilistic programming languages have been the focus of a long tradition of research within programming languages, model checking, and formal methods. For some of the early approaches within the AI and machine learning community, see, e.g., the languages PHA [Poole 1991], IBAL [Pfeffer 2001], Markov Logic [Richardson and Domingos 2006], λ • [Park et al. 2008], Church [Goodman et al. 2008], HANSEI [Kiselyov and Shan 2009], and Infer.NET [Minka et al. 2010].

In many of these languages, one can easily represent the higher-order stochastic processes (e.g., distributions on data structures, distributions on functions, and distributions on distributions) that are essential building blocks in modern nonparametric Bayesian statistics. In fact, the most expressive such languages are each capable of describing the same robust class as the others -the class of computable distributions, which delineates those from which a probabilistic Turing machine can sample to arbitrary accuracy.

Traditionally, inference algorithms for probabilistic models have been derived and implemented by hand. In contrast, probabilistic programming systems have introduced varying degrees of support for computing conditional distributions. Given the rate of progress toward broadening the scope of these algorithms, one might hope that there would eventually be a generic algorithm supporting the entire class of computable distributions.

Despite recent progress towards a general such algorithm, support for conditioning with respect to continuous random variables has remained incomplete. Our results explain why this is necessarily the case.

In order to study computable probability theory and the computability of conditioning, we work within the framework of Type-2 Theory of Effectivity (TTE) and use appropriate representations for topological and measurable objects such as distributions, random variables, and maps between them. This framework builds upon and contains as a special case ordinary Turing computation on discrete spaces, and gives us a basis for precisely describing the operations that probabilistic programming languages are capable of performing.

In particular, we study the computability of distributions on computable Polish spaces including, e.g., certain spaces of distributions on distributions. In Section 2 we present the necessary definitions and results from computable probability theory.

For an experiment with a discrete set of outcomes, computing conditional probabilities is, in principle, straightforward as it is simply a ratio of probabilities. However, in the case of conditioning on the value of a continuous random variable, this ratio is undefined. Furthermore, in modern Bayesian stati

…(Full text truncated)…

📸 Image Gallery

cover.png page_2.webp page_3.webp

Reference

This content is AI-processed based on ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut