Statistical exponential families: A digest with flash cards

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

This document describes concisely the ubiquitous class of exponential family distributions met in statistics. The first part recalls definitions and summarizes main properties and duality with Bregman divergences (all proofs are skipped). The second part lists decompositions and related formula of common exponential family distributions. We recall the Fisher-Rao-Riemannian geometries and the dual affine connection information geometries of statistical manifolds. It is intended to maintain and update this document and catalog by adding new distribution items.

💡 Research Summary

The paper presents a compact yet comprehensive reference on the exponential family of probability distributions, a class that underlies virtually every modern statistical model and many machine learning algorithms. It is divided into two main sections. The first part revisits the canonical definition
(p(x;\theta)=h(x)\exp{\langle\theta,T(x)\rangle-A(\theta)})
where (h(x)) is the base measure, (\theta) the natural (canonical) parameter, (T(x)) the sufficient statistic, and (A(\theta)) the log‑partition (cumulant) function. By emphasizing this formulation, the authors quickly remind the reader of three fundamental facts: (i) the log‑likelihood is affine in (\theta) plus a convex term (A(\theta)), which makes maximum‑likelihood estimation a convex optimization problem under regularity; (ii) the sufficient statistic (T(x)) captures all information about (\theta) contained in the data, guaranteeing the Neyman–Fisher factorization theorem; and (iii) the gradient of (A) yields the expectation of the sufficient statistic, (\eta=\mathbb{E}_{\theta}

Statistical exponential families: A digest with flash cards

💡 Research Summary

Comments & Academic Discussion

Leave a Comment