On the Structure of Information
We characterize information as risk reduction between knowledge states represented by partitions of the underlying probability space. Entropy corresponds to risk reduction from no (or partial) knowledge to full knowledge about a random variable, while information corresponds to risk reduction from no (or partial) knowledge to partial knowledge. This applies to any information measure that is based on expected loss minimization, such as Bregman information, with Shannon information and variance as prominent examples. In each case, fundamental properties like the chain rule, non-negativity, and the relationship between information and divergence are preserved. Because partitions form a lattice under refinement, our general treatment reveals how information can be decomposed into redundant, unique, and synergistic contributions, a question important in applications from neuroscience to machine learning, yet one for which existing formulations lack consensus on foundational definitions and can violate basic properties such as the chain rule or non-negativity. Redundancy corresponds to Aumann’s common knowledge, synergy to the gap between separately and jointly observed sources, and unique information is necessarily path-dependent, taking different values depending on what is already known. The resulting partial information decomposition is grounded directly in probability theory, avoids treating scalar information quantities as primitive compositional objects, yields non-negative terms by construction, and offers a more fine-grained credit assignment.
💡 Research Summary
The paper “On the Structure of Information” proposes a unifying framework that treats information as the reduction of expected loss (risk) when moving from one knowledge state to another, where knowledge states are represented by partitions of an underlying probability space. By modeling observations as partitions, the authors capture the granularity of information: a finer partition corresponds to a more informative measurement. They introduce a generic loss function L and define the information associated with a transition from partition π to a finer partition π′ as the expected decrease in loss, ΔL(π→π′)=E
Comments & Academic Discussion
Loading comments...
Leave a Comment