On the Structure of Information

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

We characterize information as risk reduction between knowledge states represented by partitions of the underlying probability space. Entropy corresponds to risk reduction from no (or partial) knowledge to full knowledge about a random variable, while information corresponds to risk reduction from no (or partial) knowledge to partial knowledge. This applies to any information measure that is based on expected loss minimization, such as Bregman information, with Shannon information and variance as prominent examples. In each case, fundamental properties like the chain rule, non-negativity, and the relationship between information and divergence are preserved. Because partitions form a lattice under refinement, our general treatment reveals how information can be decomposed into redundant, unique, and synergistic contributions, a question important in applications from neuroscience to machine learning, yet one for which existing formulations lack consensus on foundational definitions and can violate basic properties such as the chain rule or non-negativity. Redundancy corresponds to Aumann’s common knowledge, synergy to the gap between separately and jointly observed sources, and unique information is necessarily path-dependent, taking different values depending on what is already known. The resulting partial information decomposition is grounded directly in probability theory, avoids treating scalar information quantities as primitive compositional objects, yields non-negative terms by construction, and offers a more fine-grained credit assignment.

💡 Research Summary

The paper “On the Structure of Information” proposes a unifying framework that treats information as the reduction of expected loss (risk) when moving from one knowledge state to another, where knowledge states are represented by partitions of an underlying probability space. By modeling observations as partitions, the authors capture the granularity of information: a finer partition corresponds to a more informative measurement. They introduce a generic loss function L and define the information associated with a transition from partition π to a finer partition π′ as the expected decrease in loss, ΔL(π→π′)=E

On the Structure of Information

💡 Research Summary

Comments & Academic Discussion

Leave a Comment