Multivariate Partial Information Decomposition: Constructions, Inconsistencies, and Alternative Measures
While mutual information effectively quantifies dependence between two variables, it does not by itself reveal the complex, fine-grained interactions among variables, i.e., how multiple sources contribute redundantly, uniquely, or synergistically to a target in multivariate settings. The Partial Information Decomposition (PID) framework was introduced to address this by decomposing the mutual information between a set of source variables and a target variable into fine-grained information atoms such as redundant, unique, and synergistic components. In this work, we review the axiomatic system and desired properties of the PID framework and make three main contributions. First, we resolve the two-source PID case by providing explicit closed-form formulas for all information atoms that satisfy the full set of axioms and desirable properties. Second, we prove that for three or more sources, PID suffers from fundamental inconsistencies: we review the known three-variable counterexample where the sum of atoms exceeds the total information, and extend it to a comprehensive impossibility theorem showing that no lattice-based decomposition can be consistent for all subsets when the number of sources exceeds three. Finally, we deviate from the PID lattice approach to avoid its inconsistencies, and present explicit measures of multivariate unique and synergistic information. Our proposed measures, which rely on new systems of random variables that eliminate higher-order dependencies, satisfy key axioms such as additivity and continuity, provide a robust theoretical explanation of high-order relations, and show strong numerical performance in comprehensive experiments on the Ising model. Our findings highlight the need for a new framework for studying multivariate information decomposition.
💡 Research Summary
The paper provides a comprehensive critique of the Partial Information Decomposition (PID) framework and introduces new measures that avoid the inherent inconsistencies of the traditional lattice‑based approach. After a brief historical motivation, the authors review the axiomatic foundation of PID: the original three axioms (self‑redundancy, non‑negativity, monotonicity) together with later‑added properties such as commutativity, subsystem consistency, and the “whole equals the sum of its parts” (WESP) principle.
The first major contribution is a closed‑form, optimization‑free solution for the two‑source case (S₁, S₂ → T). By defining redundancy as the minimum of the two marginal mutual informations (I_min), unique information as the excess of each marginal over the redundancy, and synergy as the residual after subtracting redundancy and uniques from the joint mutual information, the authors obtain formulas that satisfy all PID axioms simultaneously. This is the first explicit construction that is both mathematically rigorous and computationally trivial for the two‑source scenario.
The second contribution addresses the long‑standing problem that PID breaks down when three or more sources are involved. The authors revisit a known three‑variable counterexample and demonstrate how the lattice‑based allocation of PI‑atoms inevitably violates the WESP principle: the sum of atoms can exceed the total mutual information, leading to negative synergy or redundant terms. Building on this, they prove a general impossibility theorem: for any system with four or more sources, no decomposition based on the redundancy lattice can satisfy the full set of axioms. The proof leverages subsystem consistency and monotonicity to show that any attempt to assign atoms consistently across all subsets leads to contradictory inequalities.
To overcome these structural limitations, the paper proposes a new framework that does not rely on the redundancy lattice. The key idea is to construct auxiliary random variables that strip away higher‑order dependencies. For any source subset A, an “independent copy” Â is defined such that the joint distribution of (Â, T) preserves only the information that is uniquely attributable to A. Unique information is then measured as I(Â; T), while multivariate synergy is defined as the remainder:
Synergy(S₁,…,S_N → T) = I(S₁,…,S_N; T) – Σ_{A⊂S} Unique(A).
Because the construction explicitly removes overlapping contributions, the resulting measures satisfy additivity (information from disjoint source groups adds up) and continuity (small changes in the distribution cause small changes in the measures). Importantly, they never produce negative values or exceed the total mutual information, thereby respecting the core desiderata that PID struggled with.
The authors validate their proposals on synthetic data and on the two‑dimensional Ising model across a range of temperatures. Near the critical point, redundant and unique information peak, reflecting the emergence of local correlations, while synergy is suppressed. In contrast, when the global magnetization is taken as the target, synergy dominates after the phase transition, capturing genuinely collective behavior. These trends correlate strongly with physical observables such as magnetic susceptibility and specific heat, demonstrating that the new measures have meaningful interpretations in statistical‑physics contexts.
Comparisons with existing PID implementations (I_min, I_ccs, I_broja, etc.) show that the lattice‑free measures consistently satisfy all axioms, avoid pathological negative values, and provide more stable numerical estimates. The paper therefore makes three clear contributions: (1) a complete, axiom‑satisfying solution for the two‑source PID; (2) a rigorous impossibility theorem showing that lattice‑based PID cannot work for three or more sources; and (3) a novel, lattice‑independent definition of multivariate unique and synergistic information that is both theoretically sound and empirically effective. These results have broad implications for neuroscience, causal inference, privacy analysis, and any field that requires a fine‑grained decomposition of information among multiple interacting variables.
Comments & Academic Discussion
Loading comments...
Leave a Comment