On the information-theoretic structure of distributed measurements
The internal structure of a measuring device, which depends on what its components are and how they are organized, determines how it categorizes its inputs. This paper presents a geometric approach to studying the internal structure of measurements performed by distributed systems such as probabilistic cellular automata. It constructs the quale, a family of sections of a suitably defined presheaf, whose elements correspond to the measurements performed by all subsystems of a distributed system. Using the quale we quantify (i) the information generated by a measurement; (ii) the extent to which a measurement is context-dependent; and (iii) whether a measurement is decomposable into independent submeasurements, which turns out to be equivalent to context-dependence. Finally, we show that only indecomposable measurements are more informative than the sum of their submeasurements.
💡 Research Summary
The paper develops a rigorous, geometry‑inspired framework for analyzing measurements performed by distributed systems such as probabilistic cellular automata, Hopfield networks, and, by extension, neural substrates of cognition. Classical measurement theory treats a device as a single function f : X → R, but this collapses the rich internal organization of real devices into a monolithic map. To capture the internal structure, the authors replace deterministic functions with stochastic maps (Markov matrices) between function spaces V X = {ϕ : X → ℝ}. Objects of the category Stoch are these finite‑dimensional vector spaces; arrows are column‑stochastic matrices, i.e., linear maps whose columns sum to one. Deterministic functions are embedded faithfully via a functor V that sends a set‑map f to the stochastic matrix V f that maps each Dirac basis vector δₓ to δ_{f(x)}.
A crucial operation is the dual (denoted T ), defined as the column‑wise transpose followed by renormalization. This dual implements Bayes’ rule with respect to a uniform prior, thereby generalizing the ordinary inverse image f⁻¹ to the probabilistic setting: (V f) yields the posterior distribution p(x | y) that is uniform over the pre‑image f⁻¹(y). The dual is involutive on stochastic maps, preserving stochasticity.
The authors then introduce distributed dynamical systems D. A system consists of a finite directed graph whose vertices (called “occasions”) represent space‑time points of a cellular automaton or a neural unit at a particular time. Each vertex vₗ carries an input alphabet Sₗ (the product of the output alphabets of its source vertices) and an output alphabet Aₗ, together with a stochastic mechanism mₗ : V Sₗ → V Aₗ. By unrolling a cellular automaton over a finite time window, one obtains a concrete instance of D.
To study how measurements decompose, the paper defines a Boolean‑lattice category Sys D of subsystems. An object C ⊆ V_D × V_D is a set of ordered pairs of vertices; the source and target projections of C determine its overall input alphabet S_C and output alphabet A_C. The category Meas D of measuring devices has objects Homₛₜₒ𝚌ₕ(V A_C, V S_C) and morphisms given by marginalization (restriction) maps built from projections and the dual of the projection.
The central construction is the structure presheaf F : Sys_D^{op} → Meas_D, which assigns to each subsystem C the space of stochastic maps that describe its measurement. Restriction along inclusions C₁ ⊆ C₂ corresponds to marginalizing out the extra inputs/outputs. Theorem 4 shows that F satisfies the gluing axiom (any compatible family of local measurements can be glued to a global one) but fails uniqueness of descent: many joint distributions share the same set of marginals. This non‑uniqueness is precisely the source of contextual effects.
Two quantitative notions are built on top of the presheaf. First, effective information (Proposition 5) measures how much a measurement reduces uncertainty relative to a baseline (e.g., the null system or a coarser subsystem). It is essentially the Kullback–Leibler divergence between the actual output distribution and the baseline distribution, interpreted as the precision of the measurement. Second, entanglement (Section 5, Theorem 9) quantifies the obstruction, in bits, to decomposing a measurement into independent submeasurements. Formally, it is the mutual information between the outputs of the constituent subsystems when the overall system is observed under the uniform prior. Entanglement zero implies that the joint measurement factorizes into a product of its parts; a positive value indicates genuine contextual dependence—information supplied by one part is useful for interpreting another.
The authors prove that a measurement can be more informative than the sum of its parts only when its entanglement is non‑zero (Theorem 9). Thus, context‑dependence and indecomposability are inseparable: a fully decomposable measurement never exceeds the additive information of its components, whereas an indecomposable (entangled) measurement can. This formal result mirrors the intuition that cortical processing integrates diverse contextual signals into a unified representation that is richer than any isolated neuron or local assembly could provide.
Illustrative examples include deterministic functions f : X → Y and a binary operation g : X × Y → Z, which serve to demonstrate how the dual, effective information, and entanglement are computed. Detailed calculations for cellular automata and Hopfield networks are deferred to companion papers (
Comments & Academic Discussion
Loading comments...
Leave a Comment