Absolute abstraction: a renormalisation group approach

Absolute abstraction: a renormalisation group approach
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Abstraction is the process of extracting the essential features from raw data while ignoring irrelevant details. It is well known that abstraction emerges with depth in neural networks, where deep layers capture abstract characteristics of data by combining lower level features encoded in shallow layers (e.g. edges). Yet we argue that depth alone is not enough to develop truly abstract representations. We advocate that the level of abstraction crucially depends on how broad the training set is. We address the issue within a renormalisation group approach where a representation is expanded to encompass a broader set of data. We take the unique fixed point of this transformation – the Hierarchical Feature Model – as a candidate for a representation which is absolutely abstract. This theoretical picture is tested in numerical experiments based on Deep Belief Networks and auto-encoders trained on data of different breadth. These show that representations in neural networks approach the Hierarchical Feature Model as the data get broader and as depth increases, in agreement with theoretical predictions.


💡 Research Summary

The paper argues that abstraction in neural networks is not solely a function of depth; instead, the breadth of the training data—its diversity and coverage—plays an equally crucial role. To formalize this claim, the authors map the learning process onto a renormalisation‑group (RG) framework borrowed from statistical physics. In RG, a system undergoes coarse‑graining (removing fine‑scale details) followed by rescaling (restoring the original size). The authors interpret coarse‑graining as the emergence of higher‑level, more abstract features, while rescaling corresponds to expanding the data domain, i.e., increasing breadth.

Mathematically, the internal representation is a probability distribution p(s) over a binary hidden layer s = (s₁,…,sₙ). When the training set is enlarged, a transformation ℛ↑ is applied: first a new random feature s₀ with maximal entropy (p(s₀=1)=p(s₀=0)=½) is added, thereby extending the representation to a wider universe; second, the existing features are reorganised under a principle of informational efficiency. The authors invoke two complementary information‑theoretic principles: maximal relevance (the most informative representation) and maximal entropy (the least biased distribution given constraints). The unique fixed point of this transformation is identified as the Hierarchical Feature Model (HFM).

The HFM is a maximum‑entropy model characterised by a single sufficient statistic—the average level of detail across features. Because it depends only on this statistic, it is independent of the specific data content and thus embodies “absolute abstraction”: a representation that is universal and data‑independent.

To test the theory, the authors conduct experiments with Deep Belief Networks and auto‑encoders, both of limited capacity. They systematically vary two axes: (i) data breadth, by progressively adding new samples from broader domains, and (ii) depth, by adding hidden layers. Results show that increasing depth alone yields richer representations but does not drive the distribution toward the HFM. Expanding breadth alone leads to a shift toward higher‑level features, reduced entropy, and increased relevance. Crucially, when depth and breadth are increased together, the hidden‑layer distribution converges to the HFM: KL‑divergence to the HFM becomes negligible and relevance reaches its theoretical maximum.

These findings support the view that true abstraction emerges from the combined effect of depth (hierarchical processing) and breadth (exposure to diverse data). The HFM provides a concrete target for designing abstract representations in artificial systems. The authors suggest future work extending the HFM framework to supervised tasks, multimodal data, and comparisons with hierarchical representations observed in the mammalian cortex, thereby bridging machine learning and cognitive neuroscience.


Comments & Academic Discussion

Loading comments...

Leave a Comment