A Thermodynamic Theory of Learning Part II: Critical Period Closure and Continual Learning Failure

A Thermodynamic Theory of Learning Part II: Critical Period Closure and Continual Learning Failure
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Learning performed over finite time is inherently irreversible. In PartI of this series, we modeled learning as a transport process in the space of parameter distributions and derived the Epistemic Speed Limit (ESL), which lower-bounds entropy production under finite-time dynamics. In this work (PartII), we show that irreversibility imposes a geometric restriction on future adaptability through the compositional structure of learning dynamics. Successive learning phases compose multiplicatively as transport maps, and their Jacobians form a semigroup whose rank and singular values are submultiplicative. As a result, dynamically usable degrees of reconfiguration can only decrease under composition. We formalize the remaining adaptability of a model in terms of compatible effective rank, defined as the log-volume of task-preserving directions that remain dynamically accessible. Although task performance may remain unchanged, finite-time learning can progressively reduce this reconfiguration capacity. We prove a capacity-threshold criterion for continual learning: let m_B denote the stable rank of the Hessian of a new task B restricted to the task-preserving manifold of a previously learned task A. If m_B exceeds the residual compatible effective rank, then task B is trajectory-level incompatible with task A; any sufficient adaptation necessarily induces forgetting. Thus catastrophic forgetting arises not from the absence of multi-task solutions, but from irreversible loss of reconfiguration capacity under compositional learning dynamics. This establishes a trajectory-level capacity limit for continual learning.


💡 Research Summary

The paper extends the thermodynamic framework for machine learning introduced in Part I, where learning was modeled as a transport of a probability distribution over parameters and bounded by the Epistemic Speed Limit (ESL). In Part II the authors shift focus from endpoint constraints to the geometry of the entire learning trajectory and show that finite‑time learning inevitably produces irreversible contraction of the transport map.

Each learning phase is represented by a random transport map Ψₜ(·; ω) with Jacobian Jₜ. The authors define “effective rank” R(t)=exp


Comments & Academic Discussion

Loading comments...

Leave a Comment