경험 재생에서 깊은 망각과 얕은 망각의 비대칭: 작은 버퍼는 특징 공간을 유지하지만 분류 경계는 왜곡한다

February 23, 2026

Reading time: 6 minute

...

📝 Original Info

Title: 경험 재생에서 깊은 망각과 얕은 망각의 비대칭: 작은 버퍼는 특징 공간을 유지하지만 분류 경계는 왜곡한다
ArXiv ID: 2512.07400
Date: 2025-12-08
Authors: Giulia Lanzillotta, Damiano Meier, Thomas Hofmann

📝 Abstract

A persistent paradox in continual learning (CL) is that neural networks often retain linearly separable representations of past tasks even when their output predictions fail. We formalize this distinction as the gap between deep (feature-space) and shallow (classifier-level) forgetting. We reveal a critical asymmetry in Experience Replay: while minimal buffers successfully anchor feature geometry and prevent deep forgetting, mitigating shallow forgetting typically requires substantially larger buffer capacities. To explain this, we extend the Neural Collapse framework to the sequential setting. We characterize deep forgetting as a geometric drift toward out-of-distribution subspaces and prove that any non-zero replay fraction asymptotically guarantees the retention of linear separability. Conversely, we identify that the "strong collapse" induced by small buffers leads to rank-deficient covariances and inflated class means, effectively blinding the classifier to true population boundaries. By unifying CL with out-of-distribution detection, our work challenges the prevailing reliance on large buffers, suggesting that explicitly correcting these statistical artifacts could unlock robust performance with minimal replay. Tasks Task onset Good buffer boundary Good population boundary Area of Bufferoptimal decision boundaries The data is OOD, there is no class information in the features Classes are separable Classes are still separable but the the decision boundary is misaligned Shallow Forgetting Figure 1 : Evolution of decision boundaries and feature separability. PCA evolution of two Cifar10 classes (1% replay). Replay samples are highlighted with a black edge. While features retain separability across tasks (low deep forgetting), the classifier optimization becomes underdetermined: multiple "buffer-optimal" boundaries (dashed brown) perfectly classify the stored samples but largely fail to align to the true population boundary (dashed green), resulting in shallow forgetting.

💡 Deep Analysis

Deep Dive into 경험 재생에서 깊은 망각과 얕은 망각의 비대칭: 작은 버퍼는 특징 공간을 유지하지만 분류 경계는 왜곡한다.

📄 Full Content

Continual learning (Hadsell et al., 2020) aims to train neural networks on a sequence of tasks without catastrophic forgetting. It holds particular promise for adaptive AI systems, such as autonomous agents that must integrate new information without full retraining or centralized data access. The theoretical understanding of optimization in non-stationary environments remains limited, particularly regarding the mechanisms that govern the retention and loss of learned representations.

A persistent observation in the literature is that neural networks retain substantially more information about past tasks in their internal representations than in their output predictions. This phenomenon, Preprint.

first demonstrated through linear probe evaluations, shows that a linear classifier trained on frozen last-layer representations achieves markedly higher accuracy on old tasks than the network’s own output layer (Murata et al., 2020;Hess et al., 2023). In other words, past-task data remain linearly separable in feature space, even when the classifier fails to exploit this structure. This motivates a distinction between two levels of forgetting: shallow forgetting, corresponding to output-level degradation recoverable by a linear probe, and deep forgetting, corresponding to irreversible loss of feature-space separability.

In this work, we show that replay buffers affect these two forms of forgetting in systematically different ways. Replay-the practice of storing a small subset of past samples for joint training with new data-is among the most effective and widely adopted strategies in continual learning. However, the requirement to store and repeatedly process substantial amounts of past data limits its scalability. Our analysis reveals a critical efficiency gap: even small buffers are sufficient to preserve feature separability and prevent deep forgetting, whereas mitigating shallow forgetting requires substantially larger buffers. Thus, while replay robustly preserves representational geometry, it often fails to maintain alignment between the learned head and the true data distribution.

To explain this phenomenon, we turn to the geometry of deep network representations. Recent work has shown that, at convergence, standard architectures often exhibit highly structured, lowdimensional feature organization. In particular, the Neural Collapse (NC) phenomenon (Papyan et al., 2020) describes a regime in which within-class variability vanishes, class means form an equiangular tight frame (ETF), and classifier weights align with these means. Originally observed in simplified settings, NC has now been documented across architectures, training regimes, and even large-scale language models (Súkeník et al., 2025;Wu & Papyan, 2025), making it a powerful framework to analyze feature-head interactions.

In this work, we extend the NC framework to continual learning, providing a characterization of the geometry of features and heads under extended training. Our analysis covers task-, class-, and domain-incremental settings and explicitly accounts for replay. To account for this, we propose two governing hypotheses for historical data absent from the buffer: (1) topologically, forgotten samples behave as out-of-distribution (OOD) entities, and (2) enlarging the replay buffer induces a smooth interpolation between this OOD regime and fully collapsed representations. These insights allow us to construct a simple yet predictive theory of feature-space forgetting that lower-bounds separability and captures the influence of weight decay, feature-norm scaling, and buffer size.

In summary, this paper makes the following distinct contributions:

The replay efficiency gap. We identify an intrinsic asymmetry in replay-based continual learning: minimal buffers are sufficient to anchor feature geometry (preventing deep forgetting), whereas mitigating classifier misalignment (shallow forgetting) requires disproportionately large capacities.
Asymptotic framework for continual learning. We extend Neural Collapse theory to continual learning, characterizing the asymptotic geometry of both single-head and multi-head architectures and identifying unique phenomena like rank reduction in taskincremental learning.
Effects of replay on feature geometry. We demonstrate that shallow forgetting arises because classifier optimization on buffers is under-determined-a condition structurally exacerbated by Neural Collapse. The resulting geometric simplification (covariance deficiency and norm inflation) blinds the classifier to the true population boundaries.
Connection to OOD detection. We reconceptualize deep forgetting as a geometric drift toward out-of-distribution subspaces. This perspective bridges the gap between CL and OOD literature, offering a rigorous geometric definition of “forgetting” beyond simple accuracy loss.

We adopt the standard compositional formulation of a neural network, decomposing it into a feature map and a cl

…(Full text truncated)…

📄 Read Full PDF on ArXiv