Why Some Models Resist Unlearning: A Linear Stability Perspective
Machine unlearning, the ability to erase the effect of specific training samples without retraining from scratch, is critical for privacy, regulation, and efficiency. However, most progress in unlearning has been empirical, with little theoretical understanding of when and why unlearning works. We tackle this gap by framing unlearning through the lens of asymptotic linear stability to capture the interaction between optimization dynamics and data geometry. The key quantity in our analysis is data coherence which is the cross sample alignment of loss surface directions near the optimum. We decompose coherence along three axes: within the retain set, within the forget set, and between them, and prove tight stability thresholds that separate convergence from divergence. To further link data properties to forgettability, we study a two layer ReLU CNN under a signal plus noise model and show that stronger memorization makes forgetting easier: when the signal to noise ratio (SNR) is lower, cross sample alignment is weaker, reducing coherence and making unlearning easier; conversely, high SNR, highly aligned models resist unlearning. For empirical verification, we show that Hessian tests and CNN heatmaps align closely with the predicted boundary, mapping the stability frontier of gradient based unlearning as a function of batching, mixing, and data/model alignment. Our analysis is grounded in random matrix theory tools and provides the first principled account of the trade offs between memorization, coherence, and unlearning.
💡 Research Summary
The paper tackles the largely empirical problem of machine unlearning—removing the influence of specific training samples without retraining from scratch—by introducing a principled theoretical framework based on asymptotic linear stability. The authors argue that, because unlearning starts from a pre‑trained local optimum, the dynamics of the optimizer can be linearized around that point: the gradient is approximated by the Hessian acting on the parameter perturbation. This yields a simple linear update w_{t+1}= (I‑ηH_t) w_t for stochastic gradient descent (SGD).
In the unlearning setting, the update simultaneously descends on the retain set and ascends on the forget set, controlled by a hyper‑parameter α∈
Comments & Academic Discussion
Loading comments...
Leave a Comment