Certified Per-Instance Unlearning Using Individual Sensitivity Bounds

Reading time: 5 minute
...

📝 Original Info

  • Title: Certified Per-Instance Unlearning Using Individual Sensitivity Bounds
  • ArXiv ID: 2602.15602
  • Date: 2026-02-17
  • Authors: ** (논문에 명시된 저자 정보가 제공되지 않아 작성 불가. 실제 논문에서 확인 필요.) **

📝 Abstract

Certified machine unlearning can be achieved via noise injection leading to differential privacy guarantees, where noise is calibrated to worst-case sensitivity. Such conservative calibration often results in performance degradation, limiting practical applicability. In this work, we investigate an alternative approach based on adaptive per-instance noise calibration tailored to the individual contribution of each data point to the learned solution. This raises the following challenge: how can one establish formal unlearning guarantees when the mechanism depends on the specific point to be removed? To define individual data point sensitivities in noisy gradient dynamics, we consider the use of per-instance differential privacy. For ridge regression trained via Langevin dynamics, we derive high-probability per-instance sensitivity bounds, yielding certified unlearning with substantially less noise injection. We corroborate our theoretical findings through experiments in linear settings and provide further empirical evidence on the relevance of the approach in deep learning settings.

💡 Deep Analysis

📄 Full Content

Modern machine learning systems increasingly operate under regulatory and contractual constraints that require the post-hoc removal of individual training examples, for instance in the context of the "right to be forgotten" (Marino et al., 2025). The gold standard is to retrain the model from scratch on the dataset with the target example removed, but such retraining is often computationally infeasible in practice. Machine unlearning seeks to approximate the effect of retraining at a substantially lower cost, either by enforcing exact equivalence with retraining (exact unlearning) (Cao and Yang, 2015) or by matching the retrained model only in distribution (approximate unlearning) (Ginart et al., 2019;Sekhari et al., 2021). We focus on approximate unlearning.

We further restrict attention to certified unlearning, where deletion procedures are accompanied by explicit, provable guarantees. Such guarantees are naturally connected to differential privacy (DP), which quantifies how the output distribution of a randomized mechanism changes when a single training point is removed. While DP provides a natural language for reasoning about certified deletion, its guarantees do not directly transfer to the unlearning setting.

A first mismatch concerns the notion of adjacency. Differential privacy allows either removing or substituting a data point, whereas unlearning compares training to retraining the model as if a given point had never been part of the dataset. Substitution-based adjacency therefore lacks a natural interpretation for unlearning, although it is sometimes used for technical convenience (e.g., Chien et al., 2024).

The most important difference, however, lies in the nature of the guarantee itself. Differential privacy certifies a mechanism a priori, uniformly over all datasets and all data points. Unlearning, by contrast, is inherently post-hoc: it targets a specific trained model, a specific dataset, and a specific data point whose deletion is requested.

First, this post-hoc nature affects how unlearning guarantees should be defined. While many existing approaches compare unlearning to a full retraining oracle (Ginart et al., 2019;Neel et al., 2021;Guo et al., 2020;Koloskova et al., 2025;Mu and Klabjan, 2025), we follow several recent works (Lu et al., 2025;Basaran et al., 2025;Waerebeke et al., 2025) and adopt a self-referenced notion of unlearning, which compares two executions of the same unlearning procedure on adjacent datasets. This isolates what is certified intrinsically by the unlearning mechanism.

Second, this conceptual difference affects how guarantees should be calibrated across data points. Uni-form DP guarantees, while stronger in generality, enforce an homogeneous treatment of all data points, regardless of their actual influence on the learned model. This regime is implicitly adopted by some certified unlearning approaches, including Langevinbased methods (Chien et al., 2024). In contrast, we argue that certified unlearning should explicitly exploit the fact that deletion targets a specific dataset and a specific data point. This naturally calls for perinstance certified unlearning, where guarantees are calibrated to the actual influence of the point being removed, avoiding unnecessary addition of noise and loss of utility. This view is supported by recent work on individualized guarantees (Sepahvand et al., 2025) and by empirical evidence showing that worst-case sensitivity bounds often overestimate the influence of typical data points (Thudi et al., 2024).

We now turn to the practical question of how to derive certified unlearning guarantees within this framework. As in (Chien et al., 2024), we consider unlearning procedures based on Langevin dynamics. Our analysis builds on recent results by Bok et al. (2024), who show how privacy loss can be tracked along noisy optimization trajectories during learning. We extend this reasoning to the learn-then-unlearn setting, where the learning noise is fixed and certified deletion guarantees are obtained by calibrating the additional noise injected during unlearning to reach a target privacy level.

We then examine how this analysis can be refined at the level of individual data points. To this end, we introduce per-instance sensitivity as the central quantity and use it to derive certified guarantees for approximate unlearning. In the case of ridge regression trained with Langevin dynamics, we show that although sensitivities are formally unbounded due to the Gaussian nature of the iterates, they can be sharply controlled with high probability and evaluated efficiently. This yields certified unlearning guarantees that adapt to the actual influence of the removed data point, rather than relying on a worst-case bound.

Overall, our results show that the difficulty of unlearning is not uniform. Data points that exert little influence on the training dynamics are inherently easier to forget and require less noise to certify their deletion, a

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut