Accumulative SGD Influence Estimation for Data Attribution

Reading time: 1 minute
...

📝 Original Info

  • Title: Accumulative SGD Influence Estimation for Data Attribution
  • ArXiv ID: 2510.26185
  • Date: 2025-10-30
  • Authors: - 원 논문에 명시된 저자 정보가 제공되지 않았습니다. (논문 PDF 혹은 공식 페이지에서 확인 필요)

📝 Abstract

Modern data-centric AI needs precise per-sample influence. Standard SGD-IE approximates leave-one-out effects by summing per-epoch surrogates and ignores cross-epoch compounding, which misranks critical examples. We propose ACC-SGD-IE, a trajectory-aware estimator that propagates the leave-one-out perturbation across training and updates an accumulative influence state at each step. In smooth strongly convex settings it achieves geometric error contraction and, in smooth non-convex regimes, it tightens error bounds; larger mini-batches further reduce constants. Empirically, on Adult, 20 Newsgroups, and MNIST under clean and corrupted data and both convex and non-convex training, ACC-SGD-IE yields more accurate influence estimates, especially over long epochs. For downstream data cleansing it more reliably flags noisy samples, producing models trained on ACC-SGD-IE cleaned data that outperform those cleaned with SGD-IE.

💡 Deep Analysis

📄 Full Content

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut