Statistical Inference for Differentially Private Stochastic Gradient Descent
Privacy preservation in machine learning, particularly through Differentially Private Stochastic Gradient Descent (DP-SGD), is critical for sensitive data analysis. However, existing statistical inference methods for SGD predominantly focus on cyclic subsampling, while DP-SGD requires randomized subsampling. This paper first bridges this gap by establishing the asymptotic properties of SGD under the randomized rule and extending these results to DP-SGD. For the output of DP-SGD, we show that the asymptotic variance decomposes into statistical, sampling, and privacy-induced components. Two methods are proposed for constructing valid confidence intervals: the plug-in method and the random scaling method. We also perform extensive numerical analysis, which shows that the proposed confidence intervals achieve nominal coverage rates while maintaining privacy.
💡 Research Summary
The paper tackles a fundamental gap in the statistical inference literature for stochastic gradient descent (SGD) when the algorithm must satisfy differential privacy (DP). Existing asymptotic results for SGD are limited to the cyclic (deterministic) subsampling scheme, which is incompatible with the random subsampling required for privacy amplification. The authors first develop a rigorous asymptotic theory for averaged SGD under the randomized rule. Assuming strong convexity, a well‑defined Hessian at the optimum, bounded fourth moments of the stochastic gradient, and a learning rate ηₜ = η·t^{‑α} with α∈(½,1), they show that when the total number of iterations T equals k·n (k≥c₀) and each mini‑batch contains m examples, the averaged iterate (\barθ_T) satisfies
\
Comments & Academic Discussion
Loading comments...
Leave a Comment