Stochastic versus Deterministic in Stochastic Gradient Descent

Stochastic versus Deterministic in Stochastic Gradient Descent
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

This paper considers the mini-batch stochastic gradient descent (SGD) for a structured minimization problem involving a finite-sum function with its gradient being stochastically approximated, and an independent term with its gradient being deterministically computed. We focus on the stochastic versus deterministic behavior of the mini-batch SGD for this setting. A convergence analysis is provided that captures the different roles of these two parts. Linear convergence of the algorithm to a neighborhood of the minimizer is established under some smoothness and convexity assumptions. The step size, the convergence rate, and the radius of the convergence region depend asymmetrically on the characteristics of the two components, which shows the distinct impacts of stochastic approximation versus deterministic computation in the mini-batch SGD. Moreover, a better convergence rate can be obtained when the independent term endows the objective function with sufficient strong convexity. Also, the convergence rate of our algorithm in expectation approaches that of the classic gradient descent when the batch size increases. Numerical experiments are conducted to support the theoretical analysis as well.


💡 Research Summary

This paper investigates the behavior of mini‑batch stochastic gradient descent (SGD) when applied to a structured optimization problem of the form
\


Comments & Academic Discussion

Loading comments...

Leave a Comment