On the Finite Time Convergence of Cyclic Coordinate Descent Methods

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Cyclic coordinate descent is a classic optimization method that has witnessed a resurgence of interest in machine learning. Reasons for this include its simplicity, speed and stability, as well as its competitive performance on $\ell_1$ regularized smooth optimization problems. Surprisingly, very little is known about its finite time convergence behavior on these problems. Most existing results either just prove convergence or provide asymptotic rates. We fill this gap in the literature by proving $O(1/k)$ convergence rates (where $k$ is the iteration counter) for two variants of cyclic coordinate descent under an isotonicity assumption. Our analysis proceeds by comparing the objective values attained by the two variants with each other, as well as with the gradient descent algorithm. We show that the iterates generated by the cyclic coordinate descent methods remain better than those of gradient descent uniformly over time.

💡 Research Summary

The paper addresses a notable gap in the theoretical understanding of cyclic coordinate descent (CCD) methods applied to smooth optimization problems with ℓ₁ regularization, a class of problems that frequently arise in modern machine learning (e.g., Lasso, sparse logistic regression). While CCD is celebrated for its simplicity, low per‑iteration cost, and empirical robustness, prior work has largely offered only asymptotic convergence guarantees or generic “converges” statements, leaving practitioners without clear expectations about finite‑time performance.

Problem setting and assumptions
The authors consider the composite objective
\

On the Finite Time Convergence of Cyclic Coordinate Descent Methods

💡 Research Summary

Comments & Academic Discussion

Leave a Comment