Sharp worst-case evaluation complexity bounds for arbitrary-order nonconvex optimization with inexpensive constraints
We provide sharp worst-case evaluation complexity bounds for nonconvex minimization problems with general inexpensive constraints, i.e.\ problems where the cost of evaluating/enforcing of the (possibly nonconvex or even disconnected) constraints, if any, is negligible compared to that of evaluating the objective function. These bounds unify, extend or improve all known upper and lower complexity bounds for unconstrained and convexly-constrained problems. It is shown that, given an accuracy level $\epsilon$, a degree of highest available Lipschitz continuous derivatives $p$ and a desired optimality order $q$ between one and $p$, a conceptual regularization algorithm requires no more than $O(\epsilon^{-\frac{p+1}{p-q+1}})$ evaluations of the objective function and its derivatives to compute a suitably approximate $q$-th order minimizer. With an appropriate choice of the regularization, a similar result also holds if the $p$-th derivative is merely H"older rather than Lipschitz continuous. We provide an example that shows that the above complexity bound is sharp for unconstrained and a wide class of constrained problems, we also give reasons for the optimality of regularization methods from a worst-case complexity point of view, within a large class of algorithms that use the same derivative information.
💡 Research Summary
The paper addresses the worst‑case evaluation (oracle) complexity of nonconvex optimization problems in which the cost of evaluating or enforcing constraints is negligible compared to the cost of evaluating the objective function. Such “inexpensive‑constraint” problems include bound‑constrained, projection‑cheap, or even disconnected feasible sets, as long as the projection onto the feasible set does not dominate the computational budget.
Problem setting and assumptions
The authors consider the set‑constrained problem
min _{x∈F} f(x)
where F⊂ℝⁿ is closed and non‑empty, but no convexity or connectivity is required. The objective f belongs to C^{p,β}(ℝⁿ) for some integer p≥1 and β∈(0,1]; that is, f is p‑times continuously differentiable and its p‑th derivative is globally Hölder continuous with exponent β (β=1 corresponds to the usual Lipschitz continuity). The smoothness constants L and β are assumed known.
High‑order optimality measure
Extending the classical first‑ and second‑order necessary conditions, the authors adopt the (ε,δ)‑q‑necessary optimality concept introduced in Cartis‑Gould‑Toint (2018). For a given radius δ∈(0,1] and order q (1≤q≤p), they define
φ_δ^{(j)}(f,x) = f(x) – min_{‖d‖≤δ, x+d∈F} T_j(x,d),
where T_j(x,d) is the j‑th order Taylor expansion of f at x. The quantity φ_δ^{(j)} measures the maximal decrease achievable by the j‑th order model inside the intersection of the feasible set and a ball of radius δ. An (ε,δ)‑q‑necessary minimizer satisfies
φ_δ^{(q)}(f,x) ≤ ε χ_q(δ), χ_q(δ)=∑_{ℓ=1}^{q} δ^{ℓ}/ℓ!.
When ε→0 this condition reduces to the path‑based necessary optimality conditions of order q.
Algorithm: Adaptive Regularization of order p (AR p)
The core algorithm builds at each iteration k a p‑th order Taylor model
T_p(x_k,s) = f(x_k) + Σ_{ℓ=1}^{p} (1/ℓ!) ∇^{ℓ}f(x_k)
Comments & Academic Discussion
Loading comments...
Leave a Comment