BayeSQP: Bayesian Optimization through Sequential Quadratic Programming
We introduce BayeSQP, a novel algorithm for general black-box optimization that merges the structure of sequential quadratic programming with concepts from Bayesian optimization. BayeSQP employs second-order Gaussian process surrogates for both the objective and constraints to jointly model the function values, gradients, and Hessian from only zero-order information. At each iteration, a local subproblem is constructed using the GP posterior estimates and solved to obtain a search direction. Crucially, the formulation of the subproblem explicitly incorporates uncertainty in both the function and derivative estimates, resulting in a tractable second-order cone program for high probability improvements under model uncertainty. A subsequent one-dimensional line search via constrained Thompson sampling selects the next evaluation point. Empirical results show thatBayeSQP outperforms state-of-the-art methods in specific high-dimensional settings. Our algorithm offers a principled and flexible framework that bridges classical optimization techniques with modern approaches to black-box optimization.
💡 Research Summary
The paper introduces BayeSQP, a novel algorithm that blends the classical sequential quadratic programming (SQP) framework with modern Bayesian optimization (BO) techniques to tackle constrained black‑box optimization problems in high‑dimensional spaces. The key innovation lies in employing second‑order Gaussian process (GP) surrogates for both the objective and any constraints. By exploiting the fact that a GP is closed under linear operations, the authors are able to infer not only the function value but also the gradient and Hessian from purely zero‑order (function‑value) observations. The GP provides posterior means and covariances for these quantities, which are then used to construct a probabilistic model of the local landscape.
At each iteration, BayeSQP builds a local quadratic subproblem reminiscent of standard SQP. However, instead of using exact derivatives, the subproblem incorporates the GP posterior means as point estimates and explicitly accounts for uncertainty through a value‑at‑risk (VaR) formulation for the objective and a high‑probability feasibility constraint for each inequality. By assuming joint Gaussianity of the function, gradient, and Hessian, the probabilistic constraints can be transformed into deterministic second‑order cone constraints. Consequently, the entire subproblem becomes a second‑order cone program (SOCP), which can be solved efficiently with off‑the‑shelf solvers.
The solution of the SOCP yields a search direction (p_t). To determine a step length, the authors replace the traditional line‑search with a constrained Thompson sampling procedure that samples candidate points along the line (x_t + \alpha p_t) while enforcing the constraints. This stochastic line search respects the modeled uncertainty and ensures that the next evaluation point lies within the feasible region with high probability.
The algorithm proceeds iteratively: (1) collect zero‑order observations, (2) update the GP surrogates for objective and constraints, (3) compute the expected Hessian of the Lagrangian, (4) solve the uncertainty‑aware SOCP for (p_t), (5) perform constrained Thompson sampling to pick the next point, and (6) repeat until the evaluation budget is exhausted.
Experimental evaluation focuses on high‑dimensional constrained benchmark problems (typically >50 dimensions). BayeSQP is compared against state‑of‑the‑art high‑dimensional BO methods such as GIBO, TuRBO, and SCBO. Results show that BayeSQP achieves faster convergence, higher success rates, and better final objective values under the same evaluation budget, especially when constraints are present. The authors attribute these gains to the use of second‑order information and the explicit handling of uncertainty in both objective and constraints.
Strengths of the work include: (i) a principled way to obtain gradient and curvature information from only function evaluations, (ii) a robust subproblem formulation that guarantees high‑probability improvement, (iii) tractable conversion to an SOCP, and (iv) a unified treatment of constrained and unconstrained problems. Limitations are also acknowledged: the full covariance of the Hessian is omitted due to its prohibitive (O(d^4)) storage, potentially under‑representing curvature uncertainty in very high dimensions; the approach relies on smooth kernels and may struggle with non‑smooth or highly irregular functions; and the empirical study, while promising, is limited to a modest set of synthetic benchmarks without large‑scale real‑world applications. Moreover, the constrained Thompson sampling line search introduces additional hyper‑parameters (e.g., confidence levels, number of samples) that may require tuning.
In summary, BayeSQP presents a compelling integration of SQP and BO, offering a new local optimization paradigm that leverages second‑order GP models and uncertainty‑aware optimization. Future work could explore efficient approximations of Hessian covariance, alternative kernels for non‑smooth problems, scalability to thousands of dimensions and constraints, and broader validation on practical engineering design tasks.
Comments & Academic Discussion
Loading comments...
Leave a Comment