COBRA++: Enhanced COBRA Optimizer with Augmented Surrogate Pool and Reinforced Surrogate Selection

COBRA++: Enhanced COBRA Optimizer with Augmented Surrogate Pool and Reinforced Surrogate Selection
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

The optimization problems in realistic world present significant challenges onto optimization algorithms, such as the expensive evaluation issue and complex constraint conditions. COBRA optimizer (including its up-to-date variants) is a representative and effective tool for addressing such optimization problems, which introduces 1) RBF surrogate to reduce online evaluation and 2) bi-stage optimization process to alternate search for feasible solution and optimal solution. Though promising, its design space, i.e., surrogate model pool and selection standard, is still manually decided by human expert, resulting in labor-intensive fine-tuning for novel tasks. In this paper, we propose a learning-based adaptive strategy (COBRA++) that enhances COBRA in two aspects: 1) An augmented surrogate pool to break the tie with RBF-like surrogate and hence enhances model diversity and approximation capability; 2) A reinforcement learning-based online model selection policy that empowers efficient and accurate optimization process. The model selection policy is trained to maximize overall performance of COBRA++ across a distribution of constrained optimization problems with diverse properties. We have conducted multi-dimensional validation experiments and demonstrate that COBRA++ achieves substantial performance improvement against vanilla COBRA and its adaptive variant. Ablation studies are provided to support correctness of each design component in COBRA++.


💡 Research Summary

The paper introduces COBRA++, an enhanced version of the Constrained Optimization by Radial Basis Function Approximation (COBRA) framework, targeting expensive black‑box constrained optimization problems where function evaluations are costly and the evaluation budget is severely limited. While the original COBRA relies on a single Radial Basis Function (RBF) surrogate for both objective and constraints and uses a deterministic bi‑stage search (first feasibility, then objective improvement), it suffers from two major drawbacks: (1) a fixed surrogate may be ill‑suited for diverse problem landscapes, leading to poor approximation and sub‑optimal search, and (2) the surrogate‑selection mechanism in adaptive variants such as A‑SACOBRA is handcrafted and lacks learning‑driven adaptability.

COBRA++ addresses these issues through two complementary innovations. First, it expands the surrogate pool from a single RBF to eleven distinct RBF models: one cubic kernel, five multiquadric kernels, and five Gaussian kernels, each with five different width scaling factors (w = 0.01, 0.2, 0.5, 1, 5). This diversified pool increases expressive power, allowing the algorithm to capture smooth, highly nonlinear, and locally varying behaviors of both objective and constraint functions. All models are retrained at every iteration using the accumulated dataset, ensuring that each surrogate reflects the most recent information.

Second, COBRA++ replaces heuristic surrogate selection with a reinforcement‑learning (RL) policy trained via a Deep Q‑Network (DQN). The surrogate‑selection process is formalized as a Markov Decision Process (MDP). At time step t the state sₜ consists of (i) per‑model features (eight dimensions each) that encode average prediction error normalized by objective range, a binary history of model usage over the last five steps, the count of successful feasibility steps, and a normalized cumulative contribution metric; and (ii) two global optimization features: the standard deviation of the last five objective values and the ratio of remaining function‑evaluation budget. The action space contains eleven discrete actions, each corresponding to selecting one surrogate from the pool. The reward rₜ is binary: it equals 1 if the new solution xₜ₊₁ improves the true objective and satisfies all constraints, otherwise 0. This direct linkage forces the policy to learn selections that lead to feasible, improving moves.

The DQN architecture comprises two parallel multilayer perceptrons (MLPs) that process the per‑model and global feature vectors separately, followed by concatenation and a Q‑value head. Training employs ε‑greedy exploration, experience replay, and a target‑network update scheme. The policy is trained on a distribution of benchmark problems varying in dimensionality (30, 50, 100), number of inequality constraints, and landscape complexity, thereby encouraging generalization across problem families.

Empirical evaluation compares COBRA++ against vanilla COBRA, SACOBRA, and A‑SACOBRA on synthetic benchmarks and real‑world engineering design tasks, all under strict evaluation budgets (200–500 true function calls). Performance metrics include final objective value, feasibility success rate, and convergence speed. Results show that COBRA++ consistently achieves lower final objective values—typically a 15–30 % reduction relative to the best baseline—and higher feasibility rates, especially on problems with highly nonlinear constraints where the improvement can exceed 20 %. Ablation studies isolate the contributions of the enlarged surrogate pool and the RL selection policy; each component alone yields modest gains, but their combination produces the largest performance boost, confirming a synergistic effect. Further analysis of the learned policy reveals dynamic adaptation: early in the search, wide‑bandwidth Gaussian surrogates dominate to promote exploration, while later stages favor narrow‑bandwidth multiquadric kernels that provide finer local approximations.

The authors conclude that (1) diversifying surrogate models substantially mitigates approximation errors across heterogeneous landscapes, and (2) learning‑driven surrogate selection outperforms static heuristics, offering better generalization to unseen problems. They suggest future extensions such as incorporating non‑RBF surrogates (e.g., Gaussian Processes, deep neural networks), handling multi‑objective or multi‑task settings, and developing adaptive budget allocation strategies. Overall, COBRA++ demonstrates that integrating a rich surrogate pool with reinforcement‑learning‑based model management can significantly advance the state of the art in expensive constrained optimization.


Comments & Academic Discussion

Loading comments...

Leave a Comment