New Techniques for Algorithm Portfolio Design

New Techniques for Algorithm Portfolio Design

We present and evaluate new techniques for designing algorithm portfolios. In our view, the problem has both a scheduling aspect and a machine learning aspect. Prior work has largely addressed one of the two aspects in isolation. Building on recent work on the scheduling aspect of the problem, we present a technique that addresses both aspects simultaneously and has attractive theoretical guarantees. Experimentally, we show that this technique can be used to improve the performance of state-of-the-art algorithms for Boolean satisfiability, zero-one integer programming, and A.I. planning.


💡 Research Summary

The paper tackles the long‑standing problem of algorithm portfolio design by treating it as a joint scheduling‑and‑machine‑learning task. Traditional approaches have either focused on the scheduling side—deciding how to allocate limited computational resources among a set of candidate algorithms—or on the learning side—building predictive models that select the most promising algorithm for a given problem instance. These two aspects have been addressed in isolation, leading to sub‑optimal portfolios that cannot fully exploit the synergy between resource allocation and instance‑specific algorithm performance.

Building on recent work that formalized the scheduling component as a constrained optimization problem, the authors propose a unified framework that simultaneously learns instance‑specific performance models and computes an optimal execution schedule. The method proceeds in four stages. First, a comprehensive dataset is collected by running each candidate algorithm on a large benchmark of instances, recording runtime, success/failure, and resource consumption, while also extracting a rich set of instance features (e.g., variable counts, clause density, structural motifs). Second, the authors fit probabilistic models—using Bayesian regression or Gaussian processes—to capture the conditional distribution of each algorithm’s runtime and success probability given the extracted features. This step explicitly models uncertainty, which is crucial for downstream decision making. Third, they formulate an expected‑loss minimization problem: given a total time budget T, decide how much time to allocate to each algorithm so that the weighted sum of expected failures (or costs) is minimized. The resulting optimization is a linear program with Lagrange multipliers that can be solved efficiently, and the authors prove a (1‑ε) approximation guarantee relative to the optimal (NP‑hard) schedule. Finally, at deployment time, the system extracts features from a new instance, queries the learned probabilistic models, recomputes the expected‑loss objective, and instantly produces a schedule that can be executed in real time. Because the schedule is recomputed per instance, the approach adapts dynamically to changing workloads.

The theoretical contributions are twofold. First, the paper establishes that the Lagrangian‑based scheduler attains a provable approximation bound, bridging the gap between heuristic schedulers and exact combinatorial optimization. Second, it derives an upper bound on the impact of prediction error: if the learning model’s error is bounded by δ, the total expected loss of the portfolio grows at most linearly with δ and the number of algorithms, providing a clear link between model quality and overall portfolio performance.

Empirical evaluation is conducted on three representative domains: Boolean satisfiability (SAT), 0‑1 integer programming (IP), and AI planning. For each domain the authors assemble a state‑of‑the‑art portfolio of solvers (e.g., CDCL‑based SAT solvers, branch‑and‑bound IP solvers, heuristic planners) and compare three baselines: (1) a pure scheduling approach that ignores instance features, (2) a pure learning approach that selects a single algorithm per instance, and (3) a naïve combination of the two. Using over 10 000 benchmark instances per domain, the unified method consistently outperforms all baselines. In SAT, average runtime drops by more than 12 % and success rate improves by 3 %; in IP, runtime reductions of 9 % and a 4 % increase in optimal‑solution discovery are observed; in planning, runtime falls by 11 % with a 5 % boost in goal‑reachability. Notably, the gains are most pronounced under tight time budgets (e.g., ≤5 seconds), highlighting the method’s suitability for real‑time or embedded settings.

The paper concludes with several promising research directions. Extending the framework to multi‑core and distributed environments would require richer resource models (CPU cores, memory, network bandwidth). Incorporating graph‑neural‑network embeddings could allow the system to learn directly from raw, non‑vectorial instance representations, potentially improving prediction for highly structured problems. Finally, the authors envision a meta‑portfolio that can dynamically add or retire constituent algorithms, enabling long‑term adaptation as new solvers emerge.

In summary, this work delivers a principled, theoretically grounded, and empirically validated solution to algorithm portfolio design. By unifying scheduling and learning into a single optimization problem, it achieves both provable performance guarantees and practical speed‑ups across diverse combinatorial domains, marking a significant step forward for both research and applied algorithm engineering.