The Configurable SAT Solver Challenge (CSSC)
It is well known that different solution strategies work well for different types of instances of hard combinatorial problems. As a consequence, most solvers for the propositional satisfiability problem (SAT) expose parameters that allow them to be customized to a particular family of instances. In the international SAT competition series, these parameters are ignored: solvers are run using a single default parameter setting (supplied by the authors) for all benchmark instances in a given track. While this competition format rewards solvers with robust default settings, it does not reflect the situation faced by a practitioner who only cares about performance on one particular application and can invest some time into tuning solver parameters for this application. The new Configurable SAT Solver Competition (CSSC) compares solvers in this latter setting, scoring each solver by the performance it achieved after a fully automated configuration step. This article describes the CSSC in more detail, and reports the results obtained in its two instantiations so far, CSSC 2013 and 2014.
💡 Research Summary
The Configurable SAT Solver Competition (CSSC) introduces a novel evaluation paradigm for SAT solvers that reflects real‑world practice: solvers are not judged by their default configuration alone, but by the performance they achieve after an automated, time‑bounded parameter tuning phase. The authors motivate this shift by pointing out that most modern SAT solvers expose a rich set of command‑line parameters whose settings can dramatically affect runtime, yet the traditional SAT Competition ignores this flexibility and evaluates every solver with a single default setting.
CSSC therefore consists of two stages. In the training stage, each submitted solver is fed to three state‑of‑the‑art algorithm configurators—ParamILS, GGA, and SMAC—each of which searches the solver’s parameter space on a set of training instances. The parameter space may contain real‑valued, integer‑valued, and categorical parameters, with conditional dependencies and forbidden combinations. The configurators run for a fixed budget (two days on 4–5 cores) and use the PAR‑10 metric (10 × cutoff for time‑outs) with a 300‑second per‑run timeout and a 3 GB memory limit. After all runs, the configuration with the best average training performance is selected.
In the testing stage, the chosen configuration is applied to a disjoint set of test instances, and the solver’s score is computed exactly as in the SAT Competition: the number of instances solved, with ties broken by average runtime. The competition retains the same track structure as the SAT Competition (Industrial, Crafted, Random, and in 2014 also Random SAT‑UNSAT), the same input/output formats, and the same scoring function, but uses a relatively short per‑run timeout (5 minutes) to emulate practitioner workloads that involve many instances with modest time budgets.
The paper reports on the two editions of CSSC held in 2013 and 2014. For each edition, the authors describe the benchmark suites (each consisting of at least 500 instances, split evenly into 250 training and 250 test instances), the participating solvers (eleven in 2013, a similar set in 2014, ranging from classic CDCL solvers to stochastic local‑search solvers), and the configuration pipelines. They emphasize that all solvers were subjected to the same configurators to avoid bias toward any particular configuration method.
Key findings are: (1) Automated configuration yields substantial speed‑ups over default settings—often an order of magnitude, and in some cases several orders of magnitude—demonstrating that parameter tuning is essential for high performance on specific instance families. (2) The benefit varies widely across solvers; solvers with large, expressive parameter spaces gain the most, while solvers with few parameters improve only modestly. Consequently, the ranking of solvers after configuration can differ dramatically from the ranking observed in the traditional SAT Competition. The authors also note that even a parameter‑free but very strong solver could win a track, though in practice the gains from tuning were large enough that tuned solvers usually prevailed.
The authors discuss broader implications: the CSSC framework can be transferred to other competition settings (e.g., planning, optimization), encouraging the community to evaluate solvers under realistic, application‑specific tuning conditions. They argue that future SAT solver development should place greater emphasis on exposing well‑structured, tunable parameters and on providing robust interfaces for automated configuration tools. Moreover, the competition highlights the importance of algorithm configuration research itself, as the three configurators used (ParamILS, GGA, SMAC) each contributed complementary strengths.
In summary, the paper presents a thorough design, implementation, and empirical evaluation of a competition that measures SAT solver performance after automated, budget‑constrained parameter optimization. The results convincingly show that such a methodology yields far more informative assessments of solver capabilities for targeted applications than the traditional default‑only approach, and it paves the way for more nuanced, practitioner‑oriented benchmarking in the SAT community and beyond.
Comments & Academic Discussion
Loading comments...
Leave a Comment