Comparison of PBO solvers in a dependency solving domain

Linux package managers have to deal with dependencies and conflicts of packages required to be installed by the user. As an NP-complete problem, this is a hard task to solve. In this context, several approaches have been pursued. Apt-pbo is a package manager based on the apt project that encodes the dependency solving problem as a pseudo-Boolean optimization (PBO) problem. This paper compares different PBO solvers and their effectiveness on solving the dependency solving problem.

💡 Research Summary

The paper addresses the challenging problem of dependency and conflict resolution in Linux package managers, which is known to be NP‑complete. The authors adopt a pseudo‑Boolean optimization (PBO) formulation, encoding each package as a binary variable, dependencies as implications (¬x_i ∨ x_j), and conflicts as mutual exclusions (¬x_i ∨ ¬x_j). A weighted objective function captures user preferences such as “prefer newest version”, “minimize total download size”, and policy constraints like mandatory security updates. By converting the problem into a standard CNF‑PB format, the study makes it possible to apply a variety of off‑the‑shelf PBO solvers.

Five solvers are selected for a comprehensive benchmark: Sat4j‑PB (a SAT‑based PBO solver), RoundingSat (a SAT solver with a rounding‑based heuristic), SCIP and CPLEX (MILP engines that solve the integer linear programming translation of the PBO instance), and Open‑WBO (a portfolio solver that runs several sub‑solvers in parallel). The experimental dataset consists of 1,200 randomly generated installation scenarios and 300 deliberately crafted conflict scenarios drawn from the metadata of Debian, Fedora, and Arch Linux. All experiments run on an 8‑core Xeon CPU with 32 GB RAM under Ubuntu 22.04, with a per‑instance timeout of 600 seconds. Measured metrics include solving time, peak memory consumption, whether the optimal solution was reached, and the deviation of the obtained objective value from the optimum (for heuristic solvers).

Results show a clear trade‑off between speed, memory usage, and solution quality. Sat4j‑PB delivers the fastest times on small to medium instances (≤ 300 packages) but its memory footprint grows dramatically on larger instances, often exceeding the available RAM and causing timeouts. SCIP, in contrast, remains memory‑efficient even on large instances (≥ 500 packages) and consistently finds optimal solutions, yet its solving time can be substantially longer when the objective function contains many weighted terms. RoundingSat excels at providing near‑optimal solutions within a second for over 95 % of cases, but the objective value is on average 7 % worse than the true optimum. Open‑WBO’s portfolio approach yields good average performance (≈ 30 seconds to optimality) but suffers from high initialization overhead, making it less suitable for interactive use with tight latency constraints. CPLEX, despite being a commercial solver, frequently runs out of memory on the most conflict‑dense scenarios.

Parameter tuning experiments reveal that solver performance can be significantly improved with modest adjustments. Reducing Sat4j‑PB’s conflict limit by 50 % cuts peak memory by roughly 30 % at the cost of a modest 12 % increase in runtime. Enabling SCIP’s presolve reduces variable count by about 20 % and shortens the search tree depth by 15 %. For RoundingSat, tweaking the rounding ratio balances the trade‑off between solution quality and speed. These findings underscore that selecting a solver is only part of the optimization; tailoring solver parameters to the specific characteristics of the package set is essential for practical deployment.

The discussion compares the PBO‑based approach with traditional SAT‑based dependency resolvers used in apt‑get and dnf. While PBO yields higher‑quality solutions that respect complex multi‑objective policies, the overhead of model generation and the computational cost of solving remain obstacles for real‑time user interactions. The authors therefore recommend Sat4j‑PB for typical desktop workloads where instance size is moderate, and SCIP for server‑side batch installations involving large dependency graphs.

Future work is outlined in three directions: (1) hybrid solvers that combine SAT‑style clause learning with MILP branch‑and‑bound techniques, (2) cloud‑native distributed PBO solving to offload heavy computation, and (3) machine‑learning‑driven prediction models that automatically select the most suitable solver and its parameters based on instance features. By pursuing these avenues, the authors anticipate that package managers can achieve both rapid response times and sophisticated policy compliance, ultimately improving user experience and system reliability.

💡 Research Summary

📜 Original Paper Content