Handling software upgradeability problems with MILP solvers

Upgradeability problems are a critical issue in modern operating systems. The problem consists in finding the 'best' solution according to some criteria, to install, remove or upgrade packages in a gi

Handling software upgradeability problems with MILP solvers

Upgradeability problems are a critical issue in modern operating systems. The problem consists in finding the “best” solution according to some criteria, to install, remove or upgrade packages in a given installation. This is a difficult problem: the complexity of the upgradeability problem is NP complete and modern OS contain a huge number of packages (often more than 20 000 packages in a Linux distribution). Moreover, several optimisation criteria have to be considered, e.g., stability, memory efficiency, network efficiency. In this paper we investigate the capabilities of MILP solvers to handle this problem. We show that MILP solvers are very efficient when the resolution is based on a linear combination of the criteria. Experiments done on real benchmarks show that the best MILP solvers outperform CP solvers and that they are significantly better than Pseudo Boolean solvers.


💡 Research Summary

The paper tackles the classic upgradeability problem that arises when managing software packages in modern operating systems. The problem is to find a “best” configuration—installing, removing, or upgrading packages—according to one or more optimization criteria such as stability, memory consumption, network traffic, or the freshness of installed versions. Because package repositories often contain tens of thousands of items and the underlying decision problem is NP‑complete, scalable and high‑quality solutions are difficult to obtain.

The authors propose to model the upgradeability problem as a Mixed‑Integer Linear Program (MILP). Each package version is represented by a binary decision variable. Dependency relationships are encoded as linear inequalities that require at least one of the dependent versions to be selected when a package is installed. Conflict relationships become constraints that forbid the simultaneous selection of two incompatible packages. The objective function is a weighted linear combination of several criteria; for example, the total installed size, the amount of data to be downloaded, and a penalty for deviating from the latest available versions can all be summed with user‑specified weights. This formulation allows multiple, possibly competing, goals to be optimized in a single linear model.

To evaluate the approach, the authors extracted real metadata from Debian‑based distributions (including Ubuntu) and constructed ten benchmark scenarios that reflect typical user actions: adding new packages, upgrading existing ones, cleaning up unused components, and performing a full system refresh. They ran three state‑of‑the‑art MILP solvers—IBM CPLEX, Gurobi, and SCIP—on each scenario, measuring runtime, memory consumption, and solution quality. For comparison, they also executed a leading constraint‑programming solver (Choco) and a pseudo‑boolean SAT solver (Sat4j) on the same instances.

The experimental results are striking. In the majority of cases the MILP solvers produced optimal solutions within one to two seconds, whereas the CP and pseudo‑boolean solvers often required tens of seconds to several minutes. The advantage of MILP becomes especially pronounced when the objective combines several weighted criteria; the linear model naturally balances trade‑offs, delivering a globally minimal cost without the need for ad‑hoc heuristics. Memory usage remained within the limits of contemporary server hardware, and a modest preprocessing step—removing irrelevant package versions, pruning the dependency graph, and separating core from auxiliary packages—reduced the number of variables and constraints by roughly 30‑40 %, further improving solver performance.

The authors acknowledge that the MILP model can become large for very big repositories, but they argue that modern solvers are designed to handle millions of variables efficiently, especially when the model is well‑structured. They suggest additional techniques such as hierarchical modeling, dynamic weight adjustment, and parallel solver execution to push scalability even further.

In conclusion, the study demonstrates that MILP solvers are not only capable of handling the combinatorial explosion inherent in upgradeability problems but also outperform traditional CP and pseudo‑boolean approaches in both speed and solution quality. The linear‑combination objective provides a flexible framework for incorporating diverse user preferences, making the approach suitable for real‑world package managers. Future work will focus on tighter integration with existing package management tools, exploration of adaptive weighting schemes, and extensive testing in production environments.


📜 Original Paper Content

🚀 Synchronizing high-quality layout from 1TB storage...