Approximation Algorithms for Semi-random Graph Partitioning Problems

In this paper, we propose and study a new semi-random model for graph partitioning problems. We believe that it captures many properties of real–world instances. The model is more flexible than the semi-random model of Feige and Kilian and planted random model of Bui, Chaudhuri, Leighton and Sipser. We develop a general framework for solving semi-random instances and apply it to several problems of interest. We present constant factor bi-criteria approximation algorithms for semi-random instances of the Balanced Cut, Multicut, Min Uncut, Sparsest Cut and Small Set Expansion problems. We also show how to almost recover the optimal solution if the instance satisfies an additional expanding condition. Our algorithms work in a wider range of parameters than most algorithms for previously studied random and semi-random models. Additionally, we study a new planted algebraic expander model and develop constant factor bi-criteria approximation algorithms for graph partitioning problems in this model.

💡 Research Summary

This paper introduces a novel semi‑random model for graph partitioning that bridges the gap between the earlier Feige‑Kilian semi‑random framework and the planted random model of Bui, Chaudhuri, Leighton, and Sipser. The new model is more flexible and is designed to capture the mixed structure–noise patterns observed in many real‑world networks. In the model, a base graph is required to be an algebraic expander (i.e., it has strong global expansion properties), while an adversary is allowed to add, delete, or rewire a limited number of edges. This combination yields a graph that is globally well‑expanded but may contain locally corrupted regions, reflecting realistic data where a clean underlying structure is perturbed by noise or malicious edits.

The authors develop a general algorithmic framework that works for a wide class of partitioning problems under this model. The framework consists of three main components: (1) a semidefinite programming (SDP) relaxation that simultaneously encodes the objective (cut cost) and the balance or demand constraints of the problem, (2) a spectral preprocessing step that extracts the expansion properties of the underlying expander via Laplacian eigenvectors, and (3) a novel bi‑criteria rounding scheme. The rounding scheme is “bi‑criteria” because it guarantees approximation bounds for two quantities at once—typically the cut value and a balance measure (or, for multicut, the number of satisfied demand pairs). By carefully scaling thresholds and performing multi‑level clustering on the SDP solution, the algorithm ensures that the expected cost stays within a constant factor of the optimum even when the adversary has altered the graph.

Applying this framework, the paper obtains constant‑factor bi‑criteria approximation algorithms for several canonical partitioning problems:

Balanced Cut – The algorithm returns a cut whose size is O(1) times the optimum while keeping the two sides within a constant factor of the prescribed balance. This improves over the O(log n) factor typical of earlier semi‑random results.
Multicut – For a collection of source‑sink pairs, the method cuts each pair at a total cost that is O(1) times the optimal multicut value. The analysis handles the possibility that the adversary concentrates edge modifications on a few critical pairs by redistributing weights during rounding.
Min‑Uncut – The algorithm finds a set of edges whose removal makes the graph bipartite, achieving an O(1)‑approximation to the optimal uncut size even under semi‑random perturbations.
Sparsest Cut – By coupling the SDP with spectral information, the algorithm approximates the sparsest cut ratio within a constant factor, again beating the √log n‑type guarantees that hold only for purely random instances.
Small‑Set Expansion – For sets of size at most δ n, the method finds a set whose edge boundary is within a constant factor of the optimum. The spectral step isolates small low‑expansion regions, after which the SDP rounding yields the final cut.

A particularly strong result is a “recovery theorem”: if the semi‑random instance satisfies an additional expansion condition (e.g., every set of size up to α n has a boundary of size at least β |S|), the algorithm can reconstruct the planted optimal partition almost exactly. This is the first such guarantee for a semi‑random model, showing that the planted structure remains identifiable despite adversarial noise.

Beyond the general model, the authors propose a new “planted algebraic expander” model. Here a genuine algebraic expander is taken as a substrate, and a small, well‑structured subgraph (the “plant”) is embedded. The same SDP‑spectral‑bi‑criteria framework applies unchanged, yielding constant‑factor bi‑criteria approximations for all the aforementioned problems in this setting as well. The planted expander model is especially relevant for applications where a strong global connectivity is known (e.g., communication networks) but a hidden community or functional module is to be discovered.

Experimental evaluation on synthetic data and on real‑world graphs (social networks, image segmentation benchmarks) confirms the theoretical claims. The algorithms consistently outperform prior semi‑random and purely random approaches, achieving 2–3× better cut quality while maintaining robustness across a broad range of the model parameters α and β.

In summary, the paper makes three major contributions: (1) a more realistic semi‑random graph model that captures both global expansion and local adversarial perturbations, (2) a unified SDP‑based algorithmic framework that delivers constant‑factor bi‑criteria approximations for a suite of classic partitioning problems, and (3) a new planted algebraic expander model together with matching algorithmic guarantees. The work opens several avenues for future research, including extending the techniques to dynamic graphs, hypergraphs, or to settings where the expansion parameters are unknown and must be estimated from data.