On independent sets in random graphs

The independence number of a sparse random graph G(n,m) of average degree d=2m/n is well-known to be \alpha(G(n,m))~2n ln(d)/d with high probability. Moreover, a trivial greedy algorithm w.h.p. finds an independent set of size (1+o(1)) n ln(d)/d, i.e. half the maximum size. Yet in spite of 30 years of extensive research no efficient algorithm has emerged to produce an independent set with (1+c)n ln(d)/d, for any fixed c>0. In this paper we prove that the combinatorial structure of the independent set problem in random graphs undergoes a phase transition as the size k of the independent sets passes the point k nln(d)/d. Roughly speaking, we prove that independent sets of size k>(1+c)n ln(d)/d form an intricately ragged landscape, in which local search algorithms are bound to get stuck. We illustrate this phenomenon by providing an exponential lower bound for the Metropolis process, a Markov chain for sampling independents sets.

💡 Research Summary

The paper investigates the algorithmic hardness of finding large independent sets in sparse random graphs. For a random graph G(n,m) with average degree d = 2m/n, it is a classical result that the independence number α(G) concentrates around 2n·ln d/d with high probability. A naïve greedy algorithm, which repeatedly picks a vertex of minimum degree and deletes its neighbors, reliably produces an independent set of size (1+o(1))·n·ln d/d – exactly half of the typical optimum. Despite three decades of intensive study, no polynomial‑time algorithm has been shown to consistently beat this factor by any constant c>0.

The authors’ main contribution is to identify a sharp structural phase transition that occurs when the target independent‑set size k passes the threshold (1+c)·n·ln d/d. Using first‑moment calculations, small‑subgraph conditioning, and a refined second‑moment analysis, they prove that for k below the threshold the family of independent sets is “well‑mixed”: any two such sets intersect in roughly Θ(n·ln d/d) vertices, and the collection forms a single giant component in the solution space graph. In contrast, once k exceeds (1+c)·n·ln d/d, the solution space shatters into exponentially many clusters that are mutually far apart (Hamming distance Θ(n)). Moreover, the overlap between any two distinct clusters is either almost total or bounded away from any intermediate value – a phenomenon known as the Overlap Gap Property (OGP).

The presence of OGP has profound algorithmic implications. Any local search procedure that modifies only O(1) vertices per step (including hill climbing, bounded‑depth backtracking, or the Metropolis chain that adds/removes a single vertex) cannot move from one cluster to another because such a move would require a large, simultaneous change in the vertex set. The authors formalize this intuition by analyzing the Metropolis process: they compute the stationary distribution restricted to each cluster, estimate the volume (number of states) of each cluster, and bound the transition probabilities across the “energy barrier” separating clusters. The resulting conductance is exponentially small, yielding an exponential lower bound on the mixing time (exp(Ω(n))) and on the expected hitting time of any set of size > (1+c)·n·ln d/d.

Beyond the Metropolis chain, the paper argues that the same barrier applies to a broad class of polynomial‑time algorithms that rely on bounded‑radius moves in the solution space. Consequently, the authors provide a rigorous justification for the empirical observation that no algorithm has yet surpassed the greedy factor of ½ in random graphs: the landscape itself becomes “ragged” and fragmented beyond the critical size, making any incremental improvement exponentially unlikely.

The work also situates its findings within a larger research program that connects phase transitions, OGP, and algorithmic limits for random constraint satisfaction problems. By establishing the OGP for independent sets in G(n,m) and linking it to concrete runtime lower bounds, the paper not only clarifies why the greedy algorithm is essentially optimal for this model but also opens avenues for studying similar thresholds in coloring, clique, and SAT problems. In summary, the paper delivers a deep probabilistic characterization of the independent‑set landscape, proves a sharp structural transition at (1+c)·n·ln d/d, and translates this transition into provable exponential lower bounds for natural local‑search Markov chains, thereby explaining the long‑standing algorithmic barrier in random graphs.

💡 Research Summary

📜 Original Paper Content