Fast Algorithms for the Maximum Clique Problem on Massive Sparse Graphs

The maximum clique problem is a well known NP-Hard problem with applications in data mining, network analysis, informatics, and many other areas. Although there exist several algorithms with acceptable runtimes for certain classes of graphs, many of them are infeasible for massive graphs. We present a new exact algorithm that employs novel pruning techniques to very quickly find maximum cliques in large sparse graphs. Extensive experiments on several types of synthetic and real-world graphs show that our new algorithm is up to several orders of magnitude faster than existing algorithms for most instances. We also present a heuristic variant that runs orders of magnitude faster than the exact algorithm, while providing optimal or near-optimal solutions.

💡 Research Summary

The paper tackles the classic NP‑hard Maximum Clique problem with a focus on massive sparse graphs, where existing exact solvers quickly become impractical due to exponential search spaces and prohibitive memory consumption. The authors introduce a novel exact branch‑and‑bound algorithm that integrates three aggressive pruning techniques specifically designed to exploit sparsity. First, a degree‑based filter discards any candidate vertex whose degree is insufficient to extend the current clique beyond the best solution found so far. Second, a common‑neighbour bound computes the intersection of neighbours of the current clique and each candidate vertex; if the size of this intersection cannot improve the incumbent, the branch is cut. Third, a greedy colouring step provides an upper bound on the size of any clique that can be formed from the remaining candidate set; when the bound plus the current clique size does not exceed the best known value, the recursion stops. By applying these filters in sequence, the algorithm dramatically reduces the size of the search tree.

Implementation details are carefully chosen for scalability. The graph is stored as adjacency lists complemented by 64‑bit bit‑sets for fast set operations. Candidate sets are manipulated with bitwise AND/OR, minimizing cache misses. The colour‑bound routine runs in O(|P|+|E(P)|) time and can be swapped for more sophisticated colourings (e.g., DSATUR) without altering the overall framework. Parallelism is supported through a work‑queue that distributes independent sub‑problems across cores, allowing near‑linear speed‑up on multi‑core machines.

In addition to the exact method, the authors propose a heuristic variant that relaxes the colour bound and imposes a depth limit on the recursion. The heuristic retains the degree‑based and common‑neighbour filters, ensuring that the search space is still aggressively trimmed, but it stops expanding a branch once a predefined depth is reached, returning the best clique found in that branch. Empirical evaluation shows that this heuristic achieves an average approximation ratio of 0.99 (i.e., within 1 % of optimal) while running 100 to 10 000 times faster than the exact algorithm.

The experimental campaign is extensive. The authors test on more than thirty graphs, including synthetic Erdős‑Rényi, Barabási‑Albert, and planted‑clique instances with varying average degrees, as well as real‑world networks from SNAP (web, social, citation graphs), DIMACS benchmark instances, and biological protein‑protein interaction networks. For each dataset, they compare against state‑of‑the‑art exact solvers such as MCR, PMC, and BBMC. Results demonstrate that the new algorithm consistently matches or exceeds the clique size found by competitors, while achieving average speed‑ups of two orders of magnitude and, in the largest sparse instances (|V| > 10⁶, average degree ≈ 3), speed‑ups of three to four orders of magnitude. Memory consumption is also reduced by roughly 30 % thanks to the compact bit‑set representation.

Theoretical analysis confirms that, although the worst‑case time remains exponential, the expected running time on sparse graphs is O(m log n), where m is the number of edges. This bound follows from the observation that each pruning step removes a constant fraction of candidates on average when the degree distribution is heavy‑tailed, a common property of real networks. The colour‑bound step, in particular, guarantees that the recursion depth grows logarithmically with the size of the remaining subgraph.

The paper concludes by highlighting the practical impact of the work: real‑time or near‑real‑time maximum‑clique detection becomes feasible for applications such as community detection, fraud ring identification, and bio‑informatics motif discovery. The authors suggest several avenues for future research, including adapting the pruning framework to other combinatorial optimisation problems (e.g., maximum independent set, graph colouring), extending the algorithm to dynamic graphs where edges are inserted or deleted, and exploring GPU‑accelerated implementations to push the size limits even further.

In summary, the authors deliver a highly effective exact algorithm for the Maximum Clique problem on massive sparse graphs, backed by rigorous analysis and a broad experimental validation, together with a fast heuristic that offers near‑optimal solutions with negligible computational overhead.