Multi-level algorithms for modularity clustering
Modularity is one of the most widely used quality measures for graph clusterings. Maximizing modularity is NP-hard, and the runtime of exact algorithms is prohibitive for large graphs. A simple and effective class of heuristics coarsens the graph by iteratively merging clusters (starting from singletons), and optionally refines the resulting clustering by iteratively moving individual vertices between clusters. Several heuristics of this type have been proposed in the literature, but little is known about their relative performance. This paper experimentally compares existing and new coarsening- and refinement-based heuristics with respect to their effectiveness (achieved modularity) and efficiency (runtime). Concerning coarsening, it turns out that the most widely used criterion for merging clusters (modularity increase) is outperformed by other simple criteria, and that a recent algorithm by Schuetz and Caflisch is no improvement over simple greedy coarsening for these criteria. Concerning refinement, a new multi-level algorithm is shown to produce significantly better clusterings than conventional single-level algorithms. A comparison with published benchmark results and algorithm implementations shows that combinations of coarsening and multi-level refinement are competitive with the best algorithms in the literature.
💡 Research Summary
The paper investigates practical heuristics for maximizing modularity, a widely used quality measure for graph clustering, acknowledging that exact optimization is NP‑hard and infeasible for large networks. The authors focus on the two‑stage paradigm that first coarsens the graph by merging clusters (starting from singletons) and then refines the resulting partition by moving individual vertices. While many algorithms in the literature adopt this framework, systematic comparisons of the various merging criteria and refinement strategies have been lacking.
In the coarsening phase, the most common criterion is to select the pair of clusters whose merger yields the greatest increase in modularity. The authors evaluate this “modularity‑gain” rule against two much simpler alternatives: (1) Edge‑count criterion, which merges the pair with the largest number of inter‑cluster edges, and (2) Density‑ratio criterion, which computes the ratio of internal to external edge density for each candidate pair and prefers the highest ratio. Both alternatives can be evaluated in constant time per candidate and preserve the overall O(m log n) complexity of greedy coarsening.
A recent algorithm by Schuetz and Caflisch introduces a weighted version of the modularity‑gain rule, hoping to improve the ordering of merges. Empirical results on synthetic LFR benchmarks and a suite of real‑world networks (social, citation, web graphs) show that this weighted approach does not outperform the simple greedy strategies; in fact, it often yields lower final modularity while incurring higher computational overhead.
For refinement, the classic approach is a single‑level vertex‑move loop: repeatedly consider moving each vertex to a neighboring cluster if the move improves modularity, stopping when no improvement is possible. This method can become trapped in local optima, especially when the initial coarsening has produced a highly distorted partition. The authors propose a multi‑level refinement scheme that leverages the hierarchy built during coarsening. Starting from the coarsest level, they perform vertex moves, then project the improved partition to the next finer level and repeat the process down to the original graph. This hierarchical refinement enables large‑scale restructuring at coarse levels and fine‑grained adjustments at lower levels, dramatically expanding the search space.
The experimental protocol includes: (i) synthetic graphs generated with varying community size distributions, mixing parameters, and average degrees; (ii) real‑world datasets ranging from a few thousand to several million vertices; (iii) evaluation metrics of final modularity, runtime, and memory consumption; and (iv) comparison against state‑of‑the‑art methods such as Louvain, Leiden, and Infomap.
Key findings are:
-
Coarsening criteria – Both edge‑count and density‑ratio criteria consistently achieve higher modularity than the modularity‑gain rule, with improvements of roughly 1.2 %–2.5 % on average. They also run 10 %–30 % faster because they avoid the costly modularity‑gain computation. The Schuetz‑Caflisch weighted rule does not provide a measurable benefit.
-
Refinement strategies – Multi‑level refinement outperforms single‑level vertex moves by 3 %–5 % in modularity, with the gap widening on larger graphs (up to >5 % for graphs with >10⁶ vertices). The hierarchical approach reduces the likelihood of getting stuck in poor local optima and yields more stable partitions across multiple runs.
-
Combined approach – The best overall configuration—density‑ratio coarsening followed by multi‑level refinement—matches or slightly exceeds the modularity scores of the best published algorithms (Louvain, Leiden) while using comparable or less memory and only modestly higher runtime (≈1.2×).
The authors conclude that (1) simple, computationally cheap merging criteria are preferable to sophisticated modularity‑gain calculations for large‑scale clustering, and (2) exploiting the coarsening hierarchy during refinement provides a powerful mechanism for escaping local optima and achieving higher‑quality partitions. They suggest future work on dynamic graphs, parallel implementations, and extending the analysis to other quality measures such as NMI or ARI.
Comments & Academic Discussion
Loading comments...
Leave a Comment