Community detection and graph partitioning
Many methods have been proposed for community detection in networks. Some of the most promising are methods based on statistical inference, which rest on solid mathematical foundations and return excellent results in practice. In this paper we show that two of the most widely used inference methods can be mapped directly onto versions of the standard minimum-cut graph partitioning problem, which allows us to apply any of the many well-understood partitioning algorithms to the solution of community detection problems. We illustrate the approach by adapting the Laplacian spectral partitioning method to perform community inference, testing the resulting algorithm on a range of examples, including computer-generated and real-world networks. Both the quality of the results and the running time rival the best previous methods.
💡 Research Summary
The paper establishes a direct correspondence between two widely used statistical inference methods for community detection—namely the standard stochastic block model (SBM) and its degree‑corrected variant—and the classic minimum‑cut graph partitioning problem. By showing that maximizing the likelihood of a given block‑model assignment is equivalent to minimizing a cut size with an additional penalty term that favors balanced groups, the authors reduce the otherwise combinatorial community‑detection task to a well‑studied graph‑partitioning problem.
For the standard SBM, after analytically maximizing the likelihood with respect to the edge‑probability parameters ω_in and ω_out, the log‑likelihood simplifies to L = − m_out + γ n₁n₂, where m_out is the number of edges crossing the two groups, n₁ and n₂ are the group sizes, and γ > 0 depends on the unknown ω’s. Because γ is unknown, the authors fix the group sizes to each possible value (0 ≤ n₁ ≤ n) and solve a pure minimum‑cut problem for each size. This yields n + 1 candidate partitions. They then evaluate the full profile likelihood Q (the likelihood after re‑optimizing ω) for each candidate and select the partition with the highest Q.
The minimum‑cut subproblem is tackled with a spectral algorithm based on the Laplacian. The second smallest eigenvector (the Fiedler vector) of the graph Laplacian L = D − A is computed once; sorting its components provides a natural ordering of vertices. By cutting the sorted list at each possible position, the algorithm generates the entire family of n + 1 partitions efficiently. The authors emphasize that this approach gives a good approximation to the global optimum while requiring only a single eigenvalue decomposition, which for sparse graphs runs in near‑linear time.
The degree‑corrected SBM extends the model by scaling the expected number of edges between i and j by the product of their degrees, k_i k_j. The log‑likelihood becomes L = − m_out + γ κ₁κ₂, where κ₁ and κ₂ are the sums of degrees in the two groups. The same size‑fixing trick applies, but now the balance term involves κ₁κ₂ rather than n₁n₂. The corresponding spectral relaxation uses the generalized eigenproblem L v = λ D v, where D is the diagonal degree matrix. Solving this yields a vector analogous to the Fiedler vector, and the same sorting‑and‑cut procedure provides the candidate partitions.
Empirical evaluation includes synthetic networks generated by the standard SBM (10 000 vertices, varying intra‑ and inter‑community edge densities) and two well‑known real‑world datasets: Zachary’s karate club and a political‑blog network. For synthetic data, the profile likelihood curves display clear peaks at the planted community sizes, and the fraction of correctly classified vertices remains near 1 across a wide range of parameters, only dropping at the theoretical detectability threshold identified in prior work. The real‑world examples produce the expected bipartitions, matching previously reported community assignments.
Overall, the paper demonstrates that maximum‑likelihood community detection can be reframed as a minimum‑cut problem plus a simple one‑dimensional search over group sizes. This insight allows practitioners to leverage the extensive toolbox of graph‑partitioning algorithms—particularly fast spectral methods—while retaining the statistical rigor of SBM inference. The approach achieves competitive accuracy and speed compared with state‑of‑the‑art inference techniques, and it naturally extends to degree‑corrected models. Future directions suggested include handling more than two communities, incorporating edge weights, and applying multilevel partitioning heuristics to further improve scalability and solution quality.
Comments & Academic Discussion
Loading comments...
Leave a Comment