Overlapping Community Detection in Networks: the State of the Art and Comparative Study
This paper reviews the state of the art in overlapping community detection algorithms, quality measures, and benchmarks. A thorough comparison of different algorithms (a total of fourteen) is provided. In addition to community level evaluation, we propose a framework for evaluating algorithms’ ability to detect overlapping nodes, which helps to assess over-detection and under-detection. After considering community level detection performance measured by Normalized Mutual Information, the Omega index, and node level detection performance measured by F-score, we reached the following conclusions. For low overlapping density networks, SLPA, OSLOM, Game and COPRA offer better performance than the other tested algorithms. For networks with high overlapping density and high overlapping diversity, both SLPA and Game provide relatively stable performance. However, test results also suggest that the detection in such networks is still not yet fully resolved. A common feature observed by various algorithms in real-world networks is the relatively small fraction of overlapping nodes (typically less than 30%), each of which belongs to only 2 or 3 communities.
💡 Research Summary
The paper provides a comprehensive survey and empirical comparison of overlapping community detection methods. Fourteen representative algorithms are grouped into five methodological families: clique‑percolation, line‑graph/link‑partitioning, local‑expansion/optimization, label‑propagation, and statistical/game‑theoretic approaches. For each family the authors describe the underlying principles, computational complexities, and typical parameter sensitivities.
A novel contribution is a node‑level evaluation framework that complements traditional community‑level metrics (Normalized Mutual Information and Omega index) with an F‑score based assessment of how accurately algorithms identify overlapping nodes. This allows the detection of both over‑detection (assigning too many memberships) and under‑detection (missing true overlaps).
Experiments are conducted on synthetic LFR benchmarks with varying overlapping density (10 %–50 %) and diversity (2–5 memberships per node), as well as on several real‑world networks (social, citation, and protein‑protein interaction graphs). The results show that for low overlapping density, SLPA, OSLOM, Game and COPRA achieve the highest NMI/Omega scores, while for high density and high diversity SLPA and Game maintain relatively stable F‑scores. Nevertheless, performance degrades noticeably in the most challenging settings, indicating that overlapping detection in dense, highly overlapping networks remains an open problem.
Analysis of real networks reveals a consistent pattern: overlapping nodes typically constitute less than 30 % of the vertices, and most overlapping nodes belong to only two or three communities. This empirical observation supports the design assumption of many algorithms that overlaps are sparse.
The authors conclude by highlighting current limitations—high computational cost for clique‑based methods, parameter dependence for local‑expansion techniques, and scalability issues for line‑graph approaches—and they call for future work on scalable algorithms, dynamic‑network extensions, and more robust benchmark standards.
Comments & Academic Discussion
Loading comments...
Leave a Comment