Evaluating Local Community Methods in Networks
We present a new benchmarking procedure that is unambiguous and specific to local community-finding methods, allowing one to compare the accuracy of various methods. We apply this to new and existing algorithms. A simple class of synthetic benchmark networks is also developed, capable of testing properties specific to these local methods.
💡 Research Summary
The paper addresses a long‑standing gap in network science: the lack of a rigorous, reproducible benchmark specifically designed for local community‑finding algorithms. While many studies evaluate global community detection methods using standard metrics such as modularity, precision, and recall, local methods—those that start from a seed node and expand a subgraph until a stopping criterion is met—have been assessed in an ad‑hoc manner, often conflating algorithmic differences with experimental settings. To remedy this, the authors propose a two‑stage benchmarking framework and a synthetic network generator that together enable fair, quantitative comparison of any local algorithm.
In the first stage, the authors formalize the typical local community discovery pipeline into three deterministic steps: (1) seed selection, (2) expansion rule, and (3) termination condition. By fixing the random seed, the maximum expansion depth, and the conductance threshold across all experiments, they eliminate sources of variance that previously made cross‑method comparisons unreliable. The second stage introduces a parametrized stochastic block model (SBM) that allows independent control of intra‑community edge probability (α) and inter‑community edge probability (β). This flexibility creates a spectrum of test graphs ranging from clearly separated communities (high α, low β) to ambiguous, fuzzy boundaries (α≈β). Moreover, the authors deliberately place seed nodes either at the geometric center of a community or on its boundary, thereby isolating the effect of seed location on algorithmic performance.
Beyond the classic precision/recall pair, the paper defines two novel evaluation metrics tailored to local methods: (i) Expansion Efficiency, the ratio of truly internal nodes added to the total number of nodes incorporated during expansion, and (ii) Boundary Accuracy, the proportion of external nodes mistakenly included when the algorithm stops. These metrics capture two essential qualities of a local method: its ability to grow quickly without over‑reaching, and its capacity to recognize when the community boundary has been reached, especially in graphs where the boundary is noisy.
The experimental suite compares three representative algorithms: Local Spectral Clustering (LSC), which greedily minimizes conductance using eigenvectors; Personalized PageRank (PPR) based expansion, which ranks nodes by a random‑walk bias from the seed; and a newly proposed Adaptive Conductance Greedy (ACG) that dynamically adjusts the conductance reduction threshold during growth. All three are run under identical seed‑selection policies, depth limits, and termination thresholds on a collection of synthetic graphs spanning α∈
Comments & Academic Discussion
Loading comments...
Leave a Comment