Detectability threshold in weighted modular networks
We study the necessary condition to detect, by means of spectral modularity optimization, the ground-truth partition in networks generated according to the weighted planted-partition model with two equally sized communities. We analytically derive a general expression for the maximum level of mixing tolerated by the algorithm to retrieve community structure, showing that the value of this detectability threshold depends on the first two moments of the distributions of node degree and edge weight. We focus on the standard case of Poisson-distributed node degrees and compare the detectability thresholds of five edge-weight distributions: Dirac, Poisson, exponential, geometric, and signed Bernoulli. We show that Dirac distributed weights yield the smallest detectability threshold, while exponentially distributed weights increase the threshold by a factor $\sqrt{2}$, with other distributions exhibiting distinct behaviors that depend, either or both, on the average values of the degree and weight distributions. Our results indicate that larger variability in edge weights can make communities less detectable. In cases where edge weights carry no information about community structure, incorporating weights in community detection is detrimental.
💡 Research Summary
The paper investigates the fundamental limits of community detection in weighted networks using spectral modularity optimization. The authors consider a weighted planted‑partition model (WPPM) with two equally sized groups. Edges are first drawn according to Bernoulli trials with probabilities p_in (within the same group) and p_out (between groups). Then each existing edge receives a weight drawn independently from a distribution W_in for intra‑group edges and W_out for inter‑group edges. Node degree k(i) and strength s(i) are defined as the sum of adjacency variables and the sum of weighted adjacency variables, respectively.
The modularity matrix is defined as q_ij = a_ij w_ij – s(i)s(j)/(2m), where 2m = Σ_r s(r). Spectral modularity maximization looks at the leading eigenpair (λ, v) of this matrix. The authors introduce an order parameter P, the normalized sum of the components of the leading eigenvector belonging to each community. When P > 0 the algorithm can recover the planted partition; P ≈ 0 indicates that the communities are undetectable.
A key analytical step is the approximation λ ≈ ⟨Δs²⟩/⟨Δs⟩, where Δs = s_in – s_out is the difference between a node’s intra‑ and inter‑community strength. By expressing the first two moments of Δs in terms of the moments of the degree and weight distributions, the authors obtain closed‑form expressions for ⟨Δs⟩ and ⟨Δs²⟩. For Poisson‑distributed degrees (large N limit) the moments simplify to ⟨s_in⟩ = ⟨w_in⟩⟨k_in⟩ and ⟨s_in²⟩ = ⟨w_in²⟩⟨k_in⟩ + ⟨w_in⟩²⟨k_in⟩², with analogous formulas for the out‑community quantities.
To capture a wide range of realistic weight distributions, the second moment of the weight distribution is parameterized as a quadratic function of its mean: ⟨w²⟩ = α₀ + α₁⟨w⟩ + α₂⟨w⟩². The five distributions studied—Dirac (deterministic), Poisson, geometric, exponential, and signed Bernoulli—are represented by specific (α₀,α₁,α₂) triples: (0,0,1), (0,1,1), (0,−1,2), (0,0,2), and (1,0,0), respectively.
Substituting these moments into the eigenvalue expression yields a compact formula for the leading eigenvalue when the average intra‑ and inter‑community weights are equal (⟨w_in⟩ = ⟨w_out⟩ = W/2):
λ = W²Δk + (K·C²·W)/Δk
where K = ⟨k_in⟩ + ⟨k_out⟩, Δk = ⟨k_in⟩ – ⟨k_out⟩, and C = 4α₀ + 2Wα₁ + W²α₂. The first term grows linearly with Δk, while the second term decays as 1/Δk. As the mixing between communities increases (Δk decreases), λ first drops, reaches a minimum, and then artificially rises because the approximation breaks down. The point where the derivative dλ/dΔk = 0 defines the detectability threshold:
Δk* = √(K·C·W).
For the Dirac case (α₀=α₁=0, α₂=1) we have C = W², and the threshold reduces to the classic unweighted result Δk* = √(K·W). Other weight distributions modify C, thus shifting the threshold. Exponential weights (α₂=2) double C, giving Δk* = √2 · √(K·W); i.e., the threshold is larger by a factor √2. Poisson and signed‑Bernoulli weights produce C that depends on W, yielding larger thresholds for small total weight but converging to the Dirac limit as W grows. Geometric weights lie in between, approaching the exponential behavior for large W.
The authors also analyze the complementary scenario where the topology is homogeneous (Erdős–Rényi with ⟨k_in⟩ = ⟨k_out⟩ = K/2) and community structure is encoded solely in weight differences (Δw = ⟨w_in⟩ – ⟨w_out⟩). By symmetry the same derivation leads to a threshold Δw* = √(K·C·W)/K, showing that higher weight variance again makes detection harder. Crucially, when weights are generated independently of the community assignment (i.e., ⟨w_in⟩ = ⟨w_out⟩), they act as noise and degrade performance; ignoring them expands the detectable region.
Numerical experiments confirm the analytical predictions. The authors compute λ and the order parameter P for synthetic networks generated with each weight distribution, varying the mixing parameter and the average degree. The observed phase transitions match the theoretical Δk* values. Moreover, they repeat the analysis with the Leiden algorithm on multi‑community networks (different numbers of groups) and find the same hierarchy of thresholds, indicating that the results are not specific to the spectral method but reflect a fundamental information‑theoretic limit.
In summary, the paper provides a unified theoretical framework that links the detectability of community structure in weighted graphs to the first two moments of both degree and weight distributions. It demonstrates that weight variability generally raises the detectability threshold, making communities harder to recover, and that incorporating edge weights is beneficial only when those weights are positively correlated with the underlying community structure. The derived expression Δk* = √(K·C·W) generalizes the classic unweighted threshold and offers a practical tool for assessing whether weighted information will help or hinder community detection in real‑world networks.
Comments & Academic Discussion
Loading comments...
Leave a Comment