A Well-Behaved Alternative to the Modularity Index
This paper reviews the modularity index and suggests an alternative index of the quality of a division of a network into subsets.
💡 Research Summary
The paper begins by framing the problem of community detection in networks as the identification of “cohesive subsets” – groups of vertices that are densely connected internally and sparsely connected to the rest of the graph. It stresses that any quality measure for a partition must satisfy three intuitive criteria: (1) it should decrease as the number of external edges (edges crossing between different subsets) grows, (2) it should increase as the number of internal edges (edges staying within a subset) grows, and (3) it should be immune to extraneous factors such as the sheer number of subsets or unequal edge counts among them.
The authors then turn to the most widely used metric, Newman and Girvan’s modularity Q. Q is defined as the sum over all subsets of the difference between the observed fraction of internal edges (e_i) and the expected fraction under a random‑graph null model that preserves the degree sequence (E(d_i)^2). While Q correctly penalizes added external edges – the value drops as cross‑cutting edges are introduced – it fails to respond to the removal of internal edges. In a series of illustrative figures the authors show that deleting one internal edge from each of two perfect cliques leaves Q unchanged at 0.50, and even after repeatedly deleting edges until only a single edge remains in each cluster, Q still reports 0.50. Thus Q is blind to internal density.
Beyond this insensitivity, Q suffers from two additional distortions. First, its maximum attainable value is bounded by (m‑1)/m, where m is the number of subsets evaluated. Consequently, a perfect partition of two cliques can never exceed 0.50, whereas a perfect partition of 100 cliques can approach 0.99, even though the structural quality is identical. Second, Q is confounded by edge‑inequality across subsets: two graphs with the same number of subsets and no external edges can yield dramatically different Q values (e.g., 0.50 versus 0.14) simply because one subset contains many more internal edges than the other. These properties make Q an unreliable indicator when the number of communities varies or when community sizes are heterogeneous.
To address these shortcomings, the paper introduces Borgatti’s η index, which is implemented in the UCINET software package. η is the Pearson correlation between two binary matrices: X, where X_{jk}=1 if vertices j and k belong to the same subset, and Y, where Y_{jk}=1 if an edge actually exists between j and k. By construction η ranges from –1 (no internal edges, all possible external edges present) to +1 (perfect internal connectivity, no external edges). The authors demonstrate that η behaves exactly as desired: it declines smoothly as external edges are added, and it also declines when internal edges are removed, unlike Q. Moreover, η is invariant to the number of subsets and to disparities in edge counts among subsets; a perfect partition of two cliques yields η=1 regardless of whether there are two, ten, or a hundred identical cliques, and a partition with highly unequal internal edge numbers still scores η=1. The only trade‑off is that η does not incorporate a null‑model expectation based on the degree distribution, so it lacks the statistical baseline that Q provides. The authors deem this a minor cost compared with the ambiguities introduced by Q.
In summary, the paper provides a thorough critique of modularity, exposing its insensitivity to internal edge loss, its dependence on the number of communities, and its susceptibility to edge‑inequality across groups. It then presents η as a well‑behaved alternative that satisfies the three fundamental criteria for a community‑quality metric. The authors recommend that researchers adopt η as the primary evaluation tool for community detection, using modularity only as a supplementary statistic when a degree‑preserving null model is specifically required.
Comments & Academic Discussion
Loading comments...
Leave a Comment