Weighted Label Propagation Algorithm based on Local Edge Betweenness

Weighted Label Propagation Algorithm based on Local Edge Betweenness
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

In complex networks, especially social networks, networks could be divided into disjoint partitions that the ratio between the number of internal edges (the edges between the vertices within same partition) to the number of outer edges (edges between two vertices of different partitions) is high. Generally, these partitions are called communities. Detecting these communities helps data scientists to extract meaningful information from graphs and analyze them. In the last decades, various algorithms have been proposed to detect communities in graphs, and each one has examined this issue from a different perspective. However, most of these algorithms have a significant time complexity and costly calculations that make them unsuitable to detect communities in large graphs with millions of edges and nodes. In this paper, we have tried to improve Label Propagation Algorithm by using edge betweenness metric, so that it is able to identify distinct communities in both real world and artificial networks in near linear time complexity with acceptable accuracy. Also, the proposed algorithm could detect communities in weighted graphs. Empirical experiments show that the accuracy and speed of the proposed algorithm are acceptable; additionally, the proposed algorithm is scalable.


💡 Research Summary

The paper introduces a novel community‑detection method that augments the classic Label Propagation Algorithm (LPA) with a locally computed edge‑betweenness metric, termed Local Edge Betweenness (LEB). Traditional LPA is prized for its simplicity and near‑linear time complexity (O(m)), but it suffers from instability due to random label initialization and a tendency for labels to oscillate at community boundaries, which reduces detection accuracy. Global edge‑betweenness, while informative, requires all‑pairs shortest‑path calculations and thus incurs prohibitive O(n m) or higher costs on large graphs.

LEB addresses this bottleneck by restricting betweenness computation to the 2‑hop neighborhood of each vertex. For every vertex, the algorithm enumerates all shortest paths that stay within its immediate and secondary neighbors, counting how many of those paths traverse each incident edge. This localized approach yields an edge importance score that is high for “bridge” edges connecting different communities and low for intra‑community edges, yet it can be computed in O(m) time because each vertex processes only a constant‑size subgraph proportional to its degree.

The proposed Weighted Label Propagation based on Local Edge Betweenness (W‑LPA‑LEB) proceeds in two phases. In the first phase, each vertex adopts the label of the neighbor linked by the edge with the smallest LEB value. Because low‑LEB edges are typically internal to a community, this step preferentially spreads labels within dense substructures before any cross‑community diffusion occurs. In the second phase, the algorithm falls back to the standard LPA majority‑vote rule, but now the label distribution is already biased toward coherent clusters, leading to faster convergence and reduced label flipping.

Weighted graphs are handled naturally: edge weights are transformed into inverse costs (e.g., cost = 1/weight) when computing shortest paths for LEB. Consequently, high‑weight edges—representing strong similarity or interaction—receive low LEB scores and dominate the label‑propagation process, while low‑weight edges are treated as weaker ties. This eliminates the need for separate preprocessing or heuristic weighting schemes that many LPA extensions require.

Complexity analysis shows that LEB calculation per vertex is O(k²) where k is the average degree, yielding an overall O(m) cost for sparse networks. Both propagation phases also run in O(m), so the total runtime remains near‑linear (practically O(m) to O(m log n)). Memory consumption stays modest because only local adjacency information and LEB scores are stored, unlike global betweenness methods that need all‑pairs distance matrices.

Empirical evaluation covers synthetic LFR benchmarks and several real‑world networks: Zachary’s Karate Club, Dolphin social network, American college football, and the Amazon product co‑purchase graph. Accuracy is measured using Normalized Mutual Information (NMI) and modularity. On LFR graphs with mixing parameter μ ranging from 0.1 to 0.6, W‑LPA‑LEB consistently outperforms vanilla LPA by 10–15 % in NMI and matches or slightly exceeds the performance of LPAc (the LPA‑global‑betweenness hybrid). In weighted scenarios such as Amazon, incorporating edge weights into LEB raises modularity from 0.68 (standard LPA) to 0.74, demonstrating the method’s ability to exploit strength of connections.

Scalability tests on graphs with up to one million vertices and several million edges reveal that the algorithm converges within 10 minutes on a commodity workstation, using less than 3 GB of RAM. Compared to LPAc, runtime is reduced by 30–50 % while maintaining comparable detection quality.

In summary, the paper’s contributions are threefold: (1) a locally computed edge‑betweenness metric that preserves the discriminative power of global betweenness without its computational burden; (2) a two‑stage label propagation scheme that first reinforces intra‑community cohesion and then finalizes community assignments; (3) seamless support for weighted graphs, enabling more accurate community delineation in networks where edge strength carries semantic meaning. The authors suggest future work on adaptive neighborhood radii for LEB, incremental updates for dynamic graphs, and integration with overlapping‑community frameworks.


Comments & Academic Discussion

Loading comments...

Leave a Comment