Detecting network communities by propagating labels under constraints
We investigate the recently proposed label-propagation algorithm (LPA) for identifying network communities. We reformulate the LPA as an equivalent optimization problem, giving an objective function whose maxima correspond to community solutions. By considering properties of the objective function, we identify conceptual and practical drawbacks of the label propagation approach, most importantly the disparity between increasing the value of the objective function and improving the quality of communities found. To address the drawbacks, we modify the objective function in the optimization problem, producing a variety of algorithms that propagate labels subject to constraints; of particular interest is a variant that maximizes the modularity measure of community quality. Performance properties and implementation details of the proposed algorithms are discussed. Bipartite as well as unipartite networks are considered.
💡 Research Summary
The paper revisits the label‑propagation algorithm (LPA), a popular method for community detection that is prized for its simplicity and near‑linear time complexity. The authors first cast LPA as an explicit discrete optimization problem. By assigning a label c_i to each vertex i and defining the objective function Q(L)=∑{i,j}A{ij}δ(c_i,c_j) (where A is the adjacency matrix and δ is the Kronecker delta), they show that each LPA update step is a greedy move that locally increases Q. This reformulation makes it clear that LPA is essentially trying to maximize the number of adjacent vertex pairs sharing the same label.
However, the paper demonstrates that maximizing Q does not guarantee high‑quality communities. In a fully connected graph, the global maximum of Q corresponds to a single label for all vertices, which is a trivial and useless partition. Moreover, Q is biased toward large clusters, penalizes small but meaningful groups, and is highly sensitive to the number of distinct labels. Consequently, LPA’s outcomes can vary dramatically with different random initializations and with synchronous versus asynchronous update schedules.
To overcome these conceptual shortcomings, the authors propose a constrained‑optimization framework. They augment the original objective with additional terms and hard constraints that encode desirable properties of a community partition. The most important variant incorporates the modularity measure M = (1/2m)∑_{i,j}
Comments & Academic Discussion
Loading comments...
Leave a Comment