Finding missing edges in networks based on their community structure

Finding missing edges in networks based on their community structure
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Many edge prediction methods have been proposed, based on various local or global properties of the structure of an incomplete network. Community structure is another significant feature of networks: Vertices in a community are more densely connected than average. It is often true that vertices in the same community have “similar” properties, which suggests that missing edges are more likely to be found within communities than elsewhere. We use this insight to propose a strategy for edge prediction that combines existing edge prediction methods with community detection. We show that this method gives better prediction accuracy than existing edge prediction methods alone.


💡 Research Summary

The paper addresses the problem of predicting missing edges in incomplete networks by leveraging community structure. The authors propose a two‑stage framework: first, they detect communities using efficient algorithms (InfoMap for non‑overlapping and OSLOM for overlapping communities); second, they apply local vertex similarity measures—Common Neighbors (CN), Adamic‑Adar (AA), and Resource Allocation (RA)—to score potential edges, but they treat intra‑community and inter‑community pairs separately. Intra‑community pairs are ranked first, reflecting the assumption that vertices within the same community are more likely to be connected.

To evaluate the approach, the authors generate synthetic LFR benchmark networks with varying mixing parameters and overlapping settings, and they also use several real‑world datasets (Southern women, football, Fb160, Netscience, Email, Blogs). Missing edges are simulated by randomly deleting a fraction of existing edges, and performance is measured using the area under the ROC curve (AUC). The proposed methods (Infomap+AA, OSLOM+AA, etc.) are compared against baseline similarity scores (CN, AA, RA), Preferential Attachment (PA), and the more computationally intensive Hierarchical Random Graph (HRG) model.

Results show that the community‑aware methods consistently outperform the baseline similarity measures across both synthetic and real networks. The advantage is especially pronounced when the network exhibits clear assortative mixing; even with overlapping communities, OSLOM successfully captures the structure, and InfoMap still yields strong performance by extracting the disjoint core of overlapping groups. HRG can surpass the proposed method only when a very large proportion of edges are missing, which hampers reliable community detection; however, HRG’s cubic time complexity limits its applicability to small graphs. In low‑clustering, power‑law networks (Email, Blogs), PA occasionally yields comparable AUC values, but overall the community‑based approach remains more robust.

The authors also report execution times, demonstrating that their pipeline runs in near‑linear time and is orders of magnitude faster than HRG, making it suitable for large‑scale applications. They conclude that combining community detection with simple local similarity scores provides a practical, accurate, and scalable solution for edge prediction. Future work includes extending the method to weighted, directed, and multipartite networks, and exploring how edge prediction can, in turn, improve community detection algorithms.


Comments & Academic Discussion

Loading comments...

Leave a Comment