Geometric protean graphs
We study the link structure of on-line social networks (OSNs), and introduce a new model for such networks which may help infer their hidden underlying reality. In the geo-protean (GEO-P) model for OSNs nodes are identified with points in Euclidean space, and edges are stochastically generated by a mixture of the relative distance of nodes and a ranking function. With high probability, the GEO-P model generates graphs satisfying many observed properties of OSNs, such as power law degree distributions, the small world property, densification power law, and bad spectral expansion. We introduce the dimension of an OSN based on our model, and examine this new parameter using actual OSN data. We discuss how the geo-protean model may eventually be used as a tool to group users with similar attributes using only the link structure of the network.
💡 Research Summary
The paper introduces the Geometric‑Protean (GEO‑P) model as a novel stochastic framework for online social networks (OSNs). Nodes are embedded as points in an m‑dimensional unit hypercube equipped with a torus L∞ metric, and each node receives a unique rank from 1 (highest) to n (lowest). A node’s rank determines the volume of its “influence region” – the ball whose volume equals r(v,t) − α n − β, where α∈(0,1) controls rank‑dependence and β∈(0,1‑α) controls overall density. At each discrete time step a new node is placed uniformly at random; for every existing node whose influence region contains the newcomer, an undirected edge is added with probability p∈(0,1]. Simultaneously a uniformly random existing node is deleted, keeping the total number of nodes constant. Ranks are updated after each step, yielding an ergodic Markov chain that converges to a stationary distribution.
The authors prove four central theorems about graphs generated by GEO‑P. First, the degree distribution follows a power law N_k ∝ k^{‑(1+1/α)}, so the exponent b = 1 + 1/α can be tuned to match empirical OSN values (typically 2 < b < 3). Second, the average degree grows as d ≈ p · n^{1‑α‑β}, implying a densification power law (average degree diverges with network size). Third, the graph diameter satisfies D = Θ( n^{β(1‑α)/m} · log^{O(1)} n ); by choosing the embedding dimension m = Θ(log n) the diameter can be made constant, thereby reproducing the small‑world phenomenon while still allowing a controllable growth rate. Fourth, the clustering coefficient exceeds that of an Erdős‑Rényi graph with the same average degree: c(G) ≥ (3/4) p · (1‑α)^{(1‑α)/(1+α)} · m^{‑(1‑α)/(1+α)} · exp(‑Θ(m)). For realistic parameter ranges (e.g., m ≤ (1‑α‑β) log n log log n) this bound is tight, showing that GEO‑P naturally generates high clustering.
Spectral analysis reveals a small gap between the first and second eigenvalues of the adjacency matrix, indicating poor spectral expansion and strong community structure—another hallmark of real OSNs. The model therefore simultaneously captures five key empirical properties: scale‑free degree distribution, small world, densification, high clustering, and weak spectral expansion.
Beyond theoretical results, the authors propose a “network dimension” metric. By fitting observed OSN statistics (node count, average degree, diameter, power‑law exponent) to the analytical formulas, one can infer the minimal embedding dimension m* required for the GEO‑P model to reproduce the data. This dimension can be interpreted as the number of latent user attributes (geographic, topical, demographic) needed to embed the network in Euclidean space. The paper demonstrates the procedure on real datasets such as Twitter, Flickr, and YouTube, obtaining plausible dimension estimates.
In summary, GEO‑P offers a unified, analytically tractable model that bridges geometric and rank‑based mechanisms, reproduces the full suite of observed OSN characteristics, and introduces a practical tool for reverse‑engineering latent user attributes from pure link structure. This work advances both the theoretical understanding of social network formation and provides a foundation for applications in community detection, attribute inference, and network‑driven recommendation systems.
Comments & Academic Discussion
Loading comments...
Leave a Comment