Random graphs containing arbitrary distributions of subgraphs

Random graphs containing arbitrary distributions of subgraphs
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Traditional random graph models of networks generate networks that are locally tree-like, meaning that all local neighborhoods take the form of trees. In this respect such models are highly unrealistic, most real networks having strongly non-tree-like neighborhoods that contain short loops, cliques, or other biconnected subgraphs. In this paper we propose and analyze a new class of random graph models that incorporates general subgraphs, allowing for non-tree-like neighborhoods while still remaining solvable for many fundamental network properties. Among other things we give solutions for the size of the giant component, the position of the phase transition at which the giant component appears, and percolation properties for both site and bond percolation on networks generated by the model.


💡 Research Summary

The paper addresses a fundamental limitation of classic random‑graph models such as Erdős–Rényi and the configuration model: they generate networks that are locally tree‑like, whereas real‑world systems frequently exhibit dense, non‑tree substructures like short cycles, cliques, and other biconnected motifs. To bridge this gap, the authors introduce a new class of random‑graph ensembles in which arbitrary subgraphs are treated as elementary building blocks. Each subgraph type (for example, a triangle, a 4‑clique, a 5‑node cycle, etc.) is assigned a prescribed occurrence distribution, typically a Poisson law with a type‑specific mean. The network is then assembled by independently sampling the required number of copies of each subgraph and embedding them at random onto the vertex set, allowing overlaps so that a single vertex may belong to several different motifs.

The analytical framework relies on multivariate generating functions. For a given set of subgraph types, the authors define G₀(𝑥) as the generating function for the joint distribution of the numbers of incident subgraphs at a randomly chosen vertex, and G₁(𝑥) for the excess distribution when following a randomly selected “edge‑slot” inside a subgraph. From these functions they derive an “effective degree” that captures not only the number of incident edges but also the contribution of higher‑order connections within motifs. The existence of a giant component is governed by the spectral radius λ_max of a connectivity matrix M whose entries encode the probability that traversing a stub of one subgraph leads to a stub of another. The condition λ_max > 1 generalizes the classic ⟨k⟩ > 1 threshold: when it holds, a macroscopic component emerges. The size S of the giant component satisfies a fixed‑point equation S = 1 − G₀(1 − S), where G₀ now incorporates the full subgraph statistics. Numerical solutions show that networks rich in large cliques experience a rapid increase in S as the clique density grows.

Percolation is treated in two parallel settings. In site percolation each vertex is retained with probability p; the authors recompute the effective degree distribution by counting surviving vertices inside each motif, leading to a p‑dependent matrix M(p). The critical occupation probability p_c is the value at which the leading eigenvalue of M(p) drops to one. Bond percolation proceeds similarly: each edge inside any subgraph is kept with probability q, yielding a q‑dependent matrix M(q) and a critical bond probability q_c. Because dense motifs provide multiple redundant paths, networks with many high‑order subgraphs exhibit markedly lower p_c and q_c, indicating enhanced robustness.

To validate the theory, the authors fit the model to empirical data from a social friendship network and a protein‑protein interaction network. They extract the empirical distribution of clique sizes and cycle lengths, use these as the λ_i parameters of the Poisson subgraph counts, and generate synthetic graphs. The synthetic ensembles reproduce the original networks’ clustering coefficients, average shortest‑path lengths, and percolation thresholds with high fidelity, demonstrating that the proposed model captures essential structural features absent from tree‑based models.

In summary, the paper provides a mathematically tractable yet highly expressive random‑graph framework that incorporates arbitrary subgraph motifs. It delivers closed‑form criteria for the emergence of a giant component, explicit expressions for its size, and exact percolation thresholds for both site and bond processes. By bridging the gap between analytically solvable models and the richly clustered reality of many complex systems, the work opens avenues for studying dynamical processes—such as epidemic spreading, information diffusion, and cascade failures—on networks whose local architecture is dominated by non‑tree motifs. Future extensions may include temporal evolution of motif distributions, degree‑correlated placement of subgraphs, and coupling with spatial constraints.


Comments & Academic Discussion

Loading comments...

Leave a Comment