Non-Conservative Diffusion and its Application to Social Network Analysis
The random walk is fundamental to modeling dynamic processes on networks. Metrics based on the random walk have been used in many applications from image processing to Web page ranking. However, how appropriate are random walks to modeling and analyzing social networks? We argue that unlike a random walk, which conserves the quantity diffusing on a network, many interesting social phenomena, such as the spread of information or disease on a social network, are fundamentally non-conservative. When an individual infects her neighbor with a virus, the total amount of infection increases. We classify diffusion processes as conservative and non-conservative and show how these differences impact the choice of metrics used for network analysis, as well as our understanding of network structure and behavior. We show that Alpha-Centrality, which mathematically describes non-conservative diffusion, leads to new insights into the behavior of spreading processes on networks. We give a scalable approximate algorithm for computing the Alpha-Centrality in a massive graph. We validate our approach on real-world online social networks of Digg. We show that a non-conservative metric, such as Alpha-Centrality, produces better agreement with empirical measure of influence than conservative metrics, such as PageRank. We hope that our investigation will inspire further exploration into the realms of conservative and non-conservative metrics in social network analysis.
💡 Research Summary
The paper challenges the widespread assumption that random‑walk‑based metrics are universally appropriate for analyzing social networks. It argues that many social processes—information diffusion, viral marketing, epidemic spread—are fundamentally non‑conservative: the quantity being propagated (ideas, infections, memes) can increase over time rather than being merely redistributed. To formalize this distinction, the authors define two classes of diffusion on a directed, weighted graph: conservative diffusion, where the total weight (or probability mass) is preserved at each step, and non‑conservative diffusion, where the total weight may grow.
For conservative diffusion they introduce a linear model in which each node retains a fraction (1-\alpha) of its incoming weight and distributes the remaining fraction (\alpha) uniformly among its out‑neighbors. The resulting update equation (\mathbf{x}_{t+1} = (1-\alpha)\mathbf{x}_t + \alpha \mathbf{x}t W_c) (with (W_c = D^{-1}A) the row‑stochastic transition matrix) converges to (\mathbf{x}\infty = (1-\alpha)(I-\alpha W_c)^{-1}\mathbf{x}_0). This is mathematically identical to the PageRank formulation, establishing PageRank as a metric derived from conservative diffusion.
In contrast, non‑conservative diffusion is modeled by allowing each node to “print” additional weight for its neighbors. The update becomes (\mathbf{x}_{t+1} = \mathbf{x}_t + \alpha \mathbf{x}t A), where (A) is the adjacency matrix and (\alpha) is the replication factor. Summing over time yields (\mathbf{x}\infty = (I-\alpha A)^{-1}\mathbf{x}0) provided (\alpha < 1/\lambda_1(A)) (the inverse of the largest eigenvalue). This fixed‑point equation is precisely the definition of Alpha‑Centrality, a metric originally proposed for influence scoring. The authors further connect this formulation to classic epidemic models (SIS), showing that the infection probability vector evolves as (\mathbf{p}{t+1} = ((1-\beta)I + \mu A)\mathbf{p}_t). The epidemic threshold (\tau = 1/|\lambda_1|) emerges naturally: if the effective infection rate (\mu/\beta) exceeds (\tau), the process spreads to a macroscopic fraction of nodes; otherwise it dies out.
To assess the practical relevance of these theoretical insights, the authors conduct experiments on a real‑world online social network—Digg. They construct the follower graph, compute PageRank and Alpha‑Centrality scores for each user, and compare the rankings against two empirical influence measures derived from user activity: (1) the number of votes a user’s submitted stories receive, and (2) the cascade size of stories the user initiates. Using Kendall’s (\tau) correlation, Alpha‑Centrality consistently outperforms PageRank, especially for highly viral content where the diffusion is clearly non‑conservative.
Because exact computation of Alpha‑Centrality requires solving ((I-\alpha A)^{-1}) and can be prohibitive for massive graphs, the paper proposes a scalable approximation. The method truncates the Neumann series (\sum_{k=0}^{K} (\alpha A)^k) after a modest number of terms (typically (K=10)–(20)), exploiting the sparsity of (A) to achieve linear‑in‑edges runtime. The authors prove an error bound that decays geometrically with (K) and demonstrate empirically that the approximate scores retain the ranking quality of the exact solution while reducing computation time by orders of magnitude.
In summary, the contributions are threefold: (1) a rigorous classification of diffusion processes and their mapping to well‑known centrality measures; (2) empirical evidence that a non‑conservative metric (Alpha‑Centrality) better captures influence in networks where information spreads via broadcasting; and (3) an efficient algorithm for computing Alpha‑Centrality on large‑scale graphs. The work underscores that the choice of network analysis tools should be guided by the underlying dynamics of the system under study. It opens avenues for future research on hybrid diffusion models, dynamic networks, and applications beyond social media such as financial contagion or power‑grid failures.
Comments & Academic Discussion
Loading comments...
Leave a Comment