A Note on the PageRank of Undirected Graphs
The PageRank is a widely used scoring function of networks in general and of the World Wide Web graph in particular. The PageRank is defined for directed graphs, but in some special cases applications for undirected graphs occur. In the literature it is widely noted that the PageRank for undirected graphs are proportional to the degrees of the vertices of the graph. We prove that statement for a particular personalization vector in the definition of the PageRank, and we also show that in general, the PageRank of an undirected graph is not exactly proportional to the degree distribution of the graph: our main theorem gives an upper and a lower bound to the L_1 norm of the difference of the PageRank and the degree distribution vectors.
💡 Research Summary
The paper investigates the behavior of the PageRank algorithm when applied to undirected graphs, a setting that frequently appears in social, biological, and collaboration networks. While PageRank was originally defined for directed graphs, many practitioners simply run the same algorithm on undirected data and then assume that the resulting scores are proportional to the vertex degrees. The authors show that this proportionality holds only under a very specific choice of the personalization vector, and they provide a rigorous quantitative analysis of how far the PageRank vector can deviate from the normalized degree distribution in the general case.
Model and Notation
Let (G=(V,E)) be a simple undirected graph with (n=|V|) vertices. The adjacency matrix is (A) (symmetric), the degree vector is (d) with entries (d_i), and (\mathbf{D}=\operatorname{diag}(d)). The random‑walk transition matrix is (P=\mathbf{D}^{-1}A); because the graph is undirected, (P) is also symmetric and satisfies (P^{\top}=P). The PageRank vector (\pi) is defined as the unique solution of
\