PageRank Optimization by Edge Selection
The importance of a node in a directed graph can be measured by its PageRank. The PageRank of a node is used in a number of application contexts - including ranking websites - and can be interpreted as the average portion of time spent at the node by an infinite random walk. We consider the problem of maximizing the PageRank of a node by selecting some of the edges from a set of edges that are under our control. By applying results from Markov decision theory, we show that an optimal solution to this problem can be found in polynomial time. Our core solution results in a linear programming formulation, but we also provide an alternative greedy algorithm, a variant of policy iteration, which runs in polynomial time, as well. Finally, we show that, under the slight modification for which we are given mutually exclusive pairs of edges, the problem of PageRank optimization becomes NP-hard.
💡 Research Summary
The paper tackles a novel optimization problem: how to maximize the PageRank of a designated node by selectively activating a subset of edges that are under the practitioner’s control. Traditional PageRank assumes a static directed graph and measures node importance as the stationary distribution of a random walk with teleportation. In many real‑world settings—such as web site administrators adding or removing hyperlinks, social‑network managers curating follow relationships, or transportation planners opening new routes—some edges can be deliberately chosen. The authors formalize this as a “controlled edge set” (E_c) and a “fixed edge set” (E_f). For each controllable edge (e\in E_c) a binary decision variable (x_e\in{0,1}) indicates whether the edge is present. The transition matrix of the underlying Markov chain becomes a linear function of these variables, and the PageRank vector (\pi) satisfies the standard balance equation
\
Comments & Academic Discussion
Loading comments...
Leave a Comment