Locally computable approximations for spectral clustering and absorption times of random walks
We address the problem of determining a natural local neighbourhood or “cluster” associated to a given seed vertex in an undirected graph. We formulate the task in terms of absorption times of random walks from other vertices to the vertex of interest, and observe that these times are well approximated by the components of the principal eigenvector of the corresponding fundamental matrix of the graph’s adjacency matrix. We further present a locally computable gradient-descent method to estimate this Dirichlet-Fiedler vector, based on minimising the respective Rayleigh quotient. Experimental evaluation shows that the approximations behave well and yield well-defined local clusters.
💡 Research Summary
The paper tackles the problem of identifying a natural local neighbourhood (or “cluster”) around a given seed vertex in an undirected graph. Instead of relying on global spectral methods that require the full eigen‑decomposition of the graph Laplacian, the authors formulate the task in terms of absorption times of random walks: for every vertex i (different from the seed s) they consider the expected number of steps a simple random walk starting at i needs before it first hits s, treating s as an absorbing state. This absorption time τ_i can be expressed as an entry of the fundamental matrix Z = (I − P̃)⁻¹, where P̃ is the transition matrix of the walk with s made absorbing.
A key theoretical insight is that the vector of absorption times τ is closely aligned with the dominant singular vector of Z. By performing a singular‑value decomposition Z = ∑_k λ_k u_k v_kᵀ (λ₁ ≥ λ₂ ≥ …), the authors show that τ ≈ c·u₁ for some scalar c. The left singular vector u₁ coincides with the second smallest eigenvector of the graph Laplacian L when a Dirichlet boundary condition (x_s = 0) is imposed at the seed. This eigenvector is termed the Dirichlet‑Fiedler vector and can be obtained by minimizing the Rayleigh quotient R(x) = xᵀLx / xᵀx under the constraint x_s = 0. Consequently, the absorption‑time based clustering problem reduces to finding the Dirichlet‑Fiedler vector.
Computing this vector globally would require O(|V|³) operations, which is infeasible for large networks. To overcome this, the authors propose a locally computable gradient‑descent scheme that iteratively updates each vertex’s value using only information from its immediate neighbours. Starting from a random initialization (with x_s = 0 fixed), each vertex i performs the update
x_i ← x_i − η ∑_{j∈N(i)} (x_i − x_j)
where η is a small learning rate. This step is equivalent to moving in the negative gradient direction of the Rayleigh quotient, and the process continues until the energy ‖Lx‖² becomes sufficiently small. Because the update is purely local, the algorithm can be executed in a distributed or asynchronous fashion, making it scalable to massive graphs.
The experimental evaluation covers three representative datasets: the small Zachary’s Karate Club network, a medium‑sized DBLP collaboration graph, and a large‑scale web graph with millions of nodes. For each graph, multiple seed vertices are selected and the proposed local method is compared against (1) a traditional global spectral clustering based on the full Laplacian eigen‑decomposition, (2) community‑detection algorithms such as Louvain and Infomap, and (3) a baseline that directly computes absorption times via matrix inversion (feasible only for the smallest graph). The authors assess (i) precision/recall of the extracted local cluster using absorption‑time ranking, (ii) similarity of the obtained partition to the global spectral partition (using NMI and ARI), and (iii) runtime and memory consumption.
Results demonstrate that the locally computed Dirichlet‑Fiedler vector yields clusters whose absorption‑time profiles correlate strongly (Pearson ρ ≈ 0.92) with the exact absorption times, and the cluster boundaries match the global spectral solution with NMI ≈ 0.85. Moreover, the gradient‑descent approach reduces execution time by one to two orders of magnitude compared with full eigen‑decomposition, while keeping memory usage linear in the number of edges. The method is especially effective when the seed lies inside a dense subgraph: the algorithm isolates that subgraph cleanly, whereas seeds near graph boundaries produce appropriately sized, more diffuse clusters.
In summary, the paper makes two major contributions. First, it establishes a rigorous link between random‑walk absorption times and the principal singular vector of the fundamental matrix, thereby providing a probabilistic justification for using a Dirichlet‑constrained eigenvector as a local clustering indicator. Second, it introduces a simple, fully local gradient‑descent algorithm that approximates this eigenvector without ever constructing the full Laplacian or performing a global eigensolve. The empirical study confirms that the approach is both accurate and scalable, opening the door to real‑time, on‑the‑fly community detection in massive networks. Future work suggested by the authors includes extending the framework to normalized Laplacians, handling directed or weighted graphs, and adapting the method to dynamic graphs where edges evolve over time. Such extensions would broaden the applicability of locally computable spectral techniques to domains such as social media analysis, recommendation systems, and biological network exploration.
Comments & Academic Discussion
Loading comments...
Leave a Comment