Node embeddings map graph vertices into low-dimensional Euclidean spaces while preserving structural information. They are central to tasks such as node classification, link prediction, and signal reconstruction. A key goal is to design node embeddings whose dot products capture meaningful notions of node similarity induced by the graph. Graph kernels offer a principled way to define such similarities, but their direct computation is often prohibitive for large networks. Inspired by random feature methods for kernel approximation in Euclidean spaces, we introduce randomized spectral node embeddings whose dot products estimate a low-rank approximation of any specific graph kernel. We provide theoretical and empirical results showing that our embeddings achieve more accurate kernel approximations than existing methods, particularly for spectrally localized kernels. These results demonstrate the effectiveness of randomized spectral constructions for scalable and principled graph representation learning.
Kernel machines are a class of machine learning algorithms, designed to handle non-linear problems by implicitly mapping data into a high-dimensional feature space. Given some data space X , this is achieved through a positive semi-definite (psd) kernel Γ : (x, y) ∈ X × X → Γ(x, y). This kernel can be interpreted, through the kernel trick, as a dot product in a (possibly infinite-dimensional) feature (Hilbert) space H, i.e., there exist a mapping ϕ : X → H such that Γ(x, y) = ⟨ϕ(x), ϕ(y)⟩ H . In this context, algorithms perform linear computations in this enriched space without explicitly computing the feature map ϕ, thereby enabling more expressive linear models.
While kernels are often introduced for data in continuous domains such as X = R d , many practical problems involve data supported on graphs, where the underlying relational structure between data samples plays a central role. It is therefore essential to define kernels that respect and exploit this structure. As such, graph kernels [1,2] provide a way to quantify similarity between pairs of nodes. For a graph G with N nodes, such a kernel can be represented as an N × N kernel matrix-most often, a function of the graph Laplacian. In practice, however, computing this matrix is often prohibitive, with a typical cost scaling as O(N 3 ), which limits the applicability of many existing graph kernel methods.
In R d , kernel methods scale poorly on large datasets as computing and storing full kernel matrices is costly. To address this, Rahimi and Recht [3] proposed ran-dom (Fourier) features. These explicitly map data into a low-dimensional space via a randomized feature map ϕ : R d → R D so that, for specific kernels Γ such as the Gaussian or Laplacian kernels, Γ(x, y) = E⟨ϕ(x), ϕ(y)⟩ ≈ ⟨ϕ(x), ϕ(y)⟩ for all x, y ∈ R d . This allows fast linear learning on the transformed points, reducing training and evaluation time as well as memory usage. It is thus natural to ask whether graph kernel computations can be similarly accelerated using random features.
We propose to use tools from Graph Signal Processing (GSP) [4] for randomized graph kernel approximation. GSP leverages the graph Laplacian, and the derived graph Fourier transform, to extend the definition of frequency, filtering, and smoothness for graph signals while respecting the graph structure. Specifically, we leverage the graph wavelet transform [5] on random input signals to construct embeddings that provide scalable approximations of graph kernels. Contributions: Building on random feature methods for kernel approximation in Euclidean spaces, we propose randomized spectral embeddings for graphs, where the dot product between embeddings estimates a low-rank approximation of a Laplacian-based graph kernel. We provide both theoretical and empirical evidence that these embeddings yield more accurate kernel approximations than existing approaches, especially when the kernel is spectrally localized. Finally, we discuss connections to randomized singular value decomposition (RSVD) [6], highlighting links between our method and classical numerical linear algebra techniques.
In [7], the graph wavelet transform of random signals is used to accelerate the computation of graph matrix functions. Their use case is slightly different from ours, as they approximate correlation matrices based explicitly on wavelets instead of general graph kernels. In [8], the application of graph wavelet transforms to random signals yields embeddings that approximate the distance matrix of classical spectral clustering. We somehow extend these results to general graph kernels.
Other works have considered random feature methods for graph kernel approximation, see e.g., [9,10]. Their random features follow a spatial design relying on random walks to assess the similarities between nodes, and produce unbiased kernel approximations. We show that the convergence to said kernel is slow for bandlimited kernels which are well localized in the spectral domain and spread out in the spatial domain.
3 Mathematical tools 3.A. Spectral graph theory: We consider an undirected weighted graph G = (V, E, w) made of N vertices (or nodes) V identified, up to a correct ordering, with the N = |V| integers [N ] := {1, . . . , N }. These nodes are connected with edges in E ⊆ V × V, with a total of E = |E| edges, and w : E → R + is a weight function assigning a positive weight to each edge. Any vector f ∈ R N can be seen as a function V → R defined on each node of V. The weight matrix W ∈ R N ×N of a graph has entries W ij equal to w(i, j) if (i, j) ∈ E, and to 0 otherwise. It is symmetric for undirected graphs. The degree matrix D ∈ R N ×N is diagonal with entries D ii := N j=1 W ij equal to the degrees of the nodes. Finally, let the normalized graph Laplacian L be L :
The graph Laplacian L plays a central role in GSP and spectral graph theory. Since L is symmetric, it factorizes as L = V ΛV ⊤ , where V = (v 1 , . . . , v N ) is an orthogonal ma
This content is AI-processed based on open access ArXiv data.