Manifold Random Features
We present a new paradigm for creating random features to approximate bi-variate functions (in particular, kernels) defined on general manifolds. This new mechanism of Manifold Random Features (MRFs) leverages discretization of the manifold and the recently introduced technique of Graph Random Features (GRFs) to learn continuous fields on manifolds. Those fields are used to find continuous approximation mechanisms that otherwise, in general scenarios, cannot be derived analytically. MRFs provide positive and bounded features, a key property for accurate, low-variance approximation. We show deep asymptotic connection between GRFs, defined on discrete graph objects, and continuous random features used for regular kernels. As a by-product of our method, we re-discover recently introduced mechanism of Gaussian kernel approximation applied in particular to improve linear-attention Transformers, considering simple random walks on graphs and by-passing original complex mathematical computations. We complement our algorithm with a rigorous theoretical analysis and verify in thorough experimental studies.
💡 Research Summary
The paper introduces Manifold Random Features (MRF), a novel framework for approximating bivariate functions—most notably positive‑definite kernels—defined on arbitrary Riemannian manifolds. Traditional random feature (RF) methods, such as those pioneered by Rahimi and Recht, rely on Fourier or other analytic expansions that are only available in Euclidean spaces. Extending these ideas to manifolds is non‑trivial because the Laplace–Beltrami operator’s eigenfunctions rarely admit closed‑form expressions, and spectral computations become cubic in the number of sampled points.
MRF circumvents these obstacles through a two‑stage pipeline. First, a target manifold (M) is discretized into a finite point set (V_N={x_1,\dots,x_N}) using a mesh, quasi‑uniform sampling, or volume‑measure draws. A weighted graph (G_N=(V_N,E_N,W_N)) is built by connecting each point to its (k) nearest neighbours with Gaussian‑type edge weights (W_{ij}= \exp(-|x_i-x_j|^2/\sigma^2)). This graph encodes the local geometry and serves as a combinatorial surrogate for the continuous Laplace–Beltrami operator.
Second, the authors apply Graph Random Features (GRF), a recent technique that constructs for each node (i) a “signature vector” (\phi_f(i)\in\mathbb{R}^N) via multiple random walks. During a walk the algorithm maintains a scalar “load” that is multiplied by the node degree, a halting probability, and the edge weight; the load is added to the current node’s entry after being modulated by a function (f) that encodes the coefficients (\alpha_k) of a graph kernel (K^\alpha(W)=\sum_{k=0}^\infty \alpha_k W^k). The expectation of the outer product of two independent signature matrices reproduces the kernel matrix, providing an unbiased low‑rank factorization.
To obtain continuous random features on the original manifold, the discrete signatures are treated as training data for a neural network (g_{\theta}(x,\omega)) that maps a pair of points ((x,\omega)) to a non‑negative scalar. The loss minimizes the discrepancy between (g_{\theta}(x,\omega)) and the corresponding entry of the signature vector (\phi_f(x)
Comments & Academic Discussion
Loading comments...
Leave a Comment