Multi-Domain Riemannian Graph Gluing for Building Graph Foundation Models

Multi-Domain Riemannian Graph Gluing for Building Graph Foundation Models
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Multi-domain graph pre-training integrates knowledge from diverse domains to enhance performance in the target domains, which is crucial for building graph foundation models. Despite initial success, existing solutions often fall short of answering a fundamental question: how is knowledge integrated or transferred across domains? This theoretical limitation motivates us to rethink the consistency and transferability between model pre-training and domain adaptation. In this paper, we propose a fresh Riemannian geometry perspective, whose core idea is to merge any graph dataset into a unified, smooth Riemannian manifold, enabling a systematic understanding of knowledge integration and transfer. To achieve this, our key contribution is the theoretical establishment of neural manifold gluing, which first characterizes local geometry using an adaptive orthogonal frame and then “glues” the local pieces together into a coherent whole. Building on this theory, we present the GraphGlue framework, which supports batched pre-training with EMA prototyping and provides a transferability measure based on geometric consistence. Extensive experiments demonstrate its superior performance across diverse graph domains. Moreover, we empirically validated GraphGlue’s geometric scaling law, showing that larger quantities of datasets improve model transferability by producing a smoother manifold. Codes are available at https://github.com/RiemannGraph/GraphGlue.


💡 Research Summary

The paper tackles a fundamental gap in multi‑domain graph pre‑training: the lack of a principled understanding of how knowledge is integrated and transferred across heterogeneous graph domains. To address this, the authors introduce a novel differential‑geometric framework called Neural Manifold Gluing, which treats each graph dataset as a local Riemannian manifold and then “glues” these local pieces into a single smooth global manifold.

Local geometry learning starts with a (k, M)‑sparse perturbation that adds a small set of auxiliary nodes to a graph and connects them to the most relevant existing nodes via an attention‑weighted edge function. The perturbation, together with a graph encoder f_GNN, yields a collection of tangent vectors. By applying QR decomposition with length recovery, an Adaptive Orthogonal Frame (AOF) is constructed, providing an orthonormal basis for the tangent space at each data point. The local metric tensor G_i is then expressed as W_iᵀW_i, where W_i stacks the basis vectors; this yields explicit inner products and distances for each domain.

Gluing the local pieces proceeds in three stages. First, edge tangent translation defines a linear map P(i,j) = G_j^{-½} G_i^{½} G_j^{-½} for every edge (i, j). The authors prove (Theorem 4.5) that this map is the optimal isometry minimizing the Frobenius norm of G_j − PᵀG_i P, guaranteeing metric compatibility along edges. Second, they show (Theorem 4.6) that these edge maps induce a unique global continuous metric G on the union of all local neighborhoods. Third, because higher‑order structures (triangles, cycles) can introduce holonomy—non‑trivial rotations of tangent vectors after traversing a closed loop—the authors introduce a holonomy map and a corresponding loss. When the holonomy of every triangle is trivial, the glued manifold becomes globally smooth. Finally, the Ricci curvature of the global manifold is regularized to control volume distortion, effectively “smoothing” the manifold.

Building on this theory, the GraphGlue framework implements a practical pre‑training and adaptation pipeline. During pre‑training, an Exponential Moving Average (EMA) prototype maintains a running representation for each domain, enabling batched training on large graphs while preserving domain‑specific geometry. After each batch, the local AOFs and edge translations are applied to merge the batch into the global manifold. In the adaptation phase, learnable prompts and a Riemannian Mixture‑of‑Experts (MoE) are used to attach a target domain to the pre‑trained manifold, ensuring geometric consistency.

A key by‑product of the manifold formulation is the Geometric Transfer Metric (GTM), defined as the sum of metric incompatibility and holonomy loss between a source and a target domain. GTM quantifies transfer difficulty: higher GTM predicts lower downstream performance. Empirically, GTM correlates strongly (Pearson ≈ 0.78) with actual transfer results, outperforming heuristic measures such as domain similarity based on node feature distributions.

Experiments span five diverse graph benchmarks—ogbn‑arxiv (citation network), Reddit (social interaction), FB15k‑237 (knowledge graph), PROTEINS (bio‑chemical), and HIV (drug‑response). The authors conduct cross‑domain transfer experiments where a model pre‑trained on a subset of domains is fine‑tuned on the remaining ones. GraphGlue consistently outperforms prior multi‑domain methods (e.g., GraphCodeBERT, MotifGNN, GraphCodebook) by 3–5 % absolute gain in accuracy or F1 score. Notably, the gains are larger for text‑free graphs, confirming that the approach does not rely on auxiliary textual attributes.

A further contribution is the geometric scaling law: as the number of pre‑training datasets K increases, the average Ricci curvature of the resulting manifold decreases, making the manifold smoother and improving transferability. The authors demonstrate that doubling K yields an average 1.2 % boost in downstream performance, and the curvature reduction follows a predictable power‑law decay.

In summary, the paper delivers: (1) a rigorous differential‑geometric theory for multi‑domain graph integration, (2) a concrete algorithmic framework (GraphGlue) that operationalizes the theory for large‑scale pre‑training, (3) a principled transfer difficulty metric (GTM), and (4) extensive empirical validation across heterogeneous graph domains. By unifying disparate graph datasets on a smooth Riemannian manifold, the work opens a new avenue for building truly generalizable graph foundation models.


Comments & Academic Discussion

Loading comments...

Leave a Comment