A Brain Graph Foundation Model: Pre-Training and Prompt-Tuning across Broad Atlases and Disorders

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

As large language models (LLMs) continue to revolutionize AI research, there is a growing interest in building large-scale brain foundation models to advance neuroscience. While most existing brain foundation models are pre-trained on time-series signals or connectome features, we propose a novel graph-based pre-training paradigm for constructing a brain graph foundation model. In this paper, we introduce the Brain Graph Foundation Model, termed BrainGFM, a unified framework that leverages graph contrastive learning and graph masked autoencoders for large-scale fMRI-based pre-training. BrainGFM is pre-trained on a diverse mixture of brain atlases with varying parcellations, significantly expanding the pre-training corpus and enhancing the model’s ability to generalize across heterogeneous fMRI-derived brain representations. To support efficient and versatile downstream transfer, we integrate both graph prompts and language prompts into the model design, enabling BrainGFM to flexibly adapt to a wide range of atlases, neurological and psychiatric disorders, and task settings. Furthermore, we employ meta-learning to optimize the graph prompts, facilitating strong generalization to previously unseen disorders under both few-shot and zero-shot learning conditions via language-guided prompting. BrainGFM is pre-trained on 27 neuroimaging datasets spanning 25 common neurological and psychiatric disorders, encompassing 2 types of brain atlases (functional and anatomical) across 8 widely-used parcellations, and covering over 25,000 subjects, 60,000 fMRI scans, and a total of 400,000 graph samples aggregated across all atlases and parcellations.

💡 Research Summary

The paper introduces BrainGFM, the first graph‑based foundation model for functional MRI (fMRI) data. Unlike prior brain foundation models that rely on raw time‑series or region‑of‑interest (ROI) functional connectivity matrices, BrainGFM converts each fMRI scan into a brain graph where nodes correspond to ROIs and edges encode pairwise Pearson correlations. To address data heterogeneity, the authors aggregate 27 public fMRI datasets covering 25 neurological and psychiatric disorders, yielding over 25,000 subjects and 60,000 scans. Crucially, every scan is processed with eight different atlases/parcellations (Schaefer‑100/200/300, AAL‑116/3v1, SHEN‑268, Power‑264, Gordon‑333), expanding the pre‑training corpus eight‑fold and providing complementary spatial‑functional representations.

The backbone is a Graph Transformer equipped with Random Walk Structural Encoding (RWSE) for efficient positional encoding, which captures relative node positions without the heavy computation of Laplacian eigenvectors. Special tokens are inserted into the node sequence:

A Brain Graph Foundation Model: Pre-Training and Prompt-Tuning across Broad Atlases and Disorders

💡 Research Summary

Comments & Academic Discussion

Leave a Comment