BrainExplore: Large-Scale Discovery of Interpretable Visual Representations in the Human Brain
Understanding how the human brain represents visual concepts, and in which brain regions these representations are encoded, remains a long-standing challenge. Decades of work have advanced our understanding of visual representations, yet brain signals remain large and complex, and the space of possible visual concepts is vast. As a result, most studies remain small-scale, rely on manual inspection, focus on specific regions and properties, and rarely include systematic validation. We present a large-scale, automated framework for discovering and explaining visual representations across the human cortex. Our method comprises two main stages. First, we discover candidate interpretable patterns in fMRI activity through unsupervised, data-driven decomposition methods. Next, we explain each pattern by identifying the set of natural images that most strongly elicit it and generating a natural-language description of their shared visual meaning. To scale this process, we introduce an automated pipeline that tests multiple candidate explanations, assigns quantitative reliability scores, and selects the most consistent description for each voxel pattern. Our framework reveals thousands of interpretable patterns spanning many distinct visual concepts, including fine-grained representations previously unreported.
💡 Research Summary
BrainExplore introduces a large‑scale, fully automated framework for discovering and interpreting visual representations across the human cortex using fMRI data. The authors first decompose voxel‑wise activity within each predefined visual ROI (e.g., EBA, PPA, V4) with four unsupervised methods: Principal Component Analysis (PCA), Non‑negative Matrix Factorization (NMF), Independent Component Analysis (ICA), and a Sparse Autoencoder (SAE). Crucially, these decompositions are learned solely from brain signals—no image features or semantic labels are injected—ensuring that the resulting components reflect intrinsic neural structure.
To overcome the limited size of measured fMRI datasets (≈10 k images per subject in the Natural Scenes Dataset), the authors augment the training set with ≈120 k synthetic image‑fMRI pairs generated by an existing image‑to‑fMRI encoder (Beliy et al.). This augmentation dramatically improves component quality, especially for the high‑capacity SAE, which benefits from a richer sampling of the stimulus space while maintaining strong sparsity constraints.
For each component, the pipeline retrieves the top‑N images that elicit the strongest coefficient (both measured and predicted responses). These image sets are then fed into a vision‑language model (VLM) coupled with a large language model (LLM) that proposes multiple natural‑language candidate labels describing the shared visual theme. An alignment score—combining image‑label similarity, label consistency across images, and reconstruction fidelity—is computed for each candidate, and the highest‑scoring label is assigned to the component. This scoring enables systematic ranking of all components across ROIs and decomposition methods, allowing the authors to isolate the most interpretable patterns without manual inspection.
The results are striking. While traditional PCA, NMF, and ICA primarily surface low‑level features (color, shape) or broad semantic categories (faces, places), the SAE uncovers fine‑grained, multi‑concept representations such as “hands holding objects,” “bent knees,” “open‑mouth actions,” and nuanced indoor/outdoor scenes (e.g., “kitchen vs. street”). Thousands of components receive high alignment scores, many of which correspond to previously unreported visual concepts. Importantly, all interpretations are validated on held‑out measured fMRI data, demonstrating that the discovered patterns generalize beyond the training set.
The authors also demonstrate a bidirectional use of the alignment scores: given a hypothesized concept (e.g., “water”), the framework can retrieve the brain component with the strongest evidence for that concept, effectively linking semantic hypotheses to neural substrates.
Limitations include reliance on the accuracy of the image‑to‑fMRI predictor—synthetic responses may introduce bias—and dependence on a pre‑constructed concept dictionary, which may miss truly novel neural representations. Future work could integrate human expert validation, expand the concept lexicon using open‑ended language models, and apply the pipeline to other sensory modalities or clinical populations.
Overall, BrainExplore provides a reproducible benchmark (code, concept dictionary, and large‑scale image‑fMRI‑explanation dataset) and a methodological blueprint for scaling interpretability from artificial neural networks to the human brain, opening new avenues for neuroscience, brain‑computer interfaces, and interdisciplinary AI research.
Comments & Academic Discussion
Loading comments...
Leave a Comment