Reading time: 25 minute
...

📝 Original Info

  • Title:
  • ArXiv ID: 2512.19928
  • Date:
  • Authors: Unknown

📝 Abstract

Accurate registration of brain MRI scans is fundamental for cross-subject analysis in neuroscientific studies. This involves aligning both the cortical surface of the brain and the interior volume. Traditional methods treat volumetric and surface-based registration separately, which often leads to inconsistencies that limit downstream analyses. We propose a deep learning framework, NeurAlign, that registers 3D brain MRI images by jointly aligning both cortical and subcortical regions through a unified volume-and-surface-based representation. Our approach leverages an intermediate spherical coordinate space to bridge anatomical surface topology with volumetric anatomy, enabling consistent and anatomically accurate alignment. By integrating spherical registration into the learning, our method ensures geometric coherence between volume and surface domains. In a series of experiments on both in-domain and out-of-domain datasets, our method consistently outperforms both classical and machine learning-based registration methods-improving the Dice score by up to 7 points while maintaining regular deformation fields. Additionally, it is orders of magnitude faster than the standard method for this task, and is simpler to use because it requires no additional inputs beyond an MRI scan. With its superior accuracy, fast inference, and ease of use, NeurAlign sets a new standard for joint cortical and subcortical registration.

📄 Full Content

Accurate alignment of both cortical and subcortical regions is essential for comprehensive and consistent whole-brain analysis in neuroimaging studies. For example, functional MRI (fMRI) measures brain activity by detecting blood oxygenation changes, with a key goal of localizing functional regions in the cerebral cortex. Mapping activation patterns to brain structure reveals functional organization and is essential for quantifying how neurodegenerative diseases affect things such as cognition (Fischl et al., 2008;Thompson et al., 2007;Ghosh et al., 2010).

Traditional deformable volumetric registration methods compute a three-dimensional (3D) displacement field to align brain images by maximizing intensity similarity (Avants et al., 2008;Balakrishnan et al., 2019;Rueckert et al., 1999). While effective for aligning subcortical structures and global anatomy, volumetric deformable registration often fails in the cortex. The cortex is a thin, highly curved surface with significant inter-subject variability in folding patterns that is difficult to align in Euclidean space (Fischl et al., 1999). Accurate alignment is critical, since the cortex encodes many of the brain’s high-level functions such as memory and language, and serves as a primary target for structural and functional neuroimaging studies (Fischl, 2012). Surface-based methods address this challenge by projecting the cortex onto a sphere and aligning the cortex using pre-computed anatomical features such as sulcal depth and/or curvature. This facilitates registration by providing a common geometric space to preserve the topology of the cortex (Fischl et al., 1999;Yeo et al., 2009).

Using fundamentally different representations for surface-and volume-based registration forces neuroscience researchers to tackle two disjoint problems and combine solutions ad hoc (Tucholka et al., 2012). This typically involves solving an elastic partial differential equation to propagate the surface registration to the interior brain. However, interpolating the surface registration to the interior is not guaranteed to be consistent with the original volumetric registration. This sequential approach thus prevents computing a single coherent registration that simultaneously satisfies both cortical and subcortical alignment objectives. This introduces errors, undermining whole-brain analyses by potentially misaligning anatomically and functionally connected areas (Joshi et al., 2009;Postelnicu et al., 2008;Ahmad et al., 2019).

In this work, we propose NeurAlign, a machine learning framework for unified registration of cortical and subcortical structures. NeurAlign, simultaneously trains a volumetric registration network and a spherical registration network using a shared framework that combines both domains. Unlike existing approaches, our model explicitly couples the two through a loss function that encourages the volumetric deformation field on the cortical ribbon to match the spherical registration field. This promotes topologically correct and geometrically consistent alignment across cortical and subcortical regions. Crucially, because our method performs registration in a single forward pass, it enables efficient and scalable whole-brain registration for large population studies. Further, NeurAlign does not require cortical meshes nor segmentations at inference, avoiding the computationally intensive preprocessing required by other methods. Our method, NeurAlign, achieves significantly improved cortical and subcortical alignment compared to the standard joint registration framework, CVS (Postelnicu et al., 2008), while reducing computation time by several orders of magnitude. In addition, NeurAlign substantially outperforms state-of-the-art volumetric machine learning-based registration approaches. To summarize our contributions:

• We propose a unified machine learning model that performs consistent registration of both cortical and subcortical structures using coupled spherical and volumetric networks. • We derive a cortical consistency loss that explicitly encourages agreement between the volumetric deformation field in the cortex and the spherical registration field, promoting anatomically coherent alignment across the entire brain. • Our proposed method leverages cortical meshes during training, but does not require them during inference, enabling accessible, rapid, and accurate registration. • We validate our method on four clinical neuroimaging datasets, showing substantially improved and rapid cortical and subcortical alignment compared to standard baselines.

Our code and trained model weights are available at https://github.com/mabulnaga/ neuralign.

Volumetric Registration. Deformable registration estimates a dense spatial mapping between a pair of images. Classical methods (Rueckert et al., 1999;Ashburner, 2007;Rohr et al., 2001;Modat et al., 2010) solve an optimization problem for each pair of images. This balances an image-similarity objective, often mean-squared error (MSE) or normalized cross correlation (NCC) (Avants et al., 2008), between the warped image and the fixed image, with a regularization objective that encourages a smooth and plausible solution. Methods differ in their optimization strategy, the choice of the objective function, and in how they parameterize the deformation field.

Learning-based methods fit the parameters of a neural network to a training dataset. The network learns a nonlinear function that maps an input image pair to an output transform, generally enabling much faster inference than classical methods. Supervised approaches (Sokooti et al., 2017;Rohé et al., 2017;Yang et al., 2017) reproduce known simulated deformations or fields estimated by classical methods, whereas unsupervised approaches (De Vos et al., 2017;Balakrishnan et al., 2019;Mok et al., 2023;Grzech et al., 2022;Abulnaga et al., 2025;Hoffmann et al., 2021) optimize an image-similarity loss term and a regularization loss term at training, analogous to classical registration.

Both optimization-based and machine learning-based volumetric approaches are highly effective at aligning subcortical structures. However, they struggle with cortical registration, since the optimization landscape in Euclidean space often leads to suboptimal local minima. This can lead to implausible solutions that, for example, move voxels from one gyrus to the next instead of following the cortical fold. Further, relying only on image-based features cannot align the functional regions of the cortex, so geometric features must be employed from geometric representations (Fischl et al., 1999;Ségonne et al., 2007).

Spherical Registration. Cortical alignment in the spherical domain is the standard approach in neuroscientific studies (Fischl, 2012;Ghosh et al., 2010;Tucholka et al., 2012). Mapping the cortex to the sphere simplifies the optimization, since the registration is guaranteed to preserve topology. The registration is driven by geometric descriptors that measure the cortical shape and folding patterns (Fischl et al., 1999;Fischl, 2012;Yeo et al., 2009;Conroy et al., 2013;Zhao et al., 2019). Similar to volumetric registration, spherical registration is posed as an optimization problem that balances a geometric similarity objective with a regularization objective. Many of these methods require solving an optimization problem for each pair of images. Recent methods use deep learning to improve inference efficiency (Li et al., 2024;Zhao et al., 2019;Cheng et al., 2020). The spherical mesh is parameterized to the 2D plane using a stereographic projection. The registration is performed in 2D space using convolutional neural networks (Zhao et al., 2019). Icosahedral CNNs (Zhao et al., 2021) learn directly on the sphere but require a rigid tessellation that is not produced natively by standard neuroimaging pipelines.These methods have achieved comparable accuracy to classical techniques at a significant improvement in speed. We build on this kind of parameterization to construct a spherical registration network and jointly model it with a volumetric one.

Surface Mesh Registration. Surface registration aligns two meshes or point clouds in 3D Euclidean space. Classical approaches typically frame the task as a deformation problem, aiming to align two shapes by estimating a rigid or non-rigid transformation, that minimizes a distance metric (Besl & McKay, 1992;Rusinkiewicz & Levoy, 2001;Sorkine & Alexa, 2007;Weischedel, 2012;Abulnaga et al., 2023). An alternative paradigm treats surface registration as a shape correspondence problem, focusing on establishing point-to-point correspondences based on intrinsic geometry (Deng et al., 2022). Functional maps (Ovsjanikov et al., 2012) and extensions (Sharp et al., 2022;Donati et al., 2020;Cao et al., 2023) have emerged as a powerful representation for such correspondences, enabling compact linear descriptions of dense mappings between shapes. However these methods, developed for graphics applications target moderate complexity meshes and scale poorly to high-curvature, highresolution cortical surfaces, despite some efforts for memory-efficient variants (Magnet & Ovsjanikov, 2024). Furthermore, because of the high curvature and topological consistency constraints of cortical surfaces, Euclidean 3D embeddings become unstable and less informative. Finally, these methods are unsuitable for registering volumetric structures as they are only defined for 2-dimensional surfaces. These reasons further motivate the use of spherical registration to align cortical surfaces.

Combined Volume and Surface Registration. Existing neuroscientific studies treat volumetric and cortical registration as two separate problems, often combining their outputs through ad hoc procedures. The Combined Volume and Surface framework (CVS) (Postelnicu et al., 2008;Zöllei et al., 2010) is the standard method for jointly aligning cortical and subcortical anatomy in adult brains. CVS first aligns the cortical surface then propagates the resulting deformation into the volume using a nonlinear elastic model, followed by intensity-based refinement. While effective, this sequential formulation decouples the surface and volumetric objectives, can introduce inconsistencies near the cortical-subcortical boundary, and requires surface meshes and long runtimes. Related work has also used cortical surface registration to supervise volumetric registration (Ahmad et al., 2019), though focused in the infant-brain domain. A strength of these variational approaches is that they offer a principled way to propagate surface constraints into the volume using well-understood physical models. However, they remain impractical for large-scale datasets, as they require several hours of computation per pair and depend on extracting cortical surfaces and segmentation labelmaps.

To address these limitations, we introduce NeurAlign, which incorporates the established idea of surface-guided alignment into a single, unified learning-based diffeomorphic framework. NeurAlign jointly aligns cortical and subcortical structures through a single forward pass, requires only 3D MRI at inference time (without surface meshes), and provides the speed and simplicity needed for high-throughput neuroimaging studies.

We develop NeurAlign, a method for jointly aligning cortical and subcortical structures in 3D brain MRI. We jointly optimize a volumetric and a spherical deformation field and propose a soft constraint to encourage consistency between the two registration fields. We first propose the continuous formulation of the problem before describing the discrete case.

Let M 1 , M 2 ⊂ R 3 be bounded brain volumes, with associated smooth cortical surfaces ∂M 1 , ∂M 2 . We seek a bijective map φ : M 1 → M 2 that aligns both the cortical and subcortical structures. Each volume is equipped with an intensity sampling function I i : R 3 → R. The surfaces are also equipped with d geometric descriptor functions g i : ∂M i → R d , such as sulcal depth or mean curvature.

Finding φ amounts to finding a solution to the following optimization problem,

with P denoting the feasible set of maps, enforcing for instance φ(∂M 1 ) ⊂ ∂M 2 .

Here, f I measures image similarity, f S compares surface descriptors, dV and dS denote the volume and surface measures, and R[•] denotes a set of regularization functionals.

Solving Eq. ( 1) is difficult: the joint matching of intensities and geometric features has a nonconvex objective with degenerate solutions. Enforcing φ(∂M 1 ) ⊆ ∂M 2 is especially challenging as the cortex is thin, highly folded, and has variable structure across subjects. Fine-grained boundary correspondence is poorly captured in a Euclidean volumetric grid, and optimizers often only align subsets of folds while blurring or misaligning others (Ségonne et al., 2007). Finally, finding a unified representation that captures both the volumetric image and cortical surface is difficult.

To address this, we use a joint representation, where cortical alignment is performed on the sphere, S 2 , using topology-preserving maps and volumetric registration is performed in 3D. By enforcing consistency between the two, this learning-based formulation efficiently registers both the cortical and subcortical structures.

We revise the optimization problem in Eq. ( 1) by introducing a spherical intermediate domain to help enforce the constraint set P. Let τ i : ∂M i → S 2 denote the (fixed) invertible spherical maps obtained via cortical inflation (Fischl et al., 1999;Hoopes et al., 2022) for each surface. The cortical alignment problem then computes a map ψ : S 2 → S 2 that displaces spherical coordinates (θ, ϕ), to align geometric descriptors. By optimizing on the sphere, we guarantee satisfaction of the constraint P. The revised optimization problem is then:

(2) We decouple this optimization into two separate optimization problems: volumetric registration of subcortical regions using φ, and cortical registration on the sphere using ψ. We introduce a constraint set Q to encourage a consistent mapping below.

Cortical Alignment Consistency Constraint. Optimizing Eq. ( 2) independently can estimate maps φ and ψ that map the cortex to different locations in the volume. We therefore require consistency in Euclidean space by composing maps to the sphere and back:

To keep the problem tractable, we soften this constraint using a coupling energy that encourages consistent alignment: where f measures dissimilarity in the mapping by the squared error. We replace the hard constraint Q with the coupling energy (Eq. ( 3)) to achieve a consistent mapping that accurately aligns both cortical and subcortical structures. This constraint only acts on a set of Lebesgue measure zero in R 3 (the cortical surface ∂M 1 ) and is therefore weak. In practice, however, our discrete implementation avoids this issue: the consistency loss samples the volumetric deformation at mesh vertex locations via trilinear interpolation, distributing gradients to neighboring voxels. We also impose smoothness priors on the map, allowing the surface-derived updates to naturally propagate throughout the volume.

The remainder of this section focuses on the discretization and use of neural networks to model the registration problem.

We discretize the optimization problem in Eq. ( 2), and use neural networks to parameterize the volumetric and spherical maps. Specifically, we formulate an unsupervised learning problem where we jointly learn 3D and 2D U-Net convolutional neural networks (CNN) (Ronneberger et al., 2015).

Given an image and a surface, the networks output the 3D and 2D deformation fields φ, ψ, respectively. We propose a consistency loss to obtain an accurate cortical surface registration in the volume. Figure 1 presents an overview of our framework.

Notation. Let I 1 , I 2 : R 3 → R denote two 3D brain MRI scans, where I i (x) gives the intensity at spatial location x. Each image used in training the networks also has an associated triangular surface mesh C 1 , C 2 ⊂ R 3 representing the cortical surface extracted from the respective anatomy. Vertex j of mesh i, v i j , is a 3D coordinate in the image space of I i . Each cortical surface mesh has an associated mesh in the spherical domain denoted S 1 , S 2 , where S 1 = τ 1 (C 1 ), with the same triangle connectivity. The vertices v are mapped to the sphere by a standard spherical inflation procedure and have an associated S 2 representation (Fischl et al., 1999).

Volumetric Registration Network. We employ a neural network F v (I 1 , I 2 ; ω v ) = φ to predict φ as a function of the input images, where F v is parameterized by ω v . A core 3D convolutional network first outputs a stationary velocity field (SVF), which we then integrate to obtain a diffeomorphic displacement field φ (Ashburner, 2007;Dalca et al., 2019a). In the deformation field φ : R 3 → R 3 , each voxel in R 3 is assigned a displacement vector u: φ(x) = x + u(x).

Spherical Registration Network. We construct a 2D CNN registration network for the sphere. We parameterize the spherical meshes to the 2D Euclidean plane using a stereographic projection π : S 2 → R 2 . By mapping to the plane, we can use a standard 2D CNN to learn a 2D diffeomorphic displacement field in the space of spherical angles (θ, ϕ).

To account for the nonuniform sampling of the spherical parameterization, where regions near the poles are stretched, we apply distortion correction in all surface-based losses by weighting samples by sin(θ), where θ is the elevation angle (Cheng et al., 2020;Li et al., 2024). We also handle the discontinuities at the image boundaries using the padding strategy of (Li et al., 2024), employing circular padding along the left-right axis and a 180 • circular shift with reflection padding at the poles.

We use a neural network F s (π(S 1 ), π(S 2 ); ω s ) = ψ to predict ψ, by first predicting a SVF that is integrated. The network outputs a displacement field u s such that ψ(ρ) = ρ + u s (ρ), where ρ = (θ, ϕ) is a discrete location in pixel coordinates on the parameterized grid.

Cortical Consistency Loss. We maximize alignment of the cortex and subcortical structures between a pair of images while maintaining a smooth map.

We discretize the proposed consistency loss in Eq. ( 3), and use it to jointly train the volumetric and spherical registration networks. The consistency loss penalizes disagreement between the predicted 3D displacement on the cortex, and the 3D displacement computed from the predicted 2D warp:

where N v is the number of vertices in C 1 , π(v) is the projection of v to the plane, and f (•, •) is a similarity metric; we use the mean squared error in our experiments. We implement π -1 (•) with trilinear interpolation over the mapped planar parameterizations of v. Maps φ and ψ are the outputs of the 3D and 2D CNN, respectively.

Image Similarity Loss. Both networks are trained by minimizing an unsupervised loss that balances image-similarity and field regularity. The similarity loss on the volume matches MRI intensities and the loss on the sphere matches geometric descriptors. We pre-compute geometric descriptors on the cortical meshes, g : C → R 2 . As is standard in spherical registration, we use the sulcal depth and mean curvature (Fischl et al., 1999;Cheng et al., 2020). The similarity loss is L sim (φ, ψ) = L sim (I 2 , φ (I 1 )) + L sim g (π (S 2 )) , ψ (g (π (S 1 ))) ; w (ρ) . Here, w(ρ) is a weighting to correct for area distortion introduced by the stereographic projection. The similarity loss measures pairwise similarity between the fixed image and sphere I 2 , S 2 and the mapped image and sphere, φ(I 1 ), ψ(S 1 ).

We use the local normalized cross correlation function as our similarity measure (Avants et al., 2010).

Regularization. We regularize the displacement fields to be smooth to encourage a well-behaved map. Our regularization loss is L reg (φ, ψ) = ∥∇φ∥ 2 + ∥∇ψ∥ 2 , where ∇φ is the Jacobian of map φ.

Auxiliary structural loss. The use of anatomical label maps during training of learning-based registration often improves substructure alignment (Balakrishnan et al., 2019). When segmentation maps are available for some pairs of images, we use this information to supervise by structural alignment. Let seg[I 1 ] indicate the segmentation labelmap of the K brain substructures for image I 1 . We add an additional loss to align these structures,

In practice, we use the soft Dice loss function, which captures volume overlap of the warped subcortical structures on one image warped to the second.

Joint training loss. Combining all loss terms provides our joint optimization problem:

where λ, γ, and κ are hyperparameters.

We evaluate NeurAlign using 3D brain MRI scans from multiple datasets comprising healthy subjects and subjects with cognitive disorders. We compare our method against CVS, the state-of-theart combined cortical surface-volume registration framework (Postelnicu et al., 2008) and against learning-based methods that emphasize structural alignment. We test the ability of our method to generalize to new datasets unseen in training.

Data. We train our model using two public 3D brain MRI datasets: OASIS-1 (Marcus et al., 2007) and ADNI (Mueller et al., 2005). Both datasets are obtained from large population studies focused on Alzheimer’s disease (AD). OASIS-1 (Marcus et al., 2007) includes T1-weighted (T1w) scans of 416 subjects aged 18-96. We use a random subset of 100 ADNI subjects aged 56-90, half of whom were diagnosed with AD. For each ADNI subject, we use data from two sessions, two years apart.

For each dataset, we hold out 20% of the subjects for testing, stratified by gender, age, and disease state. The remaining subjects are split into 85% for training and 15% for validation using the same stratification. In training, we randomly sample pairs from both datasets. In our test set, we create 123 non-overlapping pairs for registration: 83 pairs aligning OASIS-1 and OASIS-1, 20 pairs aligning ADNI and ADNI, and 20 pairs aligning OASIS-1 and ADNI.

Held-out datasets. We use two additional held-out datasets for testing model generalization. The IXI dataset contains MRI scans from healthy subjects acquired in 3 London hospitals (IXI Consortium).

We randomly select T1w scans from 115 subjects for evaluation. We also hold out 80 T1w scans from the Mindboggle-101 dataset (Klein et al., 2017), a collection of manually labeled brain MRI volumes and surfaces. We exclude all OASIS-1 subjects from the Mindboggle dataset.

All evaluations are performed on T1w MRI, which is the modality that enables reliable cortical surface extraction. T1w MRI has the strongest gray-white matter contrast and supports reliable delineation of white and pial surfaces (Fischl et al., 1999;Fischl, 2012;Hoopes et al., 2022).

Processing. We use FreeSurfer (Fischl, 2012) to perform all image preprocessing. FreeSurfer is an open-source, widely used toolkit in neuroimaging that supports cortical surface analysis, longitudinal workflows, and population-scale MRI studies. We first perform affine alignment of each image to the Talairach template (Talairach & Tournoux, 1990). We crop all images to a roughly 10 mm margin around the brain. To prepare our training data, we generate anatomical segmentations, white matter (internal) and pial (external) cortical surface reconstructions, and spherically inflated surfaces with parameterized curvature maps. For evaluation, we use both volumetric segmentations of subcortical brain regions, computed on the volume, as well as parcellations of the cortical ribbon, computed on the surface mesh. Surface parcellations are rasterized as volumetric segmentations. We emphasize that our method does not require surface meshes, spheres, nor segmentation labelmaps at inference.

Implementation Details. We train using the Adam optimizer (Kingma & Ba, 2014) with a learning rate of 10 -4 . We set the field regularization hyperparameter to λ = 1.0, the segmentation loss weight to κ = 10.0, and the cortical consistency weight to γ = 0.05. These were chosen using a grid search over the validation set. We train for 300, 000 iterations, using the best performing model on the validation set. All models are trained on a single A100 GPU. All CVS experiments are done on an Intel(R) Xeon(R) Gold 5218 CPU, as there is no GPU implementation.

Baselines. We select state-of-the-art baselines that emphasize structural alignment. We evaluate CVS (Postelnicu et al., 2008;Zöllei et al., 2010), the most widely used joint volume-cortical alignment registration method, which is integrated in the FreeSurfer software suite. CVS performs sequential spherical to volumetric alignment using a biomechanical model and image intensity registration.

We evaluate uGradICON (Tian et al., 2024), a modern foundation model for medical image registration that generalizes well to unseen data. UniGradICON employs test-time optimization to improve pairwise-registration accuracy. We also use the uGradICON-seg variant that performs test-time optimization using segmentation labelmaps to improve cortical structure alignment.

We test SynthMorph (Hoffmann et al., 2023(Hoffmann et al., , 2024)), a state-of-the-art registration tool trained exclusively on images synthesized from label maps. Since this tool optimized the overlap of standard FreeSurfer labels in training, which include only two large cortical structures, we also retrain it with extended label sets, optimizing overlap of the “a2009s” set of 196 labels that include finer Destrieux parcels of the cortex (Destrieux et al., 2010) (SynthMorph-wm).

We also evaluate FireANTs (Jena et al., 2025), a library for fast medical image registration using Riemannian optimization. We use the SyN transformation for symmetric diffeomorphic registration.

Evaluation. To evaluate registration accuracy, we compute Dice scores between anatomical structures in the warped and target images. We report overlap for 43 subcortical structures and for cortical regions obtained by parcellating the cortical ribbon into 34 regions per hemisphere. We assess field regularity and topology by computing the determinant of the Jacobian of the map, det J φ (p) = det (∇φ (p)) at each voxel p. Locations where det J φ (p) < 0 represent folded regions breaking local injectivity. We represent the percentage of folds (% folds). We measure the smoothness of the warp fields using the standard deviation of log | det J| (SDlogJ) (Leow et al., 2007).

Table 1 and Figure 2 present results for all methods on the 3 datasets. Our method, NeurAlign, consistently achieves the highest cortical Dice score, often by a large margin. For example, it improves by up to 7 Dice points on the held-out Mindboggle dataset. NeurAlign also achieves the highest subcortical Dice score on OASIS-1 & ADNI and Mindboggle. We obtain slightly lower subcortical Dice on IXI (1.5 points at worse), though the difference is not substantial. Across all datasets, NeurAlign achieves statistically better cortical Dice scores (p < 0.01) and in many cases a substantial increase (up to 7.5 points over the next best method). We also outperform baselines on whole cerebral cortex alignment (see Suppl. A.2). Finally, NeurAlign produces regular deformation fields with negligible folding. These results demonstrate that we are able to greatly improve cortical structure alignment while maintaining strong performance in substructure alignment and field regularity.

In terms of computational time, CVS was slowest, requiring 2.5 ± 0.5 hours to register a pair of images. uGradICON required on the order of minutes to perform test-time optimization. NeurAlign and all other baselines converged in seconds or milliseconds. NeurAlign effectively aligns both cortical and subcortical structures more accurately than other modern methods, while being more accurate and faster than the state-of-the-art classical method.

Figure 3 visualizes example registrations and corresponding warps. Our method yields accurate alignments with smooth and regular deformation fields.

To study important aspects of NeurAlign, we conduct two ablation experiments. In the first, we quantify the effect of different model components: label map supervision in training and including the spherical alignment loss. In the second, we quantify the effect of κ, the Dice loss weight parameter.

Results. Table 2 presents ablations of model components, trained with κ = 1.0 and γ = 0.05, on the IXI dataset. We train a base VoxelMorph model (Dalca et al., 2019a), called Base, and extensions testing components of NeurAlign. We compare the performance when training with supervision of segmentations of subcortical structures, Base+Dice(subcort), training with segmentations of all structures, Base+Dice(all), training with only the spherical consistency loss, Base+Sphere, and training with all proposed components, Base+Dice(all)+Sphere. We selected κ = 1.0, as it achieved the best performance for Base+Dice(subcort). As expected, training with subcortical supervision improved the Dice (subcortical) score compared to the Base model. However, interestingly, training with supervision of all brain structures did not lead to a substantial improvement in the Dice (cortical) score. Only when including our proposed spherical consistency loss (Base+Dice(all)+Sphere) did we observe a substantial increase in the Dice (cortical) score while maintaining the Dice (subcortical) score. Training with just the spherical loss without structural supervision did not improve results, indicating that we also require volumetric structure supervision. This ablation study demonstrates the benefit of our spherical-volumetric consistency constraint in obtain accurate alignment of fine-grained cortical structures.

We assess the impact of κ, the weighting hyperparameter for the Dice loss term (see Suppl. Table A .1,Fig. A.1). As expected, increasing κ improves Dice performance at test time, yielding higher structural overlap. In some regions, however, this comes at the expense of deformation field regularity. Overall, setting κ > 1 produces models that outperform baselines across most metrics. Consistent with prior work (Dalca et al., 2019b;Hoopes et al., 2021), the optimal choice of κ depends on the intended downstream application: for atlas-based segmentation propagation, higher Dice weighting is advantageous, whereas for longitudinal studies, smoother deformation fields may be preferable. Hypernetworks can be considered to select hyperparameters at test-time (Hoopes et al., 2021).

Limitations and Future Work. We presented NeurAlign, a model for unified cortical and subcortical neuroimage registration, evaluated primarily on adult T1w MRI from healthy and Alzheimer’s disease populations. As with any diffeomorphic registration framework, NeurAlign cannot accurately accommodate topology-altering pathologies. This limitation can be partially mitigated by incorporating pathology masks during training (Brett et al., 2001). It also remains unclear how well the model can generalize to lower-quality clinical scans or to pediatric populations. Extending NeurAlign to these more challenging settings is an important direction for future work.

Training our model requires cortical surface extraction and spherical inflation. This preprocessing pipeline may fail on challenging scans. A simple approach for quality control could be to simply discard failed cases, for example surfaces with many self-intersecting faces, or to use robust, learningbased extraction methods (Hoopes et al., 2022;Bongratz et al., 2022). In practice, we did not observe any failures on our training data.

Our model was also only tested on the T1w modality. T1w is the most widely used imaging modality for structural imaging studies (Bhalerao et al., 2024), and it is currently the primary modality from which cortical surfaces can be reliably reconstructed (Fischl, 2012;Hoopes et al., 2022) An interesting future direction is extending NeurAlign to multimodal acquisitions. Additional contrasts such as T2w, FLAIR, or DWI could provide complementary information for subcortical tissue segmentation or characterization. Incorporating these modalities would require multimodal training with all contrasts aligned to the T1w image so that the cortical surfaces remain consistent across channels. Importantly, at inference time, the surfaces would no longer be required, potentially reducing the current dependence on T1w imaging for surface extraction.

A benefit of our formulation is that it can be easily extended to include additional structures with genus-0 topology. For example, this can improve alignment of the hippocampus by providing additional supervision through surface alignment. Further, our consistency loss in Eq. ( 3) can be adopted for non-spherical surface registration, implicit surfaces, or to perform cortical alignment directly on the surface mesh using frameworks such as Deep Functional Maps (Donati et al., 2020).

Representations beyond spheres can be used, provided there is a one-to-one map to the cortical mesh.

We introduced NeurAlign, a deformable registration method that achieves accurate alignment of both subcortical and cortical structures. The approach parameterizes the deformation field through coupled spherical and volumetric alignment, promoting topological consistency across cortical surfaces. Unlike the state-of-the-art iterative method for combined volume and surface registration, CVS (Postelnicu et al., 2008), we do not require additional inputs (cortical meshes, segmentations) at inference. Our method only requires structural MRI images.

Our model outperforms state-of-the-art deep learning and foundational models trained or fine-tuned using cortical segmentation label maps. Compared to CVS, the most widely used method in largescale neuroimaging pipelines, our method achieves substantially better alignment and deformation

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut