Multi-modal Imputation for Alzheimer's Disease Classification
Deep learning has been successful in predicting neurodegenerative disorders, such as Alzheimer’s disease, from magnetic resonance imaging (MRI). Combining multiple imaging modalities, such as T1-weighted (T1) and diffusion-weighted imaging (DWI) scans, can increase diagnostic performance. However, complete multimodal datasets are not always available. We use a conditional denoising diffusion probabilistic model to impute missing DWI scans from T1 scans. We perform extensive experiments to evaluate whether such imputation improves the accuracy of uni-modal and bi-modal deep learning models for 3-way Alzheimer’s disease classification-cognitively normal, mild cognitive impairment, and Alzheimer’s disease. We observe improvements in several metrics, particularly those sensitive to minority classes, for several imputation configurations.
💡 Research Summary
This paper tackles the pervasive problem of missing imaging modalities in Alzheimer’s disease (AD) research by using a conditional denoising diffusion probabilistic model (DDPM) to synthesize diffusion‑weighted imaging (DWI) from readily available T1‑weighted MRI. The authors argue that multimodal MRI—combining structural T1 with microstructural DWI (specifically the fractional anisotropy, FA, map)—offers richer biomarkers for AD, mild cognitive impairment (MCI), and cognitively normal (CN) classification, yet DWI acquisition is often incomplete due to longer scan times and susceptibility to motion artifacts. Traditional linear or mean‑imputation methods cannot capture the complex non‑linear relationship between T1 and DWI, motivating a generative approach.
Methodology
A 3‑D conditional DDPM is built on a MONAI‑based 3‑D U‑Net architecture. The network receives two channels: the clean T1 volume and a noisy version of the target DWI at a given diffusion timestep, and predicts the added noise. Temporal conditioning (1,000 timesteps, linear β schedule from 5×10⁻⁴ to 1.95×10⁻²) and self‑attention layers enable the model to learn the conditional distribution P(DWI|T1). Training uses paired T1‑DWI scans from ADNI phases 1‑3: 642 subjects for training, 137 for validation, and 137 for testing. The model achieves a mean 3‑D SSIM of 0.36, PSNR of 23.61, L1 error of 0.043, and MSE of 0.0057 on the held‑out test set, indicating moderate anatomical fidelity.
Downstream Classification Experiments
Three downstream classifiers are evaluated:
-
DWI‑only 3‑D CNN – a standard 5‑block 3‑D convolutional network. Adding synthetic DWI scans for MCI and AD (equal numbers) raises overall accuracy from 62.19 % ± 1.63 to 65.40 % ± 2.19, balanced accuracy from 42.28 % ± 3.61 to 51.68 % ± 7.06, and Macro‑F1 from 41.00 % ± 5.20 to 50.79 % ± 5.89. Precision also improves. Stratified synthetic augmentation (preserving original class ratios) yields no clear benefit, and naive baselines (zero‑filled or diagnosis‑average DWI) either stagnate or degrade performance.
-
T1‑only 3‑D CNN – the same architecture applied to T1 data. Because the T1 dataset is large (3,901 training scans), no imputation is needed. Training on the full set yields high accuracy and AUC on the small test split (n = 137). However, when evaluated on a larger, more heterogeneous test set (n = 859), performance drops, highlighting the challenge of generalization across diverse cohorts.
-
Bimodal (T1 + DWI) 3‑D CNN – two parallel unimodal branches whose features are concatenated in a late‑fusion step. Adding synthetic MCI and AD DWI scans improves accuracy from 68.03 % ± 2.33 to 70.36 % ± 2.38 and micro‑AUC from 84.99 % ± 1.43 to 87.04 % ± 1.92. When synthetic scans are added equally across all three diagnostic groups, balanced accuracy jumps from 46.13 % ± 2.26 to 59.96 % ± 4.41 and F1 from 43.92 % ± 2.20 to 58.67 % ± 7.55. Notably, a “blank” imputation (zero‑filled DWI) yields comparable gains, suggesting that the performance boost primarily stems from the increased exposure to real T1 data rather than the quality of the synthetic DWI. Diagnosis‑average imputation does not consistently help.
Interpretation
The results indicate that conditional DDPM‑generated DWI can modestly aid classification, especially for under‑represented classes, but the overall impact is limited. The dominant factor appears to be the sheer increase in T1 training samples, which already carry strong discriminative information for AD versus MCI versus CN. The synthetic DWI, while anatomically plausible, does not add enough complementary signal to outweigh the added noise or potential bias introduced by imperfect generation.
Discussion and Future Work
The authors acknowledge that the paired T1‑DWI training set (≈800 subjects) is relatively small for training high‑capacity diffusion models. They propose three avenues for improvement: (1) retraining the DDPM on larger, newly released paired datasets; (2) exploring newer diffusion bridge architectures that may better capture cross‑modal relationships; and (3) extending evaluation to simpler binary AD classification tasks and more challenging problems such as Parkinson’s disease detection or continuous cognitive score prediction, to assess whether task difficulty modulates the utility of synthetic modalities.
Conclusion
Conditional diffusion models offer a promising route to impute missing DWI data from T1 MRI, enabling the construction of larger multimodal cohorts. While the current study demonstrates some performance gains—particularly in metrics sensitive to minority classes—the benefits are not uniform and are often eclipsed by the informational weight of T1 alone. Further scaling of paired data, methodological refinements, and broader downstream testing are needed to fully realize the potential of generative imputation in neuroimaging‑based disease classification.
Comments & Academic Discussion
Loading comments...
Leave a Comment