Simple Image Processing and Similarity Measures Can Link Data Samples across Databases through Brain MRI
Head Magnetic Resonance Imaging (MRI) is routinely collected and shared for research under strict regulatory frameworks. These frameworks require removing potential identifiers before sharing. But, even after skull stripping, the brain parenchyma contains unique signatures that can match other MRIs from the same participants across databases, posing a privacy risk if additional data features are available. Current regulatory frameworks often mandate evaluating such risks based on the assessment of a certain level of reasonableness. Prior studies have already suggested that a brain MRI could enable participant linkage, but they have relied on training-based or computationally intensive methods. Here, we demonstrate that linking an individual’s skull-stripped T1-weighted MRI, which may lead to re-identification if other identifiers are available, is possible using standard preprocessing followed by image similarity computation. Nearly perfect linkage accuracy was achieved in matching data samples across various time intervals, scanner types, spatial resolutions, and acquisition protocols, despite potential cognitive decline, simulating MRI matching across databases. These results aim to contribute meaningfully to the development of thoughtful, forward-looking policies in medical data sharing.
💡 Research Summary
The authors investigate whether a simple, unsupervised pipeline based on standard neuro‑imaging preprocessing and conventional image‑similarity metrics can reliably link skull‑stripped T1‑weighted brain MRIs belonging to the same individual across different databases. Their motivation stems from privacy regulations such as the GDPR and HIPAA, which require an assessment of “reasonable likelihood” of re‑identification when sharing de‑identified imaging data. Prior work has shown that brain MRIs retain subject‑specific patterns, but most studies relied on supervised deep‑learning models that need multiple labeled scans per subject and are often limited to the same scanner or short inter‑scan intervals.
The proposed workflow consists of two stages. First, each scan undergoes open‑source preprocessing: skull stripping, affine registration to a common stereotactic template, bias‑field correction, and histogram matching to align intensity distributions. This harmonization step standardizes both anatomy and intensity, making voxel‑wise comparisons meaningful. Second, for every possible pair of images (excluding self‑pairs) the authors compute eleven similarity measures—Mutual Information (MI), Normalized Mutual Information (NMI), Negative Mean Squared Error (NMSE), Peak Signal‑to‑Noise Ratio (PSNR), Pearson Correlation Coefficient (PCC), Cosine Similarity (CosSim), Gradient Similarity (GradSim), Structural Similarity Index (SSIM), Four‑Component Gradient‑Regularized SSIM (4‑G‑R‑SSIM), Multi‑Scale SSIM (MS‑SSIM), and Negative Fréchet Inception Distance (NFID). An unsupervised kernel‑density‑estimation (KDE) clustering of the similarity scores yields a threshold τ that separates intra‑subject from inter‑subject pairs; the ground‑truth labels are used only for evaluation.
The pipeline is first validated on two simulated datasets: a synthetic 100 K MRI collection (SLDM) and the HCP young‑adult cohort (SHCP). After applying the harmonization steps, similarity distributions for intra‑subject pairs become clearly distinct from inter‑subject pairs. SSIM and 4‑G‑R‑SSIM achieve perfect separation (AUROC = 1.0), while NMI, MS‑SSIM, GradSim, and PCC also show near‑perfect performance. NFID performs poorly, confirming that not all modern perceptual metrics are suitable for this task. Computationally, harmonizing a single image on a standard CPU takes ~87 s (10 s for registration, 77 s for bias correction and histogram matching), and calculating all eleven metrics for one pair takes ~6 s; parallel processing with 15 cores reduces total runtime substantially.
The authors then apply the method to five real‑world datasets that span a wide range of acquisition conditions: (1) Hormonal Health Study (30 women, 3 T Siemens), (2) a running‑intervention study (21 men, 3 T Siemens), (3) the Traveling Human Phantom (5 participants scanned at eight sites on Siemens, GE, and Philips 3 T scanners), (4) SDSU‑TS (9 participants, two sites, GE Discovery and Siemens Prisma), and (5) the Alzheimer’s Disease Neuroimaging Initiative (ADNI) comprising 227 participants with 454 scans across 1.5 T and 3 T scanners, multiple protocols, and three cognitive status groups (cognitively normal, MCI, dementia). Across all datasets, the selected similarity measures (SSIM, PCC, NMI, GradSim) consistently yield AUROC, sensitivity, and specificity above 0.96, often reaching 1.0. Notably, in ADNI the pipeline correctly links scans despite up to 108 months between acquisitions, scanner upgrades from 1.5 T to 3 T, and substantial brain atrophy associated with disease progression.
The study demonstrates that even without any machine‑learning training, simple image‑processing combined with well‑chosen similarity metrics can achieve near‑perfect participant linkage. This has direct implications for data‑sharing policies: skull‑stripping alone does not guarantee anonymity, and regulators should consider quantitative re‑identification risk assessments such as the one presented here. The authors suggest that their pipeline can be used as a practical tool for “reasonableness” assessments under GDPR, HIPAA, and similar frameworks, guiding institutions on whether additional de‑identification steps (e.g., adding noise, further defacing) are required before releasing neuroimaging data.
Comments & Academic Discussion
Loading comments...
Leave a Comment