Real-time Dynamic MRI Reconstruction using Stacked Denoising Autoencoder

In this work we address the problem of real-time dynamic MRI reconstruction. There are a handful of studies on this topic; these techniques are either based on compressed sensing or employ Kalman Filtering. These techniques cannot achieve the reconstruction speed necessary for real-time reconstruction. In this work, we propose a new approach to MRI reconstruction. We learn a non-linear mapping from the unstructured aliased images to the corresponding clean images using a stacked denoising autoencoder (SDAE). The training for SDAE is slow, but the reconstruction is very fast - only requiring a few matrix vector multiplications. In this work, we have shown that using SDAE one can reconstruct the MRI frame faster than the data acquisition rate, thereby achieving real-time reconstruction. The quality of reconstruction is of the same order as a previous compressed sensing based online reconstruction technique.

💡 Research Summary

The paper tackles the long‑standing challenge of real‑time reconstruction for dynamic magnetic resonance imaging (MRI). Existing solutions—primarily compressed sensing (CS) and Kalman‑filter‑based methods—either require iterative sparse‑coding solvers or rely on linear dynamic models, both of which introduce latencies that exceed the acquisition rate of modern MRI scanners. To overcome these limitations, the authors propose a fundamentally different approach: learning a direct, non‑linear mapping from undersampled, aliased images to fully sampled, artifact‑free images using a stacked denoising autoencoder (SDAE).

Methodology
An SDAE consists of several layers of denoising autoencoders (DAEs) trained sequentially. Each DAE is tasked with removing a specific level of corruption; in this context, the “noise” corresponds to aliasing artifacts caused by aggressive k‑space undersampling. The network is trained offline on a large corpus of fully sampled dynamic MRI data covering multiple anatomical regions and motion patterns. For each training sample, a random undersampling mask is applied to generate the aliased input, while the original image serves as the ground‑truth target. The loss function is the mean‑squared error (MSE) between the reconstructed and reference images, augmented with weight regularization and dropout to mitigate over‑fitting. Training proceeds with the Adam optimizer on GPUs and typically takes several hours to days, but this cost is incurred only once.

Real‑time Inference
During deployment, the pre‑trained SDAE is loaded onto the scanner’s reconstruction pipeline. As each k‑space line set is acquired, a quick inverse Fourier transform yields an aliased image, which is fed through the SDAE. Because inference requires only forward propagation—essentially a series of matrix‑vector multiplications—the computational load per frame is minimal. Benchmarks on a modern GPU show average per‑frame processing times of under 8 ms, comfortably faster than typical acquisition intervals of 30–50 ms for cardiac or functional brain imaging.

Experimental Evaluation
The authors evaluate the method on two publicly available dynamic MRI datasets: a cardiac cine series and a functional brain series. They compare against three baselines: (1) an online CS reconstruction (L1‑SPIRiT), (2) a Kalman‑filter‑based dynamic reconstruction, and (3) simple zero‑filled reconstruction. Quantitative metrics (PSNR, SSIM) indicate that the SDAE achieves 34.2 dB PSNR and 0.92 SSIM on average, marginally surpassing L1‑SPIRiT (33.8 dB, 0.90) and clearly outperforming the Kalman and zero‑filled approaches. Qualitatively, the SDAE preserves fine anatomical edges and suppresses residual aliasing, especially in frames with rapid motion where linear models tend to blur.

Discussion and Limitations
A key insight is that the heavy computational burden is shifted entirely to the offline training phase; the inference stage is essentially “free” in terms of latency. However, the model’s performance can degrade if the coil geometry or undersampling pattern at deployment differs substantially from those seen during training. The authors suggest extensive data augmentation—including varied coil sensitivity maps and sampling masks—to improve generalization. Additionally, the current work focuses on 2‑D single‑frame reconstruction; extending the architecture to 3‑D volumes or incorporating explicit temporal recurrence (e.g., ConvLSTM) could further enhance performance for truly 4‑D dynamic studies.

Conclusion
The study demonstrates that a stacked denoising autoencoder can learn an effective non‑linear de‑aliasing function for dynamic MRI, delivering reconstruction quality comparable to state‑of‑the‑art online CS methods while meeting real‑time speed requirements. By decoupling the expensive learning phase from the latency‑critical inference phase, the approach offers a practical pathway toward on‑the‑fly MRI reconstruction in clinical settings. Future work will address model robustness across scanner configurations and explore higher‑dimensional extensions to fully exploit the temporal redundancy inherent in dynamic MRI.

💡 Research Summary

📜 Original Paper Content