A Diffusion-Based Generative Prior Approach to Sparse-view Computed Tomography
The reconstruction of X-rays CT images from sparse or limited-angle geometries is a highly challenging task. The lack of data typically results in artifacts in the reconstructed image and may even lead to object distortions. For this reason, the use of deep generative models in this context has great interest and potential success. In the Deep Generative Prior (DGP) framework, the use of diffusion-based generative models is combined with an iterative optimization algorithm for the reconstruction of CT images from sinograms acquired under sparse geometries, to maintain the explainability of a model-based approach while introducing the generative power of a neural network. There are therefore several aspects that can be further investigated within these frameworks to improve reconstruction quality, such as image generation, the model, and the iterative algorithm used to solve the minimization problem, for which we propose modifications with respect to existing approaches. The results obtained even under highly sparse geometries are very promising, although further research is clearly needed in this direction.
💡 Research Summary
The paper tackles the challenging problem of reconstructing X‑ray computed tomography (CT) images from highly sparse or limited‑angle projection data, a scenario that typically leads to severe artifacts and loss of structural fidelity. To address this, the authors extend the Deep Generative Prior (DGP) framework by employing a diffusion‑based generative model—specifically a Denoising Diffusion Implicit Model (DDIM)—as the image prior. Their method, named Regularized Diffusion‑based Deep Generative Prior (RD‑DGP), combines three key innovations: (1) a physics‑informed initialization that leverages a coarse filtered back‑projection (FBP) reconstruction and maps it into the diffusion latent space via deterministic DDIM inversion, thereby providing a data‑consistent starting point that avoids poor local minima; (2) a regularized latent‑space optimization problem that augments the standard data‑fidelity term with a MAP‑style ℓ₂ prior on the latent code and a Total Variation (TV) regularizer on the generated image, balancing statistical consistency with anatomical smoothness; and (3) a cosine‑annealed learning‑rate schedule for the Adam optimizer, which enables aggressive early exploration of the latent space followed by fine‑grained refinement.
The authors detail the mathematical formulation of the forward and reverse diffusion processes, the deterministic DDIM mapping G(z), and the composite loss function F(z)=½‖K·G(z)−y‖²+λ₁‖z‖²+λ₂TV(G(z)). Optimization proceeds by back‑propagating gradients through G(z) and updating the latent vector z with the adaptive step size. The paper also provides a clear algorithmic pseudocode and a schematic illustration of the pipeline.
Experimental validation uses an augmented version of the publicly available Mayo Clinic chest CT dataset. The diffusion model is pre‑trained on 2‑D slices, and reconstruction performance is evaluated under 16, 32, and 64 view sparse‑view settings. Baselines include conventional FBP, TV‑regularized FBP, a VAE‑based DGP, and the recent DMPlug method that also integrates diffusion models with DGP. Quantitative metrics (PSNR, SSIM) show that RD‑DGP consistently outperforms all baselines, with gains of up to 3 dB in PSNR and 0.07 in SSIM in the most extreme 16‑view case. Qualitative visual inspection confirms reduced streaking artifacts, sharper anatomical edges, and better preservation of fine structures. Moreover, the cosine‑annealing schedule accelerates convergence by roughly 30 % compared to a fixed learning‑rate scheme.
The authors acknowledge limitations: the study is confined to 2‑D reconstructions, and extending the approach to full 3‑D volumes may pose memory and computational challenges. Additionally, the diffusion model is trained on a relatively modest medical dataset, raising questions about generalization to other anatomies or imaging modalities. Future work is suggested in the directions of multi‑scale or conditional diffusion models, hardware‑accelerated 3‑D implementations, and broader clinical validation across diverse datasets.
Importantly, the authors release the pre‑trained diffusion weights and the full source code (https://github.com/devangelista2/RD‑DGP) to promote reproducibility and foster further research. In summary, by integrating a physics‑driven initialization, dual regularization, and an adaptive learning‑rate strategy, the paper presents a robust and effective diffusion‑based DGP method that markedly improves sparse‑view CT reconstruction quality while preserving the interpretability and flexibility of model‑based approaches.
Comments & Academic Discussion
Loading comments...
Leave a Comment