GroupCDL: Interpretable Denoising and Compressed Sensing MRI via Learned Group-Sparsity and Circulant Attention
Nonlocal self-similarity within images has become an increasingly popular prior in deep-learning models. Despite their successful image restoration performance, such models remain largely uninterpretable due to their black-box construction. Our previous studies have shown that interpretable construction of a fully convolutional denoiser (CDLNet), with performance on par with state-of-the-art black-box counterparts, is achievable by unrolling a convolutional dictionary learning algorithm. In this manuscript, we seek an interpretable construction of a convolutional network with a nonlocal self-similarity prior that performs on par with black-box nonlocal models. We show that such an architecture can be effectively achieved by upgrading the L1 sparsity prior (soft-thresholding) of CDLNet to an image-adaptive group-sparsity prior (group-thresholding). The proposed learned group-thresholding makes use of nonlocal attention to perform spatially varying soft-thresholding on the latent representation. To enable effective training and inference on large images with global artifacts, we propose a novel circulant-sparse attention. We achieve competitive natural-image denoising performance compared to black-box nonlocal DNNs and transformers. The interpretable construction of our network allows for a straightforward extension to Compressed Sensing MRI (CS-MRI), yielding state-of-the-art performance. Lastly, we show robustness to noise-level mismatches between training and inference for denoising and CS-MRI reconstruction.
💡 Research Summary
This paper introduces GroupCDL, an interpretable convolutional neural network that incorporates non‑local self‑similarity (NLSS) priors for image denoising and compressed‑sensing MRI (CS‑MRI). Building on the authors’ earlier CDLNet, which unrolled a convolutional dictionary learning (CDL) algorithm with an ℓ₁ sparsity prior, the new architecture upgrades the soft‑thresholding step to an image‑adaptive group‑thresholding (GT) operation. GT implements a group‑sparsity regularizer defined by an adjacency matrix Γ that captures pixel‑wise similarity in the latent representation. Instead of using a fixed binary Γ, the network learns Γ on the fly, effectively turning the proximal operator into an attention‑driven, spatially varying thresholding mechanism.
To compute Γ efficiently for large images, the authors propose circulant‑sparse attention (CircAtt). CircAtt constructs a block‑circulant sparse matrix that encodes sliding‑window similarities across the whole image, reducing the quadratic cost of traditional patch‑based dense attention (PbDA) to linear‑in‑window size complexity while preserving shift‑invariance. This design enables end‑to‑end training on full‑resolution images and mitigates the redundancy and boundary artifacts inherent in overlapping‑patch processing.
Mathematically, the denoising problem is cast as a Basis Pursuit Denoising (BPDN) formulation solved by a proximal‑gradient method (PGM). The regularizer ψ is the group‑sparsity term (‖p(Iₘ⊗Γ)z²‖₁), whose proximal operator is approximated by GT:
GTτ(z;Γ)=z⊙max(1−τ·p(Iₘ⊗Γ)z²,0).
Each unrolled layer performs a gradient step on the data‑fidelity term followed by GT, with both the convolutional dictionary D and Γ learned jointly. This yields a fully interpretable pipeline where every operation corresponds to a known step in classical sparse coding.
Extensive experiments demonstrate that GroupCDL matches or exceeds state‑of‑the‑art black‑box models on standard denoising benchmarks (Set12, BSD68, Urban100). Notably, the network exhibits strong robustness to mismatches between training and test noise levels, a property inherited from CDLNet’s noise‑adaptive thresholding. In CS‑MRI, GroupCDL outperforms recent deep learning reconstructions on fastMRI and Stanford Knee datasets across multiple acceleration factors (4×, 8×). The circulant attention efficiently handles global aliasing artifacts, and the overall parameter count is substantially lower than that of transformer‑based baselines.
The authors also release a Julia implementation, emphasizing reproducibility and computational efficiency. They argue that the combination of learned group‑sparsity and circulant attention provides a principled, interpretable alternative to black‑box non‑local networks, with potential extensions to other inverse problems such as super‑resolution and JPEG artifact removal. In summary, GroupCDL delivers a transparent, mathematically grounded architecture that achieves competitive performance on both denoising and CS‑MRI while offering superior robustness and efficiency.
Comments & Academic Discussion
Loading comments...
Leave a Comment