VHU-Net: Variational Hadamard U-Net for Body MRI Bias Field Correction
Bias field artifacts in magnetic resonance imaging (MRI) scans introduce spatially smooth intensity inhomogeneities that degrade image quality and hinder downstream analysis. To address this challenge, we propose a novel variational Hadamard U-Net (VHU-Net) for effective body MRI bias field correction. The encoder comprises multiple convolutional Hadamard transform blocks (ConvHTBlocks), each integrating convolutional layers with a Hadamard transform (HT) layer. Specifically, the HT layer performs channel-wise frequency decomposition to isolate low-frequency components, while a subsequent scaling layer and semi-soft thresholding mechanism suppress redundant high-frequency noise. To compensate for the HT layer’s inability to model inter-channel dependencies, the decoder incorporates an inverse HT-reconstructed transformer block, enabling global, frequency-aware attention for the recovery of spatially consistent bias fields. The stacked decoder ConvHTBlocks further enhance the capacity to reconstruct the underlying ground-truth bias field. Building on the principles of variational inference, we formulate a new evidence lower bound (ELBO) as the training objective, promoting sparsity in the latent space while ensuring accurate bias field estimation. Comprehensive experiments on body MRI datasets demonstrate the superiority of VHU-Net over existing state-of-the-art methods in terms of intensity uniformity. Moreover, the corrected images yield substantial downstream improvements in segmentation accuracy. Our framework offers computational efficiency, interpretability, and robust performance across multi-center datasets, making it suitable for clinical deployment.
💡 Research Summary
The paper introduces VHU‑Net, a novel deep‑learning framework designed to correct bias field artifacts in body magnetic resonance imaging (MRI). Bias fields are smooth, low‑frequency multiplicative distortions that cause spatial intensity inhomogeneities, severely degrading image quality and hampering downstream tasks such as tissue segmentation. Traditional correction methods like N4ITK rely on strong assumptions of intensity uniformity across tissues, which often fail in heterogeneous body regions (abdomen, prostate, breast). Recent deep‑learning approaches improve speed and flexibility but typically depend on multiple annotations, synthetic bias generation, or solely on spatial smoothness, limiting their robustness across diverse anatomies.
VHU‑Net addresses these gaps by integrating three key innovations into a U‑Net‑style encoder‑decoder architecture. First, the encoder uses Convolutional Hadamard Transform blocks (ConvHTBlocks). Each block applies a standard 2‑D convolution followed by a channel‑wise Hadamard transform (HT). Because the HT matrix consists only of +1 and –1 entries, it is computationally cheap and concentrates most signal energy into a few low‑frequency coefficients. After the HT, a learnable scaling layer and a semi‑soft thresholding operator adaptively suppress high‑frequency noise while preserving the low‑frequency components that correspond to the bias field. This joint learning of convolutional filters and HT parameters enables the network to tailor frequency decomposition to the data.
Second, the decoder compensates for the HT’s inability to model inter‑channel dependencies by inserting an Inverse‑HT‑Reconstructed Transformer Block (IHTR‑TB). The IHTR‑TB first performs an inverse HT to return to the spatial domain, then employs a transformer layer that provides global, frequency‑aware attention across all positions. This mechanism captures long‑range relationships among the low‑frequency coefficients, ensuring that the reconstructed bias field is spatially coherent. A second semi‑soft thresholding step further refines any residual high‑frequency artifacts.
Third, the training objective is derived from variational inference. The authors formulate bias‑field correction as a latent variable model: given an observed image x, they aim to maximize the marginal log‑likelihood of the true bias field z. Because direct marginalization is intractable, they introduce an approximate posterior q(y|x) modeled by the encoder, where y is a latent representation in Hadamard space. The Evidence Lower Bound (ELBO) becomes
ELBO = 𝔼_q
Comments & Academic Discussion
Loading comments...
Leave a Comment