WALINET: A water and lipid identification convolutional Neural Network for nuisance signal removal in 1H MR Spectroscopic Imaging
Purpose. Proton Magnetic Resonance Spectroscopic Imaging (1H-MRSI) provides non-invasive spectral-spatial mapping of metabolism. However, long-standing problems in whole-brain 1H-MRSI are spectral overlap of metabolite peaks with large lipid signal from scalp, and overwhelming water signal that distorts spectra. Fast and effective methods are needed for high-resolution 1H-MRSI to accurately remove lipid and water signals while preserving the metabolite signal. The potential of supervised neural networks for this task remains unexplored, despite their success for other MRSI processing. Methods. We introduce a deep-learning method based on a modified Y-NET network for water and lipid removal in whole-brain 1H-MRSI. The WALINET (WAter and LIpid neural NETwork) was compared to conventional methods such as the state-of-the-art lipid L2 regularization and Hankel-Lanczos singular value decomposition (HLSVD) water suppression. Methods were evaluated on simulated and in-vivo whole-brain MRSI using NMRSE, SNR, CRLB, and FWHM metrics. Results. WALINET is significantly faster and needs 8s for high-resolution whole-brain MRSI, compared to 42 minutes for conventional HLSVD+L2. Quantitative analysis shows WALINET has better performance than HLSVD+L2: 1) more lipid removal with 41% lower NRMSE, 2) better metabolite signal preservation with 71% lower NRMSE in simulated data, 155% higher SNR and 50% lower CRLB in in-vivo data. Metabolic maps obtained by WALINET in healthy subjects and patients show better gray/white-matter contrast with more visible structural details. Conclusions. WALINET has superior performance for nuisance signal removal and metabolite quantification on whole-brain 1H-MRSI compared to conventional state-of-the-art techniques. This represents a new application of deep-learning for MRSI processing, with potential for automated high-throughput workflow.
💡 Research Summary
The paper introduces WALINET, a deep‑learning framework designed to simultaneously suppress the dominant water and lipid signals that contaminate whole‑brain proton magnetic resonance spectroscopic imaging (¹H‑MRSI). Conventional post‑processing pipelines treat water and lipid removal separately: water is typically eliminated with Hankel‑Lanczos singular value decomposition (HLSVD), while lipid contamination is reduced using an L2‑regularized linear projection (1‑L). Both approaches require careful parameter tuning, are computationally intensive, and often fail to remove both nuisance components in a single step, especially at ultra‑high field (7 T) where B₀/B₁⁺ inhomogeneities exacerbate the problem.
WALINET builds on a modified Y‑Net architecture that incorporates two parallel encoders. Encoder E₁ receives the original spectrum x₁ = m + l + w (metabolite + lipid + water), while Encoder E₂ receives a version of the same spectrum projected onto the lipid subspace, x₂ = (1‑L) x₁. The two feature streams are concatenated and fed into a decoder that outputs an estimate y ≈ l + w. Subtracting y from the original spectrum yields the cleaned metabolite spectrum ˜m = x₁ − y. This dual‑input strategy enables the network to learn the distinct spectral signatures of water, lipids, and metabolites while exploiting contextual information from both inputs.
Training data were generated by combining 1.9 × 10⁶ simulated metabolite spectra (25 common brain metabolites, varied concentrations, linewidths, noise levels, and baselines) with experimentally measured water and lipid spectra extracted from 19 subjects (including two glioma patients). This hybrid approach captures realistic variability that pure simulation cannot reproduce, such as phase distortions, B₀/B₁⁺ inhomogeneity, and scalp‑origin lipid broadening. Spectra were augmented online by random complex phase multiplication and normalized by the estimated energy of the underlying metabolite signal. Real and imaginary parts were treated as separate channels. The network was trained for 400 epochs using the Adam optimizer (initial LR = 0.01, halved every 50 epochs) with a mean‑squared‑error loss computed on both channels.
Architecturally, each encoder and the decoder consist of four convolutional blocks (kernel = 7, two Conv layers per block, PReLU activation, dropout = 0.01, MaxPooling/Up‑sampling with factor 2). The number of feature maps starts at 16 and doubles after each down‑sampling step. Additional convolutional layers are placed at the bottleneck and after the decoder to enhance representation capacity. The Y‑Net design, compared with a standard U‑Net, yields superior contextual integration, which is crucial for disentangling overlapping spectral components.
Performance was evaluated on both simulated test sets and in‑vivo 2D Cartesian and 3D ECCENTRIC MRSI acquisitions at 7 T. In simulations, WALINET achieved an interquartile NRMSE of 0.86‑2.69 % for lipid removal versus 3.68‑6.45 % for L2 regularization, and 0.62‑1.45 % for metabolite preservation versus 1.04‑4.11 % for L2. In real 3D data, WALINET processed a full‑brain volume in ~8 seconds, whereas the conventional HLSVD + L2 pipeline required ~42 minutes. Quantitatively, metabolite signal‑to‑noise ratio (SNR) improved by 155 % and Cramér‑Rao lower bounds (CRLB) decreased by 50 % relative to the traditional approach, indicating more reliable LCModel fitting. Metabolic maps (NAA, Cr, Cho, etc.) displayed enhanced gray‑matter/white‑matter contrast and finer structural detail, underscoring the clinical relevance of the method.
The authors also introduced LIPNET, a variant trained only for lipid removal (water omitted), demonstrating that the Y‑Net framework can be flexibly adapted to single‑nuisance tasks. Generalization was tested by applying the 3D‑trained model to 2D data, where it still outperformed conventional lipid suppression, confirming robustness across acquisition schemes.
Limitations include the current focus on 7 T data with a 1Tx/32Rx head coil; transfer to other field strengths or coil configurations would likely require retraining. Moreover, while water and lipid are the dominant contaminants, other artifacts (e.g., motion, eddy currents) remain unaddressed. Future work could extend the dual‑input concept to incorporate additional priors or multi‑modal inputs (e.g., anatomical masks) to achieve comprehensive artifact mitigation.
In summary, WALINET represents a significant advance in MRSI post‑processing: it replaces labor‑intensive, parameter‑heavy linear methods with a fast (seconds), fully automated deep‑learning solution that simultaneously suppresses water and lipid signals while preserving metabolite fidelity. This enables high‑throughput, high‑resolution whole‑brain spectroscopic imaging, paving the way for broader clinical adoption of ¹H‑MRSI in neuro‑oncology, psychiatry, and other brain disorders.
Comments & Academic Discussion
Loading comments...
Leave a Comment