VAMOS-OCTA: Vessel-Aware Multi-Axis Orthogonal Supervision for Inpainting Motion-Corrupted OCT Angiography Volumes

VAMOS-OCTA: Vessel-Aware Multi-Axis Orthogonal Supervision for Inpainting Motion-Corrupted OCT Angiography Volumes
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Handheld Optical Coherence Tomography Angiography (OCTA) enables noninvasive retinal imaging in uncooperative or pediatric subjects, but is highly susceptible to motion artifacts that severely degrade volumetric image quality. Sudden motion during 3D acquisition can lead to unsampled retinal regions across entire B-scans (cross-sectional slices), resulting in blank bands in en face projections. We propose VAMOS-OCTA, a deep learning framework for inpainting motion-corrupted B-scans using vessel-aware multi-axis supervision. We employ a 2.5D U-Net architecture that takes a stack of neighboring B-scans as input to reconstruct a corrupted center B-scan, guided by a novel Vessel-Aware Multi-Axis Orthogonal Supervision (VAMOS) loss. This loss combines vessel-weighted intensity reconstruction with axial and lateral projection consistency, encouraging vascular continuity in native B-scans and across orthogonal planes. Unlike prior work that focuses primarily on restoring the en face MIP, VAMOS-OCTA jointly enhances both cross-sectional B-scan sharpness and volumetric projection accuracy, even under severe motion corruptions. We trained our model on both synthetic and real-world corrupted volumes and evaluated its performance using both perceptual quality and pixel-wise accuracy metrics. VAMOS-OCTA consistently outperforms prior methods, producing reconstructions with sharp capillaries, restored vessel continuity, and clean en face projections. These results demonstrate that multi-axis supervision offers a powerful constraint for restoring motion-degraded 3D OCTA data. Our source code is available at https://github.com/MedICL-VU/VAMOS-OCTA.


💡 Research Summary

Handheld optical coherence tomography angiography (OCTA) enables high‑resolution, dye‑free imaging of retinal microvasculature in uncooperative or pediatric patients, but its slice‑by‑slice acquisition makes it extremely vulnerable to sudden bulk motion. When motion occurs, entire B‑scans can be unsampled, producing blank bands in the en‑face maximum‑intensity projection (MIP) and severely degrading diagnostic utility. Existing post‑acquisition remedies either rely on 2‑D generative models that lack inter‑slice consistency, or focus solely on improving the en‑face projection while neglecting the fidelity of individual cross‑sectional B‑scans. Consequently, reconstructions often appear over‑smoothed, lose fine capillary detail, or exhibit horizontal banding artifacts.

The authors introduce VAMOS‑OCTA, a deep‑learning framework that jointly restores the missing B‑scan and enforces vascular continuity across orthogonal planes. The core network is a 2.5‑D U‑Net that receives a stack of nine neighboring B‑scans (including potentially corrupted slices) and predicts the central, motion‑corrupted B‑scan. By operating in 2.5‑D, the model preserves depth information while exploiting contextual cues from adjacent slices without requiring a full 3‑D convolutional backbone.

Training data consist of seven high‑resolution retinal volumes (1000 B‑scans each, 400 × 145 pixels) acquired from awake patients using a custom handheld OCT/OCT‑A probe. To obtain paired corrupted‑clean examples, the authors simulate bulk motion by randomly removing the target slice and a geometrically‑distributed number (p = 0.4, max = 6) of neighboring slices, thereby creating contiguous gaps of varying length. This corruption is applied dynamically at each epoch, ensuring the network sees a wide variety of motion patterns; fixed masks are used for validation and testing to guarantee fair comparisons.

The novel VAMOS loss combines three complementary terms:

  1. Vessel‑weighted MSE (wMSE) – a pixel‑wise L2 loss multiplied by a spatial weight that emphasizes true vessel intensities (target‑based term) while penalizing hallucinated bright spots (prediction‑based term). This follows the SOAD approach but adds a sub‑linear exponent to temper outliers.

  2. Axial projection losses – L1 losses on both the maximum‑intensity projection (MIP) and average‑intensity projection (AIP) computed along the depth axis. These constraints directly shape the en‑face view, encouraging realistic vessel density and peak intensities.

  3. Lateral projection losses – analogous MIP and AIP losses computed across the lateral dimension of each B‑scan. By regularizing the vessel distribution within individual slices, these terms suppress the horizontal banding that arises when only axial supervision is used.

The total loss is (L_{VAMOS}=L_{wMSE}+ \lambda_{proj}(L_{axial}^{MIP}+L_{axial}^{AIP}+L_{lateral}^{MIP}+L_{lateral}^{AIP})) with (\lambda_{proj}=3), giving higher priority to projection consistency.

Quantitative evaluation uses perceptual metrics (LPIPS, Laplacian blur difference, Sobel edge preservation) for B‑scan quality and pixel‑wise metrics (L1, Mean Intensity Error, SSIM, NCC, PSNR) for the en‑face MIP. Compared to the baseline MSE, SOAD wMSE, and a version with only axial projection, VAMOS‑OCTA achieves statistically significant improvements across all measures (e.g., LPIPS reduced from 0.608 to 0.510, Sobel edge preservation increased from 0.313 to 0.427, SSIM of MIP improved from 0.888 to 0.895). Qualitative examples show that VAMOS‑OCTA restores sharp capillary contrast, eliminates horizontal banding, and produces smooth, anatomically coherent en‑face projections even when up to six consecutive B‑scans are missing.

The authors also demonstrate robustness on real‑world motion‑corrupted volumes without ground truth, where the method successfully fills large dropout regions and preserves vascular topology. Importantly, the approach does not require explicit 3‑D modeling or volumetric convolutions, making it computationally efficient while still delivering volumetric consistency through orthogonal projection constraints.

In discussion, the paper highlights that multi‑axis orthogonal supervision offers a more effective regularizer than intensity‑only losses, and that the 2.5‑D design balances context utilization with memory efficiency. Limitations include the relatively small subject pool and reliance on synthetic motion for training; future work is suggested on scaling to larger clinical datasets, integrating vessel segmentation downstream, and optimizing the model for real‑time handheld deployment.

Overall, VAMOS‑OCTA presents a compelling solution to the longstanding problem of motion‑induced B‑scan loss in handheld OCTA, delivering both perceptually sharp cross‑sectional images and accurate en‑face vascular maps through a well‑designed multi‑axis loss framework.


Comments & Academic Discussion

Loading comments...

Leave a Comment