Deep learning models for SAR oil spill segmentation often fail to generalize across regions due to differences in sea-state, backscatter statistics, and slick morphology, a limitation that is particularly severe along the Peruvian coast where labeled Sentinel-1 data remain scarce. To address this problem, we propose MORP-Synth, a two-stage synthetic augmentation framework designed to improve transfer from Mediterranean to Peruvian conditions. Stage A applies Morphological Region Perturbation, a curvature guided label space method that generates realistic geometric variations of oil and look-alike regions. Stage B renders SAR-like textures from the edited masks using a conditional generative INADE model. We compile a Peruvian dataset of 2112 labeled 512×512 patches from 40 Sentinel-1 scenes (2014-2024), harmonized with the Mediterranean CleanSeaNet benchmark, and evaluate seven segmentation architectures. Models pretrained on Mediterranean data degrade from 67.8% to 51.8% mIoU on the Peruvian domain; MORP-Synth improves performance up to +6 mIoU and boosts minority-class IoU (+10.8 oil, +14.6 look-alike).
Deep Dive into Enhancing Cross Domain SAR Oil Spill Segmentation via Morphological Region Perturbation and Synthetic Label-to-SAR Generation.
Deep learning models for SAR oil spill segmentation often fail to generalize across regions due to differences in sea-state, backscatter statistics, and slick morphology, a limitation that is particularly severe along the Peruvian coast where labeled Sentinel-1 data remain scarce. To address this problem, we propose MORP-Synth, a two-stage synthetic augmentation framework designed to improve transfer from Mediterranean to Peruvian conditions. Stage A applies Morphological Region Perturbation, a curvature guided label space method that generates realistic geometric variations of oil and look-alike regions. Stage B renders SAR-like textures from the edited masks using a conditional generative INADE model. We compile a Peruvian dataset of 2112 labeled 512×512 patches from 40 Sentinel-1 scenes (2014-2024), harmonized with the Mediterranean CleanSeaNet benchmark, and evaluate seven segmentation architectures. Models pretrained on Mediterranean data degrade from 67.8% to 51.8% mIoU on the Pe
Marine oil spills are critical environmental hazards with long-lasting impacts on marine ecosystems, coastal livelihoods, and the economy. Rapid and accurate detection is essential for timely containment and mitigation measures. Among remote sensing techniques, Synthetic Aperture Radar (SAR), particularly Sentinel-1 imagery, has proven highly effective for oil spill monitoring due to its all-weather, day and night imaging capability [1,2]. However, interpreting SAR images remains challenging: oil slicks appear as low-backscatter regions that are often indistinguishable from look-alike phenomena [3,4] such as low-wind areas, biogenic films, or naturally calm waters. This ambiguity demands automated segmentation techniques capable of reliably separating true oil spills from false positives.
Over the past decade, deep learning has transformed oil spill analysis, outperforming classical thresholding and texture based approaches. A major milestone was the release of a benchmark dataset containing 1,112 Sentinel-1 scenes with pixel-level labels for oil, look-alikes, sea, land, and ships [5,6]. This benchmark enabled systematic evaluation of convolutional architectures such as U-Net [7] and DeepLabV3+ [8]. Subsequent work introduced multiscale contextual modeling and attention mechanisms to further reduce false positives [9,10,11,12]. These studies consistently demonstrate that modern deep neural networks surpass traditional SAR-based oil detection algorithms.
Despite these advances, geographic domain shift remains a major obstacle. Most segmentation models are trained on regional datasets primarily from European waters and struggle to generalize to dis-Graphical Abstract. MORP-Synth framework overview. tinct environmental regimes [11,13,14]. The Southeast Pacific (Peruvian coast) illustrates this challenge: strong upwelling, the Humboldt Current, and unique wind conditions produce SAR backscatter textures and slick morphologies markedly different from those in the Mediterranean or North Atlantic (Table 1). This mismatch causes foreign-trained models to degrade when applied locally. Furthermore, annotated SAR datasets for South America remain scarce, limiting the applicability of state-of-the-art models. The 2022 Ventanilla/REPSOL spill underscored the urgent need for region specific monitoring tools [15].
Another limitation of these models is the scarcity of representative training data in the target domain. Variations in sea surface roughness, oil properties, and incidence angles significantly impact model transferability. In practice, models pretrained on Mediterranean data often confuse Peruvian lookalikes with oil and fail to recognize slicks exhibiting different shapes or scattering signatures. While small-scale fine-tuning can mitigate this [16,17], the limited amount of labeled Peruvian SAR data fails to capture the full variability required for robust generalization.
To overcome data scarcity, recent studies have explored generative augmentation. Approaches like diffusion-based models and conditional GANs [18,19,20,12] have been used to simulate SAR imagery.
However, these methods typically operate in image space or lack explicit control over object morphology. They often fail to generate the specific, irregular slick geometries such as thin, fragmented trails driven by the Humboldt Current that are critical for improving generalization under severe domain shift.
Our approach. To address cross-domain generalization, we combine transfer learning with synthetic data augmentation. First, we curate a new SAR dataset of 40 oil spill events along the Peruvian coast (2014-2024), annotated in full alignment with the Mediterranean benchmark. Second, we introduce MORP-Synth, a two-stage augmentation pipeline tailored for data-scarce domains. Stage A applies a Morphological Region Perturbation (MORP) algorithm that modifies oil and look-alike shapes via controlled geometric edits, shifting, rotating, and smoothly warping connected components (Figure 2). Stage B employs a employs a conditional Generative Adversarial Network (cGAN) based on Instance Adaptive De-Normalization (INADE) [21,22] to generate realistic SAR-like textures aligned with the edited labels (Figure 3). By conditioning feature normalization on instance-level statistics, the model achieves high-fidelity synthesis that preserves spatial coherence and label-image consistency. Synthetic samples are then mixed with real Peruvian patches using a controlled synthetic-to-real weighting strat-egy.
Additionally, we adopt a training strategy tailored for severe class imbalance, combining class balanced cross entropy, focal Tversky losses [23,24,25,26], confusion-aware penalties, hard negative mining, and multi scale patch sampling to improve robustness in operational monitoring scenarios.
Contributions. The main contributions of this work are:
• Cross-domain benchmark and analysis: A new Peruvian Sentinel-1 oil spill dataset harmonized with the Medite
…(Full text truncated)…
This content is AI-processed based on ArXiv data.