Enhancing deep learning performance on burned area delineation from SPOT-6/7 imagery for emergency management

Enhancing deep learning performance on burned area delineation from SPOT-6/7 imagery for emergency management
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

After a wildfire, delineating burned areas (BAs) is crucial for quantifying damages and supporting ecosystem recovery. Current BA mapping approaches rely on computer vision models trained on post-event remote sensing imagery, but often overlook their applicability to time-constrained emergency management scenarios. This study introduces a supervised semantic segmentation workflow aimed at boosting both the performance and efficiency of BA delineation. It targets SPOT-6/7 imagery due to its very high resolution and on-demand availability. Experiments are evaluated based on Dice score, Intersection over Union, and inference time. The results show that U-Net and SegFormer models perform similarly with limited training data. However, SegFormer requires more resources, challenging its practical use in emergencies. Incorporating land cover data as an auxiliary task enhances model robustness without increasing inference time. Lastly, Test-Time Augmentation improves BA delineation performance but raises inference time, which can be mitigated with optimization methods like Mixed Precision.


💡 Research Summary

This paper presents a comprehensive study aimed at enhancing both the accuracy and operational efficiency of deep learning-based burned area (BA) delineation from very high-resolution SPOT-6/7 satellite imagery, specifically for time-critical emergency management scenarios following wildfires.

The research is grounded in the practical need for rapid and accurate damage assessment. It utilizes a dataset comprising ten wildfire events (nine in Greece for training/validation/testing and one in Spain for generalization assessment) provided by the Copernicus Emergency Management Service. The data includes post-event SPOT-6/7 imagery, binary BA masks, cloud masks, and land cover (LC) maps from ESA WorldCover, all processed into 512x512 pixel patches.

Methodologically, the study evaluates two main frameworks: Single-Task Learning (STL), which focuses solely on BA segmentation, and Multi-Task Learning (MTL), which incorporates an auxiliary LC segmentation task to provide additional contextual information. Two neural network architectures with comparable parameter counts are compared: a CNN-based U-Net with a ResNet34 backbone and a Transformer-based SegFormer with an MiT-B2 backbone. Performance is measured using Dice score and Intersection over Union (IoU) to handle class imbalance, alongside inference time as a critical metric for emergency response. The study also investigates the impact of Test-Time Augmentation (TTA) to boost accuracy and Mixed Precision (MP) training/inference to improve computational efficiency.

Key findings reveal that MTL consistently outperforms STL in segmentation metrics, as the auxiliary LC task acts as a regularizer and provides richer context without increasing inference time (since the auxiliary head is used only during training). While SegFormer and U-Net achieve similar Dice and IoU scores, SegFormer demands significantly more computational resources—approximately 2.5x longer training time and 2.4x higher GPU memory usage—making U-Net the more resource-efficient choice for emergency applications. TTA successfully improves model accuracy but at the cost of a ~60% increase in inference time. Crucially, implementing MP mitigates this cost, reducing training time by 30% and inference time by approximately 51%, enabling a combination of TTA and MP to yield models that are both more accurate and 31% faster at inference than the baseline STL U-Net.

The generalization test on the unseen Spanish site showed slightly better performance for MTL, but also highlighted a fundamental limitation: model errors often correlated with inaccuracies in the ground truth labels, underscoring the dependency of supervised learning on label quality. The final optimized workflow demonstrates that a U-Net-based MTL framework, enhanced with TTA and MP, offers an effective balance of performance and speed suitable for emergency mapping. The paper concludes by suggesting future work towards reducing label dependency through advanced MTL strategies, self-supervised learning, and domain adaptation techniques to improve model generalization across diverse geographical regions.


Comments & Academic Discussion

Loading comments...

Leave a Comment