MMLSv2: A Multimodal Dataset for Martian Landslide Detection in Remote Sensing Imagery

MMLSv2: A Multimodal Dataset for Martian Landslide Detection in Remote Sensing Imagery
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

We present MMLSv2, a dataset for landslide segmentation on Martian surfaces. MMLSv2 consists of multimodal imagery with seven bands: RGB, digital elevation model, slope, thermal inertia, and grayscale channels. MMLSv2 comprises 664 images distributed across training, validation, and test splits. In addition, an isolated test set of 276 images from a geographically disjoint region from the base dataset is released to evaluate spatial generalization. Experiments conducted with multiple segmentation models show that the dataset supports stable training and achieves competitive performance, while still posing challenges in fragmented, elongated, and small-scale landslide regions. Evaluation on the isolated test set leads to a noticeable performance drop, indicating increased difficulty and highlighting its value for assessing model robustness and generalization beyond standard in-distribution settings. Dataset will be available at: https://github.com/MAIN-Lab/MMLS_v2


💡 Research Summary

The paper introduces MMLSv2, a multimodal remote‑sensing dataset designed for binary semantic segmentation of landslides on the Martian surface, specifically in the Valles Marineris canyon system. The dataset expands upon its predecessor (MMLSv1) by adding more than one hundred new image tiles, correcting annotation errors, and providing a rigorously defined train/validation/test split along with a completely isolated test set that originates from a geographically disjoint region. Each image tile is a 128 × 128‑pixel patch that contains seven aligned channels: red, green, and blue (RGB) optical bands from the Context Camera (CTX), a Digital Elevation Model (DEM) derived from MOLA/HRSC data, a slope map computed from the DEM, a thermal‑inertia layer sourced from THEMIS nighttime infrared mosaics, and two grayscale representations. All modalities are resampled to a common spatial resolution and share identical coordinate reference systems, ensuring pixel‑level correspondence across channels.

Data acquisition involved fusing high‑resolution optical imagery (≈6 m), thermal inertia maps (100 m), and topographic products (200 m) using ESRI ArcGIS. Landslide polygons were manually digitized following established geomorphological criteria, then rasterized into binary masks (1 = landslide, 0 = background). The authors refined the original annotation set by incorporating previously omitted landslides that lack clear depositional zones and by removing ambiguous depositional regions that could introduce label noise. This improves supervision quality and reduces spurious correlations that might otherwise bias deep‑learning models.

A key contribution of the work is its spatially aware partitioning strategy. Instead of random patch allocation, which can cause severe spatial leakage because neighboring patches share highly correlated visual patterns, the authors group patches into 2 × 2 spatial blocks based on their grid indices. For each block they compute a foreground ratio – the proportion of landslide pixels – and then perform quantile‑based stratified sampling to preserve a balanced distribution of easy (low foreground) and hard (high foreground) examples across the training, validation, and standard test sets. This yields 465 training patches, 66 validation patches, and 133 standard test patches, each with comparable foreground statistics (average foreground ratios around 30‑35 %). The isolated test set consists of 276 patches drawn from the opposite side of the Valles Marineris region, with a lower average foreground ratio (≈22 %) and reduced variance, intentionally presenting a sparser and structurally different distribution.

Benchmark experiments evaluated six state‑of‑the‑art segmentation architectures: U‑Net, U‑Net++, PSPNet, DeepLabV3, DeepLabV3+, and SegFormer. All models were trained under a unified configuration (Adam optimizer, learning rate = 0.001, step decay every 30 epochs, 100 epochs total, batch size = 128 on a single NVIDIA A100 GPU) using the full seven‑band input and basic data augmentations (random flips and rotations). Performance metrics included precision, recall, F1‑score, foreground IoU, background IoU, and mean IoU. On the standard test split, all models converged reliably and achieved mean IoU values between 0.68 and 0.73, demonstrating that the dataset supports stable training and competitive segmentation quality. However, when evaluated on the isolated test set, mean IoU dropped dramatically to the 0.45–0.52 range, indicating a substantial generalization gap. The authors attribute this decline to the isolated set’s distinct geomorphological context, sparser landslide occurrences, and more fragmented or elongated failure geometries that differ from patterns seen during training.

The paper’s contributions can be summarized as follows: (1) provision of a high‑quality, multimodal Martian landslide dataset with seven aligned channels; (2) introduction of a geographically isolated test set explicitly designed for assessing spatial generalization; (3) a rigorous, reproducible partitioning methodology that mitigates spatial leakage and balances sample difficulty; (4) baseline performance results for several leading segmentation models, highlighting both the dataset’s utility and the challenges it poses; and (5) open release of the dataset and code to foster reproducible research.

In conclusion, MMLSv2 represents a significant step forward for planetary remote‑sensing and machine‑learning research. By combining optical, topographic, and thermophysical information, correcting annotation noise, and enforcing spatially independent splits, the dataset offers a realistic benchmark for developing models that can generalize beyond the training region. Future work may explore advanced multimodal fusion techniques, class‑imbalance mitigation strategies (e.g., focal loss, oversampling), and domain adaptation approaches to bridge the performance gap observed on the isolated test set.


Comments & Academic Discussion

Loading comments...

Leave a Comment