MineInsight: A Multi-sensor Dataset for Humanitarian Demining Robotics in Off-Road Environments

MineInsight: A Multi-sensor Dataset for Humanitarian Demining Robotics in Off-Road Environments
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

The use of robotics in humanitarian demining increasingly involves computer vision techniques to improve landmine detection capabilities. However, in the absence of diverse and realistic datasets, the reliable validation of algorithms remains a challenge for the research community. In this paper, we introduce MineInsight, a publicly available multi-sensor, multi-spectral dataset designed for off-road landmine detection. The dataset features 35 different targets (15 landmines and 20 commonly found objects) distributed along three distinct tracks, providing a diverse and realistic testing environment. MineInsight is, to the best of our knowledge, the first dataset to integrate dual-view sensor scans from both an Unmanned Ground Vehicle and its robotic arm, offering multiple viewpoints to mitigate occlusions and improve spatial awareness. It features two LiDARs, as well as images captured at diverse spectral ranges, including visible (RGB, monochrome), visible short-wave infrared (VIS-SWIR), and long-wave infrared (LWIR). Additionally, the dataset provides bounding boxes generated by an automated pipeline and refined with human supervision. We recorded approximately one hour of data in both daylight and nighttime conditions, resulting in around 38,000 RGB frames, 53,000 VIS-SWIR frames, and 108,000 LWIR frames. MineInsight serves as a benchmark for developing and evaluating landmine detection algorithms. Our dataset is available at https://github.com/mariomlz99/MineInsight.


💡 Research Summary

The paper addresses a critical gap in humanitarian demining research: the lack of realistic, diverse, and multi‑modal datasets for validating computer‑vision‑based landmine detection algorithms. Existing datasets are largely UAV‑centric, limited to a single sensor modality, and often lack comprehensive annotations, making it difficult to assess algorithm performance under the complex conditions encountered by ground‑based demining robots. To fill this void, the authors introduce MineInsight, a publicly available dataset collected with an Unmanned Ground Vehicle (UGV) equipped with a robotic arm, offering dual‑view sensor streams and a rich set of spectral modalities.

Hardware configuration: The UGV platform hosts two LiDAR units (a 360° LiDAR on the mobile base and a 70° LiDAR on the arm) and four cameras—monochrome, RGB, visible short‑wave infrared (VIS‑SWIR), and long‑wave infrared (LWIR). The sensors are split between the base and the arm, providing two distinct viewpoints that help mitigate occlusions caused by vegetation, debris, or soil. Data acquisition is performed using two onboard computers (an industrial PC for vehicle control and an NVIDIA Jetson Orin for high‑throughput sensor processing) synchronized via hardware Precision Time Protocol (PTP). Calibration of intrinsics and extrinsics follows established toolkits (Kalibr for RGB/VIS‑SWIR, MR‑T for LWIR) and a target‑less method for LiDAR‑camera alignment, ensuring accurate sensor fusion.

Dataset composition: Three test tracks were prepared, each populated with 35 objects—15 inert landmines (balanced between antipersonnel and antitank types) and 20 everyday items (metal cans, plastic bottles, etc.) that serve as realistic false‑positive distractors. Objects were left overnight to reach thermal equilibrium, enabling meaningful LWIR contrast during night recordings. Data were captured under twelve environmental scenarios combining day/night cycles, low to high vegetation density, and varying soil moisture. The collection yields approximately 38 000 RGB frames, 53 000 VIS‑SWIR frames, and 108 000 LWIR frames, together with synchronized point clouds from both LiDARs.

Annotation pipeline: An automated detection and clustering stage first generates provisional 2D bounding boxes using multi‑modal cues. Human annotators then review and refine these boxes, producing high‑quality labels for each frame. The final annotation set includes class labels (landmine vs. common object), unique instance IDs, and precise bounding box coordinates, facilitating both object detection and instance‑level analysis.

Contribution and impact: MineInsight is the first dataset to combine dual‑view UGV/arm sensor streams, multi‑spectral imaging, and depth information for humanitarian demining. Its breadth—spanning visible, short‑wave infrared, and long‑wave infrared spectra—enables research on sensor fusion, multimodal deep learning, and robustness to illumination changes. The dual‑view design directly addresses occlusion challenges by providing alternative perspectives when a target is hidden from one camera. Moreover, the standardized ROS 2 bag format and comprehensive metadata promote reproducibility and ease of integration into existing robotics pipelines.

The authors benchmark the dataset against prior works, highlighting that earlier collections either lack LiDAR data, are UAV‑only, or provide limited spectral coverage. By releasing MineInsight on GitHub (https://github.com/mariomlz99/MineInsight) with open‑source tools for loading and visualizing the data, the paper invites the community to develop, compare, and improve landmine detection algorithms under realistic off‑road conditions. Future extensions may include additional tracks, dynamic weather, and expanded object sets, further solidifying MineInsight as a foundational resource for safe, efficient humanitarian demining robotics.


Comments & Academic Discussion

Loading comments...

Leave a Comment