TACK Tunnel Data (TTD): A Benchmark Dataset for Deep Learning-Based Defect Detection in Tunnels

December 16, 2025

Reading time: 5 minute

...

📝 Original Info

Title: TACK Tunnel Data (TTD): A Benchmark Dataset for Deep Learning-Based Defect Detection in Tunnels
ArXiv ID: 2512.14477
Date: 2025-12-16
Authors: Andreas Sjölander, Valeria Belloni, Robel Fekadu, Andrea Nascetti

📝 Abstract

Tunnels are essential elements of transportation infrastructure, but are increasingly affected by ageing and deterioration mechanisms such as cracking. Regular inspections are required to ensure their safety, yet traditional manual procedures are time-consuming, subjective, and costly. Recent advances in mobile mapping systems and Deep Learning (DL) enable automated visual inspections. However, their effectiveness is limited by the scarcity of tunnel datasets. This paper introduces a new publicly available dataset containing annotated images of three different tunnel linings, capturing typical defects: cracks, leaching, and water infiltration. The dataset is designed to support supervised, semi-supervised, and unsupervised DL methods for defect detection and segmentation. Its diversity in texture and construction techniques also enables investigation of model generalization and transferability across tunnel types. By addressing the critical lack of domain-specific data, this dataset contributes to advancing automated tunnel inspection and promoting safer, more efficient infrastructure maintenance strategies.

💡 Deep Analysis

📄 Full Content

TACK Tunnel Data (TTD): A Benchmark Dataset for Deep Learning-Based Defect Detection in Tunnels Andreas Sj¨olander1,*, Valeria Belloni2, Robel Fekadu1, and Andrea Nascetti1 1Civil and Architectural Engineering, KTH Royal Institute of Technology, Stockholm, Sweden 2Department of Civil, Building and Environmental Engineering, Sapienza University of Rome, Rome, Italy *corresponding author: Andreas Sj¨olander(asjola@kth.se) ABSTRACT Tunnels are essential elements of transportation infrastructure, but are increasingly affected by ageing and deterioration mechanisms such as cracking. Regular inspections are required to ensure their safety, yet traditional manual procedures are time-consuming, subjective, and costly. Recent advances in mobile mapping systems and Deep Learning (DL) enable automated visual inspections. However, their effectiveness is limited by the scarcity of tunnel datasets. This paper introduces a new publicly available dataset containing annotated images of three different tunnel linings, capturing typical defects: cracks, leaching, and water infiltration. The dataset is designed to support supervised, semi-supervised, and unsupervised DL methods for defect detection and segmentation. Its diversity in texture and construction techniques also enables investigation of model generalization and transferability across tunnel types. By addressing the critical lack of domain-specific data, this dataset contributes to advancing automated tunnel inspection and promoting safer, more efficient infrastructure maintenance strategies. Background & Summary Tunnels are critical components of infrastructure systems and are typically designed for a technical lifespan of 100 years or more. However, in many countries, existing tunnels are ageing, as most construction materials degrade over time due to natural and mechanical deterioration mechanisms such as cracking and water infiltration. To maintain structural integrity throughout their intended lifespan, regular inspection and maintenance are essential. Consequently, infrastructure owners must continuously plan and optimize monitoring and maintenance strategies. The aim is to minimize the risk of failure that can result in catastrophic consequences, including loss of life, while also limiting downtime that disrupts transportation networks and causes significant economic losses. Traditionally, inspections are performed on-site by experts with basic tools such as hammers and headlamps. For detailed inspections of the concrete lining, skylifts are often required to access the tunnel surface closely. This approach is time-consuming, labor-intensive, prone to human errors, and inherently subjective1. Today, sensors can be easily placed on Mobile Mapping Systems (MMS), enabling automatic collection of infrastructure data. Specifically, high-resolution cameras mounted on an MMS can be easily used to scan the tunnel and collect a large number of images that depict the tunnel lining. This allows the tunnel to be remotely inspected in the office, which reduces the tunnel closing time and improves inspection protocols1–3. However, the inspection is still performed mainly visually. Therefore, the key challenge is to achieve sufficient accuracy in damage detection, mainly cracks, using images collected with high-resolution cameras and automatic image processing techniques. Automatic damage detection methods can be categorized into two main approaches: traditional image processing4 and Machine Learning/Deep Learning (ML/DL) techniques5. Among these, DL methods have recently gained increasing attention due to their superior capabilities in representation learning. DL techniques are generally categorized into three groups: supervised, semi-supervised, and unsupervised learning. Supervised learning typically relies on large annotated datasets to train models such as Convolutional Neural Networks (CNNs) and Transformers for classification, object detection, or segmentation tasks. Supervised models require extensive and labour-intensive labelling for training and testing the model effectively, since the quality and quantity of labeled data highly affect the performance of supervised learning. Semi-supervised learning seeks to overcome these limitations by combining a small set of labelled data with a larger volume of unlabeled data. Techniques such as consistency regularization, pseudo-labeling, and self-training help propagate label information from annotated samples to unlabeled ones. Finally, unsupervised methods such as autoencoders, clustering algorithms, diffusion models, and generative adversarial networks aim to eliminate the need for labeled data, aiming to enhance generalization across diverse datasets. Supervised learning has been extensively explored for crack detection, primarily using CNNs5–13, Transformers14–16, or hybrid architectures that combine both17. In contrast, semi-supervised18,19 and unsupervised20,21 methods have only recently begun to receive attention. a

📄 Read Full PDF on ArXiv