Autoencoder-based Denoising Defense against Adversarial Attacks on Object Detection

Reading time: 5 minute
...

📝 Abstract

Deep learning-based object detection models play a critical role in real-world applications such as autonomous driving and security surveillance systems, yet they remain vulnerable to adversarial examples. In this work, we propose an autoencoder-based denoising defense to recover object detection performance degraded by adversarial perturbations. We conduct adversarial attacks using Perlin noise on vehicle-related images from the COCO dataset, apply a single-layer convolutional autoencoder to remove the perturbations, and evaluate detection performance using YOLOv5. Our experiments demonstrate that adversarial attacks reduce bbox mAP from 0.2890 to 0.1640, representing a 43.3% performance degradation. After applying the proposed autoencoder defense, bbox mAP improves to 0.1700 (3.7% recovery) and bbox mAP@50 increases from 0.2780 to 0.3080 (10.8% improvement). These results indicate that autoencoder-based denoising can provide partial defense against adversarial attacks without requiring model retraining.

💡 Analysis

Deep learning-based object detection models play a critical role in real-world applications such as autonomous driving and security surveillance systems, yet they remain vulnerable to adversarial examples. In this work, we propose an autoencoder-based denoising defense to recover object detection performance degraded by adversarial perturbations. We conduct adversarial attacks using Perlin noise on vehicle-related images from the COCO dataset, apply a single-layer convolutional autoencoder to remove the perturbations, and evaluate detection performance using YOLOv5. Our experiments demonstrate that adversarial attacks reduce bbox mAP from 0.2890 to 0.1640, representing a 43.3% performance degradation. After applying the proposed autoencoder defense, bbox mAP improves to 0.1700 (3.7% recovery) and bbox mAP@50 increases from 0.2780 to 0.3080 (10.8% improvement). These results indicate that autoencoder-based denoising can provide partial defense against adversarial attacks without requiring model retraining.

📄 Content

AUTOENCODER-BASED DENOISING DEFENSE AGAINST ADVERSARIAL ATTACKS ON OBJECT DETECTION School of Cybersecurity in Korea University Hacking and Countermeasure Research Lab (Min Geun Song, Gang Min Kim, Woonmin Kim, Yongsik Kim, Jeonghyun Sim, Sangbeom Park, Huy Kang Kim) December 19, 2025 ABSTRACT Deep learning-based object detection models play a critical role in real-world applications such as autonomous driving and security surveillance systems, yet they remain vulnerable to adversarial ex- amples. In this work, we propose an autoencoder-based denoising defense to recover object detection performance degraded by adversarial perturbations. We conduct adversarial attacks using Perlin noise on vehicle-related images from the COCO dataset, apply a single-layer convolutional autoencoder to remove the perturbations, and evaluate detection performance using YOLOv5. Our experiments demonstrate that adversarial attacks reduce bbox mAP from 0.2890 to 0.1640, representing a 43.3% performance degradation. After applying the proposed autoencoder defense, bbox mAP improves to 0.1700 (3.7% recovery) and bbox mAP@50 increases from 0.2780 to 0.3080 (10.8% improvement). These results indicate that autoencoder-based denoising can provide partial defense against adversarial attacks without requiring model retraining. Index Terms—Adversarial Attack, Autoencoder, Object Detection, YOLO Acknowledgment This work was supported by Institute for Information & Communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) (No. 2020-0-00374, Development of Security Primitives for Unmanned Vehicles). 1 Introduction Deep learning-based convolutional neural networks (CNNs) have achieved remarkable performance across various computer vision tasks, including image classification, object detection, and semantic segmentation [1]. In particular, real-time object detection models such as You Only Look Once (YOLO) serve as critical components in safety-critical applications including autonomous vehicles, security surveillance systems, and robotic vision. However, the discovery that deep learning models are vulnerable to adversarial examples has raised significant security concerns for real-world deployment [2]. Adversarial examples are inputs crafted by adding imperceptible perturbations to original images, causing deep learning models to produce incorrect predictions. This phenomenon, first discovered by Szegedy et al. [2], has led to the development of various attack methods. Goodfellow et al. [2] proposed the Fast Gradient Sign Method (FGSM) based on the linear nature of neural networks, while Madry et al. [3] introduced Projected Gradient Descent (PGD), an iterative extension that generates more powerful adversarial examples. These attacks have been extended beyond image classification to object detection and semantic segmentation [4], with demonstrated transferability across different detection models [5]. arXiv:2512.16123v1 [cs.CR] 18 Dec 2025 A PREPRINT - DECEMBER 19, 2025 Various defense mechanisms have been proposed to counter adversarial attacks. Adversarial training incorporates adversarial examples into the training data to enhance model robustness [3], but it is effective only against specific attacks and incurs high computational costs. Input transformation-based defenses apply image transformations such as bit-depth reduction, JPEG compression, and total variance minimization to remove adversarial perturbations [6]. The nature of these transformations makes them difficult for attackers to circumvent. Xie et al. [7] demonstrated that random resizing and padding at inference time can mitigate adversarial effects without additional training. Autoencoder-based defenses have also gained attention. MagNet [8] proposed using autoencoders to learn the manifold of normal data and reform adversarial examples toward this manifold. Liao et al. [9] addressed the error amplification effect through a High-level representation Guided Denoiser (HGD), achieving first place in the NIPS adversarial defense competition. However, existing research has primarily focused on image classification tasks, and the effectiveness of autoencoder- based defenses for object detection has not been systematically evaluated. In particular, quantitative analysis of how denoising-based defenses can recover detection performance in real-time detection models like YOLO remains limited. In this study, we propose an autoencoder-based denoising defense to recover object detection performance on adver- sarially perturbed images. We apply adversarial attacks using Perlin noise [10] to vehicle images from the COCO dataset [11], remove the noise through a single-layer convolutional autoencoder, and evaluate detection performance using YOLOv5. 2 Related Work 2.1 Adversarial Attacks Adversarial examples were first discovered by Szegedy et al., demonstrating that deep learning models are vulnerable to imperceptible perturbations [2]. Goodfellow

This content is AI-processed based on ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut