Detection of Adversarial Attacks in Robotic Perception

Deep Neural Networks (DNNs) achieve strong performance in semantic segmentation for robotic perception but remain vulnerable to adversarial attacks, threatening safety-critical applications. While robustness has been studied for image classification,…

Authors: Ziad Sharawy, Mohammad Nakshb, i

Detection of Adversarial Attacks in Robotic Perception
Detection of Adv ersarial A ttac ks in Rob otic P erception Ziad Shara wy , Mohammad Naksh bandi , and Sorin Mihai Grigorescu Departmen t of Mec hatronics and Robotics, F acult y of Electrical Engineering and Computer Science, T ransylv ania Univ ersity of Bras , o v, Romania ahmed.sharawy@unitbv.ro, mohammed.nakshbandi@unitbv.ro, s.grigorescu@unitbv.ro Ziad Sharawy , Mohammad-Maher Nakshbandi and Sorin Grigorescu are with the Rob otics, Vision and Con trol Lab oratory (Ro visLab, https://www.ro vislab.com), T ransilv ania Universit y of Bras , o v, Romania. The GitHub co de is a v ailable at https://gith ub.com/Ro visLab/Cyb erAI Ziad. Abstract. Deep Neural Netw orks (DNNs) ac hieve strong performance in seman tic segmenta- tion for rob otic p erception but remain vulnerable to adversarial attacks, threatening safet y- critical applications. While robustness has b een studied for image classification, seman tic segmen tation in rob otic con texts requires sp ecialized architectures and detection strategies. W e prop ose a framework for detecting adversarial attacks using pre-trained ResNet-18 and ResNet-50 models. Our metho d lev erages adv anced feature extraction and statistical metrics to distinguish clean from adversarial inputs. Exp erimen ts demonstrate its effectiveness across v arious attacks, offering insights into mo del robustness. Additionally , we compare netw ork arc hitectures to iden tify factors that enhance resilience. This work supp orts the developmen t of secure autonomous systems b y pro viding practical detection tools and guidance for selecting robust segmentation models. 1 In tro duction The field of computer vision has transformed o v er the past decade, driven b y deep learning adv ances [9, 16]. Seman tic pixel-wise segmen tation, which assigns a class label to every pixel [17], is critical for applications suc h as autonomous driving [4], medical imaging [23], urban plan ning, and robotics, where accurate visual understanding impacts decision-making. T raditional segmentation based on hand-crafted features often fails in complex scenarios [13], whereas deep CNNs enable hierarc hical feature extraction, significan tly improving p erformance [14, 15]. In safet y-critical systems like autonomous v ehicles and rob otics, robust p erception must with- stand adversarial attac ks—carefully crafted p erturbations designed to mislead mo dels [10, 25]. Such attac ks can cause misclassifications, p oten tially leading to failures and ero ding trust [3, 21]. Detect- ing adv ersarial inputs is therefore crucial for safety and reliability . This work targets adversarial detection in robotic visual p erception by extending pre-trained ResNet-18 and ResNet-50 [14] for dense seman tic feature extraction. By combining adv anced fea- ture extraction with nov el detection strategies [19, 26], our framework distinguishes original from adv ersarial images, enhancing segmentation reliability and mitigating adversarial risks. Ov erall, this contribution promotes secure and trustw orthy autonomous systems, emphasizing proactiv e adversarial detection for safer AI deploymen t in critical applications. Keyw ords: Adversarial Attac ks, Semantic Segmentation, Rob otic P erception, Deep Learning Se- curit y , Adversarial Detection Metho ds, ResNet, Computer Vision 2 Ziad Sharawy , Mohammad Nakshbandi , and Sorin Mihai Grigorescu 2 Related W ork Adv ersarial-example detection complements robustness-based defenses by identifying maliciously p erturbed inputs rather than directly making mo dels attack-resistan t. Early statistical approac hes detect adv ersarial inputs via distributional shifts in net work activ ations [12], while anomaly-detection framew orks treat adv ersarial examples as outliers in feature space [19]. Mo del-uncertain ty-based metho ds use predictiv e en tropy and m utual information to flag inputs in lo w-confidence regions [24]. Robust detection strategies combining input transformations and feature squeezing compare predic- tions across transformed inputs to detect inconsistencies caused by adversarial p erturbations [18]. Graph-based metho ds mo del relationships in the netw ork’s latent space. Laten t neighborho o d graphs of activ ations enable detection through disruptions in the manifold structure of b enign exam- ples [1]. F requency-domain analyses exploit high-frequency p erturbations, allowing reconstruction- based anomaly detection [29]. Similarly , univ ersal detectors compare deep features across la yers against reference clean examples [20], while semantic-a w are approac hes, such as seman tic graph matc hing, detect p erturbations disrupting contextual ob ject relationships [28]. In text, embedding- lev el and syntactic consistency chec ks identify adversarial manipulations [27]. Despite these adv ances, many metho ds remain vulnerable to adaptiv e attac ks [2], motiv ating our w ork to improv e reliability against b oth known and unseen attac k types. 3 Metho d 3.1 Problem definition This pap er addresses separating original and adversarial images using pre-trained ResNet-18 and ResNet-50 for semantic segmen tation. Our con tribution is a sp ecialized adversarial detection metho d for rob otic p erception, leveraging feature extraction tailored for segmentation tasks and ev aluating m ultiple attack types to assess mo del resilience in safety-critical applications. Seman tic segmentation assigns a class label to eac h pixel. F or input I ∈ R H × W × C , a deep net work extracts lo w-level (edges, textures) and high-level (ob ject shap es, context) features, pro- ducing a lab el map L ∈ R H × W × K , where L i,j giv es the predicted class of I i,j . The resulting dense classification can b e visualized using a custom color map, enabling precise scene understanding in autonomous driving, medical imaging, en vironmental monitoring, and rob otic p erception. Adv ersarial detection complemen ts robustness defenses by flagging likely perturb ed inputs. Early metho ds used predictive entrop y and m utual information [24], but adaptive attacks can bypass man y detectors [2]. More adv anced strategies include graph-based analysis of latent activ ations [1] and denoising comparisons [12]. F or a broader ov erview, see [7]. General Approac h. Detection iden tifies if x is adversarial for a classifier F ( x ), enabling interv en tion in semi-autonomous systems. Metrics. - Confidenc e Sc or e: F ( x ) ˆ y , often unreliable [18]. - Non-Maximal Entr opy (non-ME): non-ME( x ) = P i  = ˆ y F ( x ) i log( F ( x ) i ) [18]. - Kernel Density (K-density): K D ( x ) = P x i ∈ X ˆ y k ( z i , z ), in tegrating confidence and non-ME [18]. Thresholding. Inputs with metric abov e threshold T are classified as normal; otherwise flagged as adv ersarial [18]. Challenges and Solutions. Adv ersarial detection is difficult due to attac k prev alence and p oten tial degradation of normal accuracy . Ma et al. [18] combine thresholding with K-density to detect attac ks effectively with minimal extra training. 3. METHOD 3 3.2 Adv ersarial A ttack Detector Pre-pro cessing. Input images are randomly cropp ed and resized to 224 × 224, flipp ed horizontally with 50% probabilit y , and normalized using ImageNet’s mean and standard deviation: µ = [0 . 485 , 0 . 456 , 0 . 406] , σ = [0 . 229 , 0 . 224 , 0 . 225] These transformations standardize the inputs and impro ve mo del robustness. Mo del Setup. The Adv ersarial A ttack Detector is trained on 100 classes using pre-trained ResNet- 18 and ResNet-50. The final fully connected la yer is replaced with a 102-class output, while all other la yers are frozen. After training, the mo del is sav ed and later reloaded for inference with visual examples of ResNet-18 predictions shown in Figure 1. A secondary 2-class mo del is then created b y slicing weigh ts from the 102-class model. Finally , inference and visualization are p erformed on unseen images to displa y predictions. 4 Ziad Sharawy , Mohammad Nakshbandi , and Sorin Mihai Grigorescu (a) ResNet-18 (b) ResNet-50 Fig. 1: Classifier predictions on clean and adv ersarial images. (a) ResNet-18 sho ws unstable predic- tions with frequen t misclassifications. (b) ResNet-50 demonstrates robust, consistent performance against adv ersarial p erturbations. 3.3 T raining Loss & Optimization. Cross-En tropy Loss measures the discrepancy betw een predicted probabil- ities and true lab els. Mo del parameters are up dated via Sto c hastic Gradient Descent (SGD) with learning rate 0.001 and momen tum 0.9, applied only to the classifier. 4. PERFORMANCE EV ALUA TION 5 Inference. F or unseen images, predicted class lab els are obtained using the arg max of output probabilities. Mo del Arc hitecture. A ResNet backbone (ResNet-18 or ResNet-50) with optional ImageNet- pretrained weigh ts is used. The original fully-connected head is replaced with a linear classifier for the target num b er of classes (default 100), freezing all backbone parameters. Inputs are pre- pro cessed with random resized crop (224 × 224), center crop for v alidation, and normalized with ImageNet mean and std. Batc h size is 4. F or binary adversarial detection, a 2-logit head is created b y cop ying the first t wo rows of the trained c lassifier. This p erformance difference is further visualized in the classifier predictions for ResNet-18 Figure 1 and ResNet-50 Figure 2, demonstrating the mo dels’ resp onses to inputs 4 P erformance Ev aluation Ev aluation Metrics. Cross-Entrop y Loss ( L C E ) measures the discrepancy b et w een predicted probabilities and true lab els, ignoring “ignore” lab els. Intersection ov er Union (IoU) computes the ratio of true p ositiv es to the union of predicted and ground-truth pixels, with mean IoU (mIoU) a veraging ov er all classes. The Dice/F1 Score acc oun ts for true positives, false p ositiv es, and false negativ es. Pixel Accuracy (P A) gives the prop ortion of correctly classified pixels, excluding “ignore” lab els. Lost Classes. Lost Classes are identified by comparing baseline class sets with adversarial predictions, highlighting classes that were misclassified or omitted during attac ks, providing insigh t in to mo del robustness under p erturbations. The impact of increasing FGSM strength (e) on segmentation metrics for the detector is quan- tified in T able 1. This degradation, where accuracy and mIoU decrease as e increases, is visually represen ted in Figure 2 The trends in ResNet-50’s v alidation p erformance metrics, including accuracy , loss, and F1-score, are graphically represen ted in Figure 3. ResNet-50 outp erforms ResNet-18 in all metrics (gains 0.04–0.06) and shows lo wer loss, as its greater depth enables extraction of more complex features, reducing generalization error and impro ving ov erall accuracy . Some individual (epoch, phase) rows sho w large gaps (up to 0 . 6 in accuracy), including: – Ep och 43, train : ResNet-18 = 0.2 vs. ResNet-50 = 0.8 ( ∆ = − 0 . 6) – Ep och 1, train : ResNet-18 = 0.2 vs. ResNet-50 = 0.8 ( ∆ = − 0 . 6) – Ep och 29, train : ResNet-18 = 0.4 vs. ResNet-50 = 1.0 ( ∆ = − 0 . 6) – Ep och 16, train : ResNet-18 = 0.4 vs. ResNet-50 = 1.0 ( ∆ = − 0 . 6) – Ep och 40, train : ResNet-18 = 0.6 vs. ResNet-50 = 1.0 ( ∆ = − 0 . 4) 6 Ziad Sharawy , Mohammad Nakshbandi , and Sorin Mihai Grigorescu T able 1: FGSM Attac k Metrics ϵ pixel acc mIoU P A mAcc mIoU agg mF1 0.00 1.00 1.00 1.00 1.00 1.00 1.00 0.02 0.78 0.48 0.78 0.56 0.48 0.59 0.04 0.65 0.26 0.65 0.33 0.26 0.32 0.05 0.61 0.19 0.61 0.29 0.19 0.24 0.06 0.59 0.19 0.59 0.26 0.19 0.25 0.07 0.57 0.17 0.57 0.23 0.17 0.22 0.08 0.54 0.13 0.54 0.19 0.13 0.18 0.09 0.52 0.11 0.52 0.18 0.11 0.15 0.10 0.49 0.10 0.49 0.16 0.10 0.13 Fig. 2: Effect of increasing F GSM attack strength ( ϵ ) on segmen tation p erformance. As ϵ rises, accuracy and mIoU drop, and several classes are completely lost. T able 1 and Figure 2 show that F GSM attacks sharply reduce p erformance: mIoU drops from 1.00 to 0.48 at e = 0.02 and to 0.10 at e = 0.10, highlighting the need for the prop osed detection framew ork. T able 2: ResNet-18 V alidation Metrics (Epo chs 1–10) Ep och Phase Loss Accuracy Precision Recall F1-Score (Macro) (Macro) (Macro) 1 v al 0.5 1.0 1.0 1.0 1.0 2 v al 0.6 0.7 0.7 0.5 0.8 3 v al 0.5 0.7 0.7 0.5 0.8 4 v al 0.9 0.7 0.7 0.5 0.8 5 v al 1.1 0.7 0.7 0.5 0.8 6 v al 1.0 0.7 0.7 0.5 0.8 7 v al 0.5 0.7 0.7 0.5 0.8 8 v al 0.4 0.7 0.7 0.5 0.8 9 v al 0.5 1.0 1.0 1.0 1.0 10 v al 0.3 1.0 1.0 1.0 1.0 Fig. 3: ResNet-18 v alidation p erformance across epo chs 1–10, including accuracy , pre- cision, recall, F1-score, and loss. The v alidation metrics for the ResNet-18 mo del across the initial ten ep ochs are detailed in T able 2 These v alidation results are also plotted for visualization, as shown in Figure 3. In contrast, the v alidation p erformance for the ResNet-50 mo del during ep ochs 1-10 is provided in T able 3 4. PERFORMANCE EV ALUA TION 7 T able 3: ResNet-50 V alidation Metrics (Epo chs 1–10) Ep och Phase Loss Accuracy Precision Recall F1-Score (Macro) (Macro) (Macro) 1 v al 0.7 0.7 0.7 0.5 0.8 2 v al 0.8 0.7 0.7 0.5 0.8 3 v al 1.2 0.7 0.7 0.5 0.8 4 v al 1.3 0.7 0.7 0.5 0.8 5 v al 0.9 0.7 0.7 0.5 0.8 6 v al 0.6 0.7 0.7 0.5 0.8 7 v al 0.7 0.3 0.3 0.5 0.5 8 v al 0.6 0.7 0.8 0.8 0.7 9 v al 0.5 0.7 0.7 0.5 0.8 10 v al 0.7 0.7 0.7 0.5 0.8 Fig. 4: ResNet-50 v alidation performance across epo chs 1–10, including accuracy , pre- cision, recall, F1-score, and loss. Large accuracy swings (up to 0.6) rev eal training instabilit y in ResNet-18, with brief p eaks follo wed by drops, indicating conv ergence issues. ResNet-50, in contrast, shows stable, consistent learning, offering greater robustness and b etter resilience to adversarial p erturbations for rob otic p erception tasks. F urther analysis includes the visualization of FGSM adv ersarial examples, illustrating clean and p erturbed inputs across increasing e v alues (Figure 5). Fig. 5: Visualization of FGSM adversarial examples [11] on a DeepLabV3+ [5] mo del with ResNet- 18 [14] [8] [6], sho wing clean and adversarial inputs with increasing ϵ , [22]. 8 Ziad Sharawy , Mohammad Nakshbandi , and Sorin Mihai Grigorescu Arc hitectural differences affect adversarial detection. ResNet-50 offers b etter baseline segmen- tation but ma y b e more vulnerable, requiring a stricter detection threshold than ResNet-18. Figure 5 sho ws that small increases in attac k strength e quickly degrade segmentation, with mIoU dropping from 1.00 to 0.10 (T able 1). Despite minimal visual changes, this highlights the need for detection based on deep feature analysis rather than visual insp ection. 5 Conclusions This pap er addresses the challenge of detecting adversarial attac ks on robotic p erception systems, fo cusing on seman tic pixel-wise segmentation with pre-trained ResNet-18 and ResNet-50 mo dels. W e demonstrate a metho d that distinguishes original from adv ersarially manipulated images, sho wing through extensiv e exp erimen ts that it enhances segmentation mo del robustness. The results highligh t the imp ortance of incorp orating security in to automated systems and pro vide a foundation for improving resilience against adaptive adversarial attacks. As AI expands in safety-critical applications, ensuring reliabilit y and trust worthiness is essential. By adv ancing adv ersarial detection and coun termeasures, this work aims to build confidence in in telligen t systems and maximize their so cietal b enefits. In conclusion, the study lays groundwork for secure robotic p erception systems and emphasizes the need for ongoing research to improv e mo del p erformance while safeguarding against adversarial vulnerabilities for safe real-w orld deploymen t. References 1. Abusnaina, H., ...: Adversarial example detection using latent neighborho od graph. In: Pro ceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) 2021. pp. 7687–7696 (2021) 2. Carlini, N., W agner, D.: Adv ersarial examples are not easily detected: Bypassing ten detection metho ds. In: Pro ceedings of the 2017 ACM W orkshop on Artificial Intelligence and Security (AISec). p. 3–14 (2017) 3. Carlini, N., W agner, D.: T ow ards ev aluating the robustness of neural netw orks. In: IEEE Symposium on Security and Priv acy (SP). pp. 39–57. IEEE (2017) 4. Chen, L.C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Enco der-decoder with atrous separable con volution for seman tic image segmen tation. In: ECCV. pp. 801–818 (2018) 5. Chen, L.C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Enco der-decoder with atrous separable con volution for semantic image segmen tation. arXiv preprin t arXiv:1802.02611 (2018), https://arxiv. org/abs/1802.02611 6. Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzw eiler, M., Benenson, R., F rank e, U., Roth, S., Sc hiele, B.: The cityscapes dataset for seman tic urban scene understanding. In: Pro ceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 3213–3223 (2016), https: //www.cit yscap es- dataset.com/ 7. Costa, J.C., Ro xo, T., Pro en¸ ca, H., In´ acio, P .R.: How deep learning sees the w orld: A surv ey on adversarial attac ks & defenses. IEEE Access 12 , 61113–61136 (2024). h ttps://doi.org/10.1109/ACCESS.2024.3395118 8. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., F ei-F ei, L.: Imagenet: A large-scale hierarchical im- age database. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 248–255 (2009), http://www.image- net.org/ 9. Goo dfellow, I., Bengio, Y., Courville, A.: Deep learning. MIT Press (2016) 10. Goo dfellow, I.J., Shlens, J., Szegedy , C.: Explaining and harnessing adversarial examples. In: ICLR (2015) 11. Goo dfellow, I.J., Shlens, J., Szegedy , C.: Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572 (2015), 5. CONCLUSIONS 9 12. Grosse, K., Manoharan, P ., P ap ernot, N., Back es, M., McDaniel, P .: On the (statistical) detection of adv ersarial examples. arXiv preprint arXiv:1702.06280 (2017) 13. Hariharan, B., Arb el´ aez, P ., Girshic k, R., Malik, J.: Hyp ercolumns for ob ject segmentation and fine- grained lo calization. In: CVPR. pp. 447–456 (2015) 14. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. Pro ceed- ings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) pp. 770– 778 (2016), h ttps://www.cv- foundation.org/op enaccess/conten t cvpr 2016/h tml/He Deep Residual Learning CVPR 2016 pap er.h tml 15. Krizhevsky , A., Sutskev er, I., Hinton, G.E.: Imagenet classification with deep conv olutional neural net works. In: NIPS. pp. 1097–1105 (2012) 16. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521 (7553), 436–444 (2015) 17. Long, J., Shelhamer, E., Darrell, T.: F ully con volutional net works for seman tic segmen tation. In: CVPR. pp. 3431–3440 (2015) 18. Ma, X., Li, B., Liu, Y., Zhang, Y., Gong, B., Ng, H., Metaxas, D.N.: T o wards robust detection of adver- sarial examples. In: Adv ances in Neural Information Pro cessing Systems (NeurIPS) 2019. p. 9034–9045 (2019) 19. Miller, D.J., W ang, Y., Kesidis, G.: When not to classify: Anomaly detection of attacks (ada) on dnn classifiers at test time. arXiv preprin t arXiv:1712.06646 (2017) 20. Mumcu, F., Yilmaz, Y.: Detecting adversarial examples. arXiv preprint arXiv:2410.17442 (2024) 21. P ap ernot, N., McDaniel, P ., Go odfellow, I., Jha, S., Celik, Z.B., Swami, A.: The limitations of deep learning in adversarial settings. In: IEEE Europ ean Symp osium on Security and Priv acy (EuroS&P). pp. 372–387. IEEE (2016) 22. P aszke, A., Gross, S., Massa, F., Lerer, A., Bradbury , J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., et al.: Pytorch: An imperative style, high-p erformance deep learning library . Adv ances in Neural Information Pro cessing Systems (NeurIPS) 32 (2019), https://p ytorch.org/ 23. Ronneberger, O., Fischer, P ., Brox, T.: U-net: Conv olutional netw orks for biomedical image segmenta- tion. In: MICCAI. pp. 234–241. Springer (2015) 24. Smith, L., Gal, Y.: Understanding measures of uncertaint y for adversarial example detection. In: Pro- ceedings of the 34th Conference on Uncertaint y in Artificial Intelligence (UAI). p. 624–633 (2018) 25. Szegedy , C., Zaremba, W., Sutskev er, I., Bruna, J., Erhan, D., Go o dfello w, I., F ergus, R.: Intriguing prop erties of neural netw orks. arXiv preprin t arXiv:1312.6199 (2013) 26. Xu, W., Ev ans, D., Qi, Y.: F eature squeezing: Detecting adversarial examples in deep neural netw orks. In: NDSS (2018) 27. . . . : Detecting adversarial examples in text classification. In: Findings of the Asso ciation for Computa- tional Linguistics (ACL) 2022. p. . . . (2022) 28. . . . : Adversarial example detection using semantic graph matching. Information Sciences (2023) 29. . . . : A nov el adv ersarial example detection metho d based on frequency domain reconstruction. Sensors 24 (17), 5507 (2024)

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment