Semantic Mismatch and Perceptual Degradation: A New Perspective on Image Editing Immunity

December 16, 2025

Reading time: 5 minute

...

📝 Original Info

Title: Semantic Mismatch and Perceptual Degradation: A New Perspective on Image Editing Immunity
ArXiv ID: 2512.14320
Date: 2025-12-16
Authors: Shuai Dong, Jie Zhang, Guoying Zhao, Shiguang Shan, Xilin Chen

📝 Abstract

Text-guided image editing via diffusion models, while powerful, raises significant concerns about misuse, motivating efforts to immunize images against unauthorized edits using imperceptible perturbations. Prevailing metrics for evaluating immunization success typically rely on measuring the visual dissimilarity between the output generated from a protected image and a reference output generated from the unprotected original. This approach fundamentally overlooks the core requirement of image immunization, which is to disrupt semantic alignment with attacker intent, regardless of deviation from any specific output. We argue that immunization success should instead be defined by the edited output either semantically mismatching the prompt or suffering substantial perceptual degradations, both of which thwart malicious intent. To operationalize this principle, we propose Synergistic Intermediate Feature Manipulation (SIFM), a method that strategically perturbs intermediate diffusion features through dual synergistic objectives: (1) maximizing feature divergence from the original edit trajectory to disrupt semantic alignment with the expected edit, and (2) minimizing feature norms to induce perceptual degradations. Furthermore, we introduce the Immunization Success Rate (ISR), a novel metric designed to rigorously quantify true immunization efficacy for the first time. ISR quantifies the proportion of edits where immunization induces either semantic failure relative to the prompt or significant perceptual degradations, assessed via Multimodal Large Language Models (MLLMs). Extensive experiments show our SIFM achieves the state-of-the-art performance for safeguarding visual content against malicious diffusion-based manipulation.

💡 Deep Analysis

📄 Full Content

JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 1 Semantic Mismatch and Perceptual Degradation: A New Perspective on Image Editing Immunity Shuai Dong , Jie Zhang, Member, IEEE, Guoying Zhao Fellow, IEEE, Shiguang Shan Fellow, IEEE, Xilin Chen Fellow, IEEE, Abstract—Text-guided image editing via diffusion models, while powerful, raises significant concerns about misuse, motivat- ing efforts to immunize images against unauthorized edits using imperceptible perturbations. Prevailing metrics for evaluating immunization success typically rely on measuring the visual dis- similarity between the output generated from a protected image and a reference output generated from the unprotected original. This approach fundamentally overlooks the core requirement of image immunization, which is to disrupt semantic alignment with attacker intent, regardless of deviation from any specific output. We argue that immunization success should instead be defined by the edited output either semantically mismatching the prompt or suffering substantial perceptual degradations, both of which thwart malicious intent. To operationalize this principle, we propose Synergistic Intermediate Feature Manip- ulation (SIFM), a method that strategically perturbs interme- diate diffusion features through dual synergistic objectives: (1) maximizing feature divergence from the original edit trajectory to disrupt semantic alignment with the expected edit, and (2) minimizing feature norms to induce perceptual degradations. Furthermore, we introduce the Immunization Success Rate (ISR), a novel metric designed to rigorously quantify true immunization efficacy for the first time. ISR quantifies the proportion of edits where immunization induces either semantic failure relative to the prompt or significant perceptual degradations, assessed via Multimodal Large Language Models (MLLMs). Extensive exper- iments show our SIFM achieves the state-of-the-art performance for safeguarding visual content against malicious diffusion-based manipulation. Index Terms—Diffusion Models, Image Editing, Image Immu- nization. I. INTRODUCTION R ECENT breakthroughs in diffusion models (DMs) have revolutionized text-to-image synthesis and manipulation [1]–[9], enabling high-fidelity generation and fine-grained editing through natural language guidance [10]–[21]. While these capabilities unlock transformative creative tools, they also introduce severe ethical risks. Malicious actors could exploit DMs to generate deepfakes or forged content for disinformation campaigns, privacy violations, or manipulation of public discourse [22]. Such threats erode trust in digital Jie Zhang, Shiguang Shan and Xilin Chen are with the State Key Laboratory of AI Safety, Institute of Computing Technology, Chinese Academy of Sciences (CAS), Beijing 100190, China, and also with the University of China Academy of Sciences, Beijing 100049, China (e-mail: zhangjie@ict.ac.cn; sgshan@ict.ac.cn; xlchen@ict.ac.cn). Guoying Zhao is with the Center for Machine Vision and Signal Analysis, University of Oulu, 90014 Oulu, Finland (e-mail: guoying.zhao@oulu.fi). Shuai Dong is with the School of Computer Science, China University of Geosciences, Wuhan 430074, China (e-mail: dongshuai iu@163.com). media and risk destabilizing socio-political systems, making the development of robust safeguards against harmful edits a critical priority. One promising defense paradigm is image immunization, which addresses this challenge by embedding imperceptible perturbations into images to proactively disrupt unauthorized edits [23]–[31]. Despite notable advancements in mitigating malicious ed- its [27]–[32], the prevailing standard for defining immuniza- tion success remains superficial, predominantly relying on vi- sual dissimilarity between the edited output of the immunized image and the specific edited output of the unprotected orig- inal. This approach is fundamentally limited because such a reference edit represents merely one possible outcome among a spectrum of valid edits for a given prompt, especially given the inherent variability of diffusion models. Consequently, deviation from such a non-unique reference does not inherently signify successful immunization. Critically, these evaluations overlook the core requirement that effective protection must disrupt semantic alignment with the attacker’s intent, irrespec- tive of comparison to any specific reference instance. For example, with the prompt “make the hairstyles look more gothic” as shown in Fig. 1, various “gothic” interpretations can be semantically valid yet look very different from the edited original. Therefore, significant visual deviation alone does not mean a malicious edit has been prevented. This inadequacy of current standards to genuinely assess immunization raises a critical question: What constitutes an accurate standard for immunization success? To address the question, we redefine successful image immunization through an adversarial len

📄 Read Full PDF on ArXiv