📝 Original Info
- Title: Who Can See Through You? Adversarial Shielding Against VLM-Based Attribute Inference Attacks
- ArXiv ID: 2512.18264
- Date: 2025-12-20
- Authors: ** - Yucheng Fan¹ - Jiawei Chen¹ ² - Yu Tian³ - Zhaoxia Yin¹* ¹ East China Normal University, Shanghai, China ² Zhongguancun Academy, Beijing, China ³ Dept. of Computer Science and Technology, Institute for AI, Tsinghua University, Beijing, China ✉️ Corresponding author: zxyin@cee.ecnu.edu.cn — **
📝 Abstract
As vision-language models (VLMs) become widely adopted, VLM-based attribute inference attacks have emerged as a serious privacy concern, enabling adversaries to infer private attributes from images shared on social media. This escalating threat calls for dedicated protection methods to safeguard user privacy. However, existing methods often degrade the visual quality of images or interfere with vision-based functions on social media, thereby failing to achieve a desirable balance between privacy protection and user experience. To address this challenge, we propose a novel protection method that jointly optimizes privacy suppression and utility preservation under a visual consistency constraint. While our method is conceptually effective, fair comparisons between methods remain challenging due to the lack of publicly available evaluation datasets. To fill this gap, we introduce VPI-COCO, a publicly available benchmark comprising 522 images with hierarchically structured privacy questions and corresponding non-private counterparts, enabling fine-grained and joint evaluation of protection methods in terms of privacy preservation and user experience. Building upon this benchmark, experiments on multiple VLMs demonstrate that our method effectively reduces PAR below 25%, keeps NPAR above 88%, maintains high visual consistency, and generalizes well to unseen and paraphrased privacy questions, demonstrating its strong practical applicability for real-world VLM deployments.
💡 Deep Analysis
📄 Full Content
Who Can See Through You? Adversarial Shielding Against VLM-Based
Attribute Inference Attacks
Yucheng Fan1
Jiawei Chen1,2
Yu Tian 3
Zhaoxia Yin1∗
1East China Normal University, Shanghai, China
2Zhongguancun Academy, Beijing, China
3Dept. of Comp. Sci. and Tech., Institute for AI, Tsinghua University, Beijing, China
zxyin@cee.ecnu.edu.cn
Abstract
As vision-language models (VLMs) become widely adopted,
VLM-based attribute inference attacks have emerged as a
serious privacy concern, enabling adversaries to infer pri-
vate attributes from images shared on social media. This
escalating threat calls for dedicated protection methods to
safeguard user privacy. However, existing methods often
degrade the visual quality of images or interfere with vision-
based functions on social media, thereby failing to achieve a
desirable balance between privacy protection and user ex-
perience. To address this challenge, we propose a novel
protection method that jointly optimizes privacy suppres-
sion and utility preservation under a visual consistency con-
straint.
While our method is conceptually effective, fair
comparisons between methods remain challenging due to
the lack of publicly available evaluation datasets. To fill this
gap, we introduce VPI-COCO, a publicly available bench-
mark comprising 522 images with hierarchically structured
privacy questions and corresponding non-private counter-
parts, enabling fine-grained and joint evaluation of protec-
tion methods in terms of privacy preservation and user ex-
perience. Building upon this benchmark, experiments on
multiple VLMs demonstrate that our method effectively re-
duces PAR below 25%, keeps NPAR above 88%, maintains
high visual consistency, and generalizes well to unseen and
paraphrased privacy questions, demonstrating its strong
practical applicability for real-world VLM deployments.
1. Introduction
In recent years, the rapid progress of Vision-Language
Models (VLMs) has emerged as one of the most remarkable
advances in Artificial Intelligence. By integrating visual
and linguistic modalities, VLMs achieve powerful cross-
modal understanding and generalization across real-world
scenarios. However, their growing deployment also raises
increasingly diverse and complex concerns about privacy
and regulatory compliance [5, 20, 29, 31].
Users
Images
Social Media
VLMs
Attackers
Users’ Attributes
Inferred Attributes
Figure 1. Illustration of VLM-based attribute inference attack.
Users often share daily-life photos on social media. Attackers can
exploit VLMs to infer personal privacy attributes from visual cues,
even when such attributes are never explicitly disclosed.
In particular, VLM-based attribute inference attacks
have attracted increasing attention.
Specifically, [6, 8,
14, 16] demonstrated that VLMs can infer an image’s ge-
ographic location from subtle visual cues without geo-
annotations, whereas [13, 22, 28] further showed that they
can uncover private attributes from ordinary images through
semantic cues. An illustrative example of such attacks is
shown in Fig. 1, users on social platforms share volumes
of daily-life images. By automatically collecting these im-
ages, attackers can leverage VLMs to analyze visual con-
tent and uncover latent private attributes about users. Such
attacks are highly stealthy, low-cost, and scalable, posing an
emerging threat to online privacy and security.
To address these concerns, prior efforts have focused on
enhancing model-level safety alignment and content filter-
arXiv:2512.18264v1 [cs.CV] 20 Dec 2025
ing [18]. However, these safeguards can be easily circum-
vented by simple evasion techniques [2, 15], rendering them
ineffective against these attacks. This calls for an input-
level privacy protection mechanism to defend against such
threats, which poses two key challenges:
(1) Trade-off between privacy protection and user ex-
perience. An effective privacy protection method is ex-
pected to induce the model to refuse privacy questions,
thereby preventing potential privacy leakage. Meanwhile,
it should also minimize unintended refusals of non-privacy
questions that arise from the protection process to ensure
that VLM-powered functions on social platforms, such as
tagging and content recommendation, remain responsive
for a satisfactory user experience. Existing input-level im-
age privacy protection methods mainly fall into two cat-
egories: anonymization-based protection [17, 19, 23, 27]
and encryption-based protection [4, 21, 30, 32]. The for-
mer anonymizes images by masking or replacing privacy-
sensitive regions, while the latter enhances privacy by
adding noise or applying image transformations to the input.
However, these approaches often cause significant degrada-
tion in visual quality and overlook their potential negative
impact on non-privacy questions, thereby failing to achieve
a trade-off between privacy protection and user experience.
(2) Lack of publicly available targeted evaluation
datasets.
Existing relate
Reference
This content is AI-processed based on open access ArXiv data.