Concept-Aware Privacy Mechanisms for Defending Embedding Inversion Attacks

Concept-Aware Privacy Mechanisms for Defending Embedding Inversion Attacks
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Text embeddings enable numerous NLP applications but face severe privacy risks from embedding inversion attacks, which can expose sensitive attributes or reconstruct raw text. Existing differential privacy defenses assume uniform sensitivity across embedding dimensions, leading to excessive noise and degraded utility. We propose SPARSE, a user-centric framework for concept-specific privacy protection in text embeddings. SPARSE combines (1) differentiable mask learning to identify privacy-sensitive dimensions for user-defined concepts, and (2) the Mahalanobis mechanism that applies elliptical noise calibrated by dimension sensitivity. Unlike traditional spherical noise injection, SPARSE selectively perturbs privacy-sensitive dimensions while preserving non-sensitive semantics. Evaluated across six datasets with three embedding models and attack scenarios, SPARSE consistently reduces privacy leakage while achieving superior downstream performance compared to state-of-the-art DP methods.


💡 Research Summary

The paper addresses a pressing privacy problem in modern natural‑language processing: embedding inversion attacks that can reconstruct raw text or expose sensitive attributes from high‑dimensional text embeddings. Existing differential‑privacy (DP) defenses treat every embedding dimension as equally sensitive and therefore add isotropic (spherical) noise across the whole vector. This uniform perturbation either fails to protect user‑specified concepts adequately or severely degrades downstream utility. To overcome these limitations, the authors propose SPARSE (Sensitivity‑guided Privacy‑Aware Representations for better Semantic‑preserving), a two‑stage framework that (1) discovers which dimensions of an embedding are most informative about a user‑defined privacy concept and (2) injects anisotropic noise calibrated to those sensitivities.

Stage 1 – Differentiable Mask Learning.
Given a privacy concept C (e.g., names, diseases), the authors construct a positive set D⁺ of sentences containing C and a negative set D⁻ where the concept tokens are removed. For each sentence s, the embedding Φ(s) is element‑wise multiplied by a mask vector m∈


Comments & Academic Discussion

Loading comments...

Leave a Comment