Taipan: A Query-free Transfer-based Multiple Sensitive Attribute Inference Attack Solely from Publicly Released Graphs

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Graph-structured data underpin a wide spectrum of modern applications. However, complex graph topologies and homophilic patterns can facilitate attribute inference attacks (AIAs) by enabling sensitive information leakage to propagate across local neighborhoods. Existing AIAs predominantly assume that adversaries can probe sensitive attributes through repeated model queries. Such assumptions are often impractical in real-world settings due to stringent data protection regulations, prohibitive query budgets, and heightened detection risks, especially when inferring multiple sensitive attributes. More critically, this model-centric perspective obscures a pervasive blind spot: \textbf{intrinsic multiple sensitive information leakage arising solely from publicly released graphs.} To exploit this unexplored vulnerability, we introduce a new attack paradigm and propose \textbf{Taipan, the first query-free transfer-based attack framework for multiple sensitive attribute inference attacks on graphs (G-MSAIAs).} Taipan integrates \emph{Hierarchical Attack Knowledge Routing} to capture intricate inter-attribute correlations, and \emph{Prompt-guided Attack Prototype Refinement} to mitigate negative transfer and performance degradation. We further present a systematic evaluation framework tailored to G-MSAIAs. Extensive experiments on diverse real-world graph datasets demonstrate that Taipan consistently achieves strong attack performance across same-distribution settings and heterogeneous similar- and out-of-distribution settings with mismatched feature dimensionalities, and remains effective even under rigorous differential privacy guarantees. Our findings underscore the urgent need for more robust multi-attribute privacy-preserving graph publishing methods and data-sharing practices.

💡 Research Summary

The paper “Taipan: A Query‑free Transfer‑based Multiple Sensitive Attribute Inference Attack Solely from Publicly Released Graphs” introduces a novel attack paradigm that abandons the traditional query‑driven approach to attribute inference attacks (AIAs) on graph‑structured data. Existing AIAs rely on repeatedly probing a victim model to maximize posterior confidence for candidate sensitive attribute values. This paradigm is increasingly impractical because real‑world deployments are constrained by strict data‑protection regulations (e.g., GDPR), limited query budgets, and sophisticated anomaly‑detection systems. Moreover, prior work focuses on a single sensitive attribute and does not scale to the multi‑attribute setting that is common in social, financial, or health‑related graphs.

Taipan addresses these gaps by operating entirely query‑free: it assumes the attacker only has access to an auxiliary graph that is publicly released (or otherwise obtainable) and a pre‑trained attack model built on that auxiliary data. The attacker then transfers the learned knowledge to a target graph whose sensitive attributes have been masked or sanitized, possibly under strong differential‑privacy (DP) guarantees. The attack is framed as an unsupervised domain‑adaptation problem rather than a black‑box model‑exploitation problem.

Core Technical Contributions

Hierarchical Attack Knowledge Routing (HAKR) – The authors treat each sensitive attribute as a separate task within a multi‑task learning (MTL) framework. Instead of the conventional shared‑bottom MTL, they adopt a Multi‑gate Mixture‑of‑Experts (MMoE) architecture and organize the experts hierarchically according to an “attack hierarchy tree” derived from correlation analysis on the auxiliary graph. This hierarchy routes similar attributes to shared experts (promoting positive transfer) while separating conflicting attributes via distinct gates (mitigating negative transfer). Learnable pre‑text tokens serve as task identifiers, allowing the model to capture both shared and task‑specific knowledge.
Prompt‑guided Attack Prototype Refinement (PAPR) – After pre‑training on the auxiliary graph, Taipan freezes all model parameters and only fine‑tunes a lightweight set of learnable tokens (prompts). These prompts act as adapters that extract structural and semantic cues from the target graph and align them with the source knowledge. Because the target graph’s sensitive attributes are hidden, the method employs pseudo‑labeling: high‑confidence predictions from the frozen model are treated as provisional labels, and only those nodes are retained for further adaptation. The approach also re‑uses auxiliary samples during adaptation to alleviate data scarcity.
Systematic Evaluation Framework – Recognizing that existing AIA metrics target single attributes, the authors propose three families of metrics: (a) Attack Utility (overall multi‑attribute accuracy, macro‑F1, etc.), (b) Task Deviation (variance across attributes, measuring interference), and (c) Semantic Knowledge Preservation (how well non‑sensitive features and graph topology are retained). This comprehensive suite enables fair comparison across same‑distribution, similar‑distribution, and out‑of‑distribution (OOD) scenarios.

Experimental Findings

Datasets: Multiple real‑world graphs (Cora, Pubmed, Reddit, ogbn‑arxiv) with varying sizes, feature dimensions, and attribute sets.
Settings: (i) Same‑distribution transfer (auxiliary and target graphs share distribution), (ii) Similar‑distribution transfer (different but related domains), (iii) Heterogeneous OOD transfer (mismatched feature dimensionalities, structural properties), and (iv) Strong DP protection (ε‑DP with ε = 1 and ε = 0.5).
Baselines: Query‑based attacks (e.g., CSMIA, LOMIA), imputation‑based methods, and recent graph‑specific attacks that rely on node embeddings or shadow models.
Results: Taipan consistently outperforms all baselines. In same‑distribution cases it achieves up to 92% multi‑attribute accuracy, a 15‑20% absolute gain over the best query‑based competitor. Under OOD conditions, the hierarchical routing mitigates performance loss, yielding 10‑12% higher accuracy than naïve transfer. Even with ε‑DP noise injected, Taipan retains >70% accuracy, whereas baselines drop below 40%. The prompt‑only fine‑tuning requires updating less than 1% of total parameters, demonstrating high efficiency.

Significance and Limitations

Taipan’s query‑free nature eliminates the need for any interaction with the victim model, making detection virtually impossible with current monitoring tools. By leveraging only publicly released graphs, it reveals a previously under‑explored privacy leakage vector that persists even when models are sanitized or protected by DP. The hierarchical MMoE design and prompt‑based adaptation constitute the first successful combination of multi‑task expert routing and lightweight adapters in the graph domain.

However, the approach depends on the existence of a sufficiently similar auxiliary graph; if the auxiliary data are too dissimilar, the hierarchical routing may misguide the transfer. The pseudo‑labeling step also assumes that high‑confidence predictions are reliable, which may not hold for extremely noisy or heavily perturbed target graphs. Moreover, the current prompt design is manual; automated prompt search or meta‑learning could further improve robustness.

Future Directions

Automated Prompt Optimization: Employ meta‑learning or reinforcement learning to discover optimal prompt tokens for each target domain.
Adversarial Domain Alignment: Integrate adversarial discriminators to enforce distribution alignment between source and target embeddings, reducing reliance on pseudo‑labels.
Defensive Countermeasures: Explore graph‑level perturbations, edge‑level randomization, or attribute‑correlation masking to disrupt the hierarchical knowledge that Taipan exploits.
Broader Applicability: Extend the framework to heterogeneous graphs, temporal networks, and multimodal data where multiple sensitive attributes co‑occur.

In summary, the paper presents a groundbreaking shift from query‑dependent to query‑free, transfer‑based attribute inference attacks on graphs. Taipan demonstrates that merely publishing a graph, even under strong privacy guarantees, can leak multiple sensitive attributes when an adversary leverages sophisticated multi‑task learning and prompt‑based domain adaptation. The work calls for a re‑evaluation of current graph‑publishing practices, privacy‑preserving mechanisms, and regulatory policies, emphasizing that privacy risks extend far beyond direct model access.