Optimal Transport-Guided Adversarial Attacks on Graph Neural Network-Based Bot Detection
The rise of bot accounts on social media poses significant risks to public discourse. To address this threat, modern bot detectors increasingly rely on Graph Neural Networks (GNNs). However, the effectiveness of these GNN-based detectors in real-world settings remains poorly understood. In practice, attackers continuously adapt their strategies as well as must operate under domain-specific and temporal constraints, which can fundamentally limit the applicability of existing attack methods. As a result, there is a critical need for robust GNN-based bot detection methods under realistic, constraint-aware attack scenarios. To address this gap, we introduce BOCLOAK to systematically evaluate the robustness of GNN-based social bot detection via both edge editing and node injection adversarial attacks under realistic constraints. BOCLOAK constructs a probability measure over spatio-temporal neighbor features and learns an optimal transport geometry that separates human and bot behaviors. It then decodes transport plans into sparse, plausible edge edits that evade detection while obeying real-world constraints. We evaluate BOCLOAK across three social bot datasets, five state-of-the-art bot detectors, three adversarial defenses, and compare it against four leading graph adversarial attack baselines. BOCLOAK achieves up to 80.13% higher attack success rates while using 99.80% less GPU memory under realistic real-world constraints. Most importantly, BOCLOAK shows that optimal transport provides a lightweight, principled framework for bridging the gap between adversarial attacks and real-world bot detection.
💡 Research Summary
The paper introduces BOCLOAK, a novel adversarial attack framework that targets Graph Neural Network (GNN)‑based social bot detectors under realistic, constraint‑aware settings. Recognizing that existing graph attacks (e.g., Nettack, FGA, GOttack, PR‑BCD) assume unrestricted edge manipulation and often ignore temporal or domain‑specific limits, the authors propose a geometry‑first approach grounded in optimal transport (OT) theory.
Key methodological steps:
- Neighborhood Representation – For each account v, the k‑hop ego‑network is encoded as a set of feature vectors ϕ_v(η) capturing static attributes, interaction roles, content signals, and temporal cues of neighbor η. These vectors are weighted by an importance score a_v(η) = g_str(s(η))·g_temp(t(η)), where s and t are structural and temporal statistics, respectively. The weighted collection forms an empirical probability measure μ_v = Σ_{η∈N(v)} w_v(η) δ_{ϕ_v(η)}.
- Learnable OT Geometry – Two distributions, μ_human and μ_bot, are compared via an OT distance W_c(μ_human, μ_bot). The ground cost c(z_i, z_j) is parameterized and learned so that the OT distance reflects the true separability of human and bot neighborhoods. Entropy‑regularized Sinkhorn iterations provide a scalable solution, yielding an optimal transport plan P*.
- Attack Generation – The optimal plan indicates how to transform a bot’s neighborhood distribution toward the human distribution with minimal cost. BOCLOAK decodes P* into concrete graph modifications: (a) Edge‑editing – add or delete a limited number B of incident edges to an existing bot, respecting directionality and a plausibility penalty Ψ(ΔE) that encodes temporal alignment and API limits; (b) Node‑injection – create a new bot node v_t, freely choose outgoing follow edges, and selectively acquire incoming follow‑backs only when they satisfy the same plausibility constraints. The attacker operates in a black‑box setting: it knows the training graph, node labels, and feature schema but has no access to model parameters, gradients, or logits.
- Constraint Modeling – Real‑world constraints are explicitly modeled: (i) a strict edge budget B, (ii) partial observability (the attacker cannot probe the entire graph), (iii) temporal consistency (edges must respect observed activity windows), and (iv) behavioral plausibility (incoming edges must resemble natural follow‑back patterns).
Experimental evaluation:
- Datasets: TwiBot‑20, TwiBot‑22, and BotSim‑24 (the latter contains LLM‑generated bots).
- Victim Models: Five state‑of‑the‑art GNN detectors (BotRGCN, S‑HGN, RGT, etc.) in both vanilla and defended variants (adversarial training, graph sanitization, meta‑defense).
- Baselines: Nettack, Fast Gradient Attack (FGA), GOttack, PR‑BCD.
- Metrics: Attack success rate (percentage of injected/edited bots misclassified as human), GPU memory consumption, runtime.
Results show that BOCLOAK consistently outperforms baselines, achieving up to 80.13 % higher attack success rates while using 99.80 % less GPU memory. Runtime improvements reach up to 20× faster than the strongest baselines. Even against defended detectors, BOCLOAK maintains high evasion rates, indicating that current defenses (which are generally model‑agnostic and non‑adaptive) are insufficient against OT‑guided, constraint‑aware attacks.
Contributions highlighted by the authors:
- First application of optimal transport to the security domain of social bot detection, providing a principled metric space that separates human and bot neighborhoods.
- Development of a learnable OT ground cost that aligns with realistic domain constraints, enabling attacks that are both effective and plausible.
- Demonstration that OT plans can be efficiently decoded into sparse, feasible edge edits or node‑injection strategies, dramatically reducing computational overhead compared to existing graph attacks.
- Extensive empirical validation across multiple datasets, detectors, and defenses, establishing BOCLOAK as a strong baseline for future research on both attacks and robust defenses.
The paper’s broader impact lies in shifting the adversarial graph literature from unconstrained, gradient‑driven perturbations toward geometry‑driven, constraint‑aware formulations. By showing that a lightweight OT framework can generate realistic bot camouflage, the work suggests new defensive directions, such as incorporating OT distance regularization into training or designing detection mechanisms that monitor deviations in neighborhood transport costs.
Comments & Academic Discussion
Loading comments...
Leave a Comment