A Survey on Agentic Security: Applications, Threats and Defenses
In this work we present the first holistic survey of the agentic security landscape, structuring the field around three fundamental pillars: Applications, Threats, and Defenses. We provide a comprehensive taxonomy of over 160 papers, explaining how agents are used in downstream cybersecurity applications, inherent threats to agentic systems, and countermeasures designed to protect them. A detailed cross-cutting analysis shows emerging trends in agent architecture while revealing critical research gaps in model and modality coverage. A complete and continuously updated list of all surveyed papers is publicly available at https://github.com/kagnlp/Awesome-Agentic-Security.
💡 Research Summary
This paper delivers the first comprehensive survey of the emerging field of agentic security, organizing the rapidly expanding literature around three fundamental pillars: Applications, Threats, and Defenses. The authors examined more than 160 papers published in 2024‑2025, a period that alone saw a surge of research on large‑language‑model (LLM) agents applied to cybersecurity. By defining an “LLM Agent” as a system whose core decision‑making module is an LLM that plans, invokes tools/APIs, interacts with an external environment, observes feedback, and adapts its actions, the authors establish a clear conceptual baseline for the survey.
Applications (Red‑Team and Blue‑Team)
The survey first maps how agents are employed across the security lifecycle. In offensive (red‑team) settings, autonomous penetration‑testing frameworks such as PentestGPT, Incalmo, and AutoPentest demonstrate end‑to‑end exploit discovery, fuzzing, and exploit adaptation. Multi‑agent pipelines like RA‑G and task‑graph coordination push the state of the art in zero‑day discovery (e.g., Locus, CVE‑Genie). Defensive (blue‑team) agents include autonomous threat‑detection and incident‑response systems (IRCopilot, COR‑TEX), intelligent threat hunting (ProvSEEK, LLM‑CloudHunter), automated forensics and root‑cause analysis (RepoAudit, GALA), and autonomous patch synthesis (RepairAgent, IaC‑Agent). Domain‑specific agents address cloud and infrastructure hardening (KubeIntellect, BAR‑TPredict), web application security (DMAPTA, AIOS), and specialized sectors such as blockchain (LISA), healthcare (HIPAA‑Agent), and privacy (OneShield). Each application area is linked to representative benchmarks (AutoPenBench, ExCyTInBench, AI‑Pentest‑Benchmark) that enable quantitative comparison.
Threat Landscape
Transitioning from stand‑alone LLMs to autonomous agents dramatically expands the attack surface. The authors categorize threats into five major groups:
- Injection Attacks – Prompt injection, system‑prompt leakage, and split‑payload attacks that coerce agents into unintended actions. Benchmarks such as AgentDojo and PromptInject reveal a fundamental trade‑off between security hardening and task performance.
- Poisoning & Extraction – Manipulation of training data, tool outputs, or external knowledge bases to bias agent reasoning.
- Jailbreak Attacks – Techniques that bypass LLM safety alignment (e.g., BrowserArt, LLMFuzzer).
- Agent Manipulation – Inter‑agent message tampering, trust abuse, and “Agent‑in‑the‑Middle” attacks that corrupt collaborative workflows.
- Pre‑execution (Backdoor) Threats – Embedding malicious code or hidden triggers during agent initialization (e.g., ScamAgents, RSP).
The survey also documents red‑team specific attacks (SentinetAgent, AiTM) and evaluates them using adversarial benchmarks (DoomArena, SafeArena, STWebAgentBench) and execution environments (AgentDojo, CVE‑Bench).
Defensive Countermeasures
Defenses are organized along the software lifecycle:
- Secure‑by‑Design (ACE, Task Shield) embed safety constraints directly into the agent’s planning and tool‑selection modules.
- Multi‑Agent Security (D‑CIPHER, PhishDebate) focus on trust management, consensus protocols, and inter‑agent verification.
- Runtime Protection (R2‑Guard/GuardRail, AgentSpec with Human‑in‑the‑Loop, SentinelAgent behavioral monitoring) provides real‑time detection and mitigation of anomalous actions.
- Security Operations (IRIS formal verification, AutoBNB incident response, COR‑TEX SOC triage, ExCyTIn‑Bench threat hunting, GALA forensics) integrate agents into existing SOC pipelines.
- Evaluation Frameworks (ASB, RAS‑Eval, AI Agents Under Threat, Safety at Scale) standardize benchmarking of both attacks and defenses.
Cross‑Cutting Analysis and Gaps
The authors identify several systemic issues: a heavy reliance on proprietary GPT‑style models, a pronounced modality bias toward text (limited image, audio, or code integration), and a fragmented benchmark ecosystem that hampers reproducible comparison. Critical research gaps include (1) development of multimodal, multi‑agent collaboration frameworks, (2) real‑time safety verification mechanisms that do not degrade utility, and (3) open‑source, model‑agnostic defense solutions.
Conclusions and Outlook
By unifying applications, threats, and defenses under a single taxonomy, the survey provides a holistic map of the agentic security landscape. It highlights the transformative potential of LLM agents for both offensive and defensive cybersecurity while warning that current research is overly concentrated on specific models and narrow tasks. The paper calls for broader, standardized evaluation practices, richer multimodal capabilities, and robust, layered defenses to ensure that the next generation of autonomous agents can be deployed safely in high‑stakes security environments.
Comments & Academic Discussion
Loading comments...
Leave a Comment