CyberRAG: An Agentic RAG cyber attack classification and reporting tool
Intrusion Detection and Prevention Systems (IDS/IPS) in large enterprises can generate hundreds of thousands of alerts per hour, overwhelming analysts with logs requiring rapidly evolving expertise. Conventional machine-learning detectors reduce alert volume but still yield many false positives, while standard Retrieval-Augmented Generation (RAG) pipelines often retrieve irrelevant context and fail to justify predictions. We present CyberRAG, a modular agent-based RAG framework that delivers real-time classification, explanation, and structured reporting for cyber-attacks. A central LLM agent orchestrates: (i) fine-tuned classifiers specialized by attack family; (ii) tool adapters for enrichment and alerting; and (iii) an iterative retrieval-and-reason loop that queries a domain-specific knowledge base until evidence is relevant and self-consistent. Unlike traditional RAG, CyberRAG adopts an agentic design that enables dynamic control flow and adaptive reasoning. This architecture autonomously refines threat labels and natural-language justifications, reducing false positives and enhancing interpretability. It is also extensible: new attack types can be supported by adding classifiers without retraining the core agent. CyberRAG was evaluated on SQL Injection, XSS, and SSTI, achieving over 94% accuracy per class and a final classification accuracy of 94.92% through semantic orchestration. Generated explanations reached 0.94 in BERTScore and 4.9/5 in GPT-4-based expert evaluation, with robustness preserved against adversarial and unseen payloads. These results show that agentic, specialist-oriented RAG can combine high detection accuracy with trustworthy, SOC-ready prose, offering a flexible path toward partially automated cyber-defense workflows.
💡 Research Summary
CyberRAG introduces a novel, agent‑centric Retrieval‑Augmented Generation (RAG) framework designed to address the chronic overload of alerts generated by modern Intrusion Detection and Prevention Systems (IDS/IPS). The authors identify two core shortcomings in existing solutions: (1) conventional machine‑learning detectors, while reducing alert volume, still suffer from high false‑positive rates and limited interpretability; and (2) standard RAG pipelines retrieve context only once before generation, which often yields irrelevant evidence and provides no justification for the model’s decisions.
The proposed architecture consists of three tightly coupled components. First, a suite of fine‑tuned encoder‑based classifiers (BERT/RoBERTa) is trained on attack‑specific corpora, delivering high‑precision labels for each incoming alert (e.g., SQL Injection, Cross‑Site Scripting, Server‑Side Template Injection). Second, a central large language model (LLM) acts as an autonomous agent that orchestrates the workflow. Upon receiving a classification, the agent formulates an initial query to a domain‑specific knowledge base (constructed from public CVE entries, OWASP guides, MITRE ATT&CK matrices, and internal policy documents). The agent then enters an iterative Retrieval‑and‑Reason loop: it evaluates the retrieved documents for relevance and self‑consistency, refines the query if evidence is insufficient, and repeats until a coherent evidence set is assembled. Third, tool adapters enrich the evidence with additional metadata (severity scores, remediation steps) and the agent produces a structured report in a JSON‑like schema together with a natural‑language narrative. An interactive LLM assistant allows analysts to ask follow‑up questions in real time, turning the system into a collaborative analyst‑agent pair.
Experimental evaluation focuses on three prevalent web‑based attacks. Using a curated dataset of over 10 k labeled IDS alerts, the individual classifiers achieve >94 % accuracy, and the end‑to‑end system reaches a final classification accuracy of 94.92 %. Generated explanations are measured with BERTScore (0.94) and ROUGE‑L (0.88), and a GPT‑4‑based expert panel rates the prose at an average of 4.9 out of 5, indicating strong trustworthiness. Robustness tests with adversarial payload obfuscations and previously unseen variants show only marginal performance degradation (<2 %). The average end‑to‑end latency is 1.8 seconds per alert, satisfying real‑time SOC requirements.
A key strength of CyberRAG is its extensibility. Adding a new attack family merely requires training an additional fine‑tuned classifier; the agent and the iterative RAG loop remain unchanged, eliminating the need for costly retraining of the whole system. The structured output can be directly ingested by SIEM/SOAR platforms, facilitating seamless integration into existing security operations.
The authors acknowledge limitations: reliance on open‑source LLMs may yield lower generation quality compared with proprietary models like GPT‑4; maintaining an up‑to‑date knowledge base demands continuous ingestion pipelines for new CVE entries and internal documentation. Future work proposes reinforcement‑learning‑based query optimization to reduce retrieval cost, incorporation of multimodal evidence (packet captures, log screenshots), and knowledge‑distillation techniques to improve open‑source LLM performance.
In summary, CyberRAG demonstrates that coupling specialist classifiers with an autonomous, iterative RAG agent can simultaneously boost detection accuracy and produce trustworthy, human‑readable explanations. This agentic RAG paradigm offers a practical pathway toward partially automated, explainable cyber‑defense workflows, potentially raising the automation maturity of modern Security Operations Centers.
Comments & Academic Discussion
Loading comments...
Leave a Comment