Hallucination-Resistant Security Planning with a Large Language Model
Large language models (LLMs) are promising tools for supporting security management tasks, such as incident response planning. However, their unreliability and tendency to hallucinate remain significant challenges. In this paper, we address these challenges by introducing a principled framework for using an LLM as decision support in security management. Our framework integrates the LLM in an iterative loop where it generates candidate actions that are checked for consistency with system constraints and lookahead predictions. When consistency is low, we abstain from the generated actions and instead collect external feedback, e.g., by evaluating actions in a digital twin. This feedback is then used to refine the candidate actions through in-context learning (ICL). We prove that this design allows to control the hallucination risk by tuning the consistency threshold. Moreover, we establish a bound on the regret of ICL under certain assumptions. To evaluate our framework, we apply it to an incident response use case where the goal is to generate a response and recovery plan based on system logs. Experiments on four public datasets show that our framework reduces recovery times by up to 30% compared to frontier LLMs.
💡 Research Summary
The paper tackles the problem of using large language models (LLMs) as decision‑support tools in security management, with a focus on incident‑response planning, while explicitly addressing the notorious “hallucination” issue—outputs that appear plausible but are factually incorrect or counter‑productive. The authors formulate the security‑management task as an open‑ended sequential decision problem: a series of textual actions a₀,…,a_{T‑1} must be selected to minimize the total completion time T, without any explicit simulator of the environment.
The core contribution is a principled, iterative framework that couples three mechanisms: (1) generation of multiple candidate actions by the LLM, (2) a consistency check based on look‑ahead predictions, and (3) in‑context learning (ICL) driven by external feedback (e.g., a digital twin or expert evaluation). The workflow proceeds as follows: at each step t, the LLM receives a prompt containing the current logs, alerts, and the history of previously taken actions, and it returns N candidate actions A_t = {a₁,…,a_N}. For each candidate, the LLM also predicts the expected remaining time to finish the task after executing that action (T_i^{t+1}). The authors define a consistency function λ(A_t) = exp(‑β·∑_{i=1}^N (T_i^{t+1} – \bar T^{t+1})²), where \bar T^{t+1} is the average predicted remaining time and β > 0 controls the decay rate. λ ranges from 0 (high disagreement) to 1 (perfect agreement).
A decision policy π_γ uses a threshold γ ∈
Comments & Academic Discussion
Loading comments...
Leave a Comment