Security Games with Decision and Observation Errors

Security Games with Decision and Observation Errors
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

We study two-player security games which can be viewed as sequences of nonzero-sum matrix games played by an Attacker and a Defender. The evolution of the game is based on a stochastic fictitious play process. Players do not have access to each other’s payoff matrix. Each has to observe the other’s actions up to present and plays the action generated based on the best response to these observations. However, when the game is played over a communication network, there are several practical issues that need to be taken into account: First, the players may make random decision errors from time to time. Second, the players’ observations of each other’s previous actions may be incorrect. The players will try to compensate for these errors based on the information they have. We examine convergence property of the game in such scenarios, and establish convergence to the equilibrium point under some mild assumptions when both players are restricted to two actions.


💡 Research Summary

The paper investigates repeated two‑player security games—an Attacker (P₁) and a Defender (P₂)—under the framework of stochastic fictitious play (FP). In a classic FP setting each player observes the opponent’s past actions, forms an empirical frequency of those actions, and at every stage selects a best‑response mixed strategy against this estimate. The utility of player i is defined as Uᵢ(pᵢ,p_{‑i}) = pᵢᵀMᵢp_{‑i} + τᵢH(pᵢ), where H is the Shannon entropy and τᵢ≥0 controls the degree of randomization. When τᵢ>0 the best‑response mapping is unique and given by the soft‑max function σ(Mᵢp_{‑i}/τᵢ).

The novelty lies in incorporating two realistic sources of imperfection that arise when the game is played over a communication network:

  1. Decision Errors – A player may intend to play action i but, due to hardware/software glitches or channel disturbances, actually executes action j with probability α_{ij} (for P₁) or ε_{ij} (for P₂). These probabilities form stochastic matrices D₁ and D₂. Assuming the matrices are invertible, the true empirical frequency qᵢ of executed actions relates to the intended frequency pᵢ by qᵢ = Dᵢpᵢ.

  2. Observation Errors – The opponent’s actions are observed through a noisy channel. The observed empirical frequency \tilde qᵢ equals Fᵢqᵢ, where Fᵢ is another stochastic matrix. If Fᵢ is known, a player can reconstruct an unbiased estimate of qᵢ by applying Fᵢ⁻¹.

Two algorithmic variants are presented. Algorithm II‑C.1 assumes perfect observations and follows the standard stochastic FP steps: update empirical frequencies, compute the soft‑max best response, and sample an action. Algorithm II‑C.2 adds an explicit correction step for observation errors: after updating the observed frequency, the player applies the inverse observation matrix before computing the best response.

The core analytical contributions focus on the case where each player has exactly two actions (m = n = 2). Under this restriction the authors prove several convergence results:

  • Theorem 1 (extension of prior work) – For any τ₁,τ₂>0, the continuous‑time FP dynamics \dot pᵢ = βᵢ(p_{‑i}) − pᵢ converge to the set where each player’s mixed strategy equals its best response to the opponent’s strategy, provided the non‑degeneracy condition (LᵀM₁L)(LᵀM₂L) ≠ 0 holds (L =

Comments & Academic Discussion

Loading comments...

Leave a Comment