Linking Correlated Network Flows through Packet Timing: a Game-Theoretic Approach
Deciding that two network flows are essentially the same is an important problem in intrusion detection or in tracing anonymous connections. A stepping stone or an anonymity network may try to prevent flow correlation by delaying the packets, introducing chaff traffic, or even splitting the flow in several subflows. We introduce a game-theoretic framework for this problem. The framework is used to derive the Nash equilibrium under two different adversary models: the first one, when the adversary is limited to delaying packets, and the second, when the adversary also adds dummy packets and removes packets from the flow. As the optimal decoder is not computationally feasible, we restrict the possible decoder to one that estimates and compensates the attack. Our analysis can be used for understanding the limits of flow correlation based on packet timings under an active attacker.
💡 Research Summary
The paper addresses the fundamental security problem of determining whether two observed network flows originate from the same source, a task that underpins intrusion detection, stepping‑stone discovery, and de‑anonymization of low‑latency anonymity networks such as Tor. Recognizing that an adversary positioned between the source and the observer can deliberately manipulate the flow—by adding delays, injecting dummy (chaff) packets, or even splitting the flow—the authors formalize the interaction as a two‑player zero‑sum game between a Traffic Analyst (TA) and an Adversary (AD).
The game is defined as follows: the TA selects an acceptance region Λ₁(xⁿ) for hypothesis H₁ (the flows are linked) subject to a false‑positive constraint η; the AD selects an attack from a set A_AD that depends on the assumed adversary model; the utility u(Λ₁, A) is the probability that the TA correctly accepts H₁ when it is true. The equilibrium solution is min_{A∈A_AD} max_{Λ∈A_TA} u(Λ, A).
Two adversary models are examined.
-
Delay‑only model – AD may delay each packet by at most A_max seconds, preserving a one‑to‑one correspondence between the original flow xⁿ and the observed flow wⁿ (so m = n). The TA bases detection on first‑order statistics of inter‑packet delays (IPDs). By Neyman‑Pearson Lemma the optimal detector is a likelihood‑ratio test that requires the joint probability density of the delay vector aⁿ. Computing this joint density is intractable, so the authors propose a practical approximation: they first estimate the most likely delay sequence \hat aⁿ and then evaluate a simplified likelihood ratio (Equation 3). The AD’s optimal strategy, derived from the game, reduces to minimizing the likelihood that the delayed flow appears typical of unrelated traffic, which is approximated by maximizing ∑ log f_ΔY(Δx_i + Δa_i). Simulations on real SSH and HTTP traces (two scenarios: AWS stepping‑stone and Tor web access) show that an AD that chooses delays optimally reduces detection probability dramatically compared with an AD that selects delays uniformly at random.
-
Delay + chaff + splitting model – AD can additionally insert up to P_A·n dummy packets and delete up to P_L·n packets. Consequently, the mapping between xⁿ and wᵐ is no longer bijective. The TA first performs a matching between original and observed packets using a synchronization offset ρ and a loss threshold γ, producing matched subsequences xⁿ² and wⁿ² (n₂ ≤ n). The likelihood‑ratio test is then adapted to account for the reduced length and the probabilities of packet loss (Equation 11). The AD’s action space now includes the delay vector aⁿ, a binary deletion vector ℓⁿ, and a dummy‑packet timing vector cⁿ, all subject to the P_A and P_L constraints (Equation 12).
The authors evaluate the robust detector in the same two traffic scenarios, comparing three AD strategies: (i) a fully random attack (uniform delays, random deletions, random dummy timings), (ii) a hybrid attack where only the delay component is optimized (the “NON2” strategy), and (iii) the fully optimal attack derived from the game. Results (Figure 3) demonstrate that the optimal AD significantly degrades detection performance, while the presence or placement of chaff packets has a relatively minor effect when there are no losses.
Key contributions of the work are:
- A formal game‑theoretic framework that captures the strategic interaction between flow correlation and active countermeasures.
- Derivation of Nash equilibria for two realistic adversary capabilities, revealing the optimal defensive and offensive strategies.
- Practical approximations for the optimal detector that remain computationally feasible while preserving most of the theoretical advantage.
- An extended matching‑based detection method that remains effective even when the adversary can add or drop packets.
The paper concludes that while exact optimal strategies are computationally prohibitive, the proposed approximations provide a solid basis for designing robust flow‑linking systems. Future work is suggested to incorporate active watermarks into the source flow and to explore real‑time implementations.
Comments & Academic Discussion
Loading comments...
Leave a Comment