Attacking and Defending Covert Channels and Behavioral Models
In this paper we present methods for attacking and defending $k$-gram statistical analysis techniques that are used, for example, in network traffic analysis and covert channel detection. The main new result is our demonstration of how to use a behavior’s or process’ $k$-order statistics to build a stochastic process that has those same $k$-order stationary statistics but possesses different, deliberately designed, $(k+1)$-order statistics if desired. Such a model realizes a “complexification” of the process or behavior which a defender can use to monitor whether an attacker is shaping the behavior. By deliberately introducing designed $(k+1)$-order behaviors, the defender can check to see if those behaviors are present in the data. We also develop constructs for source codes that respect the $k$-order statistics of a process while encoding covert information. One fundamental consequence of these results is that certain types of behavior analyses techniques come down to an {\em arms race} in the sense that the advantage goes to the party that has more computing resources applied to the problem.
💡 Research Summary
The paper introduces a novel framework for both attacking and defending k‑gram‑based statistical analysis techniques commonly used in network traffic monitoring and covert channel detection. The authors show how to construct a stochastic process that exactly matches a given k‑order stationary distribution while allowing the designer to freely choose the (k+1)‑order statistics. This is achieved by building a Probabilistic Deterministic Finite Automaton (PDFA) whose transition matrix satisfies linear constraints derived from the observed k‑grams: the matrix must be stochastic, preserve the stationary distribution, and respect non‑negativity. Because many transition matrices satisfy these constraints, the designer can embed arbitrary (k+1)‑order behavior without altering the original k‑order profile.
Two complementary applications arise. First, an attacker can encode covert information in a traffic stream while preserving the k‑order statistics that defenders normally monitor, thereby evading detection. The paper connects this to source‑coding theory, showing how entropy‑optimal codes (e.g., Huffman or arithmetic coding) can be adapted to respect any prescribed k‑gram distribution. Second, a defender can deliberately embed a “reference signal” or carrier that has specially crafted (k+1)‑order statistics. By monitoring for deviations from these higher‑order patterns, the defender can detect when an adversary attempts to shape the traffic, even if the lower‑order statistics remain unchanged.
A detailed binary‑symbol example illustrates the concept: given first‑order probabilities r₀ and r₁, the authors solve for transition probabilities p₀₀, p₀₁, p₁₀, p₁₁ that keep the marginal distribution fixed while producing desired second‑order joint probabilities. The solution space is a convex set, showing the flexibility of the approach.
The methodology is then applied to real packet‑inter‑arrival data. The authors quantize delays into 13 bins, compute the empirical first‑order distribution R, and solve for a Markov transition matrix P that makes R a stationary vector. Multiple feasible P matrices exist, confirming that a traffic generator can mimic the observed first‑order statistics while embedding arbitrary higher‑order structure.
Overall, the work frames statistical covert‑channel detection as an “arms race”: the side that can estimate and reproduce higher‑order k‑grams more accurately gains the advantage. Consequently, practical defenses require the ability to learn and monitor high‑order statistics, which in turn demands substantial computational resources and sophisticated algorithms for PDFA construction and real‑time analysis. Future directions suggested include extending the technique to multi‑dimensional alphabets (e.g., packet sizes, header fields), integrating machine‑learning models for high‑order estimation, and developing real‑time detection mechanisms for dynamically changing traffic patterns.
Comments & Academic Discussion
Loading comments...
Leave a Comment