Capacity of Steganographic Channels
This work investigates a central problem in steganography, that is: How much data can safely be hidden without being detected? To answer this question, a formal definition of steganographic capacity is presented. Once this has been defined, a general formula for the capacity is developed. The formula is applicable to a very broad spectrum of channels due to the use of an information-spectrum approach. This approach allows for the analysis of arbitrary steganalyzers as well as non-stationary, non-ergodic encoder and attack channels. After the general formula is presented, various simplifications are applied to gain insight into example hiding and detection methodologies. Finally, the context and applications of the work are summarized in a general discussion.
💡 Research Summary
The paper tackles the fundamental question of how much information can be hidden in a cover medium without being detected, by formally defining “steganographic capacity” and deriving a general capacity formula that applies to a wide variety of channels and detectors. The authors model a steganographic system as a triple (W, g, A): an encoder‑noise channel W that captures natural distortions (compression, quantization, etc.), a steganalyzer g that decides whether a received signal contains hidden data, and an attack channel A that represents an active adversary’s manipulations (cropping, additive noise, re‑compression). The combined effect is expressed as an encoder‑attack channel Q = A ∘ W, with transition probabilities Qₙ(z|x)=∑_y Aₙ(z|y)Wₙ(y|x).
A code is defined as an (n, Mₙ, εₙ, δₙ)‑tuple, where εₙ is the average decoding error probability and δₙ is the probability that the steganalyzer raises an alarm. Secure capacity C(W,g,A) is the supremum of rates R for which there exists a sequence of codes with εₙ→0, δₙ→0 and (1/n)log Mₙ→R. The authors also introduce (ε, δ)‑secure capacity to handle non‑zero error/detection tolerances.
Because steganalyzers are often non‑ergodic and non‑stationary, the paper adopts the information‑spectrum framework. General sources X = {Xⁿ} and outputs Z = {Zⁿ} are considered without any consistency assumptions. Spectral entropy rates H̅(X), H̲(X) and spectral mutual‑information rates I̅(X;Z), I̲(X;Z) are defined via p‑limsup and p‑liminf of normalized log‑likelihood ratios. Fundamental inequalities such as I̅(X;Z) ≤ H̅(Z) − H̅(Z|X) are recalled.
A key contribution is the definition of δ‑secure input and output sets, S_δ and T_δ, consisting of sources whose induced Y‑sequences satisfy lim sup Pr{gₙ(Yⁿ)=1} ≤ δ. The secure input set S₀ (δ = 0) captures sources that never trigger the detector in the limit.
The main capacity theorem states: C(W,g,A) = sup_{X∈S₀} p‑liminf_{n→∞}
Comments & Academic Discussion
Loading comments...
Leave a Comment