Lost Audio Packets Steganography: The First Practical Evaluation

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

This paper presents first experimental results for an IP telephony-based steganographic method called LACK (Lost Audio PaCKets steganography). This method utilizes the fact that in typical multimedia communication protocols like RTP (Real-Time Transport Protocol), excessively delayed packets are not used for the reconstruction of transmitted data at the receiver, i.e. these packets are considered useless and discarded. The results presented in this paper were obtained basing on a functional LACK prototype and show the method’s impact on the quality of voice transmission. Achievable steganographic bandwidth for the different IP telephony codecs is also calculated.

💡 Research Summary

The paper presents the first practical evaluation of a VoIP‑based steganographic technique called LACK (Lost Audio Packets steganography). LACK exploits the fact that in RTP‑based real‑time communication, packets that arrive later than the jitter buffer’s tolerance are discarded and never used for voice reconstruction. The authors implement a prototype that selects an RTP packet from the voice stream, replaces its payload with bits of a secret message, and deliberately delays the packet so that a normal (non‑aware) receiver treats it as a lost packet. A receiver that knows the LACK protocol extracts the hidden payload instead of dropping the packet. Because no extra packets are generated, the overall traffic pattern remains unchanged, making detection difficult.

The paper first surveys existing VoIP steganography, which mainly modifies audio samples (LSB methods) or abuses unused header fields in SIP, RTP, or RTCP. LACK differs by using intentional packet loss as the covert channel. The authors identify three groups of factors influencing LACK performance: (1) endpoint‑related – codec type, RTP payload size, jitter buffer size (fixed or adaptive); (2) network‑related – round‑trip delay, packet loss probability, jitter; and (3) LACK‑specific – number of intentionally delayed packets, the amount of added delay, and the insertion rate (bits per second).

A theoretical model is derived for the required intentional delay dL(t) to guarantee that a packet exceeds the receiver’s jitter buffer size tB(t). The model incorporates fixed processing delays (DSP, codec, encapsulation) and network delay dN(t). For fixed jitter buffers the transmitter can compute a static delay; for adaptive buffers the transmitter must receive buffer‑size updates during the call or use a conservative delay that exceeds the maximum possible buffer size. The codec’s tolerance to packet loss (e.g., 1 % for G.723.1, 2 % for G.729A, 3 % for G.711, up to 5 % with packet‑loss concealment) directly limits the amount of covert data that can be inserted without degrading voice quality beyond acceptable MOS levels.

The prototype was built on an open‑source VoIP stack, intercepting RTP packets, swapping payloads, and inserting delays. Experiments were conducted with three widely used codecs (G.711, G.729A, G.723.1) under controlled network conditions with induced loss rates ranging from 0.5 % to 5 %. The authors measured (i) steganographic bandwidth, (ii) MOS‑based voice quality, (iii) packet‑loss statistics, and (iv) detectability using statistical analysis and an “active warden” that inspects RTP sequence numbers and timestamps.

Results show that, while staying within each codec’s loss tolerance, LACK can achieve 1.2 kbps–2.5 kbps of hidden throughput. For G.711, a 2 % induced loss yields about 2.3 kbps; for G.729A, a 1 % loss yields about 1.5 kbps. Voice quality remains high (MOS ≥ 4.0) when loss is ≤ 2 %; it drops below MOS 3.5 when loss approaches 5 %, confirming the trade‑off between bandwidth and QoS. Detectability analysis indicates that passive statistical monitoring of loss rates can flag anomalously high loss but suffers from high false‑positive rates because natural loss spikes are common (≈0.5 % of packets in 80 % of Internet calls). An active warden that drops all delayed packets can indeed eliminate the covert channel, but it also harms legitimate traffic, effectively causing a denial‑of‑service. Consequently, reliable detection would require correlating loss statistics with call duration anomalies and possibly leveraging RTCP reports.

The authors conclude that LACK is a practical, low‑overhead steganographic method for VoIP. Its effectiveness depends on careful selection of codec, jitter‑buffer management, and real‑time network monitoring to keep intentional loss within acceptable bounds. The technique is resilient to encryption (SRTP) because the warden cannot reconstruct the voice payload to verify hidden data, though encryption does not hide the fact that packets are being delayed. Future work is suggested on adaptive control algorithms that dynamically adjust delay based on live network feedback, and on integrating LACK with cryptographic protection of the hidden payload. Overall, the paper demonstrates that intentional packet loss can be harnessed as a viable covert channel without noticeably altering traffic patterns, provided that QoS constraints are respected.

Lost Audio Packets Steganography: The First Practical Evaluation

💡 Research Summary

Comments & Academic Discussion

Leave a Comment