A nearly tight memory-redundancy trade-off for one-pass compression

Let $s$ be a string of length $n$ over an alphabet of constant size $\sigma$ and let $c$ and $\epsilon$ be constants with (1 \geq c \geq 0) and (\epsilon > 0). Using (O (n)) time, (O (n^c)) bits of memory and one pass we can always encode $s$ in (n H_k (s) + O (\sigma^k n^{1 - c + \epsilon})) bits for all integers (k \geq 0) simultaneously. On the other hand, even with unlimited time, using (O (n^c)) bits of memory and one pass we cannot always encode $s$ in (O (n H_k (s) + \sigma^k n^{1 - c - \epsilon})) bits for, e.g., (k = \lceil (c + \epsilon / 2) \log_\sigma n \rceil).

💡 Research Summary

The paper investigates the fundamental trade‑off between memory usage and redundancy in the one‑pass (streaming) compression model. Given a string s of length n over a constant‑size alphabet σ, the authors consider any integer k ≥ 0 and the k‑th order empirical entropy H_k(s) as the information‑theoretic lower bound for compression. They introduce a parameter c with 0 ≤ c ≤ 1, which determines the amount of working memory available: the algorithm may use only O(n^c) bits of memory, while still being allowed O(n) time and a single pass over the input.

Upper bound.
The authors present a constructive algorithm that, for any constants c and ε > 0, achieves the following coding length for every k:

|C(s)| ≤ n·H_k(s) + O(σ^k·n^{1‑c+ε}) bits.

The algorithm works in two phases. First, it scans the input once, maintaining a compact data structure that approximates the frequencies of all observed k‑grams. Because the memory budget is limited to O(n^c) bits, the structure cannot store a full table of size σ^k; instead it uses hash‑based sampling, count‑sketches, or other sublinear‑space frequency estimators. The second phase performs arithmetic (or range) coding using the estimated probabilities. The error introduced by the approximate model translates directly into the additive term O(σ^k·n^{1‑c+ε}). The analysis shows that each update and query takes expected constant time, so the total running time remains linear in n.

Lower bound.
To complement the upper bound, the paper proves that the memory‑redundancy trade‑off is essentially tight. For any fixed c and ε, define

k = ⌈(c + ε/2)·log_σ n⌉,

so that σ^k ≈ n^{c+ε/2}. The authors construct a family of adversarial strings in which the n positions consist of σ^k different k‑gram patterns, each appearing roughly n/σ^k times and arranged uniformly at random. With only O(n^c) bits of memory, any one‑pass algorithm cannot keep track of all σ^k patterns; information‑theoretic arguments (based on counting arguments and the pigeon‑hole principle) show that the algorithm must incur at least

Ω(σ^k·n^{1‑c‑ε})

additional bits beyond the entropy term, regardless of the amount of computation time allowed. This lower bound matches the upper bound up to the factor n^{2ε}, establishing that the presented trade‑off curve is nearly optimal.

Context and significance.
Previous work on streaming compression either assumed unlimited memory (focusing solely on achieving n·H_k(s) plus lower‑order terms) or considered only constant‑memory regimes without quantifying the precise redundancy penalty. This paper bridges the gap by providing a unified analysis that holds for the whole spectrum of memory budgets, parameterized by c. The result is particularly relevant for real‑time systems where memory is at a premium: network routers, log‑aggregation services, and IoT edge devices can now reason about how much extra space is needed to obtain a desired compression quality.

Methodological contributions.

A concrete one‑pass algorithm that combines sublinear‑space frequency estimation with adaptive arithmetic coding, achieving linear time and provable redundancy guarantees.
A tight lower‑bound construction that leverages a carefully chosen k value tied to the memory exponent c, demonstrating that any algorithm constrained to O(n^c) bits must suffer the stated redundancy.
A clean analytical framework that isolates the three parameters—memory exponent c, entropy order k, and slack ε—and shows how they interact in the final bound.

Conclusions and future directions.
The paper establishes that, in the one‑pass streaming model, the redundancy term scales as σ^k·n^{1‑c±ε}. This near‑tight characterization provides a benchmark for evaluating existing compressors and for designing new ones that operate under strict memory constraints. Future research could explore extensions to non‑constant alphabets, adaptive memory allocation (e.g., allowing the algorithm to trade memory for redundancy dynamically), multi‑pass variants, or empirical evaluations on real‑world data streams to assess practical constants hidden in the O‑notation.

💡 Research Summary

📜 Original Paper Content