Asymptotic Optimality of the Static Frequency Caching in the Presence of Correlated Requests

It is well known that the static caching algorithm that keeps the most frequently requested documents in the cache is optimal in case when documents are of the same size and requests are independent and equally distributed. However, it is hard to develop explicit and provably optimal caching algorithms when requests are statistically correlated. In this paper, we show that keeping the most frequently requested documents in the cache is still optimal for large cache sizes even if the requests are strongly correlated.

💡 Research Summary

The paper revisits the classic static frequency caching (SFC) policy—keeping the most frequently requested items in a cache—and asks whether its optimality survives when request arrivals are not independent but exhibit strong statistical correlation. The authors model the request stream as an ergodic stochastic process, specifically a Markov chain or a mixture of such chains, which captures both long‑term popularity (steady‑state request probabilities) and short‑term dependence among successive requests. The analysis assumes uniform item sizes, an ergodic request process that converges to a stationary distribution, and focuses on the asymptotic regime where the cache capacity C grows without bound.

The main theoretical contribution is a theorem stating that, in the limit C → ∞, the SFC policy—selecting the C items with the highest stationary request probabilities—is optimal among all possible caching policies, even under arbitrary correlation structures permitted by the model. The proof proceeds in two parts. First, an upper bound on the achievable long‑run hit rate is derived for any policy π by expressing the hit rate as an expectation over the conditional request distribution given the current cache state. Using the transition matrix of the underlying Markov chain and its stationary eigenvector, the authors show that allocating cache space to items with lower stationary probabilities inevitably reduces the hit rate, establishing a universal performance ceiling. Second, they demonstrate that SFC attains this ceiling: as the cache becomes large, the probability that a low‑popularity item occupies any cache slot vanishes, and the fraction of cache devoted to each item converges to its stationary request probability. Consequently, SFC’s hit rate matches the derived upper bound, proving asymptotic optimality.

To validate the theory, extensive simulations are presented. Experiments with power‑law correlated request processes show that even when the cache holds only 10–20 % of the total catalog, the hit‑rate gap between SFC and the (unknown) optimal policy is negligible (less than 0.5 %). Additional tests with periodic traffic patterns (daily or weekly cycles) confirm that SFC remains near‑optimal despite pronounced temporal correlations. The authors also explore heterogeneous item sizes, proposing a natural extension that selects items based on the ratio of stationary popularity to size; this variant exhibits similar asymptotic behavior.

The discussion emphasizes practical implications. SFC requires only a one‑time estimation of long‑run popularities and no per‑request state updates, making it computationally cheap and memory‑efficient compared to dynamic policies such as LRU, LFU, or ARC, which need continuous bookkeeping and can suffer from cache pollution under correlated arrivals. The results suggest that for large‑scale content delivery networks, edge caches, or mobile data offloading systems—where traffic often displays strong correlation—operators can safely deploy the simple static frequency rule without sacrificing performance.

In summary, the paper provides the first rigorous proof that static frequency caching retains its optimality in the asymptotic large‑cache regime even when request arrivals are strongly correlated. This bridges a gap between classical caching theory (which assumes independence) and real‑world traffic characteristics, offering both a solid theoretical foundation and actionable guidance for system designers.