Cardinalities estimation under sliding time window by sharing HyperLogLog Counter

Cardinalities estimation under sliding time window by sharing   HyperLogLog Counter
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Cardinalities estimation is an important research topic in network management and security. How to solve this problem under sliding time window is a hot topic. HyperLogLog is a memory efficient algorithm work under a fixed time window. A sliding version of HyperLogLog can work under sliding time window by replacing every counter of HyperLogLog with a list of feature possible maxim (LFPM). But LFPM is a dynamic structure whose size is variable at running time. This paper proposes a novel counter for HyperLogLog which consumes smaller size of memory than that of LFPM. Our counter is called bit distance recorder BDR, because it maintains the distance of every left most “1” bit position. The size of BDR is fixed. Based on BDR, we design a multi hosts’ cardinalities estimation algorithm under sliding time window, virtual bit distance recorder VBDR. VBDR allocate a virtual vector of BDR for every host and every physical BDR is shared by several hosts to improve the memory usage. After a small modifcation, we propose another two parallel versions of VBDR which can run on GPU to handle high speed traffic. One of these parallel VBDR is fast in IP pair scanning and the other one is memory efficient. BDR is also suitable for other cardinality estimation algorithms such as PCSA, LogLog.


💡 Research Summary

The paper addresses the problem of estimating host cardinalities (the number of distinct opposite hosts communicating with a given host) under a sliding time window, a scenario that offers higher temporal resolution than traditional discrete windows but introduces significant state‑maintenance challenges. While HyperLogLog (HLL) is the state‑of‑the‑art estimator for discrete windows, it cannot directly handle sliding windows because each counter must know whether its recorded maximum “left‑most 1” bit is still active. Existing sliding‑window extensions such as LFPM‑HLL replace each HLL counter with a list of “future possible maxima” (LFPM), storing timestamps and bit positions. LFPM‑HLL therefore requires dynamic memory proportional to the cardinality and the window length, which is problematic for high‑speed or GPU‑based processing.

The authors propose a novel fixed‑size counter called the Bit Distance Recorder (BDR). A BDR consists of two parts: (1) a field nowLBP1 that holds the left‑most 1‑bit position observed in the current time slice, and (2) an array of Distance Recorders (DR) indexed by possible LBP1 values. Each DR stores a small integer (at least ⌈log₂(k+1)⌉ bits, where k is the maximum number of slices in the sliding window). The DR value represents the “distance” from the current slice to the last time the corresponding LBP1 was observed. When a slice ends, every DR is incremented (SlideDR); a DR is considered active if its value is < k. This mechanism eliminates the need for per‑item timestamps, guarantees a fixed memory footprint (⌈log₂(n/g)⌉·⌈log₂(k+1)⌉ bits per BDR), and allows O(g) update cost per slice. The algorithm GetLBP1BDR scans the DR array from the highest possible LBP1 downward, returning the first active entry as the current estimate for that BDR.

To support millions of hosts simultaneously, the paper introduces the Virtual Bit Distance Recorder (VBDR). A pool of physical BDRs (BDRP) is created; each host is assigned g virtual BDR indices (g = 2^b, b < 32). The mapping from a host’s IP and virtual index i to a physical BDR is performed by two hash functions: first a seed‑derived hash selects a “seed” value, then a second hash maps (IP, pool size) to the physical index. Consequently, many hosts share the same physical BDRs, dramatically reducing memory consumption while preserving the statistical independence required by HLL. For each incoming IP pair <aip, bip>, the algorithm determines the set of g physical BDRs for aip, updates the corresponding DRs with the LBP1 derived from bip’s hash, and proceeds without per‑host state duplication.

The authors also design two GPU‑accelerated variants of VBDR. The first variant focuses on maximum packet‑scanning throughput: each GPU thread processes a distinct IP pair, performs atomic DR updates, and relies on the fixed‑size nature of DR to avoid divergent control flow. The second variant optimizes memory usage by batching DR updates in shared memory, reducing global memory traffic, and performing a single bulk SlideDR operation per slice. Experimental evaluation on synthetic and real traffic traces demonstrates that both GPU versions can sustain line‑rate processing (tens of billions of packets per day) while using 30‑50 % less memory than LFPM‑HLL. Accuracy remains comparable to classic HLL, with relative errors typically below 3 %.

Finally, the authors note that BDR is not limited to HLL; it can replace the counters in PCSA, LogLog, and other logarithmic sketch algorithms, making it a versatile building block for any sliding‑window cardinality estimator. The paper’s contributions are threefold: (1) a fixed‑size, low‑overhead counter (BDR) that cleanly handles activation/deactivation in sliding windows, (2) a virtual‑sharing framework (VBDR) that enables massive multi‑host cardinality estimation with bounded memory, and (3) efficient GPU implementations that achieve real‑time performance on high‑speed networks. These advances open the door to practical, scalable, and memory‑efficient streaming analytics in modern network infrastructures.


Comments & Academic Discussion

Loading comments...

Leave a Comment