A CF-Based Randomness Measure for Sequences

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

This note examines the question of randomness in a sequence based on the continued fraction (CF) representation of its corresponding representation as a number, or as D sequence. We propose a randomness measure that is directly equal to the number of components of the CF representation. This provides a means of quantifying the randomness of the popular PN sequences as well. A comparison is made of representation as a fraction and as a continued fraction.

💡 Research Summary

The paper introduces a novel randomness metric for binary sequences based on the length of their continued‑fraction (CF) expansion. The authors begin by reviewing traditional randomness assessment tools—entropy, linear complexity, autocorrelation, period tests, and the NIST statistical suite—and point out that these methods often capture only specific statistical aspects while overlooking the underlying structural complexity of the sequence. This is especially problematic for pseudo‑random (PN) sequences generated by linear feedback shift registers (LFSRs) and for D‑sequences derived from the binary expansion of rational numbers, which may exhibit high linear complexity yet retain hidden regularities.

To address this gap, the authors propose a two‑step mapping. First, a binary sequence s of length L is interpreted as an integer N = Σ_{i=1}^{L} s_i·2^{L‑i}. This integer can be expressed as a rational number p/q (typically p = N, q = 2^L). Second, the rational number is converted into its simple continued‑fraction representation using the Euclidean algorithm: p/q = a₀ + 1/(a₁ + 1/(a₂ + … + 1/a_k)). The number of partial quotients k (including a₀) is defined as the randomness measure R: R = k. The intuition is straightforward: a short CF expansion indicates that N can be described by a small set of integer operations, implying regularity and low randomness; a long, irregular CF expansion suggests that the sequence lacks a compact integer description and therefore possesses higher randomness.

The experimental evaluation focuses on two families of sequences. The first family consists of classic PN sequences: maximal‑length m‑sequences generated by single LFSRs of various lengths, and Gold codes obtained by XOR‑combining two LFSRs. The second family comprises D‑sequences formed by taking the binary expansion of rational numbers such as 1/7, 1/13, and more complex fractions like 12345/67890. For each sequence the authors compute both the fraction p/q and its CF expansion, recording the length k.

Results reveal a clear separation between highly structured and more complex sequences. Simple m‑sequences (e.g., a 7‑stage LFSR) produce CF expansions of length 2–3, confirming their strong periodic structure. Gold codes, which inherit non‑linear properties from the combination of two LFSRs, yield lengths in the range 8–12, indicating a richer internal structure. Among D‑sequences, those derived from fractions with short periods (e.g., 1/7) have k = 1, while fractions with larger denominators and mixed prime factors generate longer expansions (k = 5–9). The authors also compare the CF‑based metric with linear complexity. While there is a general positive correlation—higher linear complexity often coincides with longer CF expansions—the CF length captures an additional dimension: the irregularity of the partial quotients themselves. Two sequences with identical linear complexity can have markedly different CF lengths, highlighting the metric’s sensitivity to non‑linear patterns that traditional tests may miss.

From a computational standpoint, the CF length can be obtained in O(log min(p,q)) time using the Euclidean algorithm, which is substantially faster than many statistical test suites that require O(N log N) operations for a sequence of length N. However, the authors acknowledge practical concerns: large partial quotients may cause integer overflow in fixed‑width arithmetic, necessitating arbitrary‑precision libraries or preprocessing steps to bound coefficient size.

The paper concludes by outlining future research directions. Extending the approach to non‑binary alphabets (e.g., ternary or quaternary sequences) would involve generalized continued‑fraction representations. Statistical analysis of the distribution of partial quotients could lead to threshold values for classifying sequences as “sufficiently random.” Finally, integrating the CF‑based measure into cryptographic key‑stream generator design and security evaluation could provide an additional, structure‑focused layer of assurance beyond conventional statistical testing.

In summary, the authors propose a mathematically grounded randomness measure—simply the number of terms in a sequence’s continued‑fraction expansion—that complements existing statistical metrics, offers computational efficiency, and reveals structural complexity especially in short or highly regular pseudo‑random sequences.

A CF-Based Randomness Measure for Sequences

💡 Research Summary

Comments & Academic Discussion

Leave a Comment