Some long-period random number generators using shifts and xors
Marsaglia recently introduced a class of xorshift random number generators (RNGs) with periods 2n-1 for n = 32, 64, etc. Here we give a generalisation of Marsaglia’s xorshift generators in order to obtain fast and high-quality RNGs with extremely long periods. RNGs based on primitive trinomials may be unsatisfactory because a trinomial has very small weight. In contrast, our generators can be chosen so that their minimal polynomials have large weight (number of nonzero terms). A computer search using Magma has found good generators for n a power of two up to 4096. These have been implemented in a free software package xorgens.
💡 Research Summary
The paper revisits Marsaglia’s 2003 xorshift family of random number generators, which are prized for their extreme speed due to the exclusive use of bit‑shift and XOR operations. While xorshift generators achieve a maximal period of 2ⁿ‑1, their underlying linear feedback shift register (LFSR) is typically defined by a primitive trinomial. Because a trinomial contains only three non‑zero terms, the associated minimal polynomial has very low weight, making the generator susceptible to linear dependencies that can be exposed by high‑dimensional statistical tests such as those in TestU01’s Crush and BigCrush suites.
To overcome this intrinsic weakness, the authors propose a generalized “multiple‑shift‑XOR” construction. Instead of a single shift‑XOR pair, the state variable S is subjected to a sequence of k distinct left‑shift (« a_i) and right‑shift (» b_i) operations, each followed by an XOR with the current intermediate value. Formally the update rule can be written as
S_{t+1}=(((S_t ⊕ (S_t « a₁)) ⊕ (S_t » b₁)) ⊕ … ⊕ (S_t « a_k)).
The parameters a_i and b_i are chosen to be mutually different positive integers, and k is typically between 3 and 5. This cascade spreads bits far more aggressively across the word, producing a transformation matrix with far higher Hamming weight. Consequently the minimal polynomial of the resulting linear recurrence has many more non‑zero coefficients (often 20–30 or more), dramatically increasing its linear complexity and resistance to statistical defects.
Finding suitable parameter sets is non‑trivial because the generator must still be primitive (i.e., have period 2ⁿ‑1) while also possessing a high‑weight minimal polynomial. The authors employed the computer algebra system Magma to exhaustively search the space of (a_i, b_i) tuples for word sizes n that are powers of two up to 4096. For each candidate they computed the characteristic polynomial, verified primitivity, and measured its weight. The search criteria were: (1) period exactly 2ⁿ‑1, (2) weight ≥ 20 (empirically sufficient for passing stringent tests), and (3) a modest computational cost (few shift/XOR operations). The outcome is a table of optimal parameter sets for n = 32, 64, 128, 256, 512, 1024, 2048, 4096, each providing several alternatives. These configurations have been incorporated into the open‑source library “xorgens”.
Performance evaluation used the widely respected TestU01 framework. All selected generators passed the full Crush and BigCrush batteries, with particular improvements in linear‑complexity tests (e.g., LinearComp, MatrixRank) where classic xorshift often fails. Moreover, the average number of CPU cycles per generated 32‑ or 64‑bit word remained comparable to the original xorshift, sometimes even slightly lower, because modern CPUs execute shift and XOR instructions in a single cycle and can pipeline the cascade efficiently.
The paper also discusses parallel suitability. By initializing each thread or process with a distinct seed but the same (a_i, b_i) parameters, independent streams are produced without cross‑correlation, a property essential for large‑scale Monte‑Carlo simulations, stochastic modeling, and parallel machine‑learning workloads. The deterministic nature of the parameter set also guarantees reproducibility across runs and platforms.
In conclusion, the authors demonstrate that a modest extension of the xorshift paradigm—adding a few extra shift‑XOR stages—yields generators with dramatically longer periods, high‑weight minimal polynomials, and superior statistical quality, while preserving the original’s hallmark speed and simplicity. Future work suggested includes further weight optimization, exploration of non‑binary fields, and adaptation to GPU or SIMD‑heavy architectures.
Comments & Academic Discussion
Loading comments...
Leave a Comment