C-Codes: Cyclic Lowest-Density MDS Array Codes Constructed Using Starters for RAID 6
The distance-3 cyclic lowest-density MDS array code (called the C-Code) is a good candidate for RAID 6 because of its optimal storage efficiency, optimal update complexity, optimal length, and cyclic symmetry. In this paper, the underlying connections between C-Codes (or quasi-C-Codes) and starters in group theory are revealed. It is shown that each C-Code (or quasi-C-Code) of length $2n$ can be constructed using an even starter (or even multi-starter) in $(Z_{2n},+)$. It is also shown that each C-Code (or quasi-C-Code) has a twin C-Code (or quasi-C-Code). Then, four infinite families (three of which are new) of C-Codes of length $p-1$ are constructed, where $p$ is a prime. Besides the family of length $p-1$, C-Codes for some sporadic even lengths are also presented. Even so, there are still some even lengths (such as 8) for which C-Codes do not exist. To cover this limitation, two infinite families (one of which is new) of quasi-C-Codes of length $2(p-1)$ are constructed for these even lengths.
💡 Research Summary
The paper investigates a class of distance‑3 cyclic lowest‑density Maximum‑Distance‑Separable (MDS) array codes, called C‑Codes, which are especially suitable for RAID 6 systems. RAID 6 requires protection against any two simultaneous disk failures; therefore, an MDS code of distance three is needed. Traditional Reed‑Solomon based RAID 6 solutions rely on finite‑field arithmetic, which is computationally heavy. In contrast, array codes operate solely with XOR operations, making them attractive for high‑performance storage. Among array codes, the cyclic lowest‑density MDS codes achieve four optimal properties simultaneously: (i) MDS (optimal storage efficiency), (ii) minimum update complexity (each data bit touches exactly two parity bits), (iii) maximal possible length for a distance‑3 MDS code with that update complexity, and (iv) cyclic symmetry, which simplifies hardware implementation.
The authors first formalize C‑Codes as a special case of B‑Codes with an added cyclic symmetry. A C‑Code of length 2n consists of an n × 2n binary matrix H₂ₙ that satisfies four algebraic conditions: the last column of each sub‑matrix H_k equals the k‑th column of the identity matrix, each H_k is obtained by cyclically shifting H₀, each row has weight 2n − 1, and any pair of distinct columns yields a nonsingular 2 × 2 sub‑matrix. This definition guarantees the MDS property and the optimal update complexity of 2.
To understand the combinatorial structure, the paper adopts a graph‑theoretic viewpoint. Each parity bit corresponds to a vertex, and each information bit (which is protected by exactly two parity bits) corresponds to an edge connecting the two vertices. Consequently, a C‑Code of length 2n can be represented by a (2n − 2)-regular graph on 2n vertices. The cyclic symmetry translates into a simple rule: the edge set for column i is obtained by adding i (mod 2n) to every endpoint of the edge set for column 0.
The central theoretical contribution is the equivalence between constructing a C‑Code of length 2n and constructing a bipyramidal perfect one‑factorization (P1F) of a 2n‑regular graph on 2n + 2 vertices (the extra two vertices are denoted ∞₁ and ∞₂ and are never adjacent). A one‑factor is a perfect matching; a P1F is a set of 2n one‑factors such that the union of any two distinct one‑factors forms a Hamiltonian cycle. The authors prove that a C‑Code exists iff there exists a bipyramidal P1F of the described graph.
The next breakthrough is linking bipyramidal P1Fs to even starters in the additive group (ℤ₂ₙ,+). An even starter is a set of n − 1 unordered pairs {a_i,b_i} of non‑zero elements of ℤ₂ₙ such that every non‑zero element appears exactly once as a difference (mod 2n) of a pair, and the differences are all distinct. The paper shows that any even starter induces a bipyramidal P1F, and therefore yields a C‑Code. Conversely, any C‑Code gives rise to an even starter. This bijection provides a concrete algebraic method for constructing C‑Codes.
A notable corollary is the existence of twin C‑Codes. By swapping each pair in the starter (or equivalently applying a certain involution on the one‑factorization), one obtains a distinct C‑Code with the same parameters, called the twin code. The twin code offers flexibility in disk layout without sacrificing performance.
Using the starter framework, the authors enumerate all C‑Codes for many even lengths. They list explicit constructions for lengths 4, 6, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 50. Some of these (14, 20, 24, 26, 32, 34, 38, 50) were not covered by earlier families. They also prove that C‑Codes exist for most even lengths, but not for all; length 8 is a concrete counter‑example where no even starter exists, and thus no C‑Code can be built.
To address lengths where C‑Codes are impossible, the paper introduces quasi‑C‑Codes. A quasi‑C‑Code of length 2n is defined to be κ‑quasi‑cyclic (κ divides 2n): the columns can be partitioned into κ groups, each of which is cyclically symmetric. The construction uses even multi‑starters, which are collections of several even starters whose difference sets together cover all non‑zero elements. By employing even multi‑starters, the authors construct infinite families of quasi‑C‑Codes of length 2(p − 1) for any prime p, thereby covering the previously missing lengths such as 8.
The paper further contributes four infinite families of C‑Codes of length p − 1 (p prime). These families are derived from two infinite families of even starters in the multiplicative group ℤ*_p. Three of the four families are new; the fourth coincides with the construction by Cassuto and Bruck (2010). Moreover, the authors demonstrate that any non‑cyclic B‑Code of length p − 1 previously reported in the literature can be transformed into a cyclic C‑Code via the starter correspondence.
Similarly, for length 2(p − 1), they present two infinite families of quasi‑C‑Codes based on an infinite family of even 2‑starters in ℤ_{2(p‑1)}. They also show that all known non‑cyclic B‑Codes of length 2(p − 1) can be turned into quasi‑C‑Codes.
In summary, the paper establishes a deep algebraic bridge between RAID 6‑oriented array codes and classic combinatorial objects (starters, perfect one‑factorizations). This bridge yields systematic construction methods, explains the existence (or non‑existence) of codes for specific lengths, introduces twin and quasi‑variants for flexibility, and expands the catalog of optimal RAID 6 codes with several new infinite families and sporadic examples. The results not only advance the theory of low‑density MDS array codes but also have immediate practical relevance for designing storage systems that demand minimal update overhead, maximal storage efficiency, and straightforward hardware implementation.
Comments & Academic Discussion
Loading comments...
Leave a Comment