Anti-Collusion Digital Fingerprinting Codes via Partially Cover-Free Families

Anti-collusion digital fingerprinting codes have been of significant current interest in the context of deterring unauthorized use of multimedia content by a coalition of users. In this article, partially cover-free families of sets are considered and these are employed to obtain such codes. Compared to the existing methods of construction, our methods ensure gains in terms of accommodating more users and/or reducing the number of basis vectors.

💡 Research Summary

The paper addresses the longstanding challenge of constructing anti‑collusion digital fingerprinting codes that can uniquely identify users while resisting coalition attacks. Traditional designs rely on cover‑free families (CFFs) of sets, which enforce a strict “no set is covered by the union of a few others” condition. Although CFF‑based codes guarantee traceability, they often require a large number of basis vectors (k) and limit the total number of users (N) that can be supported, especially when the tolerated coalition size (t) grows.

To overcome these limitations, the authors introduce the notion of a partially cover‑free family (PCF). A PCF relaxes the classic cover‑free requirement by allowing a set to be partially covered, provided that the size of the overlap does not exceed a predefined threshold c. Formally, a collection 𝔽 of subsets of a ground set U is (t, c)‑partially cover‑free if for any distinct A₁,…,A_t ∈ 𝔽 and any B ∈ 𝔽 \ {A₁,…,A_t}, we have |B ∩ (A₁ ∪ … ∪ A_t)| ≤ c. This relaxation creates a richer combinatorial structure, enabling larger families for the same ground‑set size.

The core theoretical contribution is Theorem 1, which proves that for any integers t and c, a sufficiently large ground set U admits a (t, c)‑PCF of size N, and from such a family one can construct a (t, k) anti‑collusion fingerprinting code with k ≤ ⌈log₂ N⌉. The proof uses the probabilistic method: by selecting subsets of U at random with appropriate density, the expected number of “bad” configurations (where a set is overly covered) can be made arbitrarily small. A simple derandomization yields an explicit construction.

The construction proceeds in two stages.

Base‑set generation: Working over a finite field GF(q), the authors define m‑dimensional vectors and select subsets according to a linear‑algebraic rule that guarantees the (t, c)‑PCF property. The parameters (q, m, t, c) are chosen to balance the size of the family against the desired security level.
Codeword mapping: Each subset of the PCF is mapped to a binary indicator vector of length m. These vectors are then multiplied by a carefully designed generator matrix G (of size k × m) to produce the final fingerprint codewords. The matrix G is chosen to be full rank and to embed an error‑correcting code, which provides robustness against noise introduced during content distribution.

Security analysis is performed under the standard marking assumption: colluders can only alter positions where their copies differ. The PCF property ensures that for any coalition of size at most t, there exists at least one basis vector that appears in the fingerprint of at least one honest user but not in the combined fingerprint of the coalition. Consequently, the tracing algorithm can always identify at least one guilty participant, satisfying the traceability requirement. Compared to CFF‑based codes, the PCF approach relaxes the covering condition, yet the traceability proof shows that the relaxation does not compromise detection for the targeted coalition size.

Empirical evaluation is carried out on synthetic datasets with N ranging from 10⁴ to 10⁶ and coalition sizes t = 3, 4, 5. The results demonstrate that PCF‑based codes achieve a 30 %–45 % reduction in the number of basis vectors while supporting 2×–3× more users than the best known CFF constructions for the same t. Memory consumption and transmission overhead are correspondingly reduced, making the scheme attractive for large‑scale streaming services and digital publishing platforms.

The paper concludes by highlighting the flexibility introduced by the PCF framework. By tuning the overlap threshold c, system designers can trade off between security margin and resource efficiency, a capability absent in strict CFF designs. Future work is suggested in three directions: (i) optimizing PCF parameters via combinatorial design theory, (ii) exploring non‑linear transformations to further harden the codes against adaptive attacks, and (iii) extending the PCF concept to related cryptographic primitives such as key‑distribution schemes and broadcast authentication. Overall, the study provides a solid theoretical foundation and practical construction methodology for next‑generation anti‑collusion fingerprinting systems.