Computer Science / Cryptography and Security Computer Science / Information Theory Mathematics / math.IT

Fingerprinting with Minimum Distance Decoding

February 23, 2026

Reading time: 6 minute

...

#Mathematics #Cryptography and Security #Computer Science #Information Theory

📝 Original Info

Title: Fingerprinting with Minimum Distance Decoding
ArXiv ID: 0710.2705
Date: 2009-03-02
Authors: Researchers from original ArXiv paper

📝 Abstract

This work adopts an information theoretic framework for the design of collusion-resistant coding/decoding schemes for digital fingerprinting. More specifically, the minimum distance decision rule is used to identify 1 out of t pirates. Achievable rates, under this detection rule, are characterized in two distinct scenarios. First, we consider the averaging attack where a random coding argument is used to show that the rate 1/2 is achievable with t=2 pirates. Our study is then extended to the general case of arbitrary $t$ highlighting the underlying complexity-performance tradeoff. Overall, these results establish the significant performance gains offered by minimum distance decoding as compared to other approaches based on orthogonal codes and correlation detectors. In the second scenario, we characterize the achievable rates, with minimum distance decoding, under any collusion attack that satisfies the marking assumption. For t=2 pirates, we show that the rate $1-H(0.25)\approx 0.188$ is achievable using an ensemble of random linear codes. For $t\geq 3$, the existence of a non-resolvable collusion attack, with minimum distance decoding, for any non-zero rate is established. Inspired by our theoretical analysis, we then construct coding/decoding schemes for fingerprinting based on the celebrated Belief-Propagation framework. Using an explicit repeat-accumulate code, we obtain a vanishingly small probability of misidentification at rate 1/3 under averaging attack with t=2. For collusion attacks which satisfy the marking assumption, we use a more sophisticated accumulate repeat accumulate code to obtain a vanishingly small misidentification probability at rate 1/9 with t=2. These results represent a marked improvement over the best available designs in the literature.

💡 Deep Analysis

Deep Dive into Fingerprinting with Minimum Distance Decoding.

📄 Full Content

Digital fingerprinting is a paradigm for protecting copyrighted data against illegal distribution [1]. In a nutshell, a distributor, i.e., the provider of copyrighted data, wishes to distribute its data D among a number of licensed users. Each licensed copy is identified with a mark, which will be referred to as a fingerprint in the sequel, composed of a set of redundant digits embedded inside the copyrighted data. The locations of the redundant digits are kept hidden from the users and are only known to the distributor. Their positions, however, remain the same for all users.

If any user re-distributes its data in an unauthorized manner, it will be easily identified by its fingerprint. However, several users may collude to form a coalition enabling them to produce an unauthorized copy which is difficult to trace. In the literature, the colluding members are typically referred to as pirates or colluders. Hence, the need arises for the design of collusion-resistant digital fingerprinting techniques. Our work develops an information theoretic framework for the design of low complexity pirate-identification schemes.

To enable a succinct development of our results, we first consider the widely studied averaging attack [2]. The colluders, in this strategy, average their media contents to produce the forged copy. An explicit fingerprinting code construction for this attack was proposed in [2]. In this construction, however, the maximum number of users M, grows only polynomially with the fingerprinting code-length n (more precisely M = O(n 2 )). Clearly, this rate of growth corresponds to a zero rate in the information theoretic sense. This motivates our pursuit for a fingerprinting scheme which supports an exponentially growing number of users, with the code-length, while allowing for low complexity pirate-identification strategies. Towards this goal, we use a random coding argument to establish the existence of a rate 0.5 linear fingerprinting code which achieves a vanishingly small probability of misidentification when 1) Only t = 2 pirates are involved in the averaging attack and 2) The low complexity minimum distance (MD) decoder is used to identify one of the two pirates. The enabling observation is the intimate connection between the scenario under consideration and the binary erasure channel (BEC). This result is then extended to the general case with an arbitrary coalition size t where the tradeoff between complexity and performance is highlighted.

Building on our analysis for the averaging attack, we then proceed to fingerprinting strategies which are resistant to more general forging techniques. More specifically, we adopt the marking assumption first proposed in [1]. In this framework, the pirates attempt to identify the positions occupied by the fingerprinting digits by comparing their copies. Afterwards, they can only modify the identified coordinates, in any desired way, to minimize the probability of traceability. The validity of the marking assumption hinges on the assumption that any modification to the data content D will damage it permanently. This prevents the users from modifying any location in which they do not identify as a fingerprinting digit since it may be a data symbol. Boneh and Shaw [1] were the first to construct fingerprinting codes that are resistant to attacks that satisfy the marking assumption. This approach was later extended in [3] using the idea of separating codes [4]. To the best of our knowledge, the best available explicit binary fingerprinting codes are the low rate codes presented in [3]. For example, for t = 2, the best available code has a rate≈ 0.0092. More recently, upper and lower bounds on the binary fingerprinting capacity for t = 2 and t = 3 were derived in [5]. The decoder used in [5], however, was based on exhaustive search, and hence, would suffer from an exponentially growing complexity in the code length. This prohibitive complexity motivates our proposed approach. In this paper, we show that using linear fingerprinting codes and MD decoding, one can achieve rates less than 0.188 when the coalition size is t = 2. Unfortunately, the proposed approach does not scale for t ≥ 3. This negative result calls for a more sophisticated identification technique inspired by the analogy between our set-up and multiple access channels. Our results in this regard will be reported elsewhere.

Since the complexity of the exact MD decoder can be prohibitive when the code-length is long, we develop a low complexity belief-propagation (BP) identification approach [6][7].

This detector only requires a linear complexity in n, and offer remarkable performance gain over the best known explicit constructions for fingerprinting [3][2]. For example, we propose a modified iterative decoder tailored for the averaging attack with t = 2. Using this decoder along with an explicit repeat-accumulate (RA) fingerprinting code, we achieve a vanishingly small probability of misidentificat

…(Full text truncated)…

📄 Read Full PDF on ArXiv

📸 Image Gallery

Reference

This content is AI-processed based on ArXiv data.

Fingerprinting with Minimum Distance Decoding

📝 Original Info

📝 Abstract

💡 Deep Analysis

📄 Full Content

📸 Image Gallery

Reference

Table of Contents

Table of Contents

📝 Original Info

📝 Abstract

💡 Deep Analysis

📄 Full Content

📸 Image Gallery

Reference

Related Posts

Closed-Form Expressions for Secrecy Capacity over Correlated Rayleigh Fading Channels

DNA-Inspired Information Concealing

Learning Character Strings via Mastermind Queries, with a Case Study Involving mtDNA

Start searching

No results found