Asymptotic fingerprinting capacity for non-binary alphabets

Asymptotic fingerprinting capacity for non-binary alphabets
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

We compute the channel capacity of non-binary fingerprinting under the Marking Assumption, in the limit of large coalition size c. The solution for the binary case was found by Huang and Moulin. They showed that asymptotically, the capacity is $1/(c^2 2\ln 2)$, the interleaving attack is optimal and the arcsine distribution is the optimal bias distribution. In this paper we prove that the asymptotic capacity for general alphabet size q is $(q-1)/(c^2 2\ln q)$. Our proof technique does not reveal the optimal attack or bias distribution. The fact that the capacity is an increasing function of q shows that there is a real gain in going to non-binary alphabets.


💡 Research Summary

The paper addresses the fundamental information‑theoretic limits of fingerprinting (traitor tracing) schemes when the underlying alphabet is non‑binary (size q ≥ 2) and the colluding coalition size c is large. Under the classic Marking Assumption—colluders may only modify positions where their copies differ—the authors extend the known binary result (capacity = 1/(2 c² ln 2), optimal interleaving attack, arcsine bias) to arbitrary q.

The problem is formalized as a two‑player zero‑sum game: the watermark designer chooses a bias distribution F over the per‑segment bias vectors p, while the colluders select an attack strategy θ (probabilities of output symbol Y given the multiset of received symbols Σ). The payoff is the per‑segment mutual information I(Y;Σ|P), scaled by 1/c. Using Sion’s minimax theorem, the max‑min game is swapped to a min‑max form, allowing the maximization over F to be replaced by a point mass at the bias that maximizes the payoff for a given p.

To analyze the large‑c regime, the discrete attack probabilities θ are approximated by smooth functions g_y(x) defined on the simplex, assuming they are twice differentiable and satisfy the marking constraints (g_y(x)=0 when x_α=0, Σ_α g_α(x)=1). The multinomial distribution of Σ concentrates around its mean cp with variance O(c), enabling a Taylor expansion of the mutual information. The leading term scales as 1/(2 c² ln q)·T(p), where

 T(p) = Σ_y


Comments & Academic Discussion

Loading comments...

Leave a Comment