WorldCup Sampling for Multi-bit LLM Watermarking

WorldCup Sampling for Multi-bit LLM Watermarking
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

As large language models (LLMs) generate increasingly human-like text, watermarking offers a promising solution for reliable attribution beyond mere detection. While multi-bit watermarking enables richer provenance encoding, existing methods largely extend zero-bit schemes through seed-driven steering, leading to indirect information flow, limited effective capacity, and suboptimal decoding. In this paper, we propose WorldCup, a multi-bit watermarking framework for LLMs that treats sampling as a natural communication channel and embeds message bits directly into token selection via a hierarchical competition mechanism guided by complementary signals. Moreover, WorldCup further adopts entropy-aware modulation to preserve generation quality and supports robust message recovery through confidence-aware decoding. Comprehensive experiments show that WorldCup achieves a strong balance across capacity, detectability, robustness, text quality, and decoding efficiency, consistently outperforming prior baselines and laying a solid foundation for future LLM watermarking studies.


💡 Research Summary

The paper “WorldCup Sampling for Multi‑bit LLM Watermarking” addresses the growing need for reliable attribution of text generated by large language models (LLMs). Existing multi‑bit watermarking methods extend zero‑bit detection schemes by first hashing a secret key and the context into a random seed, then using that seed to perturb the model’s sampling distribution. This indirect “seed‑driven steering” creates a multi‑stage information pipeline (message → seed → perturbation → token) that incurs entropy loss, limits per‑token capacity, and requires inefficient exhaustive search during decoding.

WorldCup proposes a fundamentally different perspective: treat the sampling step itself as a natural communication channel and embed bits directly into token selection. The core of the framework consists of four tightly coupled components:

  1. Hierarchical Tournament Sampling – At each generation step, the model draws N^m candidate tokens from its native distribution. Tokens then compete across m layers of a tournament; at each layer ℓ a pseudo‑random scoring function gℓ(x, r) (where r is a hash‑derived seed) ranks the candidates, and the highest‑scoring token advances. The final winner becomes the emitted token. This process can be vectorized for efficiency.

  2. Complementary g‑value Functions – To encode a binary bit, two families of scoring functions, g0 and g1, are defined such that g1(x, r) = 1 – g0(x, r). Assuming each gℓ follows a Bernoulli(0.5) distribution, the two families are perfectly anti‑correlated: tokens favored under g0 are disfavored under g1. The authors prove that this construction maximizes the statistical distance between the two embedding distributions, yielding the strongest possible decision margin for decoding.

  3. Multi‑bit Generalization – By introducing k independent g‑value families, each token can simultaneously encode k bits, turning the per‑token capacity from 1 bit to k bits. The tournament is extended to select the appropriate family for each bit group, preserving the same anti‑correlated property across all bits.

  4. Entropy‑Aware Modulation – The framework monitors the entropy of the model’s output distribution at each step. When entropy is low (high confidence), the watermark bias δ is increased, strengthening the signal; when entropy is high, δ is reduced to avoid degrading fluency. This adaptive modulation maintains high text quality while preserving detectability.

  5. Confidence‑Aware Decoding – Instead of simple majority voting over token‑level watermark scores, the decoder weights each token’s contribution by its probability under the original model distribution. This mitigates the disproportionate influence of low‑entropy tokens and improves robustness against noise, paraphrasing, and partial deletions.

Theoretical analysis shows that the complementary design yields the maximal expected squared difference E


Comments & Academic Discussion

Loading comments...

Leave a Comment