Towards Anytime-Valid Statistical Watermarking

Towards Anytime-Valid Statistical Watermarking
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

The proliferation of Large Language Models (LLMs) necessitates efficient mechanisms to distinguish machine-generated content from human text. While statistical watermarking has emerged as a promising solution, existing methods suffer from two critical limitations: the lack of a principled approach for selecting sampling distributions and the reliance on fixed-horizon hypothesis testing, which precludes valid early stopping. In this paper, we bridge this gap by developing the first e-value-based watermarking framework, Anchored E-Watermarking, that unifies optimal sampling with anytime-valid inference. Unlike traditional approaches where optional stopping invalidates Type-I error guarantees, our framework enables valid, anytime-inference by constructing a test supermartingale for the detection process. By leveraging an anchor distribution to approximate the target model, we characterize the optimal e-value with respect to the worst-case log-growth rate and derive the optimal expected stopping time. Our theoretical claims are substantiated by simulations and evaluations on established benchmarks, showing that our framework can significantly enhance sample efficiency, reducing the average token budget required for detection by 13-15% relative to state-of-the-art baselines.


💡 Research Summary

The paper addresses a pressing problem in the era of large language models (LLMs): how to reliably detect machine‑generated text while preserving generation quality and allowing for efficient, real‑time decision making. Existing statistical watermarking schemes embed a subtle bias into the token sampling process and then test for dependence between the generated tokens and a secret seed sequence. Although theoretically sound, current approaches suffer from two major drawbacks. First, the choice of the seed‑generation distribution is heuristic; there is no principled way to select a distribution that maximizes detection power while keeping the watermark distortion‑free. Second, detection is performed over a fixed token horizon using p‑values, which invalidates Type‑I error control under optional stopping (the well‑known “p‑hacking” problem). Consequently, detectors cannot stop early when confidence is already high, leading to unnecessary token consumption and reduced robustness against adaptive attacks that modify later parts of the text.

To overcome these limitations, the authors introduce Anchored E‑Watermarking, the first framework that applies e‑values and test supermartingales to statistical watermark detection. An e‑value is a non‑negative random variable whose expectation under the null hypothesis is bounded by one. When a sequence of e‑values forms a non‑negative supermartingale (a test martingale), Ville’s inequality guarantees that the process is anytime‑valid: for any data‑dependent stopping time τ, the probability that the e‑value ever exceeds 1/α is at most α. This property allows a detector to monitor evidence continuously and stop as soon as the accumulated evidence is sufficient, without inflating the false‑positive rate.

The framework hinges on an anchor distribution (p_0) that both the generator and the detector share. The anchor is assumed to lie within a δ‑proximity ball around the true target distribution (q). In practice, (p_0) can be an open‑source model that approximates the proprietary LLM. During generation, the anchor is used to draw pseudorandom seeds; these seeds bias token selection (e.g., via green‑red lists or speculative decoding) to embed the watermark. Because the seed distribution is anchored to (p_0), the watermark remains distortion‑free: the marginal token distribution stays close to (q).

For detection, the authors construct an e‑value that measures the likelihood ratio between the joint watermarked distribution and the product of the anchor and seed marginals. They analytically derive the optimal worst‑case log‑growth rate (J^*) of this e‑value under the alternative hypothesis, showing that

\


Comments & Academic Discussion

Loading comments...

Leave a Comment