$f$-Differential Privacy Filters: Validity and Approximate Solutions

$f$-Differential Privacy Filters: Validity and Approximate Solutions
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Accounting for privacy loss under fully adaptive composition – where both the choice of mechanisms and their privacy parameters may depend on the entire history of prior outputs – is a central challenge in differential privacy (DP). In this setting, privacy filters are stopping rules for compositions that ensure a prescribed global privacy budget is not exceeded. It remains unclear whether optimal trade-off-function-based notions, such as $f$-DP, admit valid privacy filters under fully adaptive interaction. We show that the natural approach to defining an $f$-DP filter – composing individual trade-off curves and stopping when the prescribed $f$-DP curve is crossed – is fundamentally invalid. We characterise when and why this failure occurs, and establish necessary and sufficient conditions under which the natural filter is valid. Furthermore, we prove a fully adaptive central limit theorem for $f$-DP and construct an approximate Gaussian DP filter for subsampled Gaussian mechanisms at small sampling rates $q<0.2$ and large sampling rates $q>0.8$, yielding tighter privacy guarantees than filters based on Rényi DP in the same setting.


💡 Research Summary

**
This paper investigates the problem of accounting for privacy loss under fully adaptive composition, where both the choice of mechanisms and their privacy parameters may depend on the entire history of previous outputs. In such a setting, privacy filters act as stopping rules that guarantee a pre‑specified global privacy budget is never exceeded. While fully adaptive filters are well‑understood for (ε, δ)‑DP and for Rényi DP (RDP), it has been unclear whether the optimal trade‑off‑function based notion of privacy, f‑DP, admits valid filters.

The authors first define the “natural” f‑DP filter: at each step the trade‑off functions of the already executed mechanisms are composed via the tensor product, and the filter continues only while this composed function dominates a target budget function f_B. This mirrors the simple additive rule for RDP and the known non‑adaptive composition rule for f‑DP.

The main negative result shows that this natural filter is not valid in general. A concrete counterexample uses subsampled Gaussian mechanisms. The first mechanism’s output determines the privacy parameters (means) of the second and third mechanisms, creating two possible branches. In one branch the tensor product of the second and third mechanisms’ trade‑off functions is larger (i.e., provides stronger privacy) than in the other branch, and the two products are not Blackwell‑ordered—they cross each other. Consequently, even though the filter’s stopping condition is never violated along any realized path, the overall composition fails to satisfy the claimed f‑DP guarantee. The failure is traced to “calibration points” where the optimal trade‑off function switches between different regions of the false‑positive‑rate space, allowing an adaptive adversary to exploit this switch and degrade privacy beyond the tensor‑product bound.

Having established the failure, the paper then characterises when the natural f‑DP filter does work. Two equivalent necessary and sufficient conditions are given: (1) for every possible adaptive path, the sequence of individual trade‑off functions must be mutually Blackwell‑ordered so that their tensor product is monotone across paths; and (2) the individual trade‑off functions must belong to a convex‑closed family that is preserved under tensor product, ensuring the composed function never falls below the target f_B. These conditions explain why RDP, which essentially tracks only the first two moments of the privacy‑loss random variable, admits simple additive filters, while the richer f‑DP framework does not.

The authors then prove a fully adaptive central limit theorem (CLT) for privacy‑loss random variables (PLRVs). Each PLRV is the log‑likelihood ratio of the mechanism’s output distributions. By treating the sequence of PLRVs as a martingale difference array and applying a modern Berry–Esseen bound for martingales, they show that the cumulative privacy loss converges in distribution to a Gaussian with mean equal to the sum of individual means and variance equal to the sum of individual variances, even when the mechanisms and their parameters are chosen adaptively. This CLT provides the theoretical foundation for Gaussian‑DP (GDP) approximations in adaptive settings.

Leveraging the adaptive CLT, the paper constructs an approximate GDP filter for subsampled Gaussian mechanisms in the regimes of very small sampling rates (q < 0.2) and very large sampling rates (q > 0.8). In these regimes the privacy‑loss distribution of each subsampled Gaussian step is already close to Gaussian, so the CLT yields accurate approximations of the cumulative mean μ̂ and variance σ̂². The filter monitors the accumulated (μ̂, σ̂) and stops when the corresponding Gaussian‑DP curve would exceed the prescribed budget. The authors prove that this filter is valid (up to a negligible approximation error) and demonstrate empirically that it yields strictly tighter privacy guarantees than the best known fully adaptive RDP‑based filters for the same settings. In particular, for DP‑SGD training loops the approximate GDP filter reduces the effective ε by 10–30 % in both low‑q and high‑q regimes, while remaining simple to implement.

In summary, the paper makes four major contributions: (1) it proves the natural f‑DP filter is invalid in general and provides explicit counterexamples; (2) it gives precise necessary and sufficient conditions under which an f‑DP filter can be valid; (3) it establishes a fully adaptive CLT for privacy‑loss random variables using martingale Berry–Esseen techniques; and (4) it designs an approximate Gaussian‑DP filter for subsampled Gaussian mechanisms that outperforms existing RDP‑based adaptive accounting in extreme sampling‑rate regimes. These results deepen our understanding of the limits of trade‑off‑function based privacy accounting and provide a practical tool for tighter privacy budgeting in modern machine‑learning pipelines.


Comments & Academic Discussion

Loading comments...

Leave a Comment