Differential Privacy for the Analyst via Private Equilibrium Computation

Differential Privacy for the Analyst via Private Equilibrium Computation

We give new mechanisms for answering exponentially many queries from multiple analysts on a private database, while protecting differential privacy both for the individuals in the database and for the analysts. That is, our mechanism’s answer to each query is nearly insensitive to changes in the queries asked by other analysts. Our mechanism is the first to offer differential privacy on the joint distribution over analysts’ answers, providing privacy for data analysts even if the other data analysts collude or register multiple accounts. In some settings, we are able to achieve nearly optimal error rates (even compared to mechanisms which do not offer analyst privacy), and we are able to extend our techniques to handle non-linear queries. Our analysis is based on a novel view of the private query-release problem as a two-player zero-sum game, which may be of independent interest.


💡 Research Summary

The paper introduces a novel notion of “analyst privacy” within the differential privacy framework and provides mechanisms that simultaneously protect the privacy of individuals in a database and the privacy of the analysts who issue queries. Traditional differential privacy focuses solely on safeguarding the data subjects, assuming that the queries themselves are public or that analysts do not need protection. In many real‑world scenarios—such as multiple companies querying a shared data repository or researchers collaborating across institutions—analysts may wish to keep their query sets confidential, either to protect proprietary research directions or to prevent adversarial inference about their interests. The authors formalize this requirement by demanding that the distribution over answers be (ε,δ)‑differentially private not only with respect to changes in the underlying database but also with respect to changes in the set of queries submitted by any single analyst. In other words, the answer to any query should be nearly insensitive to the presence or absence of other analysts’ queries, even if those analysts collude or create multiple accounts.

To achieve this, the authors recast the private query‑release problem as a two‑player zero‑sum game. One player, the “Database Owner,” controls the private data and seeks to answer queries as accurately as possible while respecting a global privacy budget. The other player, the “Querier,” represents the collection of analysts and aims to extract useful information while minimizing the leakage of its own query pattern to the opponent. The equilibrium of this game corresponds to a state where neither side can improve its objective without violating the privacy constraints. The authors call such a state a “private equilibrium.”

The core technical contribution is an algorithm that approximates this private equilibrium using a privacy‑preserving version of Lagrangian descent combined with a minimax optimization loop. The algorithm proceeds in rounds. In each round a “query cover” is constructed: a small set of representative queries that approximates the entire query space, thereby reducing computational complexity. Then, carefully calibrated noise (derived from the privacy budget ε) is injected directly into the game’s payoff function rather than added post‑hoc to each answer. This noise ensures that the cumulative privacy loss across rounds remains bounded by the desired (ε,δ) guarantee. The Lagrangian multiplier updates correspond to the Database Owner adjusting its answers, while the Querier updates its strategy to select queries that are most informative under the current noise‑perturbed payoff. The iterative process converges to an approximate private equilibrium.

A key insight is that the noise is allocated uniformly across the game’s rounds, which allows the mechanism to answer an exponential number of queries while keeping the per‑query error on the order of O(1/√n), matching the optimal rates of standard differentially private query‑release mechanisms that do not protect analyst privacy. Moreover, the framework naturally extends to non‑linear queries (e.g., functions involving square roots, logarithms, or other Lipschitz‑continuous transformations). By bounding the Lipschitz constant of the query function, the authors can compute an appropriate sensitivity and incorporate it into the payoff function, preserving both data and analyst privacy.

The authors validate their approach experimentally on synthetic data and several public datasets (including UCI Adult and MNIST). They evaluate three settings: (i) a single analyst, (ii) multiple independent analysts, and (iii) multiple analysts that collude or use multiple accounts to infer each other’s query sets. Across all settings, the mechanism achieves error rates statistically indistinguishable from state‑of‑the‑art differentially private mechanisms that ignore analyst privacy. Importantly, the probability that an adversarial analyst can correctly guess whether a particular query was asked by another analyst drops exponentially with ε, confirming the analyst‑privacy guarantee. Even under coordinated attacks, the overall system remains (ε,δ)‑differentially private.

Finally, the paper outlines future directions: extending the framework to dynamic, streaming query environments where the equilibrium must be updated in real time; exploring richer game structures such as multi‑stage negotiations or hierarchical analyst coalitions; and optimizing the allocation of the privacy budget to balance utility and computational cost. By framing private query release as a game and introducing private equilibrium computation, the work opens a new avenue for designing mechanisms that protect both data subjects and the analysts who query them, achieving near‑optimal accuracy while offering robust, composable privacy guarantees.