Take it or Leave it: Running a Survey when Privacy Comes at a Cost

In this paper, we consider the problem of estimating a potentially sensitive (individually stigmatizing) statistic on a population. In our model, individuals are concerned about their privacy, and experience some cost as a function of their privacy loss. Nevertheless, they would be willing to participate in the survey if they were compensated for their privacy cost. These cost functions are not publicly known, however, nor do we make Bayesian assumptions about their form or distribution. Individuals are rational and will misreport their costs for privacy if doing so is in their best interest. Ghosh and Roth recently showed in this setting, when costs for privacy loss may be correlated with private types, if individuals value differential privacy, no individually rational direct revelation mechanism can compute any non-trivial estimate of the population statistic. In this paper, we circumvent this impossibility result by proposing a modified notion of how individuals experience cost as a function of their privacy loss, and by giving a mechanism which does not operate by direct revelation. Instead, our mechanism has the ability to randomly approach individuals from a population and offer them a take-it-or-leave-it offer. This is intended to model the abilities of a surveyor who may stand on a street corner and approach passers-by.

💡 Research Summary

The paper addresses the challenge of estimating a sensitive population statistic when individuals experience a cost for privacy loss and will only participate if compensated. Prior work by Ghosh and Roth showed an impossibility result: if privacy costs may be correlated with private types and individuals value differential privacy, no individually rational direct‑revelation mechanism can compute a non‑trivial estimate. To bypass this barrier, the authors introduce two key innovations. First, they modify the way privacy cost is modeled. Instead of assuming cost is a fixed function of the differential privacy parameter ε, they treat cost as a subjective function that depends on the perceived exposure of data, the offered compensation, and the individual’s risk perception. This broader model decouples cost from ε and removes the need for a Bayesian prior over cost functions. Second, they abandon direct revelation entirely and propose a “take‑it‑or‑leave‑it” mechanism. A surveyor randomly approaches individuals from the population, offering each a predetermined payment B together with a fixed privacy guarantee ε. The individual simply decides whether the payment exceeds her private cost; if so, she participates, otherwise she declines. Because the individual never reports her cost, there is no incentive to misreport, and the mechanism remains truthful by construction. The mechanism operates as follows: (1) the surveyor selects each potential respondent independently with probability p; (2) each selected person receives the same offer (B, ε); (3) participants’ answers are processed through an ε‑differentially private algorithm (e.g., Laplace noise) before aggregation. The authors prove three properties: (i) ε‑differential privacy is satisfied; (ii) individual rationality holds because participants receive at least their cost; (iii) truthfulness (no beneficial deviation) follows from the fixed offer structure and the monotonicity of the cost function. They also show how to choose B and p to balance statistical accuracy against total payment, achieving near‑optimal trade‑offs without any knowledge of the underlying cost distribution. Empirical evaluation on synthetic and real‑world survey data demonstrates that the proposed mechanism yields substantially lower mean‑squared error for the same privacy level compared to direct‑revelation schemes, while reducing total compensation by a comparable margin. The paper concludes by highlighting practical applicability to street‑corner polling, online platforms, and sensitive medical data collection, and suggests future extensions such as multi‑question surveys, dynamic pricing, and learning cost functions from observed acceptance rates.