Differential Privacy versus Quantitative Information Flow
Differential privacy is a notion of privacy that has become very popular in the database community. Roughly, the idea is that a randomized query mechanism provides sufficient privacy protection if the ratio between the probabilities of two different entries to originate a certain answer is bound by e^\epsilon. In the fields of anonymity and information flow there is a similar concern for controlling information leakage, i.e. limiting the possibility of inferring the secret information from the observables. In recent years, researchers have proposed to quantify the leakage in terms of the information-theoretic notion of mutual information. There are two main approaches that fall in this category: One based on Shannon entropy, and one based on R'enyi’s min entropy. The latter has connection with the so-called Bayes risk, which expresses the probability of guessing the secret. In this paper, we show how to model the query system in terms of an information-theoretic channel, and we compare the notion of differential privacy with that of mutual information. We show that the notion of differential privacy is strictly stronger, in the sense that it implies a bound on the mutual information, but not viceversa.
💡 Research Summary
The paper investigates the relationship between differential privacy—a widely adopted privacy notion in the database community—and quantitative information‑flow measures that have been developed in the fields of anonymity and information‑flow security. The authors first recast the standard definition of ε‑differential privacy, which bounds the ratio of output probabilities for any two adjacent databases, into an equivalent “δ‑differential privacy” formulation that only considers singleton output events. They prove the equivalence of the two definitions (Theorem 1), thereby simplifying later arguments.
Next, they model a database query mechanism as an information‑theoretic channel. The true answer to a query, denoted X = f(D), is treated as the channel input, while the reported (noisy) answer Y = K(f(D)) is the output. The conditional distribution p(Y|X) is precisely the noise distribution added to achieve differential privacy; the authors focus on the Laplace mechanism, where p(Y = y | X = x) = (Δf / 2ε)·exp(−|y−x|·ε/Δf). This channel representation enables the use of Shannon entropy, conditional entropy, and mutual information, as well as Rényi’s min‑entropy and the associated min‑mutual information.
The core technical contribution consists of two complementary results. First, they show that ε‑differential privacy imposes an upper bound on the Shannon mutual information I(X;Y) of the channel. In particular, for any pair of adjacent databases the probability‑ratio constraint forces I(X;Y) ≤ ε·log e (or a similar bound derived from the logarithmic relationship). Consequently, as ε → 0 the average information leakage measured by mutual information tends to zero. An analogous bound is proved for the Rényi min‑mutual information I∞(X;Y), demonstrating that differential privacy also limits the worst‑case guessing advantage of an adversary.
Second, they demonstrate that the converse does not hold: a channel can have arbitrarily small mutual information (both Shannon and min‑mutual) while still violating any finite ε‑differential privacy guarantee. They construct a family of channels in which the output space is enlarged and the conditional distribution is made almost deterministic for each input, thereby driving I(X;Y) and I∞(X;Y) toward zero. However, the ratio of probabilities for certain singleton outputs can still be arbitrarily large, meaning that no finite ε can satisfy the differential‑privacy inequality. This counterexample establishes that mutual‑information‑based leakage metrics are strictly weaker than differential privacy.
The paper also discusses channel capacity. The Shannon capacity C = max_{p_X} I(X;Y) and the min‑capacity C∞ = max_{p_X} I∞(X;Y) are both bounded by functions of ε; as ε decreases, the capacities shrink, reflecting reduced worst‑case information‑transfer capability. Conversely, a channel with C = 0 (or C∞ = 0) does not guarantee a small ε, reinforcing the asymmetry between the two notions.
Structurally, the paper proceeds as follows: Section 2 reviews the necessary background on differential privacy and information theory (Shannon entropy, Rényi min‑entropy, mutual information, and capacities). Section 3 formalizes the channel model for private query answering. Section 4 presents the main theorems linking ε‑differential privacy to bounds on both Shannon and min‑mutual information, and provides the counterexample showing the lack of a converse. Section 5 concludes, summarizing the findings and suggesting future work such as exploring tighter quantitative bridges, extending the analysis to adaptive queries, and investigating other privacy notions (e.g., concentrated differential privacy) within the information‑theoretic framework.
Overall, the work clarifies that differential privacy is a strictly stronger guarantee than the information‑theoretic leakage measures commonly used in anonymity and quantitative information‑flow research. By casting differential privacy in the language of channels, the authors enable cross‑fertilization between the two research communities while highlighting the limitations of mutual‑information‑based metrics for providing robust privacy guarantees.
Comments & Academic Discussion
Loading comments...
Leave a Comment