Formal approaches to information hiding: An analysis of interactive systems, statistical disclosure control, and refinement of specifications
In this thesis we consider the problem of information hiding in the scenarios of interactive systems, statistical disclosure control, and refinement of specifications. We apply quantitative approaches to information flow in the first two cases, and we propose improvements for the usual solutions based on process equivalences for the third case. In the first scenario we consider the problem of defining the information leakage in interactive systems where secrets and observables can alternate during the computation and influence each other. We show that the information-theoretic approach which interprets such systems as (simple) noisy channels is not valid. The principle can be recovered if we consider channels with memory and feedback. We also propose the use of directed information from input to output as the real measure of leakage in interactive systems. In the second scenario we consider the problem of statistical disclosure control, which concerns how to reveal accurate statistics about a set of respondents while preserving the privacy of individuals. We focus on the concept of differential privacy, a notion that has become very popular in the database community. We show how to model the query system in terms of an information-theoretic channel, and we compare the notion of differential privacy with that of min-entropy leakage.In the third scenario we address the problem of using process equivalences to characterize information-hiding properties. We show that, in the presence of nondeterminism, this approach may rely on the assumption that the scheduler “works for the benefit of the protocol”, and this is often not a safe assumption. We present a formalism in which we can specify admissible schedulers and, correspondingly, safe versions of complete-trace equivalence and bisimulation, and we show that safe equivalences can be used to establish information-hiding properties.
💡 Research Summary
This dissertation tackles the problem of information hiding from three distinct yet interrelated perspectives: interactive systems, statistical disclosure control (specifically differential privacy), and the refinement of specifications under nondeterminism. The overarching goal is to provide quantitative, information‑theoretic models that capture leakage more accurately than traditional qualitative approaches, and to develop formal methods that guarantee privacy even when adversarial schedulers are present.
1. Interactive Systems
The first part of the work observes that classic information‑flow analysis treats a system as a simple memoryless noisy channel, which fails when secrets and observables alternate and influence each other during execution. To remedy this, the author models an Interactive Information‑Hiding System (IIHS) as a channel with memory and feedback. Inputs are lifted to “reaction functions” that map past outputs to current secret choices, thereby preserving the causal dependency structure. Leakage is measured by directed information I(X→Y), the amount of information that flows from the input sequence to the output sequence, rather than the symmetric mutual information I(X;Y). This metric respects the directionality inherent in interactive protocols. The thesis proves a capacity theorem for IIHSs, showing how much information can be leaked in the worst case. A detailed case study—the “Cocaine Auction” protocol—demonstrates the construction of the channel matrix, the computation of directed information, and the comparison with naïve mutual‑information based estimates, highlighting the superiority of the proposed approach. Topological properties of IIHSs (e.g., regularity, connectivity) are also linked to leakage bounds.
2. Differential Privacy
The second part re‑examines differential privacy (DP) through an information‑theoretic lens. A database query mechanism is modeled as a probabilistic channel K: D → O. The ε‑DP condition is interpreted as a bound on the ratio of output probabilities for neighboring databases, which can be expressed as a constraint on the channel’s transition matrix. The author then relates ε‑DP to min‑entropy leakage, the worst‑case probability that an adversary can correctly guess a secret after observing the output. By exploiting symmetries in the query graph (distance‑regular graphs, V_T⁺ graphs), a matrix transformation is derived that maps the original channel to a “symmetrized” version whose posterior entropy can be bounded analytically. This yields a clean inequality: ε‑DP ⇒ min‑entropy leakage ≤ f(ε, graph parameters). The framework is applied to voting and election scenarios, where the trade‑off between individual privacy loss and overall utility (accuracy of statistics) is quantified. The results show that for many natural query structures the DP guarantee is essentially optimal with respect to min‑entropy leakage.
3. Safe Equivalences for Nondeterministic Systems
The third part addresses the use of process equivalences (trace equivalence, bisimulation) to certify information‑hiding properties in systems that contain nondeterminism. Traditional approaches assume that the scheduler (which resolves nondeterminism) behaves “for the benefit of the protocol,” an assumption that is unrealistic under adversarial conditions. The thesis introduces the notion of admissible or safe schedulers, which are constrained either globally (by limiting the set of reachable global states) or locally (by restricting the choices at each component). Based on this, safe complete‑trace equivalence and safe bisimulation are defined. The author proves that if two processes are safely equivalent, then they are indistinguishable to any attacker, regardless of the admissible scheduler’s decisions. This yields a robust method for verifying secrecy in distributed probabilistic systems. The chapter also presents a formalism of Tagged Probabilistic Automata to model components and their interactions, and demonstrates the technique on a distributed voting protocol.
4. Conclusions and Outlook
The dissertation unifies three strands of research under a common quantitative framework. By extending channel models to incorporate memory, feedback, and graph‑induced symmetries, it provides more precise leakage metrics for interactive and statistical privacy settings. The introduction of safe equivalences equips designers with a tool to reason about secrecy even when nondeterminism is resolved by potentially malicious schedulers. Future work suggested includes continuous‑time extensions, multi‑secret interactions, and automated synthesis of admissible schedulers.
Overall, the thesis makes substantial theoretical contributions—directed information for interactive leakage, graph‑based bounds linking differential privacy to min‑entropy leakage, and safe process equivalences—while also delivering practical analysis techniques applicable to real‑world privacy‑sensitive protocols.
Comments & Academic Discussion
Loading comments...
Leave a Comment