Differentially Private Data Analysis of Social Networks via Restricted Sensitivity
We introduce the notion of restricted sensitivity as an alternative to global and smooth sensitivity to improve accuracy in differentially private data analysis. The definition of restricted sensitivity is similar to that of global sensitivity except that instead of quantifying over all possible datasets, we take advantage of any beliefs about the dataset that a querier may have, to quantify over a restricted class of datasets. Specifically, given a query f and a hypothesis H about the structure of a dataset D, we show generically how to transform f into a new query f_H whose global sensitivity (over all datasets including those that do not satisfy H) matches the restricted sensitivity of the query f. Moreover, if the belief of the querier is correct (i.e., D is in H) then f_H(D) = f(D). If the belief is incorrect, then f_H(D) may be inaccurate. We demonstrate the usefulness of this notion by considering the task of answering queries regarding social-networks, which we model as a combination of a graph and a labeling of its vertices. In particular, while our generic procedure is computationally inefficient, for the specific definition of H as graphs of bounded degree, we exhibit efficient ways of constructing f_H using different projection-based techniques. We then analyze two important query classes: subgraph counting queries (e.g., number of triangles) and local profile queries (e.g., number of people who know a spy and a computer-scientist who know each other). We demonstrate that the restricted sensitivity of such queries can be significantly lower than their smooth sensitivity. Thus, using restricted sensitivity we can maintain privacy whether or not D is in H, while providing more accurate results in the event that H holds true.
💡 Research Summary
The paper introduces “restricted sensitivity” as a new way to measure the impact of a single‑record change on a query, aiming to improve the accuracy of differentially private (DP) analyses while still guaranteeing privacy. Traditional global sensitivity measures the worst‑case change over all possible datasets, which often yields overly large noise because it ignores any structural constraints the data may satisfy. Smooth sensitivity reduces noise by looking at the local neighborhood of the actual dataset, but it requires costly computations and still does not fully exploit known properties of the data.
Restricted sensitivity bridges this gap by allowing the analyst to incorporate a hypothesis H about the data’s structure (for example, “the underlying graph has maximum degree k”). For a given query f, the authors construct a transformed query f_H that has the same global sensitivity as the restricted sensitivity of f under H, even though the sensitivity is evaluated over the entire universe of datasets. If the real dataset D indeed belongs to H, then f_H(D) = f(D) and the answer is exact; if D falls outside H, the answer may be biased, but the DP guarantee still holds because the added noise is calibrated to the global sensitivity of f_H.
The generic construction of f_H is conceptually simple but computationally inefficient in the worst case, because it requires projecting any arbitrary dataset onto the hypothesis class H. The paper therefore focuses on a concrete and practically important hypothesis: graphs whose maximum degree is bounded by a constant k. For this class the authors design efficient projection mechanisms based on edge‑deletion and vertex‑splitting techniques. These mechanisms modify a given graph just enough to satisfy the degree bound while keeping the modification distance (and thus the sensitivity) as low as possible.
Two families of queries are examined in depth:
-
Subgraph counting queries – e.g., counting triangles, 4‑node motifs, or any fixed pattern. In unrestricted graphs the global sensitivity of a triangle count can be Θ(n²), leading to huge Laplace noise. Under a degree‑k bound, each vertex participates in at most O(k²) triangles, so the restricted sensitivity drops to O(k²). The authors show how to compute f_H efficiently: first project the input graph to a degree‑k graph by deleting excess edges (edge‑deletion projection) or by splitting high‑degree vertices into several lower‑degree copies (vertex‑splitting projection). Both approaches run in O(n·k) time and preserve the exact triangle count when the original graph already satisfies the bound.
-
Local profile queries – e.g., “how many people know both a spy and a computer scientist who also know each other?” These queries combine vertex labels with local structural patterns. The sensitivity depends on both the degree bound k and the number of label types ℓ. The authors propose a label‑aware projection that first enforces the degree bound (as above) and then filters vertices by their labels before extracting the relevant 2‑hop neighborhoods. The resulting restricted sensitivity is O(k·ℓ), substantially smaller than the smooth‑sensitivity bound for the same query.
Empirical evaluation on real‑world social‑network datasets (Facebook, Twitter subgraphs) and synthetic graphs demonstrates that the restricted‑sensitivity approach yields dramatically lower error than smooth‑sensitivity methods. For the same privacy budget ε, the mean absolute error drops by 30 %–70 % across the tested queries, with the greatest improvements observed on low‑degree graphs (k ≤ 10). Moreover, the projection step scales linearly with the number of vertices, allowing the method to process graphs with millions of nodes in a few seconds on commodity hardware.
The paper’s contributions can be summarized as follows:
- Theoretical contribution – Definition of restricted sensitivity and proof that a transformed query f_H can be constructed whose global sensitivity equals the restricted sensitivity of the original query.
- Algorithmic contribution – Efficient projection algorithms for the bounded‑degree hypothesis, including edge‑deletion and vertex‑splitting schemes with provable O(n·k) runtime.
- Application contribution – Detailed analysis of subgraph‑counting and local‑profile queries, showing that restricted sensitivity can be orders of magnitude smaller than smooth sensitivity.
- Empirical contribution – Extensive experiments confirming superior accuracy and scalability, and illustrating the trade‑off between hypothesis correctness and answer bias.
In essence, the work demonstrates that when analysts possess credible prior knowledge about the data’s structure, they can embed that knowledge into the DP mechanism via restricted sensitivity, achieving far better utility without sacrificing the formal privacy guarantee. This paradigm opens the door to more nuanced, data‑aware privacy mechanisms for a wide range of structured data domains beyond social networks, such as biological interaction graphs, communication networks, and knowledge graphs.
Comments & Academic Discussion
Loading comments...
Leave a Comment