Privacy utility trade offs for parameter estimation in degree heterogeneous higher order networks

Privacy utility trade offs for parameter estimation in degree heterogeneous higher order networks
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

In sensitive applications involving relational datasets, protecting information about individual links from adversarial queries is of paramount importance. In many such settings, the available data are summarized solely through the degrees of the nodes in the network. We adopt the $β$ model, which is the prototypical statistical model adopted for this form of aggregated relational information, and study the problem of minimax-optimal parameter estimation under both local and central differential privacy constraints. We establish finite sample minimax lower bounds that characterize the precise dependence of the estimation risk on the network size and the privacy parameters, and we propose simple estimators that achieve these bounds up to constants and logarithmic factors under both local and central differential privacy frameworks. Our results provide the first comprehensive finite sample characterization of privacy utility trade offs for parameter estimation in $β$ models, addressing the classical graph case and extending the analysis to higher order hypergraph models. We further demonstrate the effectiveness of our methods through experiments on synthetic data and a real world communication network.


💡 Research Summary

This paper addresses the fundamental problem of estimating the parameters of the β‑model (and its r‑uniform hypergraph extension) when only node degree information is available and strict privacy guarantees are required. The authors adopt the edge‑differential privacy framework, which protects the presence or absence of any single (hyper)edge, and study both local differential privacy (each user perturbs their own degree) and central differential privacy (a trusted curator aggregates and privatizes the degrees).

The main theoretical contributions are twofold. First, they derive a minimax lower bound for any ε‑local‑DP estimator, showing that the worst‑case ℓ₂ risk cannot be smaller than c·ε⁻²·n^{-(r‑1)} (up to constants depending on r, the bound M on the true β, and a minimal ε₀). This result quantifies precisely the additional sample complexity incurred by privacy: a factor of order ε⁻² relative to the non‑private rate Θ(n^{-(r‑1)}). The proof uses a packing argument combined with a locally private version of Fano’s inequality, constructing 2^{Θ(n)} parameter vectors that are ε⁻¹·n^{-(r‑1)} apart in ℓ₂ norm while keeping the induced degree distributions statistically indistinguishable.

Second, they propose concrete estimators that achieve these lower bounds up to constant and logarithmic factors. For the local model, they add independent discrete Laplace noise (scale 1/ε) to each degree—a mechanism that satisfies ε‑local‑DP. Using the noisy degree vector, they compute a simple ℓ₂‑based estimator of β, and prove an upper bound of C·ε⁻²·n^{-(r‑1)}·log n on the risk, matching the lower bound. For the central model, they either add Gaussian noise to the aggregated degree vector or run a differentially private gradient descent (DP‑GD) on the degree‑based log‑likelihood. The DP‑GD estimator attains risk C·(ε⁻¹·n^{-(r‑1)}+n^{-1}), which improves the second‑order term compared with the local case and approaches the non‑private rate when ε is sufficiently large (e.g., ε ≥ √{log n}).

Empirical validation is performed on synthetic hypergraphs (varying n from 500 to 5000 and uniformities r = 2,3,4) and on the real Enron email network (treated as a 2‑uniform graph). In the synthetic experiments, the locally private estimator’s mean‑squared error is within a factor of 1.2–1.5 of the maximum‑likelihood estimator for ε ∈


Comments & Academic Discussion

Loading comments...

Leave a Comment