Tighter Information-Theoretic Generalization Bounds via a Novel Class of Change of Measure Inequalities
In this paper, we propose a novel class of change of measure inequalities via a unified framework based on the data processing inequality for $f$-divergences, which is surprisingly elementary yet powerful enough to yield tighter inequalities. We provide change of measure inequalities in terms of a broad family of information measures, including $f$-divergences (with Kullback-Leibler divergence and $χ^2$-divergence as special cases), Rényi divergence, and $α$-mutual information (with maximal leakage as a special case). We then embed these inequalities into the analysis of generalization error for stochastic learning algorithms, yielding novel and tighter high-probability information-theoretic generalization bounds, while also recovering several best-known results via simplified analyses. A key advantage of our framework is its flexibility: it readily adapts to a range of settings, including the conditional mutual information framework, PAC-Bayesian theory, and differential privacy mechanisms, for which we derive new generalization bounds.
💡 Research Summary
This paper introduces a unified and elementary framework for deriving change‑of‑measure inequalities that lead to tighter high‑probability generalization bounds for stochastic learning algorithms. The core technical tool is the data‑processing inequality (DPI) for f‑divergences. By applying DPI to the deterministic indicator channel of an event E, the authors obtain a lower bound on any f‑divergence between two distributions P and Q in terms of the binary f‑divergence between Bernoulli(p) and Bernoulli(q), where p = P(E) and q = Q(E). Combining this with the Young‑Fenchel conjugate inequality yields a generic bound of the form
P(E) ≤ inf_{u>0}
Comments & Academic Discussion
Loading comments...
Leave a Comment