Robust Nonparametric Two-Sample Tests via Mutual Information using Extended Bregman Divergence

Robust Nonparametric Two-Sample Tests via Mutual Information using Extended Bregman Divergence
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

We introduce a generalized formulation of mutual information (MI) based on the extended Bregman divergence, a framework that subsumes the generalized S-Bregman (GSB) divergence family. The GSB divergence unifies two important classes of statistical distances, namely the S-divergence and the Bregman exponential divergence (BED), thereby encompassing several widely used subfamilies, including the power divergence (PD), density power divergence (DPD), and S-Hellinger distance (S-HD). In parametric inference, minimum divergence estimators are well known to balance robustness with high asymptotic efficiency relative to the maximum likelihood estimator. However, nonparametric tests based on such statistical distances have been relatively less explored. In this paper, we construct a class of consistent and robust nonparametric two-sample tests for the equality of two absolutely continuous distributions using the generalized MI. We establish the asymptotic normality of the proposed test statistics under the null and contiguous alternatives. The robustness properties of the generalized MI are rigorously studied through the influence function and the breakdown point, demonstrating that stability of the generalized MI translates into stability of the associated tests. Extensive simulation studies show that divergences beyond the PD family often yield superior robustness under contamination while retaining high asymptotic power. A data-driven scheme for selecting optimal tuning parameters is also proposed. Finally, the methodology is illustrated with applications to real data.


💡 Research Summary

The paper introduces a novel class of robust, non‑parametric two‑sample tests for comparing two absolutely continuous distributions. The core idea is to define a generalized mutual information (MI) based on the extended Bregman divergence, which subsumes the generalized S‑Bregman (GSB) divergence family. The GSB divergence unifies the S‑divergence and the Bregman exponential divergence (BED), thereby encompassing several widely used sub‑families such as the power divergence (PD), density power divergence (DPD), and S‑Hellinger distance (S‑HD).

First, the authors formalize the extended Bregman divergence (D^{(k)}{\phi}(g,f)=\int{\phi(g^{k})-\phi(f^{k})-(g^{k}-f^{k})\phi’(f^{k})},d\mu) with a positive index (k). By choosing a convex generator (\phi(t)=e^{\beta t}+t+ B A^{-1}B) (where (A) and (B) are functions of the S‑divergence tuning parameters (\alpha) and (\lambda)), they obtain the GSB divergence, which reduces to the S‑divergence when (\beta=0). The generalized MI is then defined as the integral of this divergence over the common support, i.e., ( \text{B‑MI}(g,f)=\int D^{(A)}{\phi}(g,f),d\mu).

Using kernel density estimators for the two samples, the authors construct an empirical version of B‑MI and normalize it to form a test statistic
\


Comments & Academic Discussion

Loading comments...

Leave a Comment