Tight Robustness Certificates and Wasserstein Distributional Attacks for Deep Neural Networks

Tight Robustness Certificates and Wasserstein Distributional Attacks for Deep Neural Networks
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Wasserstein distributionally robust optimization (WDRO) provides a framework for adversarial robustness, yet existing methods based on global Lipschitz continuity or strong duality often yield loose upper bounds or require prohibitive computation. We address these limitations with a primal approach and adopt a notion of exact Lipschitz certificates to tighten this upper bound of WDRO. For ReLU networks, we leverage the piecewise-affine structure on activation cells to obtain an exact tractable characterization of the corresponding WDRO problem. We further extend our analysis to modern architectures with smooth activations (e.g., GELU, SiLU), such as Transformers. Additionally, we propose novel Wasserstein Distributional Attacks (WDA, WDA++) that construct candidates for the worst-case distribution. Compared to existing attacks that are restricted to point-wise perturbations, our methods offer greater flexibility in the number and location of attack points. Extensive evaluations demonstrate that our proposed framework achieves competitive robust accuracy against state-of-the-art baselines while offering tighter certificates than existing methods. Our code is available at https://github.com/OLab-Repo/WDA.


💡 Research Summary

This paper tackles the longstanding gap between theoretical robustness guarantees and practical adversarial evaluation for deep neural networks by leveraging Wasserstein distributionally robust optimization (WDRO). Existing approaches either rely on coarse global Lipschitz bounds, which lead to overly pessimistic certificates, or on strong duality formulations that are computationally prohibitive. The authors propose a primal‑based method that introduces “exact Lipschitz certificates” and uses them to obtain tight upper bounds on the WDRO objective.

For ReLU networks, the piecewise‑affine nature of the model is exploited: each activation pattern (mask) defines a linear region (cell) with an associated Jacobian J_D = W_{H+1} D_H … D_1 W_1, where D_h are binary diagonal matrices encoding the on/off status of ReLUs. Because the set of possible masks 𝔇_X is finite, the exact local Lipschitz constant can be computed as
L = 2^{1/r}·max_{D∈𝔇_X} ‖J_D‖{r→s}.
Theorem 3.1 shows that for cross‑entropy or DLR loss, the worst‑case risk over a Wasserstein‑1 ball satisfies
E
{P_N}


Comments & Academic Discussion

Loading comments...

Leave a Comment