Representation Geometry as a Diagnostic for Out-of-Distribution Robustness
Robust generalization under distribution shift remains difficult to monitor and optimize in the absence of target-domain labels, as models with similar in-distribution accuracy can exhibit markedly different out-of-distribution (OOD) performance. While prior work has focused on training-time regularization and low-order representation statistics, little is known about whether the geometric structure of learned embeddings provides reliable post-hoc signals of robustness. We propose a geometry-based diagnostic framework that constructs class-conditional mutual k-nearest-neighbor graphs from in-distribution embeddings and extracts two complementary invariants: a global spectral complexity proxy based on the reduced log-determinant of the normalized Laplacian, and a local smoothness measure based on Ollivier–Ricci curvature. Across multiple architectures, training regimes, and corruption benchmarks, we find that lower spectral complexity and higher mean curvature consistently predict stronger OOD accuracy across checkpoints. Controlled perturbations and topological analyses further show that these signals reflect meaningful representation structure rather than superficial embedding statistics. Our results demonstrate that representation geometry enables interpretable, label-free robustness diagnosis and supports reliable unsupervised checkpoint selection under distribution shift.
💡 Research Summary
The paper introduces a post‑hoc, label‑free diagnostic framework called TORRIC that predicts out‑of‑distribution (OOD) robustness of deep vision models by analyzing the geometry of their in‑distribution (ID) embeddings. For each checkpoint, class‑conditional mutual k‑nearest‑neighbor (k‑NN) graphs are built from ℓ₂‑normalized embeddings. Two complementary geometric invariants are then extracted: (1) a global spectral complexity proxy, defined as the reduced log‑determinant of the normalized Laplacian (τ), which aggregates the full non‑zero spectrum and, via Kirchhoff’s Matrix‑Tree theorem, reflects the number of spanning trees and overall connectivity redundancy of each class manifold; (2) a local smoothness measure based on Ollivier‑Ricci curvature (κ̄), computed by assigning probability mass to neighboring nodes proportional to edge weights, estimating the Wasserstein‑1 distance between these distributions, and averaging curvature over all edges. Lower τ indicates simpler, less cyclic class structures, while higher κ̄ signals stronger neighborhood overlap and local contraction.
Both τ and κ̄ are z‑normalized across checkpoints within the same training run and combined linearly into a lightweight score, GeoScore = z_τ − z_κ̄. This score is intended for ranking checkpoints rather than absolute OOD accuracy prediction.
Experiments are conducted on ResNet‑18 and ViT‑S/16 models trained on CIFAR‑10 with both empirical risk minimization (ERM) and contrastive objectives. Checkpoints are saved regularly, and for each checkpoint the TORRIC metrics are computed using class‑balanced subsampling. OOD performance is evaluated on CIFAR‑10.1, CIFAR‑10.2, CIFAR‑10‑C (severity 1‑5), and Tiny‑ImageNet‑C. Spearman rank correlation between τ, κ̄, GeoScore and OOD accuracy consistently yields strong positive relationships (ρ≈0.6–0.8). Controlled ablations—random label shuffling, feature permutation, degree‑preserving graph rewiring—significantly weaken these correlations, confirming that the metrics capture meaningful representation geometry rather than superficial statistics. Conversely, isometry‑preserving transformations such as random projections leave the metrics largely unchanged, demonstrating robustness to benign transformations.
Using GeoScore for unsupervised checkpoint selection achieves near‑oracle performance: the top‑10 % of checkpoints ranked by GeoScore contain over 85 % of the checkpoints that are actually in the top‑10 % of OOD accuracy. Persistent homology summaries are reported as secondary baselines and are shown to be less predictive than the proposed geometric invariants.
The authors provide theoretical intuition linking the two invariants to robustness: positive Ollivier‑Ricci curvature implies local contraction of neighborhoods, which promotes stability of nearest‑neighbor relations under input perturbations; low spectral torsion reflects reduced global redundancy, limiting the number of unstable pathways that could cause misclassification under shift. Together, these properties suggest smoother, more regular representations that generalize better to distributional changes.
In conclusion, TORRIC demonstrates that high‑order geometric analysis of ID embeddings can serve as a practical, interpretable tool for monitoring and selecting robust models without any target‑domain labels. The work opens avenues for extending geometry‑based diagnostics to other modalities, multi‑domain scenarios, and for developing more scalable graph construction techniques.
Comments & Academic Discussion
Loading comments...
Leave a Comment