Negatives-Dominant Contrastive Learning for Generalization in Imbalanced Domains

Negatives-Dominant Contrastive Learning for Generalization in Imbalanced Domains
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Imbalanced Domain Generalization (IDG) focuses on mitigating both domain and label shifts, both of which fundamentally shape the model’s decision boundaries, particularly under heterogeneous long-tailed distributions across domains. Despite its practical significance, it remains underexplored, primarily due to the technical complexity of handling their entanglement and the paucity of theoretical foundations. In this paper, we begin by theoretically establishing the generalization bound for IDG, highlighting the role of posterior discrepancy and decision margin. This bound motivates us to focus on directly steering decision boundaries, marking a clear departure from existing methods. Subsequently, we technically propose a novel Negative-Dominant Contrastive Learning (NDCL) for IDG to enhance discriminability while enforce posterior consistency across domains. Specifically, inter-class decision-boundary separation is enhanced by placing greater emphasis on negatives as the primary signal in our contrastive learning, naturally amplifying gradient signals for minority classes to avoid the decision boundary being biased toward majority classes. Meanwhile, intra-class compactness is encouraged through a re-weighted cross-entropy strategy, and posterior consistency across domains is enforced through a prediction-central alignment strategy. Finally, rigorous yet challenging experiments on benchmarks validate the effectiveness of our NDCL. The code is available at https://github.com/Alrash/NDCL.


💡 Research Summary

Imbalanced Domain Generalization (IDG) addresses the simultaneous presence of domain shift and label shift, a scenario that is common in real‑world applications such as medical diagnosis, finance, and autonomous driving. While conventional domain generalization (DG) methods focus primarily on covariate shift—aligning the marginal input distribution P(X) across source domains—they typically assume a shared label distribution P(Y). This assumption breaks down when each domain exhibits a heterogeneous long‑tailed class distribution, causing minority classes to suffer from reduced decision margins and biased decision boundaries.

The paper makes three core contributions. First, it provides the first theoretical generalization bound for IDG based on H‑divergence. The bound decomposes the target risk into (i) the weighted empirical risk over source domains, (ii) a margin term that penalizes samples whose decision margin falls below a threshold, (iii) a prior‑distribution discrepancy term, and (iv) a posterior‑distribution discrepancy term. Crucially, label imbalance directly inflates both the posterior discrepancy and the margin term, highlighting the need for methods that explicitly enforce posterior consistency and enlarge inter‑class margins.

Second, the authors propose Negative‑Dominant Contrastive Learning (NDCL), a contrastive framework that treats negative samples as the primary learning signal. By reformulating the InfoNCE loss so that cosine dissimilarity with negatives appears in the numerator, the gradient is amplified toward repelling negative samples. This design naturally benefits minority classes because their neighborhoods are often populated by majority‑class negatives; the denominator becomes small, further increasing the “amplification factor” and strengthening the repulsive force. To avoid trivial solutions, NDCL incorporates a hard‑negative mining strategy: low‑confidence in‑class samples are mixed with high‑confidence out‑of‑class samples via mixup, producing ambiguous negatives that drive more informative gradients.

Third, NDCL adds two complementary components. A class‑wise re‑weighted cross‑entropy loss assigns larger weights to under‑represented classes, directly boosting the loss contribution of minority samples. A prediction‑central alignment loss aligns class prototypes (the average prediction vectors) across domains, encouraging posterior consistency without relying on raw input alignment, which can be biased by domain‑specific sample proportions.

Empirical evaluation spans standard DG benchmarks (PACS, VLCS, Office‑Home) augmented with synthetic label imbalance (ratios up to 1:20) and a real‑world medical dataset with naturally imbalanced class frequencies across hospitals. NDCL consistently outperforms strong baselines—including ERM, CORAL, IRM, SupCon, and recent imbalance‑aware methods such as re‑weighting, LDAM, and Balanced Softmax—by 4–8% in average accuracy and by over 10% in minority‑class F1 score. Ablation studies confirm that each component (negative‑dominant loss, hard‑negative mining, re‑weighted CE, and prototype alignment) contributes additively to the final performance. Visualizations (t‑SNE) reveal that NDCL produces well‑separated clusters for minority classes, whereas conventional DG methods allow majority classes to dominate the feature space.

The paper’s strengths lie in (1) formalizing IDG and providing a clear theoretical justification for focusing on posterior alignment and margin enlargement, (2) introducing a simple yet effective contrastive objective that leverages the natural abundance of negatives, and (3) demonstrating robustness across diverse imbalance settings and domains. Limitations include the reliance on prediction‑space losses, which may miss opportunities for joint input‑feature regularization, and the sensitivity of the mixup‑based hard‑negative generation to hyper‑parameters. Future work could explore extending NDCL to multimodal or sequential data, developing online adaptation mechanisms for dynamically changing label distributions, and automating the selection of hard negatives. Overall, the study offers a compelling, theoretically grounded solution to a practically important yet under‑explored problem in domain generalization.


Comments & Academic Discussion

Loading comments...

Leave a Comment