Noisy-Pair Robust Representation Alignment for Positive-Unlabeled Learning

Noisy-Pair Robust Representation Alignment for Positive-Unlabeled Learning
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Positive-Unlabeled (PU) learning aims to train a binary classifier (positive vs. negative) where only limited positive data and abundant unlabeled data are available. While widely applicable, state-of-the-art PU learning methods substantially underperform their supervised counterparts on complex datasets, especially without auxiliary negatives or pre-estimated parameters (e.g., a 14.26% gap on CIFAR-100 dataset). We identify the primary bottleneck as the challenge of learning discriminative representations under unreliable supervision. To tackle this challenge, we propose NcPU, a non-contrastive PU learning framework that requires no auxiliary information. NcPU combines a noisy-pair robust supervised non-contrastive loss (NoiSNCL), which aligns intra-class representations despite unreliable supervision, with a phantom label disambiguation (PLD) scheme that supplies conservative negative supervision via regret-based label updates. Theoretically, NoiSNCL and PLD can iteratively benefit each other from the perspective of the Expectation-Maximization framework. Empirically, extensive experiments demonstrate that: (1) NoiSNCL enables simple PU methods to achieve competitive performance; and (2) NcPU achieves substantial improvements over state-of-the-art PU methods across diverse datasets, including challenging datasets on post-disaster building damage mapping, highlighting its promise for real-world applications. Code: Code will be open-sourced after review.


💡 Research Summary

Positive‑Unlabeled (PU) learning aims to train a binary classifier when only a small set of positively labeled examples and a large pool of unlabeled data are available. Existing PU methods often rely on auxiliary negative validation data or pre‑estimated class‑prior parameters, and they fall far behind fully supervised models on complex benchmarks such as CIFAR‑100, where the performance gap can exceed 10 %. The authors identify the root cause as the difficulty of learning discriminative representations under unreliable supervision: noisy pairwise relations (incorrect same‑class/different‑class pairs) dominate the gradient signal, preventing the model from forming well‑separated clusters.

To address this, the paper introduces a novel non‑contrastive PU framework called NcPU, which consists of two synergistic components:

  1. Noisy‑Pair Robust Supervised Non‑Contrastive Loss (NoiSNCL).

    • Builds on the BYOL architecture (online and momentum‑updated target networks).
    • Unlike contrastive losses that push apart different‑class pairs, NoiSNCL only pulls together samples whose (possibly noisy) labels agree, thereby reducing the influence of incorrect pairs.
    • The authors analytically show that in PU settings noisy pairs generate larger gradient magnitudes than clean pairs (Eq. 6).
    • By scaling the loss (Eq. 7) they ensure that gradients from clean pairs dominate (Eq. 8), effectively turning the learning dynamics into intra‑class clustering while tolerating label noise.
  2. Phantom Label Disambiguation (PLD).

    • Uses class‑conditional prototypes µ_c updated via a moving‑average of normalized embeddings (Eq. 9).
    • Pseudo‑targets s′ are refined by comparing each sample’s embedding with the prototypes (Eq. 10).
    • To avoid the trivial solution where every unlabeled sample is forced to the negative class, a PhantomGate mechanism injects explicit negative supervision only for samples whose classifier confidence exceeds a dynamically learned threshold τ.
    • The threshold is self‑adaptively updated (SAT) using batch‑wise statistics (Eqs. 12‑13), starting low to provide abundant negative signals early on and gradually increasing to filter out potentially mislabeled negatives.

The two modules fit naturally into an Expectation‑Maximization (EM) view: the E‑step assigns pseudo‑labels using the current classifier (PLD), and the M‑step refines the representation by minimizing NoiSNCL, which tightens intra‑class clusters. This iterative process simultaneously improves label quality and representation discriminability.

Empirical evaluation spans several vision datasets (CIFAR‑10/100, ImageNet‑subset) and a real‑world remote‑sensing task (post‑disaster building damage mapping). Key findings include:

  • Adding NoiSNCL to simple PU baselines (nnPU, uPU) yields 3–5 % absolute accuracy gains without any extra hyper‑parameters.
  • Full NcPU narrows the supervised‑vs‑PU gap on CIFAR‑100 from 14.26 % to under 3 %, achieving 81.6 % accuracy versus 71.4 % for the previous state‑of‑the‑art.
  • On the building‑damage dataset, mean Intersection‑over‑Union improves from 0.58 to 0.65, demonstrating practical utility in humanitarian assistance.
  • Ablation studies confirm that both NoiSNCL and PLD contribute individually, but their combination yields the largest performance boost.

In summary, the paper proposes a noise‑pair‑robust non‑contrastive representation learning strategy coupled with prototype‑driven phantom label disambiguation, eliminating the need for auxiliary negatives or prior class‑ratio estimates. Theoretical analysis, EM interpretation, and extensive experiments collectively show that NcPU can bring PU learning close to fully supervised performance even on challenging visual tasks, opening avenues for broader application to other modalities and multi‑class PU scenarios.


Comments & Academic Discussion

Loading comments...

Leave a Comment