Semi-Supervised Conformal Prediction With Unlabeled Nonconformity Score

Semi-Supervised Conformal Prediction With Unlabeled Nonconformity Score
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Conformal prediction (CP) is a powerful framework for uncertainty quantification, generating prediction sets with coverage guarantees. Split conformal prediction relies on labeled data in the calibration procedure. However, the labeled data is often limited in real-world scenarios, leading to unstable coverage performance in different runs. To address this issue, we extend CP to the semi-supervised setting and propose SemiCP, a new paradigm that leverages both labeled and unlabeled data for calibration. To achieve this, we introduce an unlabeled nonconformity score, Nearest Neighbor Matching (NNM) score. Specifically, NNM estimates the nonconformity scores of unlabeled samples using their most similar pseudo-labeled counterparts during calibration, while maintaining the original scores for labeled data. Theoretically, we demonstrate that the average coverage gap (i.e., the absolute difference between the empirical marginal coverage and the target coverage) of SemiCP can decrease significantly at a rate $\mathcal{O}(1/\sqrt{N})$ and converge to an error term, where $N$ is the number of unlabeled data. Extensive experiments validate the effectiveness of SemiCP under limited labeled data, reducing the average coverage gap by up to 77% on common benchmarks with 4000 unlabeled examples, when there are only 20 labeled examples.


💡 Research Summary

Conformal prediction (CP) provides finite‑sample, distribution‑free prediction sets with a guaranteed coverage level 1 − α. In the widely used split‑conformal setting, a hold‑out calibration set of labeled examples is required to estimate a quantile of a nonconformity score. When the calibration set is small, the quantile estimate becomes noisy: the marginal coverage follows a Beta distribution with large variance, leading to over‑coverage, oversized prediction sets, and run‑to‑run instability. Existing remedies either interpolate calibration points, cluster them, or employ few‑shot meta‑learning, but all still rely heavily on labeled data and lack finite‑sample guarantees.

The authors propose Semi‑Supervised Conformal Prediction (SemiCP), a paradigm that augments the calibration pool with unlabeled examples. The key technical contribution is a new unlabeled nonconformity score called Nearest‑Neighbor Matching (NNM) score. For each unlabeled instance \tilde{x}, the method finds its most similar labeled instance (x_i, y_i) in a feature space (e.g., the penultimate layer of a pretrained network) and treats y_i as a pseudo‑label \tilde{y}. The NNM score is then defined as S(\tilde{x}, \tilde{y}), where S is any standard score function (THR, APS, RAPS, etc.). By construction, NNM does not require any additional training or optimization; it only needs a nearest‑neighbor search.

SemiCP aggregates the original labeled scores {s_i} and the NNM scores {\tilde{s}i} into a single multiset of size n + N, where n is the number of labeled calibration points and N the number of unlabeled points. The (1 − α) quantile of this combined set, \hat{τ}{SemiCP}, is used as the threshold for constructing prediction sets on test inputs, exactly as in ordinary split CP.

Theoretical analysis shows that the marginal coverage of SemiCP satisfies
P(y_test ∈ C_{SemiCP}(x_test)) ≥ 1 − α + ε_{n,N},
where ε_{n,N} = (N/(n+N))·(F_S(\hat{τ}) − F_{\tilde{S}}(\hat{τ})). Here F_S and F_{\tilde{S}} are the CDFs of the true (label‑aware) and NNM scores, respectively. The bias term ε_{n,N} vanishes as N grows because the NNM distribution converges to the true distribution. Moreover, a calibration‑conditional bound (Theorem 2) demonstrates that the average coverage gap shrinks at the rate O(1/√N) with high probability, while an additional term of order 1/(n+N) captures the usual finite‑sample uncertainty. Thus, unlabeled data can dramatically reduce the variability caused by a small labeled calibration set.

Empirically, the authors evaluate SemiCP on three image classification benchmarks: CIFAR‑10, CIFAR‑100, and ImageNet. In the most challenging regime (20 labeled examples, 4 000 unlabeled examples), SemiCP reduces the average coverage gap by up to 77 % compared with standard split CP using the same base scores (THR, APS, RAPS). Prediction set sizes also shrink by 5–7 % on average, indicating higher efficiency. The method remains effective across different neural network backbones (ResNet‑50, Vision Transformers) and when combined with conditional CP variants such as ClusterCP. Sensitivity analyses show that using a single nearest neighbor (k = 1) yields the best trade‑off between bias and variance, and that the approach is robust to the choice of distance metric (Euclidean vs. cosine).

In summary, the paper makes three principal contributions: (1) introducing SemiCP, a semi‑supervised conformal prediction framework that leverages unlabeled data for calibration; (2) designing the NNM score, a simple yet theoretically sound unlabeled nonconformity measure that enables the above framework; and (3) providing both rigorous finite‑sample guarantees (coverage bias bounded by O(1/√N)) and extensive empirical evidence of improved stability and efficiency under severe label scarcity. The method requires no retraining, incurs only a nearest‑neighbor search cost, and can be plugged into any existing CP pipeline, making it a practical solution for real‑world applications where labeled data are expensive or scarce.


Comments & Academic Discussion

Loading comments...

Leave a Comment