Coverage Guarantees for Pseudo-Calibrated Conformal Prediction under Distribution Shift

Reading time: 5 minute
...

📝 Original Info

  • Title: Coverage Guarantees for Pseudo-Calibrated Conformal Prediction under Distribution Shift
  • ArXiv ID: 2602.14913
  • Date: 2026-02-16
  • Authors: ** (논문에 명시된 저자 정보가 제공되지 않아 정확히 기재할 수 없습니다. 원문에서 확인 후 추가해 주세요.) **

📝 Abstract

Conformal prediction (CP) offers distribution-free marginal coverage guarantees under an exchangeability assumption, but these guarantees can fail if the data distribution shifts. We analyze the use of pseudo-calibration as a tool to counter this performance loss under a bounded label-conditional covariate shift model. Using tools from domain adaptation, we derive a lower bound on target coverage in terms of the source-domain loss of the classifier and a Wasserstein measure of the shift. Using this result, we provide a method to design pseudo-calibrated sets that inflate the conformal threshold by a slack parameter to keep target coverage above a prescribed level. Finally, we propose a source-tuned pseudo-calibration algorithm that interpolates between hard pseudo-labels and randomized labels as a function of classifier uncertainty. Numerical experiments show that our bounds qualitatively track pseudo-calibration behavior and that the source-tuned scheme mitigates coverage degradation under distribution shift while maintaining nontrivial prediction set sizes.

💡 Deep Analysis

📄 Full Content

C ONFORMAL prediction (CP) provides a rigorous frame- work for constructing prediction sets with guaranteed marginal coverage under an exchangeability assumption between calibration and test data [1], [2]. CP has been applied in a variety of settings [3], [4], [5]. However, the finite-sample, distribution-free guarantees from CP rely on exchangeability [6], which is often violated due to distribution shift between the source (calibration) and target (test) domains [7], [8], [9], [10]. If target labels are available, one approach to correct such miscoverage is to utilize weighted CP by importance weighting the calibration scores with estimated density ratios between source and target marginals on the input space [2], [9]. Alternatively, robust distributional approaches construct ambiguity sets (such as Lévy-Prokhorov balls) around the score distribution and propagate worst-case perturbations through the conformal quantile in score space [11].

When target-domain labels are unavailable, one possibility is to utilize pseudo-labels from source classifiers. However, pseudo-labels introduce additional uncertainty that can degrade coverage. Some recent works have attempted to mitigate this by heuristically rescaling scores using predictive entropy or reconstruction loss [12], [13]; however, these methods do not yield analytical coverage guarantees. Alternatively, [14] offers bounds for pseudo-labeled targets using score distribution distances. However, since these bounds do not depend on the underlying classifier or shift characteristics, they do not provide insights on designing the classifier or pseudocalibration schemes for trading off coverage and set size.

These limitations point to a broader gap: existing CP methods under distribution shift do not account for how source domain errors translate to the target domain. Domain adaptation (DA) theory provides a natural lens for this question by bounding target errors in terms of source losses and distributional shift measures [15], [16]. Yet, these ideas have not been incorporated into CP under distribution shift, leaving the analytical understanding of label-free multiclass CP under distribution shift underdeveloped.

Our work bridges this gap by drawing on DA theory to derive coverage guarantees that explicitly depend on classifier properties and shift measures. We extend these tools to multiclass classification and obtain bounds for pseudo-calibration on the target domain in terms of the source classifier’s loss and the Wasserstein distance between source and target distributions. Inspired by [17], we further introduce a sourcetuned pseudo-calibration algorithm that interpolates between hard pseudo-labels and randomized labels based on classifier uncertainty, reducing the conservatism of standard pseudocalibration while preserving source domain coverage.

To our knowledge, this is the first integration of DA theory with conformal coverage analysis under distribution shift. Our contributions can be summarized as follows. First, we derive coverage lower bounds for pseudo-calibrated prediction sets on the target domain in terms of the classifier’s source-domain loss, Lipschitz properties of the classifier, and Wasserstein measure of the distribution shift. Second, we introduce relaxed pseudo-calibrated sets that inflate the conformal threshold by a slack parameter and provide a simple design rule for choosing this slack to guarantee a desired target coverage level. Finally, we propose a source-tuned pseudo-calibration algorithm that interpolates between hard pseudo-labels and randomized labels based on classifier uncertainty. Experiments show that our theoretical bounds track empirical behavior and that the proposed algorithm mitigates coverage drop on the target while maintaining nontrivial expected set sizes. II. BACKGROUND AND PROBLEM FORMULATION a) Conformal Prediction: Conformal prediction constructs prediction sets with finite-sample marginal coverage guarantees [18]. Given calibration data {(X i , Y i )} n i=1 i.i.d.

∼ P XY and a test point (X n+1 , Y n+1 ), the goal is to ensure

for any α ∈ (0, 1) under an exchangeability assumption between the calibration and test point, i.e., their joint dis-tribution is invariant under permutations [18,Section 3]. Let s : X ×Y → R be a nonconformity score. For a distribution P on X × Y, denote the pushforward score distribution by s#P , with CDF F s#P . For α ∈ (0, 1), define (1 -α)-quantile as inf{t ∈ R : F s#P (t) ≥ 1-α}. With the empirical distribution Pn = 1 n n i=1 δ (Xi,Yi) , the split-conformal threshold at α is

and the conformal prediction set is given by

Definition 1: For p ≥ 1 and probability measures P and Q on X , the p-Wasserstein distance is

We consider a multiclass classification setting with input space X = R d and label space Y = [K] := {1, . . . , K}. The source and target domains are represented by joint distributions P XY and Q XY over X × Y. A classifier f : X → Y is induced by a logit map M f

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut