We study online conformal prediction for non-stationary data streams subject to unknown distribution drift. While most prior work studied this problem under adversarial settings and/or assessed performance in terms of gaps of time-averaged marginal coverage, we instead evaluate performance through training-conditional cumulative regret. We specifically focus on independently generated data with two types of distribution shift: abrupt change points and smooth drift. When non-conformity score functions are pretrained on an independent dataset, we propose a split-conformal style algorithm that leverages drift detection to adaptively update calibration sets, which provably achieves minimax-optimal regret. When non-conformity scores are instead trained online, we develop a full-conformal style algorithm that again incorporates drift detection to handle non-stationarity; this approach relies on stability - rather than permutation symmetry - of the model-fitting algorithm, which is often better suited to online learning under evolving environments. We establish non-asymptotic regret guarantees for our online full conformal algorithm, which match the minimax lower bound under appropriate restrictions on the prediction sets. Numerical experiments corroborate our theoretical findings.
Conformal prediction, also known as conformal inference, has emerged as a versatile, distribution-free framework for quantifying uncertainty in modern data science (Vovk et al., 1999;Papadopoulos et al., 2002;Vovk et al., 2005;Angelopoulos et al., 2023Angelopoulos et al., , 2024b)). What sets it apart is its ability to offer rigorous, finitesample coverage guarantees under minimal distribution assumptions, allowing practitioners to treat complex machine learning models as black boxes while still producing reliable measures of uncertainty. In its classical formulation, we observe n training data taking the form of n feature-response pairs {(X i , Y i )} 1≤i≤n ⊂ X × R, and are given a test point X n+1 ∈ X for which the corresponding response Y n+1 is unknown. The aim is to construct a prediction set C(X n+1 ) that is likely to cover Y n+1 . Conformal prediction achieves this objective in a distribution-free fashion, provided the data {(X i , Y i )} 1≤i≤n+1 are exchangeable (Angelopoulos et al., 2023(Angelopoulos et al., , 2024b)).
While the ability to accommodate exchangeable data applies to wide-ranging practical scenarios, there is no shortage of scenarios that naturally violate exchangeability. One notable example arises when the data distributions drift over time, as is often the case with sequential or online data (Zhou et al., 2025;Fannjiang et al., 2022). This motivates a flurry of recent studies exploring online conformal prediction, with the objective to extend the conformal prediction framework to accommodate sequentially arriving data streams (e.g., Vovk et al. (2009); Weinstein and Ramdas (2020); Gibbs and Candes (2021); Bastani et al. (2022); Zaffran et al. (2022); Bhatnagar et al. (2023); Lin et al. (2022); Feldman et al. (2022); Auer et al. (2023); Sun and Yu (2023); Xu and Xie (2023b,a); Xu et al. (2024); Gibbs and Candès (2024); Han et al. (2024a); Angelopoulos et al. (2023Angelopoulos et al. ( , 2024aAngelopoulos et al. ( , 2025)); Bao et al. (2024); Lee and Matni (2024); Yang et al. (2024); Podkopaev et al. (2024); Su et al. (2024); Zhang et al. (2024b); Ramalingam et al. (2025); Sale and Ramdas (2025); Humbert et al. (2025)).
Setting the stage, consider a sequential data stream {(X t , Y t )} 1≤t≤T generated by a dynamic process, where X t ∈ X denotes the feature (or covariate) at time t and Y t ∈ R the corresponding response. The datagenerating distribution is allowed to drift over time; namely, the distribution of (X t , Y t ), denoted by D t , may vary with t. At each time t, the task is to use the previously observed data {(X s , Y s )} s<t , together with the newly observed feature X t , to construct a prediction set C t (X t ) that is likely to contain the as-yet-unobserved response Y t . More precisely, for a prescribed miscoverage level α ∈ (0, 1), a desirable prediction set C t (X t ) would satisfy
Central to conformal prediction is the non-conformity score function s t (•, •), which is computed at time t and may sometimes depend on past observations {(X τ , Y τ )} τ :τ <t . For the most part, the score s t (x, y) measures the extent to which a data point (x, y) ∈ X × R deviates from the prediction of a fitted model. A canonical example is the absolute residual score s t (x, y) = |y -µ t (x)|, where µ t (•) denotes a predictive model trained by an arbitrary machine learning algorithm (for instance, a neural network, or a nonparametric estimator). A widely studied class of prediction intervals takes the form C t (x) := {y : s t (x, y) ≤ q t } (2)
for some adaptively chosen threshold q t , in which case prediction interval construction amounts to dynamically adjusting {q t } given the non-conformity scores.
The online nature of the above problem has motivated a recent line of work to reframe (1) as an online decision-making task and leverage techniques from online learning to address it. A prominent example is Adaptive Conformal Inference (ACI), proposed by Gibbs and Candes (2021). In a nutshell, the ACI algorithm sequentially calibrates the quantile estimates via the iterative update rule:
which can be interpreted as an instance of the online subgradient method applied to optimize the quantile loss (or pinball loss).
To establish theoretical validity, a substantial body of prior work developed coverage guarantees for online conformal prediction methods. For instance, Gibbs and Candes (2021) demonstrated that the ACI algorithm achieves some sort of time-averaged coverage without imposing any assumption on the data-generating mechanism; more formally, they proved that, with a suitable constant learning rate schedule, ACI satisfies
as T grows, which holds even when the data stream is generated adversarially. Building on this result, subsequent work has extended time-averaged coverage results to a broader family of algorithms (e.g., Zaffran et al. (2022); Angelopoulos et al. (2024a); Bhatnagar et al. (2023); Zhang et al. (2024a)). Note, however, that controlling the empirical l
This content is AI-processed based on open access ArXiv data.