Conformal Prediction Sets for Instance Segmentation

Conformal Prediction Sets for Instance Segmentation
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Current instance segmentation models achieve high performance on average predictions, but lack principled uncertainty quantification: their outputs are not calibrated, and there is no guarantee that a predicted mask is close to the ground truth. To address this limitation, we introduce a conformal prediction algorithm to generate adaptive confidence sets for instance segmentation. Given an image and a pixel coordinate query, our algorithm generates a confidence set of instance predictions for that pixel, with a provable guarantee for the probability that at least one of the predictions has high Intersection-Over-Union (IoU) with the true object instance mask. We apply our algorithm to instance segmentation examples in agricultural field delineation, cell segmentation, and vehicle detection. Empirically, we find that our prediction sets vary in size based on query difficulty and attain the target coverage, outperforming existing baselines such as Learn Then Test, Conformal Risk Control, and morphological dilation-based methods. We provide versions of the algorithm with asymptotic and finite sample guarantees.


💡 Research Summary

The paper tackles a fundamental shortcoming of modern instance‑segmentation models: while they achieve high average accuracy, they provide no principled quantification of uncertainty for individual mask predictions. Existing conformal prediction techniques such as Learn‑Then‑Test (LTT) or Conformal Risk Control (CRC) are designed for settings where a single tunable parameter yields a single prediction per input and where the loss function is monotone in that parameter. In instance segmentation, the natural quality metric is Intersection‑over‑Union (IoU), which is non‑monotonic, and a single parameter value rarely works well for all queries. Consequently, LTT/CRC often fail to find a feasible solution, or they return a single mask that may be far from the ground truth.

To overcome these limitations, the authors propose a set‑based conformal prediction algorithm that searches over a collection of parameter values rather than a single one. The method assumes access to an instance‑segmentation model (f) that takes an image‑plus‑query point ((I, (z_1,z_2))) and a tunable parameter (T) (e.g., mask‑score threshold in Segment‑Anything, watershed threshold, logit cutoff) and returns a binary mask. The user pre‑selects a modest grid ({t_1,\dots,t_k}) of possible parameter values.

Using a calibration set ({(X_i,Y_i)}{i=1}^n) (where each (Y_i) is the true mask for the query point in (X_i)), the algorithm computes for every pair ((i,j)) the IoU (\rho{ij}= \text{IoU}(Y_i, f(X_i,t_j))). For each parameter (t_j) it records the set (S_j={i:\rho_{ij}>\tau}) of calibration examples that achieve a user‑specified IoU threshold (\tau). The core combinatorial step is to find a minimum‑size subset of parameters (J_{\alpha,\tau}\subseteq{1,\dots,k}) such that the union of the corresponding (S_j) covers at least ((1-\alpha)n) calibration points. This is precisely the classic set‑cover problem; the authors employ a greedy approximation followed by a brute‑force refinement on small subsets to obtain a (near‑)optimal cover.

For a new test query (X_{\text{test}}), the algorithm returns the conformal prediction set \


Comments & Academic Discussion

Loading comments...

Leave a Comment