Computer Science / Artificial Intelligence

Explicit Abstention Knobs for Predictable Reliability in Video Question Answering

February 09, 2026

Reading time: 1 minute

...

📝 Original Info

Title: Explicit Abstention Knobs for Predictable Reliability in Video Question Answering
ArXiv ID: 2601.00138
Date: 2025-12-31
Authors: Jorge Ortiz

📝 Abstract

High-stakes deployment of vision-language models (VLMs) requires selective prediction, where systems abstain when uncertain rather than risk costly errors. We investigate whether confidence-based abstention provides reliable control over error rates in video question answering, and whether that control remains robust under distribution shift. Using NExT-QA and Gemini 2.0 Flash, we establish two findings. First, confidence thresholding provides mechanistic control in-distribution. Sweeping threshold ε produces smooth risk-coverage tradeoffs, reducing error rates from 23.6% to 9.4% at 63.7% coverage with well-calibrated predictions (ECE = 0.018). Second, this control is not epistemic. Under evidence degradation (18 frames reduced to 6), the model's confidence distribution contracts only modestly. Evaluating the same frozen question instances under both evidence conditions, median self-reported confidence remains 0.9 in both regimes despite a 3× reduction in visual information. We corroborate this finding with logprob-derived confidence (p max ), obtained via a separate prompt interface on matched question instances; this signal exhibits the same failure mode. The model does not "know when it does not know" under shift. These results motivate warrant-based selec...

📄 Full Content

...(본문 내용이 길어 생략되었습니다. 사이트에서 전문을 확인해 주세요.)

Explicit Abstention Knobs for Predictable Reliability in Video Question Answering

📝 Original Info

📝 Abstract

📄 Full Content

Table of Contents

Table of Contents

📝 Original Info

📝 Abstract

📄 Full Content

Start searching

No results found