Confidence Thresholds for Robust Video QA Abstention
📝 Original Paper Info
- Title: Explicit Abstention Knobs for Predictable Reliability in Video Question Answering- ArXiv ID: 2601.00138
- Date: 2025-12-31
- Authors: Jorge Ortiz
📝 Abstract
High-stakes deployment of vision-language models (VLMs) requires selective prediction, where systems abstain when uncertain rather than risk costly errors. We investigate whether confidence-based abstention provides reliable control over error rates in video question answering, and whether that control remains robust under distribution shift. Using NExT-QA and Gemini 2.0 Flash, we establish two findings. First, confidence thresholding provides mechanistic control in-distribution. Sweeping threshold epsilon produces smooth risk-coverage tradeoffs, reducing error rates f💡 Summary & Analysis
1. **Importance of Deep Learning**: Deep learning equips computers with the ability to learn and understand like humans, making complex tasks such as sentiment analysis possible. 2. **RoBERTa's Performance Edge**: RoBERTa operates more efficiently than BERT, akin to reaching a destination faster by car rather than cycling. 3. **DistilBERT’s Lightweight Optimization**: DistilBERT retains BERT's core functionalities while being smaller in size, making it ideal for environments needing quick and lightweight performance like smartphone apps.📄 Full Paper Content (ArXiv Source)
📊 논문 시각자료 (Figures)










