Soft-Label Training Preserves Epistemic Uncertainty

Many machine learning tasks involve inherent subjectivity where annotators naturally provide varied labels. Standard practice collapses these label distributions into single labels, aggregating diverse human judgments into point estimates. We argue this is epistemically misaligned for ambiguous data: the annotation distribution itself is the ground truth. Training on collapsed single labels forces models to express false confidence on fundamentally ambiguous cases, creating misalignment between model certainty and the diversity of human perception. We demonstrate empirically that soft-label training treating annotation distributions as ground truth preserves epistemic uncertainty. Across vision and NLP tasks, soft-label training achieves 32% lower KL divergence to human annotations and 61% stronger correlation between model and annotation entropy, while matching hard-label accuracy. Our work repositions annotation distributions from noisy signals to be aggregated away, to faithful representations of epistemic uncertainty that models should learn to reproduce.

📜 Original Paper Content