Uncertainty Estimation by Flexible Evidential Deep Learning
Uncertainty quantification (UQ) is crucial for deploying machine learning models in high-stakes applications, where overconfident predictions can lead to serious consequences. An effective UQ method must balance computational efficiency with the ability to generalize across diverse scenarios. Evidential deep learning (EDL) achieves efficiency by modeling uncertainty through the prediction of a Dirichlet distribution over class probabilities. However, the restrictive assumption of Dirichlet-distributed class probabilities limits EDL’s robustness, particularly in complex or unforeseen situations. To address this, we propose \textit{flexible evidential deep learning} ($\mathcal{F}$-EDL), which extends EDL by predicting a flexible Dirichlet distribution – a generalization of the Dirichlet distribution – over class probabilities. This approach provides a more expressive and adaptive representation of uncertainty, significantly enhancing UQ generalization and reliability under challenging scenarios. We theoretically establish several advantages of $\mathcal{F}$-EDL and empirically demonstrate its state-of-the-art UQ performance across diverse evaluation settings, including classical, long-tailed, and noisy in-distribution scenarios.
💡 Research Summary
The paper tackles a fundamental limitation of Evidential Deep Learning (EDL), namely its reliance on the Dirichlet distribution as a prior over class probabilities. While Dirichlet‑based EDL offers a single‑forward‑pass, closed‑form uncertainty quantification, the restrictive assumption hampers robustness in ambiguous, noisy, or out‑of‑distribution (OOD) scenarios. To overcome this, the authors introduce Flexible Evidential Deep Learning (F‑EDL), which replaces the Dirichlet prior with the Flexible Dirichlet (FD) distribution—a generalization that incorporates an additional allocation vector p and a dispersion scalar τ alongside the usual concentration parameters α.
The FD distribution is constructed by normalizing a Flexible Gamma basis, which combines independent Gamma variables with a shared Gamma component and a multinomial allocation variable. This design endows FD with the ability to model dependencies among class components and to generate multimodal probability distributions, a capability absent in the standard Dirichlet. Consequently, F‑EDL can express richer uncertainty patterns, especially when multiple classes are plausible simultaneously.
Model architecture: an input x is processed by a feature extractor fθ to obtain a latent representation z. Three neural heads predict α = exp(gϕ₁(z)), p = softmax(gϕ₂(z)), and τ = softplus(gϕ₃(z)). Spectral normalization is applied to fθ and gϕ₁ to stabilize the concentration parameters. The predicted parameters define a posterior FD distribution π ∼ FD(α, p, τ) over class probabilities.
Training objective: (i) the expected mean‑squared error under the FD posterior, **Eπ
Comments & Academic Discussion
Loading comments...
Leave a Comment