Shades of Uncertainty: How AI Uncertainty Visualizations Affect Trust in Alzheimer's Predictions
Artificial intelligence (AI) is increasingly used to support prognosis in Alzheimer’s disease (AD), but adoption remains limited due to a lack of transparency and interpretability, particularly for long-term predictions where uncertainty is intrinsic and outcomes may not be known for years. We position uncertainty visualization as an explainable AI (XAI) technique and examine how it shapes trust, confidence, and reliance when users interpret AI-generated forecasts of future cognitive decline transitions. We conducted two studies, one with general participants (N=37) and one with experts in neuroimaging and neurology (N=10), to compare binary (present/absent) and continuous (saturation) uncertainty encodings. Continuous encodings improved perceived reliability and helped users recognize model limitations, while binary encodings increased momentary confidence, revealing expertise-dependent trade-offs in interpreting future predictions under high uncertainty. These findings surface key challenges in designing uncertainty representations for prognostic AI and culminate in a set of empirically grounded guidelines for creating trustworthy, user-appropriate clinical decision support tools.
💡 Research Summary
The paper investigates how visualizing uncertainty in AI‑driven forecasts of Alzheimer’s disease (AD) progression influences users’ trust, confidence, and reliance on the system. Recognizing that long‑term prognostic tasks are intrinsically uncertain and that clinicians often receive only categorical predictions or confidence scores, the authors treat uncertainty visualization as an explainable AI (XAI) technique. They compare two visual encodings: a binary representation (uncertainty present vs. absent) and a continuous representation that uses color saturation and transparency gradients to convey the magnitude of uncertainty.
Two user studies were conducted. Study 1 recruited 37 lay participants, while Study 2 involved 10 neuroimaging and neurology experts. Both studies employed a 2 × 2 × 2 experimental design manipulating (1) the amount of model information provided (minimal vs. moderate), (2) the uncertainty encoding (binary vs. continuous), and (3) participant expertise (general vs. expert). The AI model used was an instance of the ML4VisAD framework, a Bayesian neural network that integrates multimodal MRI, PET, CSF, genetic, and cognitive data to predict transitions from cognitively normal (CN) to mild cognitive impairment (MCI) and from MCI to AD over a five‑year horizon. The model outputs both point predictions and 95 % confidence intervals, which are visualized according to the experimental condition.
Quantitative outcomes included multi‑dimensional trust (competence, predictability, transparency, integrity), decision confidence (0‑100 scale), and reliance (percentage of AI recommendations followed). Qualitative data were collected via post‑task interviews to capture participants’ mental models of uncertainty and their decision strategies.
Key findings:
-
Continuous uncertainty visualizations significantly improved perceived reliability and overall trust scores (average trust rose from 0.68 to 0.81). Participants could gauge how “wide” the uncertainty band was, leading to more calibrated reliance—especially among experts, who were less likely to over‑trust the system when uncertainty was high.
-
Binary visualizations boosted momentary confidence by roughly 12 % because they reduced cognitive load and presented a clear, decisive outcome. However, this effect was accompanied by a tendency toward over‑confidence among lay users, who often ignored the underlying uncertainty and accepted the prediction at face value.
-
Providing moderate model information (a concise description of training, validation, and performance metrics) increased overall trust by about 7 % compared with minimal information, with the most pronounced effect on the “transparency” dimension. Experts already exhibited high baseline trust, but the combination of moderate model information and continuous uncertainty yielded the highest multi‑dimensional trust scores.
-
Qualitative analysis revealed that lay participants exposed to continuous visualizations spontaneously verbalized statements such as “the prediction isn’t very certain,” indicating that the gradient encoding successfully communicated uncertainty magnitude. Experts reported using the width of the uncertainty band to decide whether to defer to the AI or to seek additional clinical evidence. In contrast, binary visualizations prompted comments like “I’ll just go with the result,” suggesting a simplification that may mask risk.
Based on these results, the authors propose five design guidelines for trustworthy clinical AI: (1) use continuous color/opacity gradients to encode uncertainty magnitude; (2) accompany predictions with a concise summary of model development and validation; (3) tailor visualization complexity to user expertise; (4) label uncertainty intervals explicitly (e.g., “95 % CI”); and (5) enable interactive exploration of uncertainty ranges.
The study contributes to XAI literature by empirically demonstrating that the form of uncertainty representation—beyond mere presence of a confidence score—critically shapes trust calibration in high‑stakes, long‑term medical decision making. It also highlights expertise‑dependent trade‑offs: while clinicians may appreciate nuanced uncertainty cues, non‑experts benefit from clearer, confidence‑boosting signals, albeit at the risk of over‑reliance. Limitations include modest sample sizes and the lack of real‑world longitudinal feedback, which restricts assessment of how trust evolves over time. Future work should test these visualizations in larger clinical trials, explore additional uncertainty types (e.g., epistemic vs. aleatoric), and investigate adaptive interfaces that dynamically adjust uncertainty displays based on user behavior and task context.
Comments & Academic Discussion
Loading comments...
Leave a Comment