Will We Trust What We Don't Understand? Impact of Model Interpretability and Outcome Feedback on Trust in AI

Will We Trust What We Don't Understand? Impact of Model Interpretability and Outcome Feedback on Trust in AI
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Despite AI’s superhuman performance in a variety of domains, humans are often unwilling to adopt AI systems. The lack of interpretability inherent in many modern AI techniques is believed to be hurting their adoption, as users may not trust systems whose decision processes they do not understand. We investigate this proposition with a novel experiment in which we use an interactive prediction task to analyze the impact of interpretability and outcome feedback on trust in AI and on human performance in AI-assisted prediction tasks. We find that interpretability led to no robust improvements in trust, while outcome feedback had a significantly greater and more reliable effect. However, both factors had modest effects on participants’ task performance. Our findings suggest that (1) factors receiving significant attention, such as interpretability, may be less effective at increasing trust than factors like outcome feedback, and (2) augmenting human performance via AI systems may not be a simple matter of increasing trust in AI, as increased trust is not always associated with equally sizable improvements in performance. These findings invite the research community to focus not only on methods for generating interpretations but also on techniques for ensuring that interpretations impact trust and performance in practice.


💡 Research Summary

The paper investigates how two design factors—model interpretability and outcome feedback—affect users’ trust in artificial‑intelligence (AI) systems and their performance on AI‑assisted prediction tasks. Although modern AI models often achieve superhuman accuracy, adoption remains limited because users are reluctant to rely on “black‑box” systems they cannot understand. To test whether providing explanations (interpretability) actually increases trust, the authors designed an interactive experiment with a 2 × 2 factorial structure: (1) interpretability present vs. absent and (2) outcome feedback present vs. absent.

Participants (N = 240) performed a sequential prediction task (e.g., forecasting a numeric target from a time‑series). In the interpretability condition, each AI prediction was accompanied by a visual explanation—feature‑importance bars, a partial decision tree, or SHAP‑style values—intended to make the model’s reasoning transparent. In the non‑interpretability condition, only the raw prediction was shown. In the feedback condition, after each trial participants received the true outcome together with the model’s error, allowing them to see whether their own decision and the AI’s suggestion were correct. In the no‑feedback condition, this information was withheld, so participants could only rely on their intuition and the AI’s prediction.

Trust was measured with pre‑ and post‑experiment Likert questionnaires covering willingness to follow the AI, perceived reliability, and overall confidence in the system. Performance was operationalized as prediction accuracy (mean absolute error) and decision latency (average time per trial). The authors applied mixed‑effects ANOVAs to assess main effects and interactions.

Results showed a striking asymmetry. Model interpretability alone did not produce a statistically significant increase in trust (p = 0.38, Cohen’s d ≈ 0.12) and its effect size was negligible (η² ≈ 0.02). By contrast, providing outcome feedback yielded a robust boost in trust: post‑experiment trust scores rose by an average of 0.45 points on a 7‑point scale (p < 0.001, η² ≈ 0.18). Feedback also modestly reduced decision time (≈12 % faster) but did not dramatically improve accuracy. The interaction between interpretability and feedback was weak; interpretability only added a small, non‑significant performance edge when feedback was already present. Overall, both factors produced modest gains in prediction accuracy (effect sizes below 0.05), indicating that higher trust does not automatically translate into proportionally higher task performance.

The authors interpret these findings in three ways. First, the prevailing research emphasis on generating explanations may be misplaced if explanations do not meaningfully affect user trust. Explanations must be coupled with mechanisms that help users integrate them into their decision processes—e.g., training, interactive exploration, or context‑specific tailoring. Second, direct experience of outcomes (feedback) appears to be the primary driver of trust formation. Seeing one’s errors and the AI’s errors creates a concrete basis for evaluating reliability, which is more persuasive than abstract rationales. Third, trust and performance are partially dissociated; increasing trust alone is insufficient for performance gains, and over‑trust could even lead to complacency.

Limitations include the relatively simple, synthetic prediction task, which may not capture the complexity of real‑world decision environments, and the narrow definition of interpretability (visual explanations only). Future work should explore diverse explanation modalities (textual, conversational, causal), longer‑term usage scenarios, and domain‑specific tasks where stakes are higher.

In sum, the study provides empirical evidence that outcome feedback is a more potent lever than model interpretability for building user trust in AI. It calls for a shift in research and design priorities: rather than focusing solely on producing explanations, developers should ensure that explanations are actionable and that users receive clear, timely feedback on AI performance. Only by aligning trust‑building mechanisms with actual performance improvements can AI systems achieve broader acceptance and effective human‑AI collaboration.


Comments & Academic Discussion

Loading comments...

Leave a Comment