Artificial Intelligence tools such as large language models are increasingly used by the public to obtain health information and guidance. In health-related contexts, following or rejecting AI-generated advice can have direct clinical implications. Existing instruments like the Trust in Automated Systems Survey assess trustworthiness of generic technology, and no validated instrument measures users' trust in AI-generated health advice specifically. This study developed and validated the Trust in AI-Generated Health Advice (TAIGHA) scale and its four-item short form (TAIGHA-S) as theory-based instruments measuring trust and distrust, each with cognitive and affective components. The items were developed using a generative AI approach, followed by content validation with 10 domain experts, face validation with 30 lay participants, and psychometric validation with 385 UK participants who received AI-generated advice in a symptom-assessment scenario. After automated item reduction, 28 items were retained and reduced to 10 based on expert ratings. TAIGHA showed excellent content validity (S-CVI/Ave=0.99) and CFA confirmed a two-factor model with excellent fit (CFI=0.98, TLI=0.98, RMSEA=0.07, SRMR=0.03). Internal consistency was high (α=0.95). Convergent validity was supported by correlations with the Trust in Automated Systems Survey (r=0.67/-0.66) and users' reliance on the AI's advice (r=0.37 for trust), while divergent validity was supported by low correlations with reading flow and mental load (all |r|<0.25). TAIGHA-S correlated highly with the full scale (r=0.96) and showed good reliability (α=0.88). TAIGHA and TAIGHA-S are validated instruments for assessing user trust and distrust in AI-generated health advice. Reporting trust and distrust separately permits a more complete evaluation of AI interventions, and the short scale is well-suited for time-constrained settings.
1
Original Paper
The Trust in AI-Generated Health Advice (TAIGHA)
Scale and Short Version (TAIGHA-S): Development
and Validation Study
Marvin Kopka1,2*, Azeem Majeed3, Gabriella Spinelli4, Austen El-
Osta2,5, Markus Feufel1
1 Division of Ergonomics, Department of Psychology and Ergonomics (IPA),
Technische Universität Berlin, Berlin, Germany
2 Self-Care Academic Research Unit (SCARU), School of Public Health, Imperial
College London, London, United Kingdom
3 Department of Public Health and Primary Care, Imperial College London,
London, United Kingdom
4 College of Engineering, Design and Physical Sciences, Brunel University of
London, Uxbridge, United Kingdom
5 School of Life Course and Population Sciences, King’s College London, London,
United Kingdom
Austen El-Osta and Markus Feufel share the last authorship.
Abstract
Artificial Intelligence tools such as large language models are increasingly used by
the public to obtain health information and guidance. In health-related contexts,
following or rejecting AI-generated advice can have direct clinical implications.
Existing instruments like the Trust in Automated Systems Survey assess
trustworthiness of generic technology, and no validated instrument measures
users’ trust in AI-generated health advice specifically. This study developed and
validated the Trust in AI-Generated Health Advice (TAIGHA) scale and its four-
item short form (TAIGHA-S) as theory-based instruments measuring trust and
distrust, each with cognitive and affective components. The items were developed
using a generative AI approach, followed by content validation with 10 domain
experts, face validation with 30 lay participants, and psychometric validation with
385 UK participants who received AI-generated advice in a symptom-assessment
scenario. After automated item reduction, 28 items were retained and reduced to
10 based on expert ratings. TAIGHA showed excellent content validity (S-
CVI/Ave=0.99) and CFA confirmed a two-factor model with excellent fit (CFI=0.98,
TLI=0.98, RMSEA=0.07, SRMR=0.03). Internal consistency was high (α=0.95).
2
Convergent validity was supported by correlations with the Trust in Automated
Systems Survey (r=0.67/−0.66) and users’ reliance on the AI’s advice (r=0.37 for
trust), while divergent validity was supported by low correlations with reading flow
and mental load (all |r|<0.25). TAIGHA-S correlated highly with the full scale
(r=0.96) and showed good reliability (α=0.88). TAIGHA and TAIGHA-S are
validated instruments for assessing user trust and distrust in AI-generated health
advice. Reporting trust and distrust separately permits a more complete evaluation
of AI interventions, and the short scale is well-suited for time-constrained settings.
Keywords: Artificial Intelligence; Health Advice; Trust; Distrust; Scale;
Questionnaire; Measurement; Medical Decision-Making; Advice-Taking, Large
Language Models
Introduction
Given the growing popularity, availability and performance of generative Artificial
Intelligence (AI) tools such as Large Language Models (LLMs), the public are
increasingly using these technologies to obtain health information and guidance
for a variety of health-related tasks and decisions [1]. This growing reliance on AI-
generated information is particularly consequential in health-related contexts,
where following or rejecting an AI tool’s advice can have personal, clinical and
safety consequences, as well as broader impacts on healthcare systems [2–4].
The recent case of a ChatGPT user who was hospitalised for bromism after
following advice on how to reduce salt intake demonstrated these risks [5].
Similarly, LLMs may also inadvertently spread misinformation when generating
inaccurate or fabricated content [6,7]. Whereas such incidents exemplify potential
dangers, the same technology also promises to make healthcare more efficient.
Emerging empirical evidence suggests that these risks are amplified by users’ high
levels of trust in AI-generated medical advice. A recent MIT study [8] found that
patients often trust medical recommendations produced by AI systems more than
those provided by human clinicians, even when the AI advice is demonstrably
incorrect. Notably, participants were less likely to critically challenge AI-generated
guidance and more inclined to follow it with confidence, raising concerns about
overreliance and reduced skepticism in decision-making. This tendency is
particularly problematic in health contexts, where misplaced trust may lead to
harmful
self-management
behaviours,
delayed
clinical
intervention,
or
inappropriate treatment decisions.
At a system level, for instance, LLMs may support patient empowerment, their
decision-making, and health education in community settings [9–12]. For non-
experts, determining the accuracy of information or advice provided by an AI
decision support tool (DST) is often challenging, yet they must still d
This content is AI-processed based on open access ArXiv data.