Overstating Attitudes, Ignoring Networks: LLM Biases in Simulating Misinformation Susceptibility
Large language models (LLMs) are increasingly used as proxies for human judgment in computational social science, yet their ability to reproduce patterns of susceptibility to misinformation remains unclear. We test whether LLM-simulated survey respondents, prompted with participant profiles drawn from social survey data measuring network, demographic, attitudinal and behavioral features, can reproduce human patterns of misinformation belief and sharing. Using three online surveys as baselines, we evaluate whether LLM outputs match observed response distributions and recover feature-outcome associations present in the original survey data. LLM-generated responses capture broad distributional tendencies and show modest correlation with human responses, but consistently overstate the association between belief and sharing. Linear models fit to simulated responses exhibit substantially higher explained variance and place disproportionate weight on attitudinal and behavioral features, while largely ignoring personal network characteristics, relative to models fit to human responses. Analyses of model-generated reasoning and LLM training data suggest that these distortions reflect systematic biases in how misinformation-related concepts are represented. Our findings suggest that LLM-based survey simulations are better suited for diagnosing systematic divergences from human judgment than for substituting it.
💡 Research Summary
The paper investigates whether large language models (LLMs) can faithfully simulate human respondents in surveys that measure susceptibility to misinformation. The authors construct detailed participant profiles from three real‑world online surveys covering public‑health, climate‑change, and pandemic‑politics misinformation. Each profile contains four blocks of information: egocentric network characteristics (e.g., number of discussion partners, network heterogeneity), demographic variables (age, gender, education, income, etc.), and attitudinal/behavioral measures (political leaning, trust in science, media use, health literacy). Using these profiles, the authors prompt a diverse set of LLMs—including GPT‑4, Claude‑2, LLaMA‑2, and several open‑source models of varying size and instruction‑tuning—to answer the same false‑claim items that human participants answered. The prompts ask the model to adopt the described persona and to rate each claim on a 1‑7 accuracy scale and on a 1‑7 sharing‑intention scale, returning a JSON object with a single integer response.
Two robustness manipulations are applied: (1) the order of the three profile blocks is shuffled to test sensitivity to presentation, and (2) composite scores summarising each block are supplied instead of raw item‑level data. For chat‑style models, a chain‑of‑thought field is optionally added to elicit intermediate reasoning.
The evaluation proceeds in three stages. First, distributional similarity between human and model responses is quantified using Jensen–Shannon divergence (JSD) and Earth Mover’s Distance (EMD). LLMs produce distributions that are broadly similar (JSD ≈ 0.12–0.18, EMD ≈ 0.45–0.62) but show systematic shifts, especially an over‑representation of high sharing scores. Second, item‑level Pearson correlations reveal a moderate alignment for belief ratings (average r ≈ 0.46) but a weak alignment for sharing intentions (average r ≈ 0.31). Third, linear regression and LASSO models are fitted separately to the human and simulated outcomes to compare explained variance (R²) and feature importance. In the human data, egocentric network variables account for roughly 22 % of the total R², while attitudinal/behavioral variables contribute about 38 %. In contrast, in the LLM‑generated data network features explain less than 3 % of the variance, whereas attitudinal and behavioral predictors dominate, explaining over 58 % of the variance. This pattern indicates that LLMs heavily overweight the variables that are well‑represented in their pre‑training corpora (e.g., political ideology, trust in science) and largely ignore the more sparsely represented network descriptors.
Qualitative analysis of the model‑generated reasoning strings shows frequent statements such as “people who believe a claim are more likely to share it,” confirming that the models have internalized a simplified belief‑sharing causal heuristic. The authors also examine the effect of the two prompt variations: changing block order has minimal impact, whereas providing composite scores reduces response variance but further attenuates the influence of network features, suggesting that detailed network information is essential for preserving its predictive power.
Overall, the study concludes that while LLM‑based survey simulations can approximate aggregate response patterns and capture some well‑known attitudinal effects, they systematically overstate the link between belief and sharing and underrepresent the role of personal networks. These biases stem from the way misinformation‑related concepts are encoded in the models’ training data. Consequently, LLMs are better suited as diagnostic tools for identifying divergences from human judgment rather than as full replacements for human respondents in computational social‑science research on misinformation dynamics.
Comments & Academic Discussion
Loading comments...
Leave a Comment