Exploring Community-Powered Conversational Agent for Health Knowledge Acquisition: A Case Study in Colorectal Cancer

Exploring Community-Powered Conversational Agent for Health Knowledge Acquisition: A Case Study in Colorectal Cancer
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Online communities have become key platforms where young adults, actively seek and share information, including health knowledge. However, these users often face challenges when browsing these communities, such as fragmented content, varying information quality and unfamiliar terminology. Based on a survey with 56 participants and follow-up interviews, we identify common challenges and expected features for learning health knowledge. In this paper, we develop a computational workflow that integrates community content into a conversational agent named CanAnswer to facilitate health knowledge acquisition. Using colorectal cancer as a case study, we evaluate CanAnswer through a lab study with 24 participants and interviews with six medical experts. Results show that CanAnswer improves the recalled gained knowledge and reduces the task workload of the learning session. Our expert interviews (N=6) further confirm the reliability and usefulness of CanAnswer. We discuss the generality of CanAnswer and provide design considerations for enhancing the usefulness and credibility of community-powered learning tools.


💡 Research Summary

This paper presents the design, development, and evaluation of “CanAnswer,” a community-powered conversational agent (CA) designed to facilitate health knowledge acquisition for users, particularly young adults, browsing online communities.

The research began with a survey of 56 participants and follow-up interviews with six, identifying key challenges users face: fragmented content, variable information quality, and unfamiliar medical terminology. Users expressed a desire for CA features such as suggested follow-up questions, topic-switching examples, input auto-completion, and access to real-world case stories.

To address these needs, the authors propose a novel computational workflow that intelligently integrates data from two distinct types of online communities: 1) Doctor-led communities (e.g., Dingxiang Doctor), which provide professionally vetted, reliable medical information, and 2) Peer-led communities (e.g., RED/Xiaohongshu), which offer concrete personal experiences and emotional support. This workflow processes three data sources using a Large Language Model (LLM): a small, expert-curated Base Dataset, a larger set of verified Q&A pairs from doctor-led communities (Disease Lookup), and relevant personal stories from peer-led communities (Real-world Cases). This enables CanAnswer to provide authoritative answers, suggest context-aware follow-up questions, and link to relatable peer experiences.

The system was implemented and evaluated using colorectal cancer as a case study. The final dataset comprised 24 base QA pairs, 2048 disease-specific QA pairs, 400 doctor-patient conversations from a doctor-led community, and 246 relevant posts from a peer-led community.

A between-subjects lab study with 24 participants compared CanAnswer (experimental condition) against a baseline of using raw community interfaces and a standard LLM. Results showed that participants using CanAnswer recalled a significantly greater amount of knowledge in a post-session test and reported a significantly lower task workload, as measured by NASA-TLX. Qualitative feedback indicated that the community-powered suggested follow-up questions were particularly valuable for guiding the learning journey.

Furthermore, interviews with six medical experts confirmed the perceived reliability and usefulness of CanAnswer’s design. Experts valued the integration of professionally sourced data for accuracy and acknowledged the practical and emotional benefits of the peer-sourced real-world cases, though they concurrently expressed concerns about the inherent credibility risks of information from peer-led platforms.

The paper concludes by discussing the generality of the proposed workflow for other health topics and provides design considerations for future community-powered learning tools, emphasizing the balance between leveraging rich community data and ensuring information credibility, transparency of sources, and appropriate user guidance.


Comments & Academic Discussion

Loading comments...

Leave a Comment