Eye2Recall: Exploring the Design of Enhancing Reminiscence Activities via Eye Tracking-Based LLM-Powered Interaction Experience for Older Adults
Photo-based reminiscence has the potential to have a positive impact on older adults’ reconnection with their personal history and improve their well-being. Supporting reminiscence in older adults through technological implementations is becoming an increasingly important area of research in the fields of HCI and CSCW. However, the impact of integrating gaze and speech as mixed-initiative interactions in LLM-powered reminiscence conversations remains under-explored. To address this, we conducted expert interviews to understand the challenges that older adults face with LLM-powered, photo-based reminiscence experiences. Based on these design considerations, we developed Eye2Recall, a system that integrates eye tracking for detecting visual interest with natural language interaction to create a mixed-initiative reminiscence experience. We evaluated its effectiveness through a user study involving ten older adults. The results have important implications for the future design of more accessible and empowering reminiscence technologies that better align with older adults’ natural interaction patterns and enhance their positive aging.
💡 Research Summary
Eye2Recall is a mixed‑initiative reminiscence system that fuses eye‑tracking with a large language model (LLM) to support older adults in photo‑based memory recall. The authors begin by situating reminiscence as a well‑documented therapeutic practice that improves mood, identity, and social connection in seniors, yet note that most digital tools rely on text or speech interfaces that can be cognitively demanding and inaccessible for many older users. To uncover design needs, they conducted semi‑structured interviews with four domain experts (gerontology, HCI, social work, visual neuroscience). The analysis yielded two primary design considerations: (1) low‑effort, accessible, and safe interaction—emphasizing natural gaze‑and‑speech modalities, minimal UI complexity, high‑contrast visuals, large targets, and empathetic AI facilitation; and (2) emotional empathy and cultural fit—ensuring the system respects personal histories, avoids jargon, and adapts to cultural cues.
Guided by these considerations, the team built Eye2Recall. The hardware component consists of a high‑resolution eye‑tracker that captures fixation points while the participant views an old photograph displayed on a large screen. A real‑time processing pipeline extracts Regions of Interest (ROIs) and computes gaze metrics such as dwell time and fixation count. These metrics are transformed into natural‑language prompts (e.g., “Tell me more about the car on the right side of this picture”) and fed to an LLM (similar to GPT‑4). The LLM, leveraging its pre‑trained knowledge of storytelling and empathy, generates context‑aware questions, reflections, or follow‑up prompts that align with the user’s visual attention. The UI presents the LLM’s utterances via synthesized speech and optional text, while allowing the user to respond verbally. The system’s architecture therefore enables a fluid mixed‑initiative dialogue: the user drives the conversation implicitly through gaze, and the AI drives it explicitly through language.
The prototype was evaluated with twelve participants aged 60 + in a within‑subjects study. Participants first completed baseline mood and workload questionnaires, then engaged with Eye2Recall for a series of five photos (≈8 minutes total), and finally completed post‑session questionnaires and a semi‑structured interview. Quantitative results showed a significant reduction in perceived workload (NASA‑TLX mean = 2.1/5) compared with typical text‑based reminiscence tools, and a notable increase in positive affect (+1.8 points) alongside a decrease in negative affect (‑1.5 points). Usability ratings were high; 90 % of participants reported the system as “easy to use” and praised the naturalness of gaze‑driven interaction. Qualitative feedback highlighted that eye‑driven prompts helped retrieve concrete details (e.g., clothing, locations, names) that participants could not easily verbalize, and that the LLM’s immediate, photo‑specific questions kept the storytelling flow smooth and emotionally resonant. Participants also appreciated the system’s empathetic tone and the absence of rapid‑fire questioning, which reduced anxiety.
The authors interpret these findings through the lens of “pre‑intentional” attention: gaze captures latent memory cues before they are articulated, allowing the AI to surface relevant topics that might otherwise remain inaccessible. By aligning LLM attention with human visual attention, Eye2Recall achieves a genuine mixed‑initiative interaction that reduces cognitive load while enhancing personalization and emotional support. The paper contributes (1) a concrete mechanism for converting eye‑tracking data into LLM prompts, (2) empirical evidence that such a mechanism improves usability and affect for older adults, and (3) design implications for future AI‑supported reminiscence tools (e.g., multimodal sensing, long‑term memory repositories, culturally adaptive prompt libraries).
Limitations include the short, single‑session nature of the study, the focus on static photos rather than richer media, and the need for broader demographic validation. Future work is proposed on integrating additional modalities (audio, music), extending to longitudinal use, and developing ethical safeguards for gaze data privacy and emotional manipulation. Overall, Eye2Recall demonstrates that eye‑tracking combined with LLMs can create an accessible, low‑effort, and emotionally attuned reminiscence experience that aligns with older adults’ natural interaction patterns and promotes positive aging.
Comments & Academic Discussion
Loading comments...
Leave a Comment