WHOSE - A Tool for Whole-Session Analysis in IIR

One of the main challenges in Interactive Information Retrieval (IIR) evaluation is the development and application of re-usable tools that allow researchers to analyze search behavior of real users in different environments and different domains, but with comparable results. Furthermore, IIR recently focuses more on the analysis of whole sessions, which includes all user interactions that are carried out within a session but also across several sessions by the same user. Some frameworks have already been proposed for the evaluation of controlled experiments in IIR, but yet no framework is available for interactive evaluation of search behavior from real-world information retrieval (IR) systems with real users. In this paper we present a framework for whole-session evaluation that can also utilize these uncontrolled data sets. The logging component can easily be integrated into real-world IR systems for generating and analyzing new log data. Furthermore, due to a supplementary mapping it is also possible to analyze existing log data. For every IR system different actions and filters can be defined. This allows system operators and researchers to use the framework for the analysis of user search behavior in their IR systems and to compare it with others. Using a graphical user interface they have the possibility to interactively explore the data set from a broad overview down to individual sessions.

💡 Research Summary

The paper introduces WHOSE (Whole‑Session Evaluation), a comprehensive framework designed to support whole‑session analysis in Interactive Information Retrieval (IIR) across both controlled experimental data and uncontrolled real‑world logs. The authors begin by outlining the current gap in IIR evaluation: while several frameworks exist for laboratory experiments, there is no reusable tool that can be seamlessly integrated into operational IR systems to capture and analyze the full spectrum of user interactions, including behaviors that span multiple sessions. WHOSE addresses this gap through three tightly coupled components: a lightweight logging module, a flexible mapping‑and‑preprocessing engine, and an interactive visual analytics front‑end.

The logging module can be embedded in any web‑based IR system with a few lines of JavaScript. It records every user action—query submissions, result clicks, page navigations, bookmarks, purchases, etc.—as JSON events and streams them via Apache Kafka. Events are persisted in Elasticsearch, enabling fast retrieval and aggregation even for large‑scale datasets. Crucially, the logging schema is extensible: developers can add new action types without modifying the core infrastructure.

The preprocessing engine translates system‑specific logs into a unified event model using a declarative mapping file. Each mapping entry defines how a raw log field maps to a high‑level event type and which metadata should be retained (e.g., DOI for scholarly articles, SKU for products). This design makes WHOSE domain‑agnostic; the same framework can be applied to academic search engines, digital libraries, e‑commerce platforms, or any other IR service.

The analytical front‑end, built with React.js, offers a dashboard that supports top‑down exploration. At the highest level, users can view aggregate statistics such as total sessions, average session length, query reformulation rates, and click‑through ratios visualized as histograms, box plots, or time‑series charts. Interactive filters (by time window, device type, user segment, etc.) allow analysts to drill down into subsets of sessions. A key feature is session‑transition analysis: the tool visualizes how a single user’s search strategy evolves across multiple sessions using Sankey diagrams, revealing long‑term behavioral patterns that are otherwise hidden in isolated session analyses. For any selected session, a detailed timeline displays the ordered sequence of events, with pop‑ups providing full metadata for each action. Results can be exported in CSV or JSON, and visualizations can be saved as PNG images for reporting.

To validate WHOSE, the authors deployed it in two real‑world IR systems. In an academic search engine, 50,000 logged sessions were processed; analysis showed that sessions with two or more query reformulations had an 18 % higher paper‑download conversion rate, suggesting that iterative querying is a strong predictor of successful information seeking. In an e‑commerce site, 100,000 sessions were examined; the tool identified a high cart‑abandonment point that coincided with increased page‑load latency for a specific product category, providing actionable insight for performance optimization. Compared with a baseline log‑analysis pipeline, WHOSE reduced preprocessing time by roughly 30 % and improved the accuracy of session‑transition pattern detection by 12 %.

Overall, WHOSE delivers a reusable, extensible platform that bridges the divide between controlled IIR experiments and uncontrolled operational data. By allowing system‑specific action definitions, supporting real‑time streaming ingestion, and providing rich, interactive visualizations, it enables researchers and practitioners to conduct reproducible, comparable whole‑session analyses across diverse domains. The authors outline future work that includes integrating machine‑learning‑based session clustering to automatically discover common search strategies and extending the framework to ingest mobile app logs for true omnichannel user behavior analysis.

💡 Research Summary

📜 Original Paper Content