A Novel Combined Term Suggestion Service for Domain-Specific Digital Libraries

Reading time: 6 minute
...

📝 Abstract

Interactive query expansion can assist users during their query formulation process. We conducted a user study with over 4,000 unique visitors and four different design approaches for a search term suggestion service. As a basis for our evaluation we have implemented services which use three different vocabularies: (1) user search terms, (2) terms from a terminology service and (3) thesaurus terms. Additionally, we have created a new combined service which utilizes thesaurus term and terms from a domain-specific search term re-commender. Our results show that the thesaurus-based method clearly is used more often compared to the other single-method implementations. We interpret this as a strong indicator that term suggestion mechanisms should be domain-specific to be close to the user terminology. Our novel combined approach which interconnects a thesaurus service with additional statistical relations out-performed all other implementations. All our observations show that domain-specific vocabulary can support the user in finding alternative concepts and formulating queries.

💡 Analysis

Interactive query expansion can assist users during their query formulation process. We conducted a user study with over 4,000 unique visitors and four different design approaches for a search term suggestion service. As a basis for our evaluation we have implemented services which use three different vocabularies: (1) user search terms, (2) terms from a terminology service and (3) thesaurus terms. Additionally, we have created a new combined service which utilizes thesaurus term and terms from a domain-specific search term re-commender. Our results show that the thesaurus-based method clearly is used more often compared to the other single-method implementations. We interpret this as a strong indicator that term suggestion mechanisms should be domain-specific to be close to the user terminology. Our novel combined approach which interconnects a thesaurus service with additional statistical relations out-performed all other implementations. All our observations show that domain-specific vocabulary can support the user in finding alternative concepts and formulating queries.

📄 Content

A Novel Combined Term Suggestion Service for Domain-Specific Digital Libraries Daniel Hienert, Philipp Schaer, Johann Schaible and Philipp Mayr

GESIS – Leibniz Institute for the Social Sciences,
Lennéstr. 30, 53113 Bonn, Germany {Daniel.Hienert, Philipp.Schaer, Johann.Schaible, Philipp.Mayr}@gesis.org Abstract. Interactive query expansion can assist users during their query for- mulation process. We conducted a user study with over 4,000 unique visitors and four different design approaches for a search term suggestion service. As a basis for our evaluation we have implemented services which use three different vocabularies: (1) user search terms, (2) terms from a terminology service and (3) thesaurus terms. Additionally, we have created a new combined service which utilizes thesaurus term and terms from a domain-specific search term re- commender. Our results show that the thesaurus-based method clearly is used more often compared to the other single-method implementations. We interpret this as a strong indicator that term suggestion mechanisms should be domain- specific to be close to the user terminology. Our novel combined approach which interconnects a thesaurus service with additional statistical relations out- performed all other implementations. All our observations show that domain- specific vocabulary can support the user in finding alternative concepts and formulating queries.
Keywords: Evaluation, Term Suggestion, Query Suggestion, Thesaurus, Digi- tal Libraries, Interactive Query Expansion. 1 Introduction A general and long known problem with keyword-based search is the so called “vo- cabulary problem“ or “wording problem” [6]. The same information need or search query can be expressed in a variety of ways. Current web search engines often re- trieve a list of documents where same relevant items are always included – but this is mostly a phenomenon of the very large document index. Thus, when using a “wrong” term there is still a high probability getting a non-empty result set.
When we analyze today’s Digital Library (DL) systems or domain-specific data- bases a controlled vocabulary, usually a thesaurus is used to index the publications. DLs often consist of metadata entries on the specific publications, descriptive ab- stracts are optional. In this situation the vocabulary problem can become quite se- rious. If the searcher doesn’t use one of the controlled terms the document was in- dexed, the chance of getting relevant documents is low. There is a significantly higher chance to retrieve an empty result set. Users tend to adapt their search strategies to work around these drawbacks. In a user study done by Aula et al. [1] one expert arti- culated: “I choose search terms based not specifically on the information I want, but rather on how I could imagine someone wording […] that information.” Modern information-seeking support systems (ISSS) try to make use of a variety of automated approaches to transform and expand textual queries e.g. by using stop word lists, stemming or spelling correction. From the perspective of interface design interactive query reformulation still is an open research issue. [11]. In the following paper we will present the results of a user study with more than 4,000 unique visitors in the online information portal Sowiport1. Users were con- fronted with three basic term suggestions services based on (1) user-search-terms, (2) terms from a terminology service and (3) terms from a domain-specific thesaurus. As a novel approach, we have created a term suggestion service that combines thesaurus terms and terms from a domain-specific search term recommender. We will present related work in section 2, followed by the evaluated vocabularies and services in section 3. We will proceed with the conducted evaluation in section 4 and will present results in section 5. We conclude this paper with a discussion in section 6. 2 Related Work We will present two different perspectives on query reformulation tools: the origin of the proposed terms and the different types of reformulation tools. As Efthimiadis [5] points out interactive query expansion (IQE) can be divided in two types of IQE mechanisms: (1) those that are based on collection dependent or independent knowledge structures and (2) those that are based on the search results. The difference between these approaches is the origin of the data to propose terms from. The terms that are presented to the user can either be retrieved from a know- ledge structure like e.g. thesauri or from the documents that are included in the search result (e.g. to perform a pseudo-relevance feedback). Regarding this characteristics Vechtomova et al. [18] compared two approaches for query expansion (QE) based on term co-occurrences. The first approach was a global co-location analysis where the entire document collection was used to extract related terms. The second approach only used ter

This content is AI-processed based on ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut