Semantic Evolutionary Concept Distances for Effective Information Retrieval in Query Expansion

Reading time: 6 minute
...

📝 Original Info

  • Title: Semantic Evolutionary Concept Distances for Effective Information Retrieval in Query Expansion
  • ArXiv ID: 1701.05311
  • Date: 2017-01-20
  • Authors: Researchers from original ArXiv paper

📝 Abstract

In this work several semantic approaches to concept-based query expansion and reranking schemes are studied and compared with different ontology-based expansion methods in web document search and retrieval. In particular, we focus on concept-based query expansion schemes, where, in order to effectively increase the precision of web document retrieval and to decrease the users browsing time, the main goal is to quickly provide users with the most suitable query expansion. Two key tasks for query expansion in web document retrieval are to find the expansion candidates, as the closest concepts in web document domain, and to rank the expanded queries properly. The approach we propose aims at improving the expansion phase for better web document retrieval and precision. The basic idea is to measure the distance between candidate concepts using the PMING distance, a collaborative semantic proximity measure, i.e. a measure which can be computed by using statistical results from web search engine. Experiments show that the proposed technique can provide users with more satisfying expansion results and improve the quality of web document retrieval.

💡 Deep Analysis

Deep Dive into Semantic Evolutionary Concept Distances for Effective Information Retrieval in Query Expansion.

In this work several semantic approaches to concept-based query expansion and reranking schemes are studied and compared with different ontology-based expansion methods in web document search and retrieval. In particular, we focus on concept-based query expansion schemes, where, in order to effectively increase the precision of web document retrieval and to decrease the users browsing time, the main goal is to quickly provide users with the most suitable query expansion. Two key tasks for query expansion in web document retrieval are to find the expansion candidates, as the closest concepts in web document domain, and to rank the expanded queries properly. The approach we propose aims at improving the expansion phase for better web document retrieval and precision. The basic idea is to measure the distance between candidate concepts using the PMING distance, a collaborative semantic proximity measure, i.e. a measure which can be computed by using statistical results from web search eng

📄 Full Content

Collective Evolutionary Concept Distance Based Query Expansion for Effective Web Document Retrieval

C. H. C Leung Dept. of Computer Science Hong Kong Baptist University Hong Kong clement@comp.hkbu.edu.hk Alfredo Milani Dept. of Mathematics and Computer Science University of Perugia Perugia, Italy milani@unipg.it Yuanxi Li Dept. of Computer Science Hong Kong Baptist University Hong Kong yxli@comp.hkbu.edu.hk Valentina Franzoni Dept. of Mathematics and Computer Science University of Perugia Perugia, Italy valentina.franzoni@dmi.unipg.it

Abstract— In this work several semantic approaches to concept-based query expansion and re-ranking schemes are studied and compared with different ontology-based expansion methods in web document search and retrieval. In particular, we focus on
concept-based query expansion schemes, where, in order to effectively increase the precision of web document retrieval and to decrease the users’ browsing time, the main goal is to quickly provide users with the most suitable query expansion. Two key tasks for query expansion in web document retrieval are to find the expansion candidates, as the closest concepts in web document domain, and to rank the expanded queries properly. The approach we propose aims at improving the expansion phase for better web document retrieval and precision. The basic idea is to measure the distance between candidate concepts using the PMING distance, a collaborative semantic proximity measure, i.e. a measure which can be computed by using statistical results from web search engine. Experiments show that the proposed technique can provide users with more satisfying expansion results and improve the quality of web document retrieval.
Keywords- web document retrieval; concept distance; PMING distance; semantic similarity measures; query expansion; precision and recall I. INTRODUCTION Query expansion (QE) is the process of reformulating a seed query to improve retrieval performance in information retrieval operations.[1] In the context of web search engines, query expansion involves evaluating a user’s input (which words were typed into the search query area, and sometimes other types of data) and expanding the search query to match additional documents. Query expansion involves techniques such as the following:  Finding synonyms of words, and searching for the synonyms as well  Finding all the various morphological forms of words by stemming each word in the search query  Fixing spelling errors and automatically searching for the corrected form or suggesting it in the results  Re-weighting the terms in the original query

Query expansion is a widely studied methodology in the field of computer science, particularly within the realm of natural language processing and information retrieval. Most casual users of IR systems type short queries. Recent research [3] has shown that adding new words to these queries can improve the retrieval effectiveness of such queries. In the web document search engines, the goal of query expansion in this regard is that, by increasing recall, precision can potentially increase (rather than decrease), including in the result set pages which are more relevant (of higher quality), or at least equally relevant. With query expansion, pages having higher potential to be relevant, and that are otherwise not included, can be included. In order to increase the precision of web document retrieval and decrease the users’ browsing time, the most important task is to provide users the most suitable expanded queries quickly. Therefore, to find the closest expansion candidate concepts in web document domain, and to rank the expansion queries properly, are two main issues for query expansion in web document retrieval. Our work mainly focuses on these two targets to improve the expansion results for better precision. In particular the use of a semantic proximity measure, the PMING distance [2, 37], is proposed and experimented.

This paper is organized as follows.
Related work on query expansion and proximity measures will be introduced in Section two; the proposed distance-based query expansion system for web document search will be presented in detail in Section three. The experimental results are reported in Section four, followed by conclusions in the last Section.

II. RELATED WORKS A. Expansion techniques

In order to find the candidate concepts for query expansion in web document domain, different classes of expansion techniques can be considered.

One of the main approach to query expansion consists in using the associativity rules underlying the domain and the context of the query. For example, if a document contains two objects/concepts, say U and V, where only U is indexed, then searching for V will not return the web document in the query result, even though V is present in the web document but for some reasons it has not been explicitly i

…(Full text truncated)…

📸 Image Gallery

cover.png page_2.webp page_3.webp

Reference

This content is AI-processed based on ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut