Query Sense Discovery Approach to Realize the User's Search Intent

Query Sense Discovery Approach to Realize the User's Search Intent

Tarek Chenaina, Sameh Neji, Abdullah Shoeb
Copyright: © 2022 |Pages: 18
DOI: 10.4018/IJIRR.289609
Article PDF Download
Open access articles are freely available for download

Abstract

The main goal of information retrieval is getting the most relevant documents to a user’s query. So, a search engine must not only understand the meaning of each keyword in the query but also their relative senses in the context of the query. Discovering the query meaning is a comprehensive and evolutionary process; the precise meaning of the query is established as developing the association between concepts. The meaning determination process is modeled by a dynamic system operating in the semantic space of WordNet. To capture the meaning of a user query, the original query is reformulating into candidate queries by combining the concepts and their synonyms. A semantic score characterizing the overall meaning of such queries is calculated, the one with the highest score was used to perform the search. The results confirm that the proposed "Query Sense Discovery" approach provides a significant improvement in several performance measures.
Article Preview
Top

Introduction

The major objective of Information Retrieval (IR) systems is to find relevant documents for a user’s query (Grechanik et al., 2010; Zhai et al., 2015) Many IR systems are based on the traditional bag of words (BOW) approach. The different meanings of the query keywords are not taken into consideration, leading to an ambiguity caused by the polysemy. In most cases, words that contained into a query are polysemous. It is often possible to understand the meaning of a word from the set of words which used within; this is the notion of context. For example, the word “note” may mean “a notation representing the pitch and duration of a musical sound”, “a brief written record”, or “a piece of paper money”. Disambiguation lies into the capacity of the system to exhibit relevant synonyms of the concept i.e. to determine the precise sense that the concept has in the query context.

To solve the problem of query disambiguation, several works have been done (ALMasri et al., 2016; Fernández-Reyes et al., 2018; Serizawa and Kobayashi, 2013). In order to retrieve the truly relevant documents, majority of works on disambiguation (Hirst et al., 1998; Khan and Feng Luo, 2003; Mihalcea and Moldovan, 2000) addresses the problem by measuring the similarity between the initial query and the documents. This method is not optimal, it is necessary to proceed with disambiguation of the query independently of the document, because the ambiguity intrinsically linked to the concepts of this query degrades the search effectiveness.

Recent works (Bobed and Mena, 2016; Yan et al., 2017; Zingla et al., 2016) on the “query expansion” add similar terms, from those initially used. These terms are suggested either from resulting documents from the original query (blind expansion, relevance feedback, etc.) from a linguistic resource or from the query logs. Applying these approaches leads into a risk of introducing a noise in the search results (query drift) which yields a deviation from the user’s intention. The first approach suffers from large size of the Web resources that degrades the approach effectivity. Moreover, these approaches do not contribute getting closer to the user's desired meaning because of the disambiguation does not focus only on the original terms of the query which obstructs the process of discovering the meaning of the query. Indeed, the lack of understanding of the factors influence the query’s meaning and the results they produce because of the effect of relative positions of the words. This is due to the interrelationship of several parameters such as the dispersion of the concepts on the branches of the ontology and their depth and the semantic similarity between the concepts and their predecessors.

In this work, the case of a user requiring information on a specific topic through a query (ad-hoc search) is studied. One of the main problems regarding this search type is detection of query meaning subject to information user's need. As WordNet (George A. Miller, 1995) is one of the well-known and widely used external information resources that has been used in the proposed approach. WordNet provides a conceptual framework for the structured representation of query’s context, in which nouns, verbs, adjectives and adverbs are organized by a variety of semantic relationships. Each concept has a set of synsets (synonyms) that represent its sense.

Complete Article List

Search this Journal:
Reset
Volume 14: 1 Issue (2024)
Volume 13: 1 Issue (2023)
Volume 12: 4 Issues (2022): 3 Released, 1 Forthcoming
Volume 11: 4 Issues (2021)
Volume 10: 4 Issues (2020)
Volume 9: 4 Issues (2019)
Volume 8: 4 Issues (2018)
Volume 7: 4 Issues (2017)
Volume 6: 4 Issues (2016)
Volume 5: 4 Issues (2015)
Volume 4: 4 Issues (2014)
Volume 3: 4 Issues (2013)
Volume 2: 4 Issues (2012)
Volume 1: 4 Issues (2011)
View Complete Journal Contents Listing