Semantic Query Expansion using Cluster Based Domain Ontologies

Semantic Query Expansion using Cluster Based Domain Ontologies

Suruchi Chawla (Department of Computer Science, Shaheed Rajguru College of Applied Science, University of Delhi, Delhi, India)
Copyright: © 2012 |Pages: 16
DOI: 10.4018/ijirr.2012040102
OnDemand PDF Download:
$37.50

Abstract

Information on the web has been growing at a very rapid pace and has become quite voluminous over the past few years. The users search query on the web could not retrieve sufficient relevant documents and is responsible for low precision of search results. To improve the precision of search results, an algorithm is proposed in this paper for semantic query expansion using domain ontology based on clustered web query sessions. Domain ontology is created for each cluster of query sessions. The input query of a user is used to select the most similar cluster. The domain ontology of the selected cluster is used to suggest the related concepts for query expansion and the expanded query is used for information retrieval to test its effectiveness. The experiment was conducted on the captured user query sessions on the web and results prove the efficacy of the proposed approach.
Article Preview

Introduction

Information on the web is growing at a rapid speed and has become quite voluminous. Users search this vast collection of information on the web using search queries. These search queries, which express the information need of the user, are composed of very few keywords (Jansen, Spink, Bateman, & Saracevic, 1998; Baeza-Yates & Ribeiro-Neto, 1999; Gudivada, Raghavan, Grosky, & KasanaGottu, 1997). On the other hand, the vocabulary used by the authors on the web is very much diverse; therefore the keywords in the user search query are not enough to infer the information need of the user and there is a need for better expressing the information need of the user’s input query to bridge the gap between the vocabulary used by the user and that of the authors on the web. Research has already been done in the direction of expansion of the user query by adding related keywords in the search query for better inferring the information need of the user (Cui, Wen, Nie, & Ma, 2003; Xu & Croft, 1996).

One of the work done in this direction in Chawla and Bedi (2008) uses the clusters of query sessions of clicked URLs on the web for expansion of user input query. The keywords are selected for query expansion using correlation of co-occurrence of input query keywords and keywords of clicked URLs present in clustered query sessions. Parallelly research has also been going on in the direction of query expansion based on ontology (Sack, 2005; Revuri, Upandhyaya, & Sreenivasa Kumar, 2006). It was felt that to further improve search precision a fusion of clustered query sessions and domain ontology techniques for query expansion should be employed and hence the algorithm is proposed for semantic query expansion using the domain ontology based on clustered query sessions.

In this paper semantic based query expansion is done using domain ontology associated with the clusters of query sessions keyword vector. Query session keyword vectors are generated from user query sessions on the web using Information Scent and content of clicked URLs present in the query sessions. Clustering is done on query sessions keyword vectors in order to group them on the basis of similar information need. Domain Ontology is created for each of the clusters of query sessions keyword vectors. The algorithm proposed in Revuri et al. (2006) for the generation of related concepts for one or two term input query has been extended in this paper for the generation of related concepts for multiterm input query in the proposed approach of query expansion using domain ontology based on clustered query sessions.

The experiment was conducted in the domains of academics, entertainment and sports. Software engineering ontology in academics, movies ontology in entertainment and sports ontology were created using Protege3.x. The proposed algorithm for query expansion was implemented in SPARQL and Jena using Eclipse IDE. The user query was entered through a console based Interface. The input query keywords were used to select the most similar cluster and the domain ontology associated with selected cluster was used to identify the related concepts for the expansion of the input query. The expanded input query was then issued on the Google search engine to retrieve the search results. It was found that more relevant documents are retrieved at a higher precision when compared to those retrieved using simple keywords based input query expansion using same clustered query sessions proposed in Chawla and Bedi (2008).

The paper is organized as follows: first, we discuss related work. Then, the next section elucidates the proposed Query Expansion Algorithm using Cluster based Domain Ontology. Afterwards we discuss the Experimental Results and conclude the paper.

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 7: 4 Issues (2017)
Volume 6: 4 Issues (2016)
Volume 5: 4 Issues (2015)
Volume 4: 4 Issues (2014)
Volume 3: 4 Issues (2013)
Volume 2: 4 Issues (2012)
Volume 1: 4 Issues (2011)
View Complete Journal Contents Listing