Domain-Based Dynamic Ranking

Domain-Based Dynamic Ranking

Sutirtha Kumar Guha (Seacom Engineering College, India), Anirban Kundu (Netaji Subhash Engineering College, India) and Rana Dattagupta (Jadavpur University, India)
Copyright: © 2015 |Pages: 18
DOI: 10.4018/978-1-4666-8676-2.ch017
OnDemand PDF Download:
No Current Special Offers


In this chapter a domain based ranking methodology is proposed in cloud environment. Web pages from the cloud are clustered as ‘Primary Domain' and ‘Secondary Domain'. ‘Primary' domain Web pages are fetched based on the direct matching with the keywords. ‘Primary Domain' Web pages are ranked based on Relevancy Factor (RF) and Turbulence Factor (TF). ‘Secondary Domain' is constructed by Nearest Keywords and Similar Web pages. Nearest Keywords are the keywords similar to the matched keywords. Similar Web pages are the Web pages having Nearest Keywords. Matched Web pages of ‘Primary' and ‘Secondary' domain are ranked separately. A wide range of Web pages from the cloud would be available and ranked more efficiently by this proposed approach.
Chapter Preview

1. Introduction

1.1 Overview

Ever increasing e-population and associated services have led to an expansion of the world’s domain. Role of Search Engine is invaluable in navigating through the vast expanse of the Web world to find the most relevant information. Web pages are resided in a cloud environment. In cloud environment, it is desirable to search Web pages from the cloud through the vast Web of information and rank the result so as to cater to the exact need of the user. The main function of search engine is to process the users’ query, searching for a match within particular Web page of the relevant cloud, and to present the matching results in a ranked manner.

1.2 Motivations

Main functionality of a search engine is to match user query given as input with the extensive database of the search engine and display matched URLs in a chronological order. Most optimal and feasible result may not be obtained every time as the searching procedure is executed mechanically. It may happen that URL that may not tally fully with the user given query but match a few keywords may cater to users’ requirements. However it is evident that URLs matching exactly with the users’ query may also yield an optimal result.

1.3 Goal

In typical Search engine environment Web pages having matched keywords are fetched from the predefined database and ranked. Wide range of relevant Web pages having unmatched keywords are omitted in this procedure. Neglected Web pages having relevant information are covered by the proposed method. Relevant Web pages are fetched and displayed irrespective of keyword matching.

1.4 Methodology Applied

Relevance Factor and Turbulence Factor are introduced to rank the Web pages having matched keywords with respect to the user query. Hierarchical Web database is proposed to calculate the Relevance Factor. Turbulence Factor is calculated based on the effect on other Web pages of the same Web site. Relevant Web pages having unmatched keywords are collected by nearest keywords and similar Web pages concept as discussed in Section 3.2.


2. Background

A regularization-based algorithm called ranking adaptation SVM (RA-SVM) is proposed in (Geng, Yang, Xu, & Hua, 2012) as a unique ranking model. Ranking adaptability measurement is proposed to quantitatively estimate if an existing ranking model can be adapted to a new domain (Geng et al., 2012). Different applications are also made based on the domain based ranking algorithm. A prototype application to demonstrate ranking model adoption using a novel ranking model meant for ranking the search results besides adapting to new domains is proposed in (Greeshma, Srinivasa Rao, & Krishnaiah, 2013). It is claimed by experimental results that in the paper that the proposed application is useful in searching data across the domains (Greeshma et all., 2013). It is observed that traditional search techniques fail to interpret the significance of geographical clues and unable to return highly relevant search results as users are interested in a set of location-sensitive topics. An innovative probabilistic ranking framework for domain information retrieval is proposed in (Li, H., Li, Z., Lee, W. C., & Lee D.L., 2009). The proposed method recognizes the geographical distribution of topic influence in the process of ranking documents and models it accurately using probabilistic Gaussian Process classifiers (Li et al., 2009). A framework is proposed to learn the aggregate votes of constituent rankers with domain specific expertise without supervision. The learning framework is applied to the settings of aggregating full rankings and aggregating top-k lists, demonstrating significant improvements over a domain-agnostic baseline in both cases (Klementiev, Roth, Small, & Titov, 2009).

Complete Chapter List

Search this Book: