Rough Set Based Aggregation for Effective Evaluation of Web Search Systems

Rough Set Based Aggregation for Effective Evaluation of Web Search Systems

Rashid Ali (Aligarh Muslim University, India) and M. M. Sufyan Beg (Jamia Millia Islamia, India)
DOI: 10.4018/978-1-4666-0294-6.ch008
OnDemand PDF Download:
List Price: $37.50


Rank aggregation is the process of generating a single aggregated ranking for a given set of rankings. In industrial environment, there are many applications where rank aggregation must be applied. Rough set based rank aggregation is a user feedback based technique which mines ranking rules for rank aggregation using rough set theory. In this chapter, the authors discuss rough set based rank aggregation technique in light of Web search evaluation. Since there are many search engines available, which can be used by used by industrial houses to advertise their products, Web search evaluation is essential to decide which search engines to rely on. Here, the authors discuss the limitations of rough set based rank aggregation and present an improved version of the same, which is more suitable for aggregation of different techniques for Web search evaluation. In the improved version, the authors incorporate the confidence of the rules in predicting a class for a given set of data. They validate the mined ranking rules by comparing the predicted user feedback based ranking with the actual user feedback based ranking. They show their experimental results pertaining to the evaluation of seven public search engines using improved version of rough set based aggregation for a set of 37 queries.
Chapter Preview

Here, we first discuss industrial web search.

Web searching is very important in an industrial environment. A large number of online users are using Web search for business information. It is an emerging trend for growing number of people to search for the best product or the products that suit best to their pocket. That means Web searching influences highly the consumers’ buying decisions. In turn, this means to industrial marketers or sellers to develop and manage their business presence online for the Web search for the significant business growth and development. On their websites, they must provide information about events, offers and promotions, allow for customer reviews, and include photos to help users familiarize themselves with their products and services. Since, a large number of users, use search engines for finding products, it is essential for the industrial marketers to approach search engines to make their products placed high in the search results. Their presence in the search results ensures that they are reaching their targeted customer groups. This motivates many industrial companies to pay search engines for advertisements. Now, there are a large number of search engines available. Therefore, evaluation of web search engines will help the industrial marketers to select best search engines for the advertisements and they will invest their money for advertisements sensibly.

Key Terms in this Chapter

User Feedback: The feedback taken from the user.

Learning Ranking Rules: Learning the rules which are used for obtaining overall ranking of a set of items in a rank aggregation process.

Vector Space Model: A content based model that represents a document as a vector in an n-dimensional space, where each dimension represents a term and similarity between two documents is measured through cosine angle between the two vectors.

PageRank: A link based algorithm that measures the importance of a document (Web page) not only in terms of the number of documents that link to it but also, the importance of the documents that link to it. It is used by popular search engine Google

Web Search Evaluation: Measuring the performance of a Web search engine.

Boolean Similarity Measures: A content based model where each document is represented by a Boolean expression of its terms and similarity between two documents using the presence or absence of a term in their Boolean expression

Rough Set Theory: An approach to deal vagueness in data, which expresses vagueness by employing a boundary region of a set. A non-empty boundary region means that the set is rough. Otherwise, the set is crisp.

Rank Aggregation: Combining ranking from different sources to a single aggregated ranking.

Confidence: The accuracy of the rules.

Complete Chapter List

Search this Book: