A Genetic Fuzzy Semantic Web Search Agent Using Granular Semantic Trees for Ambiguous Queries

A Genetic Fuzzy Semantic Web Search Agent Using Granular Semantic Trees for Ambiguous Queries

Yan Chen (Georgia State University, USA) and Yan-Qing Zhang (Georgia State University, USA)
DOI: 10.4018/978-1-60566-324-1.ch018
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

For most Web searching applications, queries are commonly ambiguous because words or phrases have different linguistic meanings for different Web users. The conventional keyword-based search engines cannot disambiguate queries to provide relevant results matching Web users’ intents. Traditional Word Sense Disambiguation (WSD) methods use statistic models or ontology-based knowledge systems to measure associations among words. The contexts of queries are used for disambiguation in these methods. However, due to the fact that numerous combinations of words may appear in queries and documents, it is difficult to extract concepts’ relations for all possible combinations. Moreover, queries are usually short, so contexts in queries do not always provide enough information to disambiguate queries. Therefore, the traditional WSD methods are not sufficient to provide accurate search results for ambiguous queries. In this chapter, a new model, Granular Semantic Tree (GST), is introduced for more conveniently representing associations among concepts than the traditional WSD methods. Additionally, users’ preferences are used to provide personalized search results that better adapt to users’ unique intents. Fuzzy logic is used to determine the most appropriate concepts related to queries based on contexts and users’ preferences. Finally, Web pages are analyzed by the GST model. The concepts of pages for the queries are evaluated, and the pages are re-ranked according to similarities of concepts between pages and queries.
Chapter Preview
Top

Based on the theory of granular computing, the granules, such as subsets, classes, objects, and elements of a universe, are basic ingredients of granular computing (Yao, 2007). The general granules are constructed by grouping finer granules based on available information and knowledge, such as similarity or functionality. The term-space granulation is used in the information retrieval and WSD areas (Yao, 2003). Terms are basic granules and term hierarchy is constructed by clustering terms. Then, new terms may be assigned to clusters as labels. Usually, those labels are more general than the terms in the cluster. The notion of term-space granulation serves as an effective tool for the QD applications. Many researchers have used term clustering with its application in disambiguating queries.

One of the frequently used techniques for term clustering is Statistical Association method (Brena and Ramirez, 2006). Through measuring the co-occurrences of words in the large quantities of Web pages on the Internet, collections of word clusters can be made. For example, through statistical analysis, the words “beach”, “resort”, “Florida” and “hotel” usually appear in the same Web pages, so they can be in the same cluster. Keyword palm has three meanings: beach, tree, and electrical device. If palm and resort co-occur in one Web page, then Palm should be interpreted as a beach, neither a tree nor an electrical device in that page. This algorithm uses a statistical approach, instead of a knowledge-based approach, to express words relations. Therefore, human knowledge or intervene is not needed in this algorithm. However, since billions of Web pages exist on the Internet, it is very difficult to select samples for measuring the co-occurrence degrees of words.

Complete Chapter List

Search this Book:
Reset