Contextualized Clustering in Exploratory Web Search

Contextualized Clustering in Exploratory Web Search

Jon Atle Gulla (Norwegian University of Science and Technology, Norway), Hans Olaf Borch (Bekk Consulting AS, Norway) and Jon Espen Ingvaldsen (Norwegian University of Science and Technology, Norway)
Copyright: © 2008 |Pages: 24
DOI: 10.4018/978-1-59904-373-9.ch009
OnDemand PDF Download:


Due to the large amount of information on the web and the difficulties of relating user’s expressed information needs to document content, large-scale web search engines tend to return thousands of ranked documents. This chapter discusses the use of clustering to help users navigate through the result sets and explore the domain. A newly developed system, HOBSearch, makes use of suffix tree clustering to overcome many of the weaknesses of traditional clustering approaches. Using result snippets rather than full documents, HOBSearch both speeds up clustering substantially and manages to tailor the clustering to the topics indicated in user’s query. An inherent problem with clustering, though, is the choice of cluster labels. Our experiments with HOBSearch show that cluster labels of an acceptable quality can be generated with no upervision or predefined structures and within the constraints given by large-scale web search.

Complete Chapter List

Search this Book:
Table of Contents
Cláudio Chauke Nehme
Hercules Antonio do Prado, Edilson Ferneda
Hercules Antonio do Prado, Edilson Ferneda
Chapter 1
Jie Tang, Mingcai Hong, Duo Liang Zhang, Juanzi Li
This chapter is concerned with the methodologies and applications of information extraction. Information is hidden in the large volume of web pages... Sample PDF
Information Extraction: Methodologies and Applications
Chapter 2
Roberto Penteado, Eric Boutin
The information overload demands that organizations set up new capabilities concerning the analysis of data and texts to create the necessary... Sample PDF
Creating Strategic Information for Oranizations with Structured Text
Chapter 3
Christian Aranha, Emmanuel Passos
This chapter integrates elements from Natural Language Processing, Information Retrieval, Data Mining and Text Mining to support competitive... Sample PDF
Automatic NLP for Competitive Intelligence
Chapter 4
Horacio Saggion
Free text is a main repository of human knowledge, therefore methods and techniques to access this unstructured source of knowledge are of paramount... Sample PDF
Mining Profiles and Definitions with Natural Language Processing
Chapter 5
Ying Liu, Han Tong Loh, Wen Feng Lu
This chapter introduces an approach of deriving taxonomy from documents using a novel document profile model that enables document representations... Sample PDF
Deriving Taxonomy from Documents at Sentence Level
Chapter 6
Shigeaki Sakurai
This chapter introduces knowledge discovery methods based on a fuzzy decision tree from textual data. It argues that the methods extract features of... Sample PDF
Rule Discovery from Textual Data
Chapter 7
Edson Takashi Matsubara, Maria Carolina Monard, Ronaldo Cristiano Prati
This chapter presents semi-supervised multi-view learning in the context of text mining. Semi-supervised learning uses both labelled and unlabelled... Sample PDF
Exploring Unclassified Texts Using Multiview Semisupervised Learning
Chapter 8
Lean Yu, Shouyang Wang, Kin Keung Lai
With the rapid increase of the huge amount of online information, there is a strong demand for Web text mining which helps people discover some... Sample PDF
A Multi-Agent Neural Network System for Web Text Mining
Chapter 9
Jon Atle Gulla, Hans Olaf Borch, Jon Espen Ingvaldsen
Due to the large amount of information on the web and the difficulties of relating user’s expressed information needs to document content... Sample PDF
Contextualized Clustering in Exploratory Web Search
Chapter 10
Li Weigang, Wu Man Qi
This chapter presents a study of Ant Colony Optimization (ACO) to Interlegis Web portal, Brazilian legislation Website. The approach of AntWeb is... Sample PDF
AntWeb—Web Search Based on Ant Behavior: Approach and Implementation in Case of Interlegis
Chapter 11
Leandro Krug Wives, José Palazzo Moreira de Oliveira, Stanley Loh
This chapter introduces a technique to cluster textual documents using concepts. Document clustering is a technique capable of organizing large... Sample PDF
Conceptual Clustering of Textual Documents and Some Insights for Knowledge Discovery
Chapter 12
Domonkos Tikk, György Biro, Attila Törcsvári
Abstract: Patent categorization (PC) is a typical application area of text categorization (TC). TC can be applied in different scenarios at the work... Sample PDF
A Hierarchical Online Classifier for Patent Categorization
Chapter 13
Patricia Bintzler Cerrito
The purpose of this chapter is to demonstrate how text mining can be used to reduce the number of levels in a categorical variable to then use the... Sample PDF
Text Mining to Define a Validated Model of Hospital Rankings
Chapter 14
Wagner Francisco Castilho, Gentil José de Lucena Filho, Hércules Antonio do Prado, Edilson Ferneda
Clustering analysis (CA) techniques consist in, given a set of objects, estimating dense regions of points separated by sparse regions, according to... Sample PDF
An Interpretation Process for Clustering Analysis Based on the Ontology of Language
About the Contributors