Contextualized Clustering in Exploratory Web Search
Jon Atle Gulla (Norwegian University of Science and Technology, Norway), Hans Olaf Borch (Bekk Consulting AS, Norway) and Jon Espen Ingvaldsen (Norwegian University of Science and Technology, Norway)
Copyright: © 2008
Due to the large amount of information on the web and the difficulties of relating user’s expressed information needs to document content, large-scale web search engines tend to return thousands of ranked documents. This chapter discusses the use of clustering to help users navigate through the result sets and explore the domain. A newly developed system, HOBSearch, makes use of suffix tree clustering to overcome many of the weaknesses of traditional clustering approaches. Using result snippets rather than full documents, HOBSearch both speeds up clustering substantially and manages to tailor the clustering to the topics indicated in user’s query. An inherent problem with clustering, though, is the choice of cluster labels. Our experiments with HOBSearch show that cluster labels of an acceptable quality can be generated with no upervision or predefined structures and within the constraints given by large-scale web search.