Query Log Analysis for Adaptive Dialogue-Driven Search

Query Log Analysis for Adaptive Dialogue-Driven Search

Udo Kruschwitz (University of Essex, UK), Nick Webb (SUNY Albany, USA) and Richard Sutcliffe (University of Limerick, Ireland)
Copyright: © 2009 |Pages: 26
DOI: 10.4018/978-1-59904-974-8.ch020
OnDemand PDF Download:
$37.50

Abstract

The theme of this chapter is the improvement of Information Retrieval and Question Answering systems by the analysis of query logs. Two case studies are discussed. The first describes an intranet search engine working on a university campus which can present sophisticated query modifications to the user. It does this via a hierarchical domain model built using multi-word term co-occurrence data. The usage log was analysed using mutual information scores between a query and its refinement, between a query and its replacement, and between two queries occurring in the same session. The results can be used to validate refinements in the domain model, and to suggest replacements such as domain-dependent spelling corrections. The second case study describes a dialogue-based question answering system working over a closed document collection largely derived from the Web. Logs here are based around explicit sessions in which an analyst interacts with the system. Analysis of the logs has shown that certain types of interaction lead to increased precision of the results. Future versions of the system will encourage these forms of interaction. The conclusions of this chapter are firstly that there is a growing literature on query log analysis, much of it reviewed here, secondly that logs provide many forms of useful information for improving a system, and thirdly that mutual information measures taken with automatic term recognition algorithms and hierarchy construction techniques comprise one approach for enhancing system performance.
Chapter Preview
Top

Introduction

The Web is growing at an incredible speed and has become an active research area in its own right (Spink & Jansen, 2004). Search engines such as Google (Brin & Page, 1998) enable users to process, access and navigate vast amounts of information. Such engines are built upon the well-established principles of Information Retrieval (IR) (Baeza-Yates & Ribeiro-Neto, 1999). While an IR system takes as input a user query and returns a ranked list of documents considered relevant to it, a Question Answering (QA) system goes one stage further and returns an exact answer extracted from one of the documents. Since its adoption at the Text REtrieval Conference (TREC) (Voorhees, 1999), the Cross Language Evaluation Forum (CLEF) (Magnini, Romagnoli, Vallin, Herrera, Peñas, Peinado, Verdejo & de Rijke, 2003) and the National Test Collection for Information Retrieval (NTCIR) (Sasaki, Chen, Chen & Lin, 2005), in concert with targeted funding under the Advanced Research Development Agency (ARDA) Advanced QUestion Answering for INTelligence (AQUAINT) program, QA has developed rapidly to the stage at which commercial systems such as Qristal are beginning to appear (Laurent, Séguéla & Nègre, 2006).

A considerable amount of the work in IR and QA has been devoted to the retrieval of results for individual queries. Increasingly, however, users need Interactive Information Systems (IIS) capable of converging on a person’s information need by stages, using methods such as Interactive QA (Webb, 2006; Webb & Webber, 2008; Small, Strzalkowski, Liu, Ryan, Salkin, Shimizu, Kantor, Kelly, Rittman & Wacholder, 2004) and dialogue driven search (Kruschwitz, 2003; Kruschwitz, 2005; Kruschwitz & Al-Bakour, 2005). Traditional artificial dialogue systems already allow users to interact with simple, structured data such as train or flight timetables (Zue, Glass, Goodine, Leung, Phillips, Polifroni & Seneff, 1990; Goddeau, Brill, Glass, Pao, Phillips, Polifroni, Seneff & Zue, 1994; Allen, Schubert, Ferguson, Heeman, Hwang, Kato, Light, Martin, Miller, Poesio & Traum, 1995; Aust, Oerder, Seide & Steinbiss, 1995). Such models make extensive use of corpora containing both Human-Computer (H-C) and increasingly Human-Human (H-H) interactions (Hardy, Biermann, Inouye, Mckenzie, Strzalkowski, Ursu, Webb & Wu, 2004). Such corpora can be used to study and capture the phenomena, vocabulary and style of such interactions and hence to develop appropriate machine models.

By contrast, IR and QA systems often operate in much wider domains for which appropriate corpora are not available. As a result, query logs are potentially an extremely valuable resource for increasing our understanding of the complex interactions involved and hence in developing more sophisticated systems. Logs contain a huge amount of information but effective methods for extracting it are only now being developed.

Key Terms in this Chapter

Query Modification: is the modification by a search of a previous query.

Question Answering (QA) Systems: go one step fruther than an information retrieval system that takes as input a user query and returns a ranked list of documents considered relevant to it. QA) systems return an exact answer extracted from one of the documents.

Domain Knowledge: is the knowledge possessed or required of a person or system within a specific topical area.

Interactive Information Systems (IIS): are capable of converging on a person’s information need by stages.

Complete Chapter List

Search this Book:
Reset
Table of Contents
Preface
Bernard J. Jansen, Amanda Spink, Isak Taksa
Chapter 1
Bernard J. Jansen, Isak Taksa, Amanda Spink
This chapter outlines and discusses theoretical and methodological foundations for transaction log analysis. We first address the fundamentals of... Sample PDF
Research and Methodological Foundations of Transaction Log Analysis
$37.50
Chapter 2
W. David Penniman
This historical review of the birth and evolution of transaction log analysis applied to information retrieval systems provides two perspectives.... Sample PDF
Historic Perspective of Log Analysis
$37.50
Chapter 3
Lee Rainie, Bernard J. Jansen
Every research methodology for data collection has both strengths and limitations, and this is certainly true for transaction log analysis.... Sample PDF
Surveys as a Complementary Method for Web Log Analysis
$37.50
Chapter 4
Sam Ladner
This chapter aims to improve the rigor and legitimacy of Web-traffic measurement as a social research method. I compare two dominant forms of... Sample PDF
Watching the Web: An Ontological and Epistemological Critique of Web-Traffic Measurement
$37.50
Chapter 5
Kirstie Hawkey
This chapter examines two aspects of privacy concerns that must be considered when conducting studies that include the collection of Web logging... Sample PDF
Privacy Concerns for Web Logging Data
$37.50
Chapter 6
Bernard J. Jansen
Exploiting the data stored in search logs of Web search engines, Intranets, and Websites can provide important insights into understanding the... Sample PDF
The Methodology of Search Log Analysis
$37.50
Chapter 7
Anthony Ferrini, Jakki J. Mohr
As the Web’s popularity continues to grow and as new uses of the Web are developed, the importance of measuring the performance of a given Website... Sample PDF
Uses, Limitations, and Trends in Web Analytics
$37.50
Chapter 8
Danielle Booth
This chapter is an overview of the process of Web analytics for Websites. It outlines how visitor information such as number of visitors and visit... Sample PDF
A Review of Methodologies for Analyzing Websites
$37.50
Chapter 9
Gi Woong Yun
This chapter discusses validity of units of analysis of Web log data. First, Web log units are compared to the unit of analysis of television to... Sample PDF
The Unit of Analysis and the Validity of Web Log Data
$37.50
Chapter 10
Kirstie Hawkey, Melanie Kellar
This chapter presents recommendations for reporting context in studies of Web usage including Web browsing behavior. These recommendations consist... Sample PDF
Recommendations for Reporting Web Usage Studies
$37.50
Chapter 11
Seda Ozmutlu, Huseyin C. Ozmutlu, Amanda Spink
This chapter summarizes the progress of search engine user behavior analysis from search engine transaction log analysis to estimation of user... Sample PDF
From Analysis to Estimation of User Behavior
$37.50
Chapter 12
Gheorghe Muresan
In this chapter, we describe and discuss a methodological framework that integrates analysis of interaction logs with the conceptual design of the... Sample PDF
An Integrated Approach to Interaction Design and Log Analysis
$37.50
Chapter 13
Brian Detlor, Maureen Hupfer, Umar Ruhi
This chapter provides various tips for practitioners and researchers who wish to track end-user Web information seeking behavior. These tips are... Sample PDF
Tips for Tracking Web Information Seeking Behavior
$37.50
Chapter 14
Sandro José Rigo
Adaptive Hypermedia is an effective approach to automatic personalization that overcomes the difficulties and deficiencies of traditional Web... Sample PDF
Identifying Users Stereotypes for Dynamic Web Pages Customization
$37.50
Chapter 15
Brian K. Smith, Priya Sharma, Kyu Yon Lim, Goknur Kaplan Akilli, KyoungNa Kim, Toru Fujimoto
Computers and networking technologies have led to increases in the development and sustenance of online communities, and much research has focused... Sample PDF
Finding Meaning in Online, Very-Large Scale Conversations
$37.50
Chapter 16
Isak Taksa, Sarah Zelikovitz, Amanda Spink
Search query classification is a necessary step for a number of information retrieval tasks. This chapter presents an approach to non-hierarchical... Sample PDF
Machine Learning Approach to Search Query Classification
$37.50
Chapter 17
Seda Ozmutlu, Huseyin C. Ozmutlu, Amanda Spink
This chapter emphasizes topic analysis and identification of search engine user queries. Topic analysis and identification of queries is an... Sample PDF
Topic Analysis and Identification of Queries
$37.50
Chapter 18
Elmer V. Bernstam, Jorge R. Herskovic, William R. Hersh
Clinicians, researchers and members of the general public are increasingly using information technology to cope with the explosion in biomedical... Sample PDF
Query Log Analysis in Biomedicine
$37.50
Chapter 19
Michael Chau, Yan Lu, Xiao Fang, Christopher C. Yang
More non-English contents are now available on the World Wide Web and the number of non-English users on the Web is increasing. While it is... Sample PDF
Processing and Analysis of Search Query Logs in Chinese
$37.50
Chapter 20
Udo Kruschwitz, Nick Webb, Richard Sutcliffe
The theme of this chapter is the improvement of Information Retrieval and Question Answering systems by the analysis of query logs. Two case studies... Sample PDF
Query Log Analysis for Adaptive Dialogue-Driven Search
$37.50
Chapter 21
Mimi Zhang
In this chapter, we present the action-object pair approach as a conceptual framework for conducting transaction log analysis. We argue that there... Sample PDF
Using Action-Object Pairs as a Conceptual Framework for Transaction Log Analysis
$37.50
Chapter 22
Paul DiPerna
This chapter proposes a new theoretical construct for evaluating Websites that facilitate online social networks. The suggested model considers... Sample PDF
Analysis and Evaluation of the Connector Website
$37.50
Chapter 23
Marie-Francine Moens
This chapter introduces information extraction from blog texts. It argues that the classical techniques for information extraction that are commonly... Sample PDF
Information Extraction from Blogs
$37.50
Chapter 24
Adriana Andrade Braga
This chapter explores the possibilities and limitations of nethnography, an ethnographic approach applied to the study of online interactions... Sample PDF
Nethnography: A Naturalistic Approach Towards Online Interaction
$37.50
Chapter 25
Isak Taksa, Amanda Spink, Bernard J. Jansen
Web log analysis is an innovative and unique field constantly formed and changed by the convergence of various emerging Web technologies. Due to its... Sample PDF
Web Log Analysis: Diversity of Research Methodologies
$37.50
About the Contributors