The Perspectives of Improving Web Search Engine Quality
Jengchung V. Chen (National Cheng Kung University, Taiwan), Wen-Hsiang Lu (National Cheng Kung University, Taiwan), Kuan-Yu He (National Cheng Kung University, Taiwan) and Yao-Sheng Chang (National Cheng Kung University, Taiwan)
Copyright: © 2008
With the fast growth of the Web, users often suffer from the problem of information overload, since many existing search engines respond to queries with many nonrelevant documents containing query terms based on the conventional search mechanism of keyword matching. In fact, both users and search engine developers had anticipated that this mechanism would reduce information overload by understanding user goals clearly. In this chapter, we will introduce some past research in Web search, and current trends focusing on how to improve the search quality in different perspectives of “what”, “how”, “where”, “when”, and “why”. Additionally, we will also briefly introduce some effective search quality improvements using link-structure-based search algorithms, such as PageRank and HITS. At the end of this chapter, we will introduce the idea of our proposed approach to improving search quality, which employs syntactic structures (verb-object pairs) to automatically identify potential user goals from search-result snippets. We also believe that understanding user goals more clearly and reducing information overload will become one of the major developments in commercial search engines in the future, since the amounts of information and resources continue to increase rapidly, and user needs will become more and more diverse.
Key Terms in this Chapter
User Behavior: Users’ interaction with the search engine.
User Goal Identification: To identify what the user wants to do when submitting a query.
Ke yword Matching: A search mechanism which considers a document relevant if it shares common terms with the query.
Search: Quality: A nature of providing users with useful search results.
Natural Language Processing: A field of studying the problems of automated generation and understanding of natural human languages.
Click-Through Data: The information which can reveal the behavior of users from submitting a query to finally finding the target Web pages.
Information Retrieval: To retrieve information useful or relevant to the query.
Complete Chapter List
Coral Calero, M. Angeles Moraga, Mario Piattini
Emilia Mendes, Silvia Abrahão
Rosemary Stockdale, Chad Lin
May Haydar, Ghazwa Malak, Houari Sahraoui, Alexandre Petrenko, Sergiy Boroday
Mª Ángeles Moraga, Julio Córdoba, Coral Calero, Cristina Cachero
Angélica Caro, Coral Calero, Mario Piattini
Marta Fernández de Arriba, Eugenia Díaz, Jesús Rodríguez Pérez
Carlos García Moreno
Adriana Martín, Alejandra Cechich, Gustavo Rossi
Francisco Montero, María Dolores Lozano, Pascual González
Maristella Matera, Francesca Rizzo, Rebeca Cortázar, Asier Perallos
Fernando Bellas, Iñaki Paz, Alberto Pan, Óscar Díaz
Victoria Torres, Joan Fons, Vicente Pelechano
Nicolas Guelfi, Cédric Pruski, Chantal Reynaud
Carmen Martínez-Cruz, Ignacio José Blanco, M. Amparo Vila
Ricardo Barros, Geraldo Xexéo, Wallace A. Pinheiro, Jano de Souza
Fernando Molina, Francisco J. Lucas, Ambrosio Toval Alvarez, Juan M. Vara, Paloma Cáceres, Esperanza Marcos
M.J. Escalona, G. Aragón
Cristina Cachero Castro, Coral Calero, Yolanda Marhuenda García
Sergej Sizov, Stefan Siersdorfer
Mª Ángeles Moraga, Ignacio García-Rodríguez de Guzmán, Coral Calero, Mario Piattini
Tony C. Shan, Winnie W. Hua
Mohamed Salah Hamdi
Jengchung V. Chen, Wen-Hsiang Lu, Kuan-Yu He, Yao-Sheng Chang
John D. D’Ambra, Nina Mistillis