Information Retrieval

Information Retrieval

Manjunath Ramachandra (MSR School of Advanced Studies, Philips, India)
DOI: 10.4018/978-1-60566-888-8.ch014

Abstract

The demand of the end user for the information is to be fulfilled by the supporting supply chain. The search queries for the data are to be appropriately handled to supply the content seamlessly. The users finally have to get what they want. This chapter explains how the quality of search results can be improved with a little processing on the queries.
Chapter Preview
Top

Background

Information retrieval (IR) is the technology for providing the required content based on the request from the user. It involves the searching of the content based on the keywords, with assistance from the metadata. To facilitate the retrieval, the documents are clustered based on some commonalities (Levene, Mark, 2005). Identification of these commonalities is quite involved. The documents are described with the Metadata. Fusion of the Metadata available in various forms is challenging. To ease the issue of interoperability, XML is generally adopted for the metadata description.

Information routing or filtering is the process of retrieving the required information from the data streams such as the news feed. Here the keyword search happens over individual documents or stream unlike the conventional information retrieval where a large number of stored documents are used for the retrieval of the relevant information. The documents or portion of the stream matching with the required features or profiles will be rendered as the search result. The browser support for query is discussed in (Kent Wittenburg and Eric Sigman, 1997)

Performance

The search performance is typically measured using three numbers:

  • 1.

    Number of queries handled per second: It can be very large depending up on the underlying search engine, from tens to hundreds of queries.

  • 2.

    Average search time per query: The query response time is typically tens to hundreds of milli seconds.

  • 3.

    Size of the data transactions: It will be of the order of several Gigabytes.

The parameters characterize the search engine. The efficiency may be improved with parallel search over multiple servers.

Complete Chapter List

Search this Book:
Reset