In this section, we present an overview of information retrieval, information retrieval system, and the need for query expansion. Further, it discusses appropriateness and drawbacks of term co-occurrence approaches for query expansion and the need for incorporating query terms context window and semantics in the field of automatic query expansion.
The discipline of information retrieval is almost as old as the computer itself. An old definition of information retrieval is the following by Mooers (1950):
Information retrieval is the name of the process or method whereby a prospective user of information is able to convert his need for information into an actual list of citations to documents in storage containing information useful to him.
An information retrieval system is a software program that is used to retrieve, store and manages needed information in a large collection. The system assists users to find the information need like the question answering system that returns the existence and location of documents instead of returning needed information or answer the question explicitly. Some system suggested documents may satisfy the user’s information need. These kinds of documents are called relevant documents. A perfect retrieval system would retrieve only the relevant documents, not the irrelevant documents. However, there are no perfect retrieval systems because the searching statements are necessarily incomplete, and relevance of documents is the user’s subjective opinion.
There are a large number of applications in which information retrieval is useful such as digital libraries, information filtering, recommender system, media search, search engines and many other and there is a constant need for improving such systems. In this context, information retrieval is an active field of research in computer science.