Clear and Private Ad Hoc Retrieval Models on Web Data

Clear and Private Ad Hoc Retrieval Models on Web Data

Souria Ortiga (Mente Argentina University, Argentina)
Copyright: © 2019 |Pages: 18
DOI: 10.4018/978-1-5225-7338-8.ch009

Abstract

During the 1980s, and despite its maturity, the search information (RI) was only intended for librarians and experts in the field of information. Such tendentious vision prevailed for many years. Since the mid-90s, the web has become an increasingly crucial source of information , which has a renewed interest in IR. In the last decade, the popularization of computers, the terrible explosion in the amount of unstructured data, internal documents, and corporate collections, and the huge and growing number of internet document sources have deeply shaken the relationship between man and information. Today, a great change has taken place, and the RI is often used by billions of people around the world. Simply, the need for automated methods for efficient access to this huge amount of digital information has become more important, and appears as a necessity.
Chapter Preview
Top

Ad-Hoc Research (Ar)

The ad-hoc research (RA) is the standard task in classical IR, based on the interrogation of the information elements (documents in the collection) by the user to obtain the necessary documents after a specified query. The RA has recently conquered the world, fueling not only sought engines in the web, but also any type of unstructured research behind the great web ecommerce. . The objective of this task is to automate the document analysis process calculates the comparison between the representation of the need for the information (query) and representation of documents (Larson, 2010).

RA is a process quite familiar to most of us because we all probably use Google at least once a day on average. This task order is to maintain a collection of documents and when a new request comes, we seek in this collection to identify the appropriate documents (called relevant) for this request. The need for information is supposed to be on time rather than long term (as the case in the filtering task see section) and one request at a time is compared to a static document collection. This type of research provides an open field for the user to specify what he needs as a query without any restrictions. Finally,(Kowalski, 2006).

General Architecture of an Ad-Hoc Research Model (MRA)

A research model Ad-Hoc (MRA) is a process that stores and manages information on documents, often text documents but can also be multimedia (pictures or video). For example we have some query, q, which is an expression of user needs. An MRA compared with a corpus of documents, C = {d (1). . . D (n)}. The goal is to select some documents C and classified according to a score of relevance to the needs of the user as expressed by q.

A perfect MRA should retrieve only relevant documents and any non-relevant documents. However, this is impossible, because search statements are incomplete and relevance depends on the subjective opinion of the user. In practice, two users can put the same request to a MRA and judge the relevance of documents retrieved differently: some users will appreciate the results, others not (Qin, 2010).

Figure 1.

Generic architecture of a research model Ad-Hoc (MRA) (Baeza-Yates, 1999)

As shown in the figure above is an MRA up of three main components. The inputs of a MRA are: the query expression that represents the user's information needs and a set of documents in the collection (can be images, text, objects etc ...) which are a source of interest. These data elements will be the entrance of representation component.

Complete Chapter List

Search this Book:
Reset