The user dimension is a crucial component in the information retrieval process and for this reason it must be taken into account in planning and technique implementation in information retrieval systems. In this paper we present a technique based on relevance feedback to improve the accuracy in an ontology based information retrieval system. Our proposed method combines the semantic information in a general knowledge base with statistical information using relevance feedback. Several experiments and results are presented using a test set constituted of Web pages.
A user is a fundamental component in the information retrieval process and we can affirm that the goal of an information retrieval system is to satisfy a user’s information needs. In several contexts, with the Web it can be very hard to satisfy completely the request of a user, given the great amount of information and the high heterogeneity in the information structure. On the other hand, users find it difficult to define their information needs, either because of the inability to express information need or just insufficient knowledge about the domain of interest hence they use just a few keywords. In this context, it is very useful to define the concept of relevance information. We can divide relevance into two main classes (Harter, 1992; Saracevic, 1975; Swanson, 1986) called objective (system-based) and subjective (human (user)-based) relevance respectively. The objective relevance can be viewed as a topicality measure, i.e. a direct match of the topic of the retrieved document and the one defined by the query. Several studies on the human relevance show that many other criteria are involved in the evaluation of the IR process output (Barry, 1998; Park, 1993; Vakkari & Hakala, 2000). In particular the subjective relevance refers to the intellectual interpretations carried out by users and it is related to the concepts of aboutness and appropriateness of retrieved information. According to Saracevic (1996) five types of relevance exist: an algorithmic relevance between the query and the set of retrieved information objects; a topicality-like type, associated with the concept of aboutness; cognitive relevance, related to the user information need; situational relevance, depending on the task interpretation; and motivational and affective relevance, which is goal-oriented. Furthermore, we can say that relevance has two main features defined at a general level: multidimensional relevance, which refers to how relevance can be perceived and assessed differently by different users; dynamic relevance, which instead refers to how this perception can change over time for the same user. These features have great impact on information retrieval systems which generally have not a user model and are not adaptive to individual users. It is generally acknowledged that some techniques can help the user in information retrieval tasks with more awareness, such as Relevance Feedback (RF). Relevance feedback is a means of providing additional information to an information retrieval system by using a set of results provided by a classical system by means of a query (Salton & Buckley, 1990). In the RF context, the user feeds some judgment back to the system to improve the initial search results. The system can use this information to retrieve other documents similar to the relevant ones or rank the documents on the basis of user clues. In this paper we use the second approach. A user may provide the system with relevance information in several ways. He may perform an explicit feedback task, directly selecting documents from list results, or an implicit feedback task, where the system tries to estimate the user interests using the relevant documents in the collection. Another well known technique is the pseudo-relevance feedback where the system chooses the top-ranked documents as the relevant ones. This paper is organized as follows: in section 2 some related work about relevance feedback techniques and different methods and contexts are presented; section 3 briefly summarizes the fundamental theoretical background used in this work; several novel similarity metrics are then introduced in section 4; in section 5 we describe our Web information retrieval system based on ontologies and user feedback, while evaluations, experiments and conclusions are described in section 6 and 7 respectively.