OSIRIS: Ontology-Based System for Semantic Information Retrieval and Indexation Dedicated to Community and Open Web Spaces

OSIRIS: Ontology-Based System for Semantic Information Retrieval and Indexation Dedicated to Community and Open Web Spaces

Francky Trichet (University of Nantes: Team Knowledge and Decision (KOD), France), Xavier Aimé (University of Nantes: Team Knowledge and Decision (KOD), France) and Christophe Thovex (University of Nantes: Team Knowledge and Decision (KOD), France)
DOI: 10.4018/978-1-61520-883-8.ch021
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

OSIRIS (Ontology-based Systems for Semantic Information Retrieval and Indexation dedicated to community and open web Spaces) is a platform dedicated to the development of community web spaces which aim at facilitating both semantic annotating process and searching process of multimedia resources. Based on the use of both heavyweight ontologies and thesauri, OSIRIS allows the end-user (1) to describe the semantic content of its resources by using an intuitive natural-language based model of annotation which is founded on the triple (Subject, Verb, Object), and (2) to formally represent these annotations by using Conceptual Graphs. Each resource can be described by adopting multiple points of view, which are usually provided by different end-users. These different points of view can be defined by using multiple ontologies which can be related to connected (or not-connected) domains. Developed from the integration of Semantic Web technologies and Web 2.0 technologies, OSIRIS aims at facilitating the deployment of semantic, collaborative, community and open web spaces. The use of OSIRIS is illustrated in the context of a project dedicated to the preservation of French popular and cultural heritage.
Chapter Preview
Top

Introduction

Currently, the collective and interactive dimension of Web 2.0 coupled with the lightness of its tools facilitates the rise of many platforms dedicated to the sharing of multimedia resources such as Flickr (http://www.youtube.com) for the videos. However, the success of these platforms (in terms of number of listed resources and number of federated users) must be moderated in comparison with the poverty of the approach used for Information Retrieval (IR). Indeed, the search engines integrated in such systems are only based on the use of tags which are usually defined manually by the end-users of the communities (i.e. the social tagging which leads to the creation of folksonomies). In addition to the traditional limits of IR systems based on keywords, in particular the poverty of semantic description provided by a set of tags and consequently the impossibility of implementing a semantic search engine, these systems suffer from a lack of openness because the tags provided by the end-users remain useful and efficient only inside the platforms; they cannot be exported when the resources are duplicated from a platform to another.

OSIRIS (Ontology-based Systems for Semantic Information Retrieval and Indexation dedicated to community and open web Spaces) is a platform dedicated to the development of community web spaces which aim at facilitating both semantic annotating process and searching process of multimedia resources. Such a community space corresponds to an Internet-mediated social and semantic environment in the sense that the resources which are shared are not only tagged by the users (which thus construct a folksonomy in a collaborative way) but they are also formally described by using one (or several) ontolog(ies) shared by all the members of the community. The result is an immediate and rewarding gain in the user's capacity to semantically describe and find related content.

Based on the use of heavyweight ontologies (Furst & Trichet, 2006a) coupled with thesauri1, OSIRIS allows the end-users to semantically describe the content of a resource (for instance, this photography of Doisneau represents “A woman who kisses a man in a famous French place located in Paris”) and then to formally represent this content by using Conceptual Graphs (Sowa, 1984). Each resource can be described according to multiple points of view (i.e. representation of several contents) which can also be defined according to multiple ontologies, which can cover connected domain or not. Thus, during the annotating process, OSIRIS allows managing several ontologies which are used jointly and in a transparent way during the searching process, thanks to the possibility of defining equivalence links between concepts and/or relations of two ontologies. Moreover, OSIRIS is based on heavyweight ontologies, (i.e. ontologies which in addition to including the concepts and relations [structured within hierarchies based on the relation of Specialisation/Generalisation] characterizing the considered domain, also include the axioms [rules and constraints] that govern this domain). This confers to OSIRIS the possibility to automatically enrich the annotations (manually associated to a resource) by applying the axioms which generally correspond to inferential knowledge of the domain.

Key Terms in this Chapter

Folksonomy: A folksonomy is a system of classification derived from the practice and method of collaboratively creating and managing tags to annotate and categorize content. This practice is also known as collaborative tagging, social classification, social indexing, and social tagging.

Thesaurus: A thesaurus is a special kind of controlled vocabulary where the terms (which correspond to the entries of the thesaurus) are structured by using linguistic relationships such as synonymy, antonymy, hyponymy or hypernymy. A thesaurus (like Wordnet) is not an ontology because it only deals with terms (i.e. the linguistic level), without considered the concepts and the relations of the domain (i.e. the conceptual or knowledge level). A thesaurus is therefore similar to a dictionary with the difference that it does not provide word definitions, its scope is limited to a particular domain, entry terms are single-word or multi-word entries and that it facilitates limited cross-referencing among the contained terms (e.g. synonyms or antonyms).

Ontology Matching: The objective of ontology matching is to discover and evaluate semantic links (e.g. identity or subsumption) between conceptual primitives (concepts and relations) of two given ontologies supposed to be built on related domains.

Conceptual Graphs: Conceptual graphs (CGs) are a system of logic based on the existential graphs of Charles Sanders Peirce and the semantic networks of artificial intelligence. They express meaning in a form that is logically precise, humanly readable, and computationally tractable. With a direct mapping to language, conceptual graphs serve as an intermediate language for translating computer-oriented formalisms to and from natural languages. With their graphic representation, they serve as a readable, but formal design and specification language (J. Sowa).

Ontology: An ontology is a “formal, explicit specification of a shared conceptualisation”. It is composed of concepts and relations structured into hierarchies (i.e. they are linked together by using the Specialisation/Generalisation relationship). A heavyweight ontology is a lightweight ontology (i.e. an ontology simply based on a hierarchy of concepts and a hierarchy of relations) enriched with axioms used to fix the semantic interpretation of concepts and relations.

Precision/Recall: In the context of Information Retrieval, Precision is defined as the number of relevant documents retrieved by a search divided by the total number of documents retrieved by that search, and Recall is defined as the number of relevant documents retrieved by a search divided by the total number of existing relevant documents (which should have been retrieved).

Complete Chapter List

Search this Book:
Reset