LMF Dictionary-Based Approach for Domain Ontology Generation

LMF Dictionary-Based Approach for Domain Ontology Generation

Feten Baccar Ben Amar (University of Sfax, Tunisia), Bilel Gargouri (University of Sfax, Tunisia) and Abdelmajid Ben Hamadou (University of Sfax, Tunisia)
Copyright: © 2012 |Pages: 25
DOI: 10.4018/978-1-4666-0188-8.ch005

Abstract

In this chapter, the authors propose an approach for generating domain ontologies from LMF standardized dictionaries (ISO-24613). It consists, firstly, of deriving the target ontology core systematically from the explicit information of the LMF dictionary structure. Secondly, it aims at enriching such a core, taking advantage of textual sources with guided semantic fields available in the definitions and the examples of lexical entries. The originality of this work lies not only in the use of a unique and finely-structured source containing multi-domain and lexical knowledge of morphological, syntactic, and semantic levels, lending itself to ontological interpretations, but also in providing ontological elements with linguistic grounding. In addition, the proposed approach has addressed the quality issue that is of a major importance in ontology engineering. They have integrated a validation stage along with the extraction modules in order to maintain the consistency of the generated ontologies. Furthermore, the proposed approach was applied to a case study in the field of astronomy and the experiment has been carried out on the Arabic language. This choice is explained both by the great deficiency of work on Arabic ontology development and the availability within the research team of an LMF standardized Arabic dictionary.
Chapter Preview
Top

Introduction

Over the last decades, ontologies have gained growing interest opening fascinating possibilities in several applications such as Natural Language Processing (NLP), Information Retrieval, Semantic Web (SW) and Question Answering. Indeed, Guarino (1998) defines ontology as an “engineering artifact,” constituted by a specific vocabulary used to describe a certain reality in a formal way. Hence, research on ontology development and construction process improvement has become increasingly widespread in computer science community. In fact, the nature of ontologies as reference models for a domain requires a high quality degree of the respective model. Although several approaches have been considered in literature to assess ontology construction methodologies (Gómez-Pérez, 1994, 2004; Guarino & Welty, 2002; Porzel & Malaka, 2004; Brewster, et al., 2004; Gangemi, et al., 2006), a comprehensive and consensual standard methodology seems to be out of reach. Yet, evaluating the ontology as a whole is a costly and challenging task especially when the reduction of human intervention is sought. This can be deemed as a major impediment that may elucidate the ontologies’ failure not only to be reused in others but also to be exploited in final applications (M. B. Almeida, 2009).

Moreover, even though an ontology may be constructed manually or semi-automatically, it is never a trivial task. Actually, the absence of common consensus and structured guidelines has hindered the development of ontologies within and between research teams. Indeed, although the area of ontology learning aiming to automate the ontology creation process has been dealt with by plenty of work, it is still a long way from being fully automatic and deployable on a large scale. This is essentially because it is a time-consuming and painstaking endeavor that requires significant human (expert) involvement for the validation of each step throughout this process (Lonsdale, et al., 2010).

In order to reduce the costs, research on (semi-) automatic ontology building from scratch has been conducted using a variety of resources, such as raw text (Aussenac-Gilles, et al., 2008; Li, et al., 2005; Navigli, et al., 2003), Machine-Readable Dictionaries (MRDs) (Kurematsu, et al., 2004; Kietz, et al., 2000; Rigau, et al., 1998), and thesauri (Christment, et al., 2008; Soergel, et al., 2004). Obviously, these resources have different features, and therefore, each proposed process is based on a different approach with respect to principles, design criteria, NLP techniques, etc.

On the other hand, as linguistic information is increasingly required in ontologies, mainly in SW and NLP communities (Buitelaar, et al., 2009; Pazienza, et al., 2007), among the considered terminological resources, MRDs represent one of the most likely and suitable sources promoting the knowledge extraction both at ontological and lexical levels. However, since much information has not yet been encoded, the access to the potential wealth of information in dictionaries remains limited to software applications.

Recently, Lexical Markup Framework (LMF) (ISO 24613, 2008), which is a standard for the representation and construction of lexical resources, has been defined. Its meta-model basically provides a common and shared representation of lexical objects that allows the encoding of rich linguistic information, including morphological, syntactic, and semantic aspects (Francopoulo & George, 2008). Thanks to its encompassing of both ontological and lexical information, an LMF standardized dictionary offers a very suitable primary knowledge resource to learn domain ontologies and above all to provide the ontology elements with linguistic grounding or structure (Baccar, et al., 2010).

Complete Chapter List

Search this Book:
Reset