Ontologies Application to Knowledge Discovery Process in Databases

Ontologies Application to Knowledge Discovery Process in Databases

Héctor Oscar Nigro (Universidad Nacional del Centro de la Provincia de Buenos Aires, Argentina) and Sandra Elizabeth González Císaro (Universidad Nacional del Centro de la Provincia de Buenos Aires, Argentina)
DOI: 10.4018/978-1-60566-242-8.ch054
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

Nowadays one of the most important and challenging problems in Knowledge Discovery Process in Databases (KDD) or Data Mining is the definition of the prior knowledge; this can be originated either from the process or the domain. This contextual information may help select the appropriate information, features or techniques, decrease the space of hypothesis, represent the output in a more comprehensible way and improve the whole process.
Chapter Preview
Top

Introduction

Nowadays one of the most important and challenging problems in Knowledge Discovery Process in Databases (KDD) or Data Mining is the definition of the prior knowledge; this can be originated either from the process or the domain. This contextual information may help select the appropriate information, features or techniques, decrease the space of hypothesis, represent the output in a more comprehensible way and improve the whole process. 

Most part of this background knowledge is only present as implicit knowledge –in the analyst mind- or textual documentation. Therefore we need a conceptual model to help represent this knowledge. Advances in the field of Knowledge Engineering allow codifying previous knowledge under the ontological formalism. According to Gruber’s (2002) ontology definition - explicit formal specifications of the terms in the domain and relations among them - we can represent the knowledge of the Knowledge Discovery process and knowledge about domain. Ontologies are used for communication (between machines and/or humans), automated reasoning, representation and reuse of knowledge. As a result, ontological foundation is a precondition for efficient automated usage of Knowledge Discovery information. As a result, we can perceive the relation between Ontologies and Data Mining in two ways:

  • From ontologies to data mining, we are incorporating knowledge in the process through the use of ontologies, i.e. how the experts comprehend and carry out the analysis tasks. Representative applications are intelligent assistants for the discover process (2005) interpretation and validation of mined knowledge, Ontologies for resource and service description and Knowledge Grids (Cannataro et al., 2007).

  • From data mining to ontologies, we include domain knowledge in the input information or use the ontologies to represent the results. Therefore the analysis is done over these ontologies. The most distinctive applications are in Medicine, Biology and Spatial Data, such as Gene representation, Taxonomies, applications in Geosciences, medical applications and specially in evolving domains (Bogorny el al., 2006; Sidhu et al., 2006)

So far, the proposals or solutions that we find in KDD with ontologies are partial, i.e. they are centered on some of the steps of knowledge discovery. For instance Euler and Scholz (2004) present a metamodel of KDD preprocessing chains that contains an ontology describing conceptual domain knowledge. This metamodel is operational, yet abstract enough to allow the reuse of successful KDD applications in similar domains. Bernstein et al. (2005) propose an intelligent tool (IDA) based on a mining ontology. Brisson and Collar (2007) present the so-called KEOPS approach integrating expert knowledge all along the data mining process in a coherent and uniform manner. An ontology driven information system plays a central role in the approach.

The main goal of this paper is to present the issue of the ontologies application in KDD. As a result of our research, we will propose a general ontology-based model, which includes all discovery steps.

This paper is presented as follows: First, Background: main works in the field are introduced. Second, Main focus section is divided into: KDD Using Ontologies cycle in which we explain the knowledge process and propose a model, Domain Ontologies, Metadata Ontologies and Ontologies for Data Mining Process. Third: Future Trends, Conclusions, References and Key Terms.

Top

Background

This section describes the most recent research works in ontologies application to KDD. As you will appreciate, none of the research works alludes to the use of the ontologies in the whole process.

Key Terms in this Chapter

Knowledge Engineering: A field within artificial intelligence that develops knowledge-based systems. Such systems are computer programs that contain large amounts of knowledge, rules and reasoning mechanisms to provide solutions to real-world problems.

RDF: Resource Description Framework is a formal language to define ontologies. Defines a data model as a series of resources and relations among them

Knowledge Discovery Process in Databases: “The non-trivial process of identifying valid, novel, potentially useful, and ultimately understandable patterns in data” (Fayyad et al., 1996).

Knowledge Grid: A software architecture for geographically distributed PDKD (Parallel and Distributed Knowledge Discovery) applications called Knowledge Grid, which is designed on top of computational Grid mechanisms provided by Grid environments. The Knowledge Grid uses basic Grid services and they are organized into two layers: Core K-Grid Layer, which is built on top of generic Grid services, and High-Level K-Grid Layer, which is implemented over the core layer.

RDFS: Resource Description Framework Schema- It is a vocabulary for describing properties and classes of RDF resources.

Knowledge Management: An integrated, systematic approach to identifying, codifying, transferring, managing, and sharing all knowledge of an organization.

IDA -Intelligent Discovery Assistant: Helps data miners with the exploration of the space of valid DM processes. It takes advantage of an explicit ontology of data-mining techniques, which defines the various techniques and their properties. (Bernstein et al, 2005).?

Ontology: “A specification of a conceptualization …a description of the concepts and relations that can exist for an agent or a community of agents” (Gruber 2002).

Complete Chapter List

Search this Book:
Reset