Multi-Layered Semantic Data Models

Multi-Layered Semantic Data Models

László Kovács (University of Miskolc, Hungary) and Tanja Sieber (University of Miskolc, Hungary)
Copyright: © 2009 |Pages: 6
DOI: 10.4018/978-1-59904-849-9.ch165
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

One of the basic terms in information engineering is data. In our approach, data item is defined as representation of an information atom stored in digital computers. Although an information atom can be considered as a subject-predicate-value triplet (Lassila, 1999), data is usually given only with its value representation. This fact can lead to definitions where data is just numbers, words or pictures without context. For example in (WO, 2007), data is given as information in numerical form that can be digitally transmitted or processed. It is interesting that we can often recognize that the term ‘data’ is used without any exact terminological definition with the effect that the term often remains confusing, sometimes even contradicting the definitions of the term presented. Sieber and Kammerer (2006) introduce a new interpretation of data containing several levels. The lowest level belongs to data instances that describe the form and appearance of symbols. The intermediate level is the level of representatives which includes the applied encoding system. The highest level is related to the meaning with context description. All three levels are needed to get to know the information atom. For example the symbol ‘36’ in a database determines only the value and representation system, but not the meaning. To cover the whole information atom, the database should store some additional data items to describe the original data. The main purpose of semantic data models is to describe both context and the main structure of data items in the problem area. These additional data items are called metadata. It is important to see that: • metadata are data, • metadata are relative, and • metadata describe data. Metadata constitute a basis for bringing together data that are related in terms of content, and for processing them further. They can be understood as a pre-requisite for intelligent and efficient administration and processing, and not least as a focused, formal means of providing relevant data.
Chapter Preview
Top

Introduction

One of the basic terms in information engineering is data. In our approach, data item is defined as representation of an information atom stored in digital computers. Although an information atom can be considered as a subject-predicate-value triplet (Lassila, 1999), data is usually given only with its value representation. This fact can lead to definitions where data is just numbers, words or pictures without context. For example in (WO, 2007), data is given as information in numerical form that can be digitally transmitted or processed. It is interesting that we can often recognize that the term ‘data’ is used without any exact terminological definition with the effect that the term often remains confusing, sometimes even contradicting the definitions of the term presented.Sieber and Kammerer (2006)introduce a new interpretation of data containing several levels. The lowest level belongs to data instances that describe the form and appearance of symbols. The intermediate level is the level of representatives which includes the applied encoding system. The highest level is related to the meaning with context description. All three levels are needed to get to know the information atom. For example the symbol ‘36’ in a database determines only the value and representation system, but not the meaning. To cover the whole information atom, the database should store some additional data items to describe the original data. The main purpose of semantic data models is to describe both context and the main structure of data items in the problem area. These additional data items are called metadata. It is important to see that:

  • metadata are data,

  • metadata are relative, and

  • metadata describe data.

Metadata constitute a basis for bringing together data that are related in terms of content, and for processing them further. They can be understood as a pre-requisite for intelligent and efficient administration and processing, and not least as a focused, formal means of providing relevant data.

Top

Background

In data management systems, the context of a value is usually defined with the help of a storage structure. An identification name (a text value) is assigned to each position of the structure. The description of storage (structure, naming and constraints) is called schema. A big problem of structural data modeling is that it can not provide all the information needed to understand the full context of the data. For example, a relational schema

RT (NM INT, KNEV CHAR(20), RU DATE)

alone is not enough to capture the meaning of the stored data items.

The main building blocks to describe the context in semantic data models (SDM) are concepts and relationships. The first widely known structure oriented semantic models in database design are the Entity-Relationship (ER) model (Chen, 1976) and the EER (Thalheim, 2000) model. The ER model consists of three basic elements: entity (concept), relationship and attribute. The attributes are considered as structure elements of the entities, one attribute may belong to only one entity. The EER model is the extension of the ER model with IS_A and HAS_A relationships. Some other extensions are SIM, IFO and RM/T. One of the main drawbacks of structure oriented SDM is the limitations of expressive power.

Later, models like UML or ODL (Catell, 1997) were developed to cover the missing object oriented elements. In the case of ODL, a class description can contain the following elements: attributes, methods, inheritance parameters, visibility, relationships and integrity rules. These models provide a powerful complexity for software engineering but they are not very flexible to describe data models of higher abstraction.

Key Terms in this Chapter

Semantic Data Model: A high level data model. It is usually based on concepts and it uses a graphical formalism. It contains only the key, the semantic properties of the data structure. It does not cover the details of the implementation.

Data Model: A formal description language to describe and to manipulate the investigated data instances. It contains three components: a static structural part, an integrity part and a manipulation part.

OWL: A language to describe Web ontologies. It uses an XML format and it contains a formal description logic component, too. It provides the following extra functionality: classification, type and cardinality constraints, thesauri, decidability.

RDF: A semantic data model that describes the world with statements. A statement is a triplet having the following form: subject-predicate-object.

UML: A standardized general-purpose modeling language for object oriented software systems. It has a graphical notation and contains several diagrams: structure diagrams (class, object, component, package) and behavioral diagrams (activity, use-case, state machine, interaction).

Ontology: A semantic data model that describes the concepts and their relationships. It contains a controlled vocabulary and a grammar for using the vocabulary terms. The ontology enables to make queries and assertions and reasoning. The most popular form to describe ontology is RDF and OWL.

Multi-Layered Data Model: A data model where the model elements are assigned to levels. In the model, a hierarchy is defined between the levels. Regarding the element-level relationships, the intra-level relationships differ from the inter-level relationships.

Complete Chapter List

Search this Book:
Reset