Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
OnDemand
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us

Unified Data Model for Large-Scale Multi-Schema Integration (ULMI)

Michael Dietrich, Jens Lemcke

Source Title: Handbook of Research on E-Business Standards and Protocols: Documents, Data and Advanced Web Technologies

DOI: 10.4018/978-1-4666-0146-8.ch020

OnDemand:

(Individual Chapters)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

Current approaches in schema mapping and matching focus on pair-wise comparison of schemas. This chapter gives an overview of how n-way comparison of schemas via a unified data model for large-scale multi-schema integration (ULMI) can benefit to schema matching and mapping processes. The approach integrates a set of input schemas into one comprehensive representation. Thus, a unified data model is constructed. The unified data model represents the closure of all integrated schemas. However, as the unified data model is too complex and too large, it is never revealed to the user. Therefore, the authors derive a canonical data model which represents the most common structure of all schemas. In a use case, the advantages of the canonical data model are demonstrated. Finally, challenges for further research are derived. This work is based on excerpts from realistic input schemas, and it provides a concrete, ideal canonical data model as a reference for further research.

Chapter Preview

Top

Introduction

Software integration is a big issue. About 40% of all IT budget is spent on integration (Kastner, 2006). The main reason is lacking knowledge of the connections between the message schemas that make up the interfaces. The growing number of applications communicating via the Internet amplifies the integration challenge. In this chapter, we present the ULMI approach. ULMI stands for “Unified data model for Large-scale Multi-schema Integration”. With our approach, we operate in the domain of enterprise information integration (EII). However, ULMI extends EII by addressing also inter-company information integration. In particular, ULMI combines the strengths of the following two traditional approaches that, each on their own, solve the integration problem only partially:

For inter-company communication, e-business standards define common message structures. The properties of the approach are summarized in the first column of Table 1. Examples for e-business standards are RosettaNet (RosettaNet, 2011) and CIDX (OAGi, 2008). An e-business standard is defined for a concrete domain, such as RosettaNet for the high tech and CIDX for the chemical industry. Inside the domain, every company adapts the e-business standard to fit their individual objective. To be adaptable, an e-business standard is under-specified and consists of many optional fields to cover all potentially relevant aspects. Concrete mappings never involve the standard itself. Instead, mappings connect always two companies’ interpretations of the standard. Since e-business standards are domain-specific and under-specified, a multitude of different standards and interpretations exists. Therefore, reusing mapping knowledge for future integration projects is difficult.

Table 1.

Approaches with central data model

	e-Business standard	Canonical data model	Unified data model
Scope	Whole business domain	Single company	Multiple business domains
Completeness	Covers all aspects	Restricted to aspects relevant for communication	Covers all aspects
Level of detail	Under-specified	Maximum detail	Maximum detail
Mappings between…	Schema and schema	Schema and CDM	Schema and schema

Key Terms in this Chapter

(Leaf, Intermediary) Correspondence: Is an equivalence relation on (leaf, non-leaf) nodes of the schemas. In contrast to a mapping element, a correspondence has no direction. The correspondences define equivalence classes of schema nodes. The equivalence classes are an important ingredient for the unified and the canonical data models described in this chapter.

Canonical Data Model (CDM): Is a subgraph of the conflict-free UDM graph. The canonical data model is a tree and can be understood as a new schema. The canonical data model contains the leaves of a set of selected schemas. The canonical data model follows the most common structure among the selected schemas.

Mapping: Relates leaves of one schema to the leaves of another schema. In particular, a script that transforms a message conforming to one schema to a message conforming to the other schema comprises a mapping. This chapter takes mappings as a given and expects a mapping to be correct but likely incomplete.

UDM Graph: Is a graph that results from merging the corresponding nodes of the schemas according to the UDM. The UDM graph contains cycles if corresponding nodes are conflictingly nested in the schemas. A cycle allows for paths through the UDM graph on which properties are nested in a way that cannot be observed in any schema.

Unified Data Model (UDM): Consists of a set of schemas, the correspondences of the schemas’ nodes, and a unique label for each equivalence class.

Interface: For the sake of this chapter, an interface defines how a software system can be communicated with electronically. An interface consists of at least one schema.

Schema: Describes the structure of data for communication. In this chapter, a schema is represented as a tree. Each leaf of a schema represents semantically unique, atomic data to be communicated. The non-leaves structure the data.

Conflict-Free UDM Graph: Is a UDM graph where no conflictingly nested elements are merged. The conflict-free UDM graph is a directed acyclic graph. Every way properties are nested in the conflict-free UDM graph can be found in at least one schema.

Complete Chapter List

Search this Book:

Reset

MLA

APA

Chicago

Export Reference

Unified Data Model for Large-Scale Multi-Schema Integration (ULMI)

Abstract

Introduction

Key Terms in this Chapter

Complete Chapter List