Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
OnDemand
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us

Data Warehouse Maintenance, Evolution and Versioning

Johann Eder, Karl Wiggisser

Source Title: Data Warehousing Design and Advanced Engineering Applications: Methods for Complex Construction

DOI: 10.4018/978-1-60566-756-0.ch010

OnDemand:

(Individual Chapters)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

Data Warehouses typically are building blocks of decision support systems in companies and public administration. The data contained in a data warehouse is analyzed by means of OnLine Analytical Processing tools, which provide sophisticated features for aggregating and comparing data. Decision support applications depend on the reliability and accuracy of the contained data. Typically, a data warehouse does not only comprise the current snapshot data but also historical data to enable, for instance, analysis over several years. And, as we live in a changing world, one criterion for the reliability and accuracy of the results of such long period queries is their comparability. Whereas data warehouse systems are well prepared for changes in the transactional data, they are, surprisingly, not able to deal with changes in the master data. Nonetheless, such changes do frequently occur. The crucial point for supporting changes is, first of all, being aware of their existence. Second, once you know that a change took place, it is important to know which change (i.e., knowing about differences between versions and relations between the elements of different versions). For data warehouses this means that changes are identified and represented, validity of data and structures are recorded and this knowledge is used for computing correct results for OLAP queries. This chapter is intended to motivate the need for powerful maintenance mechanisms for data warehouse cubes. It presents some basic terms and definitions for the common understanding and introduces the different aspects of data warehouse maintenance. Furthermore, several approaches addressing the problem are presented and classified by their capabilities.

Chapter Preview

Top

Introduction

The standard architecture for data warehouse systems are multidimensional databases, where transactional data (cell values) are described in terms of master data (dimensions and dimension members). Whereas today’s commercial systems are well prepared to deal with changes in the transactional data, they are, surprisingly, not able to deal with changing master data in a satisfactory way. Nonetheless, such changes frequently occur (restructuring in organizations, new laws, mergers and acquisitions, product portfolio restructuring, etc.). All these changes have to be represented in the information systems, and thus, must somehow be modeled also in the data warehouse. For data warehouses the adequate representation and treatment of such changes is even more crucial than in standard database applications, since data warehouses are intended to represent also historical data which – changes occurring – might be quite incompatible.

A simple example illustrating the problem of missing data is querying the number of inhabitants in the European Union for the last 25 years. This query seems rather straightforward and the numbers should not leave much space for interpretation. But, one has to be aware of some changes: First of all, the geopolitical entity “European Union” only exists since 1993, succeeding the “European Community”, which itself was originally named “European Economic Community”. Furthermore, in the considered period (1983 to 2008), the European Union grew from 12 to 27 members. Finally, with the reunification of East- and West-Germany in 1990 one of the member countries had a massive internal reorganization. So if querying the number of inhabitants from 1983 to 2008, how can the resulting numbers be compared? When querying this data from the Eurostat website, one has to choose the “geopolitical entity” (EU-27, EU-25, one or more counties, …) for which the data should be retrieved. If, for instance, EU-25 is chosen, the population for these 25 countries is returned also for the years before their membership. But of course, the overall sum of returned inhabitants for the year 1987 does not match the real number of people living in the European Union at that time. Comparing the numbers of 1990 and 1991, where the organization itself did not change, may indicate a massive increase of inhabitants. In reality, the 1991 number also contains the 16.4 million people of former East-Germany. Eurostat, for instance, takes this into account, and presents numbers of the united Germany also for the years before 1991. Another example for an unclear inclusion are the Baltic countries or Slovenia. They did not even exist before 1991, but were parts of other countries, which, of course, never were parts of the European Union. An alternative to presenting such “adjusted data” is to display the “historical truth”, i.e. include the numbers of different countries only after they joined the European Union. This may make sense in some situations, in others, such results may be useless.

An example demonstrating the effect of changing semantics could be to retrieve the Gross National Product of the countries in the European Union from 1983 to 2008. Besides the problems induced by the structural changes described above, i.e. whether and how to include numbers for a specific country, this query illustrates the changing semantics problem: As of 1999 and 2002, a common European currency, the Euro, was introduced as deposit currency and cash money respectively, in many – but not all – of the member countries. Thus, before 1999 the Gross National Product of different countries was expressed in the local currency, but as of 1999 it is given in Euro. Before 1999 for comparing the GNP of different countries, it is obvious that the numbers must be brought to a common base, i.e. the same currency, to be comparable. But what about statistics for a single country? For Austria, 1 Euro exchanges 13.7603 Austrian Schillings. So, someone comparing the Austrian Gross National Product from 1990–2006 without considering the Euro would notice a giant retracement in the year 1999. But, of course, someone who knows about the Euro can divide each value given in ATS by 13.7603 and then compare the values.

Besides such simple “unit changes” there may be also more complex semantic changes for dimension members. Consider a query analyzing the unemployment rate in the European Union. Not only that it is calculated in different ways for various countries, the calculation mode has also been changed several times in the last few years, for instance whether people who are attending coursed offered by federal employment offices are counted as unemployed or not. Such calculation methods may be contained in the data warehouse definition as formulae for a certain member.

Complete Chapter List

Search this Book:

Reset

MLA

APA

Chicago

Export Reference

Data Warehouse Maintenance, Evolution and Versioning

Abstract

Introduction

Complete Chapter List