Object Migration Tool for Data Warehouses

Object Migration Tool for Data Warehouses

Nayem Rahman (Intel Corporation, USA), Peter W. Burkhardt (Intel Corporation, USA) and Kevin W. Hibray (Intel Corporation, USA)
DOI: 10.4018/jsita.2010100104


Data warehouses contain numerous software applications and thousands of objects that make those applications work. Many companies maintain multiple data warehouses depending on business requirements for example development, testing and production. Installing objects and keeping them synchronized across all environments can be a challenging task due to the sheer number of objects and complexity. Software objects stored in a Source Control system must be installed on target warehouse environments. Manual copy procedures are possible but very inefficient. Developers spend much time preparing installation and migration scripts that are prone to syntax errors. This paper proposes an Object Migration and Apply Tool (OMAT) that automates software installation across all warehouses for anyone using manual procedures. An automated tool can help eliminate error prone manual procedures, increase flawless object installation and reduce installation time. The OMAT tool is easy to use through a web browser and includes many useful features that support the development life cycle, the Sarbanes-Oxley (SOX) Act requirements and enforce numerous business requirements. OMAT is designed to support the construction and maintenance of an enterprise-wide, strategic data warehouse faster and better.
Article Preview

1. Introduction

The data warehouse is considered as one of the six major IT infrastructures (Weill, et al., 2002) of business organizations. A data warehouse often begins with a small project. Once the return on investment (ROI) is realized the warehouse expands with the growth of the customer base. New data warehouse customers increase at a steady rate throughout the years until the once small project has ballooned into a large data warehouse. A data warehouse can contain hundreds of applications and thousands of objects. These objects include base tables, global temporary tables, base views, and business views for analytical purposes, stored procedures, macros, join indexes, triggers, etc.

Many companies maintain more than one data warehouse related to their business requirements. Warehouses often have names that describe their function. Some examples are Development, Consolidated testing, and Production. These warehouses are often referred to as environments or platforms.

Installing objects and keeping them synchronized across all warehouses (migration) is often a challenge due to sheer number of objects and also to various complexities. For many IT shops object software installations are a manual effort. There is a corresponding high cost due to long hours spent during the year on this activity. Manual procedures can be project timeline constraints and they frequently have a high rate of re-work. Developers can spend a lot of time preparing installation and migration scripts which are prone to run time syntax errors. The cost of migrating objects goes up when developers have to rewrite scripts and re-migrate them. This contradicts the strategic objectives of many data warehousing projects (better, faster, cheaper).

In this paper we propose an automated tool which helps make object migration flawless, more efficient and more effective. It can help any organization achieve the strategic objective of creating and maintaining a state-of-the-art data warehouse with the ulterior goal of helping business executives make better strategic and tactical decisions based on the information stored in the warehouse.

We will proceed to describe one automated tool, Object Migration and Apply Tool (OMAT), which automates object installation across all related data warehouses. Most OMAT functions originate from former manual procedures so using OMAT aligns well with existing development lifecycle activities. OMAT allows a developer to select a particular version of code from the Source Control Repository and store a copy of it in a construct called a package. During construction each OMAT package is assigned a unique 5 digit ascending sequential number. This numbering allows efficient identification and queue management. A package contains one or more related component lines listed in the sequence and in order of dependency. OMAT uses XML files to track which warehouse the object has been installed on and to enforce the synchronization of the selected code version across all of the warehouses.

After the package is constructed the developer uses OMAT to perform an installation on a target environment (Development). OMAT executes each package line sequentially and functions for all database objects - most often tables, views, macro, and stored procedures. It can also install components that reside on Non-Database Clients. Most target destinations (Tables/Folders) are determined automatically based on the Source Control Folder name. OMAT determines the database name from the context of the SQL statements. OMAT automatically looks up the appropriate database logon id and password at run time.

If meaningful errors are encountered, OMAT will stop and wait for resolution. If needed, a developer can add or change lines. Package installation can resume from the failure point after error resolution. Errors defined as 'meaningless' (e.g. SQL Drop errors due to non-existent target objects) are intercepted and ignored. Logs are stored in discrete folders on the ETL server for ready reference.

After the package is installed and tested on development, it is then 'delivered' to the Change Management team for installation on Consolidated Testing and later on Production. The developer has limited rights with their packages after the 'delivery' to Change Management. The Change management team can return the package to the developer for corrections if required except after the point of no return. At installation time OMAT has a feature to detect if there are related batch jobs running that might be using the existing warehouse objects. It will not proceed until it is clear that the objects are not in use. This prevents the installation of the new object version from causing batch job failures or data corruption.

Complete Article List

Search this Journal:
Open Access Articles: Forthcoming
Volume 10: 4 Issues (2019): Forthcoming, Available for Pre-Order
Volume 9: 4 Issues (2018): Forthcoming, Available for Pre-Order
Volume 8: 4 Issues (2017): 1 Released, 3 Forthcoming
Volume 7: 4 Issues (2016)
Volume 6: 4 Issues (2015)
Volume 5: 4 Issues (2014)
Volume 4: 4 Issues (2013)
Volume 3: 4 Issues (2012)
Volume 2: 4 Issues (2011)
Volume 1: 4 Issues (2010)
View Complete Journal Contents Listing