Test-Driven Development of Data Warehouses

Test-Driven Development of Data Warehouses

Sam Schutte (Unstoppable Software, Inc., USA), Thilini Ariyachandra (Xavier University, USA) and Mark Frolick (Xavier University, USA)
DOI: 10.4018/978-1-61350-456-7.ch210
OnDemand PDF Download:


Test-driven development is a software development methodology that has recently gained a great deal of traction in the software development community. It focuses on creating software-based test cases that define the business requirements of an application before beginning the coding of the application itself. This paper proposes that test-driven development could be a useful methodology for data warehouse projects, in that it could help team members avoid some of the major pitfalls of data warehousing, and result in a higher-quality end product.
Chapter Preview

Status Of Tdd In The Bi And Data Warehousing Space

While the data warehouse and business intelligence industry has adopted a few of the methods from agile development, such as “bite size analysis” (Arnett, 2002) and improved coding practices, the methods of test-driven development are only starting to gain use. For instance, test-driven development has been proposed as a way to verify the validity of business intelligence reports such as Crystal Reports (Landes, 2005). In the specific area of data warehousing however, test-driven development does not appear to have made an impact.

If the principles of test-driven development were applied to a data warehousing project, the resulting data warehouse would likely be of high quality and its functionality would not exceed the scope of the original request. Additionally, it would be scientifically provable through the use of the software-based test cases that the data within a data warehouse was correct – even in the face of disbelieving executives who may question the accuracy of reports generated from the data warehouse.

To succeed in a test-driven environment, a data warehouse team would have to follow several guidelines. First, all functionality and data in the system must be specified by end-users, and then test cases must be created which specifically address each piece of data or data relationship. Second, as the system is developed, these test cases must be run against the data warehouse. When a test passes, that particular feature (e.g., database table or field, or ETL process) within the data warehousing environment is considered complete. In this way, the pending tasks in the project implantation plan simply become any tests that are currently failing.

Complete Chapter List

Search this Book: