Data Profiling and Data Quality Metric Measurement as a Proactive Input into the Operation of Business Intelligence Systems

Data Profiling and Data Quality Metric Measurement as a Proactive Input into the Operation of Business Intelligence Systems

Scott Delaney (Hydro Tasmania, Australia)
DOI: 10.4018/978-1-4666-9562-7.ch107
OnDemand PDF Download:
List Price: $37.50
10% Discount:-$3.75


Business intelligence systems have reached business critical status within many companies. It is not uncommon for such systems to be central to the decision-making effectiveness of these enterprises. However, the processes used to load data into these systems often do not exhibit a level of robustness in line with their criticality to the organisation. The processes of loading business intelligence systems with data are subject to compromised execution, delays, or failures as a result of changes in the source system data. These ETL processes are not designed to recognise nor deal with such shifts in data shape. This chapter proposes the use of data profiling techniques as a means of early discovery of issues and changes within the source system data and examines how this knowledge can be applied to guard against reductions in the decision making capability and effectiveness of an organisation caused by interruptions to business intelligence system availability or compromised data quality. It does so by examining issues such as where profiling can be best be applied to get appropriate benefit and value, the techniques of establishing profiling, and the types of actions that may be taken once the results of profiling are available. The chapter describes components able to be drawn together to provide a system of control that can be applied around a business intelligence system to enhance the quality of organisational decision making through monitoring the characteristics of arriving data and taking action when values are materially different than those expected.
Chapter Preview

Main Focus Of The Chapter

Problem Statement

Business intelligence and decision support systems often have fragility in the extract transform and load (ETL) processes that are used to bring data into the data repositories underpinning business intelligence systems. When such processes encounter problems the decision making ability of an organisation is compromised to some degree, either through reduced speed / agility, quality or some combination of the two. It is not uncommon for such circumstances to trigger expensive investigations to uncover the root cause of the problem before designing and implementing a means of rectification. In essence these are reactive data profiling, discovery and data quality rule establishment undertakings. Often such circumstances arise as the result of subtle (or not so subtle) shifts in the profile of the data arriving at the warehouse, but they can also occur as a result of incomplete or immature understanding of the data or the business rules at the time of the design and original implementation of the ETL processes.

Complete Chapter List

Search this Book: