Multi-Relational Data Mining A Comprehensive Survey

Multi-Relational Data Mining A Comprehensive Survey

Ali H. Gazala (Auckland University of Technology, New Zealand) and Waseem Ahmad (Auckland University of Technology, New Zealand)
DOI: 10.4018/978-1-4666-8513-0.ch003
OnDemand PDF Download:
List Price: $37.50


Multi-Relational Data Mining or MRDM is a growing research area focuses on discovering hidden patterns and useful knowledge from relational databases. While the vast majority of data mining algorithms and techniques look for patterns in a flat single-table data representation, the sub-domain of MRDM looks for patterns that involve multiple tables (relations) from a relational database. This sub-domain has received an increased research attention during the last two decades due to the wide range of possible applications. As a result of that growing attention, many successful multi-relational data mining algorithms and techniques were presented. This chapter presents a comprehensive review about multi-relational data mining. It discusses the different approaches researchers have followed to explore the relational search space while highlighting some of the most significant challenges facing researchers working in this sub-domain. The chapter also describes number of MRDM systems that have been developed during the last few years and discusses some future research directions in this sub-domain.
Chapter Preview

2.0 Multi-Relational Data Mining

Multi relational data mining sub-domain has emerged to develop novel algorithms and techniques designed to work in relational database environments. During the last two decades, two different approaches have risen to address the challenging task of mining multi-relational structural data; propositionalization and upgrading.

Propositionalization is a process that leads from relational data and background knowledge to a single-table representation (Krogel, 2005). The literature of MRDM suggests two different techniques to achieve propositionalization; joining and aggregation. The joining technique is to use SQL join commands to link all related tables together in a single view, while aggregation technique is to summarize the raw data from multiple relations into a single compact table using SQL grouping commands. Both techniques will generate a universal table that is usually large and contain multiple nulls and redundant values. In addition to that, joining and aggregation techniques cause the loss of important semantic information that is usually represented in relational structure. Although propositionalization techniques are widely used in the domain of KDD, it also has significant drawbacks in terms of data representation and pre-processing cost.

Complete Chapter List

Search this Book: