iVAS: An Interactive Visual Analytic System for Frequent Set Mining

iVAS: An Interactive Visual Analytic System for Frequent Set Mining

Carson Kai-Sang Leung (The University of Manitoba, Canada) and Christopher Carmichael (The University of Manitoba, Canada)
DOI: 10.4018/978-1-60960-102-7.ch013
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

Nowadays, various data, text, and web mining applications can easily generate large volumes of data. Embedded within these data is previously unknown and potential useful knowledge such as frequently occurring sets of items, merchandise, or events. Hence, numerous algorithms have been proposed for finding these frequent sets, which are usually presented in a lengthy textual list. However, “a picture is worth a thousand words”. The use of visual representations can enhance user understanding of the inherent relations among the frequent sets. Although a few visualizers have been developed, most of them were not designed for visualizing the mined frequent sets. In this chapter, an interactive visual analytic system called iVAS is proposed for providing visual analytic solutions to the frequent set mining problem. The system enables the visualization and advanced analysis of the original transaction databases as well as the frequent sets mined from these databases.
Chapter Preview
Top

Introduction

Due to advances in technology, large volumes of data can be easily generated. Examples of these data include structured data in relational or transactional databases, as well as semi-structured data in text documents or the World Wide Web. Embedded within these data is potentially useful knowledge that professionals, researchers, students, and practitioners want to discover. This calls for data mining (Frawley et al., 1991), which aims to search for implicit, previously unknown and potential useful information or knowledge from large volumes of data. A common data mining task is frequent set mining (Agrawal et al., 1993), and it analyzes the data to find frequently occurring sets of items. These frequent sets serve as building blocks for many other data mining tasks such as the mining of association rules, correlation, sequences, episodes, emerging patterns, web access patterns, maximal frequent patterns, closed frequent patterns, and constrained patterns (Agrawal & Srikank, 1994; Bayardo, 1998; Pasquier et al., 1999; Pei et al., 2000; Lakshmanan et al., 2003; Leung et al., 2007; Leung, 2009). Moreover, these frequently occurring sets of items can be used in the mining tasks like classification (e.g., associative classification (Liu, 2009)). Frequent sets can also answer many questions that help users to make important decisions for real-life applications in different domains such as health care, bioinformatics, social science, as well as business. For example, knowing the sets of frequently purchased merchandise helps store managers to make intelligent business decisions like item shelving, finding the sets of popular elective courses helps students to select the combination of courses they wish to take, and discovering the sets of frequently occurring patterns in genes helps professionals and researchers to get a better understanding of certain biomedical or social behaviours of human beings.

As frequent set mining has played important roles in many data mining tasks and has contributed to various real-life applications, it has drawn attention of many researchers. This explains why numerous frequent set mining algorithms (Han et al., 2007; Cheng & Han, 2009) have been proposed since the introduction of the frequent set mining problem (Agrawal et al., 1993). Most of these algorithms return the mining results in textual forms such as a very long unsorted list of frequent sets of items. However, presenting a large number of frequent sets in such a conventional lengthy list does not lead to ease of understanding. As a result, users may not easily discover the useful knowledge that is embedded in the large volumes of data.

It is well known that “a picture is worth a thousand words”. As visual representation matches the power of the human visual and cognitive system, having a visual representation of the frequent sets makes it easier for users (e.g., professionals, researchers, students, practitioners) to view and analyze the mining results when compared to presenting a lengthy textual list of frequent sets of items. This leads to visual analytics, which is the science of analytical reasoning supported by interactive visual interfaces (Thomas & Cook, 2005; Keim et al., 2009). Since numerous frequent set mining algorithms (which analyze large volumes of data to find frequent sets of items) have been proposed, what we need are interactive systems for visualizing the mining results so that we could take advantages of both worlds (i.e., combine advanced data analysis with visualization).

Among the existing visualization systems, many of them were built to visualize data other than the mining results. For those that were built for visualizing the mining results, they mostly show the results for other data mining tasks—such as groups of similar objects (for clustering), decision trees (for classification), and rules (for association rule mining)—rather than frequently occurring sets of items (for frequent set mining). Hence, an objective of this chapter is to propose an interactivevisualanalyticsystem called iVAS for effective visualization and advanced analysis of large volumes of data and the frequent sets of items mined from these data.

Complete Chapter List

Search this Book:
Reset