A Discovery Method of Attractive Rules from the Tabular Structured Data

A Discovery Method of Attractive Rules from the Tabular Structured Data

Shigeaki Sakurai (Tokyo Institute of Technology, Japan)
DOI: 10.4018/978-1-4666-1806-0.ch001
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

This chapter introduces a discovery method of attractive rules from the tabular structured data. The data is a set of examples composed of attributes and their attribute values. The method is included in the research field discovering frequent patterns from transactions composed of items. Here, the transaction and the item are a receipt and a sales item in the case of the retail business. The method focuses on relationships between the attributes and the attribute values in order to efficiently discover patterns based on their frequencies from the tabular structured data. Also, the method needs to deal with missing values. This is because parts of attribute values are missing due to the problems of data collection and data storage. Thus, this chapter introduces a method dealing with the missing values. The method defines two evaluation criteria related to the patterns and introduces a method that discovers the patterns based on the two-stepwise evaluation method. In addition, this chapter introduces evaluation criteria of the attractive rules in order to discover the rules from the patterns.
Chapter Preview
Top

Background

Basket analysis of receipts collected from the retail business is the origin of the discovery of both frequent patterns and association rules. The rules are usually extracted from the discovered patterns. Therefore, it is important to efficiently discover the patterns. Each receipt is defined as a transaction in the analysis. Then, each transaction is composed of some items such sales items in the retail business. Each item is regarded as either of two cases in the transaction. That is, one case shows that the item is included in the transaction and the other case shows that the item is not included. Agrawal and Srikant (1994) and Han et al. (2000) propose representative discovery methods of frequent patterns using the monotonic property of the patterns. The property shows that if the pattern grows, its evaluation criterion monotonically decreases. It is called the Apriori property. On the other hand, Morzy and Zakrzewicz (1998) and Zaki et al. (1997) propose methods that speedily discover the patterns by devising storage methods for the data. Also, Koh et al. (2005) proposes a method that discovers association rules with low support but high confidence. Here, the support and the confidence are evaluation criteria of the patterns and the rules. Yan et al. (2005) proposes a discovery method of association rules based on a genetic method. The method regards the combination of an association rule including k items and the number of items in the condition part as a chromosome. It genetically discovers better association rules by using three genetic operators: select, crossover, and mutation.

Complete Chapter List

Search this Book:
Reset