Maintenance of Frequent Patterns: A Survey

Maintenance of Frequent Patterns: A Survey

Mengling Feng (Nanyang Technological University and National University of Singapore, Singapore), Jinyan Li (Nanyang Technological University, Singapore), Guozhu Dong (Wright State University, USA) and Limsoon Wong (Nanyang Technological University, Singapore)
DOI: 10.4018/978-1-60566-404-0.ch014
OnDemand PDF Download:
$37.50

Abstract

This chapter surveys the maintenance of frequent patterns in transaction datasets. It is written to be accessible to researchers familiar with the field of frequent pattern mining. The frequent pattern maintenance problem is summarized with a study on how the space of frequent patterns evolves in response to data updates. This chapter focuses on incremental and decremental maintenance. Four major types of maintenance algorithms are studied: Apriori-based, partition-based, prefix-tree-based, and conciserepresentation- based algorithms. The authors study the advantages and limitations of these algorithms from both the theoretical and experimental perspectives. Possible solutions to certain limitations are also proposed. In addition, some potential research opportunities and emerging trends in frequent pattern maintenance are also discussed.
Chapter Preview
Top

Preliminaries And Problem Description

Discovery of Frequent Patterns

Let I = {i1, i2, ..., im} be a set of distinct literals called ‘items’. A ‘pattern’, or an ‘itemset’, is a set of items. A ‘transaction’ is a non-empty set of items. A ‘dataset’ is a non-empty set of transactions. A pattern P is said to be contained or included in a transaction T if PT. A pattern P is said to be contained in a dataset D, denoted as PD, if there is TD such that PT. The ‘support count’ of a pattern P in a dataset D, denoted count(P,D), is the number of transactions in D that contain P. The ‘support’ of a pattern P in a dataset D, denoted sup(P,D), is calculated as sup(P,D) = count(P,D)/|D|. Figure 1(a) shows a sample dataset, and all the patterns contained in the sample dataset are enumerated in Figure 1(b) with their support counts.

Figure 1.

(a) An example of transaction dataset. (b) The space of frequent patterns for the sample dataset in (a) when ms%=25% and the concise representations of the space. (c) Decomposition of frequent pattern space into equivalence classes.

A pattern P is said to be frequent in a dataset D if sup(P,D) is greater than or equal to a pre-specified threshold ms%. Given a dataset D and a support threshold ms%, the collection of all frequent itemsets in D is called the ‘space of frequent patterns’, and is denoted by F(ms%, D). The task of frequent pattern mining is to discover all the patterns in the space of frequent patterns. In real-life applications, the size of the frequent pattern space is often tremendous. According to the definition, suppose the dataset has l distinct items, the size of the frequent pattern space can go up to 2l. To increase computational efficiency and reduce memory usage, concise representations are developed to summarize the frequent pattern space.

Complete Chapter List

Search this Book:
Reset
Editorial Advisory Board
Table of Contents
Foreword
David Bell
Acknowledgment
Yanchang Zhao, Chengqi Zhang, Longbing Cao
Chapter 1
Paul D. McNicholas, Yanchang Zhao
Association rules present one of the most versatile techniques for the analysis of binary data, with applications in areas as diverse as retail... Sample PDF
Association Rules: An Overview
$37.50
Chapter 2
Mirko Boettcher, Georg Ruß, Detlef Nauck, Rudolf Kruse
Association rule mining typically produces large numbers of rules, thereby creating a second-order data mining problem: which of the generated rules... Sample PDF
From Change Mining to Relevance Feedback: A Unified View on Assessing Rule Interestingness
$37.50
Chapter 3
Solange Oliveira Rezende, Edson Augusto Melanda, Magaly Lika Fujimoto, Roberta Akemi Sinoara, Veronica Oliveira de Carvalho
Association rule mining is a data mining task that is applied in several real problems. However, due to the huge number of association rules that... Sample PDF
Combining Data-Driven and User-Driven Evaluation Measures to Identify Interesting Rules
$37.50
Chapter 4
Julien Blanchard, Fabrice Guillet, Pascale Kuntz
Assessing rules with interestingness measures is the cornerstone of successful applications of association rule discovery. However, as numerous... Sample PDF
Semantics-Based Classification of Rule Interestingness Measures
$37.50
Chapter 5
Huawen Liu, Jigui Sun, Huijie Zhang
In data mining, rule management is getting more and more important. Usually, a large number of rules will be induced from large databases in many... Sample PDF
Post-Processing for Rule Reduction Using Closed Set
$37.50
Chapter 6
Hacène Cherfi, Amedeo Napoli, Yannick Toussaint
A text mining process using association rules generates a very large number of rules. According to experts of the domain, most of these rules... Sample PDF
A Conformity Measure Using Background Knowledge for Association Rules: Application to Text Mining
$37.50
Chapter 7
Hetal Thakkar, Barzan Mozafari, Carlo Zaniolo
The real-time (or just-on-time) requirement associated with online association rule mining implies the need to expedite the analysis and validation... Sample PDF
Continuous Post-Mining of Association Rules in a Data Stream Management System
$37.50
Chapter 8
Ronaldo Cristiano Prati
Receiver Operating Characteristics (ROC) graph is a popular way of assessing the performance of classification rules. However, as such graphs are... Sample PDF
QROC: A Variation of ROC Space to Analyze Item Set Costs/Benefits in Association Rules
$37.50
Chapter 9
Maria-Luiza Antonie, David Chodos, Osmar Zaïane
The chapter introduces the associative classifier, a classification model based on association rules, and describes the three phases of the model... Sample PDF
Variations on Associative Classifiers and Classification Results Analyses
$37.50
Chapter 10
Silvia Chiusano, Paolo Garza
In this chapter the authors make a comparative study of five well-known classification rule pruning methods with the aim of understanding their... Sample PDF
Selection of High Quality Rules in Associative Classification
$37.50
Chapter 11
Sadok Ben Yahia, Olivier Couturier, Tarek Hamrouni, Engelbert Mephu Nguifo
Providing efficient and easy-to-use graphical tools to users is a promising challenge of data mining, especially in the case of association rules.... Sample PDF
Meta-Knowledge Based Approach for an Interactive Visualization of Large Amounts of Association Rules
$37.50
Chapter 12
Claudio Haruo Yamamoto, Maria Cristina Ferreira de Oliveira, Solange Oliveira Rezende
Miners face many challenges when dealing with association rule mining tasks, such as defining proper parameters for the algorithm, handling sets of... Sample PDF
Visualization to Assist the Generation and Exploration of Association Rules
$37.50
Chapter 13
Nicolas Pasquier
After more than one decade of researches on association rule mining, efficient and scalable techniques for the discovery of relevant association... Sample PDF
Frequent Closed Itemsets Based Condensed Representations for Association Rules
$37.50
Chapter 14
Mengling Feng, Jinyan Li, Guozhu Dong, Limsoon Wong
This chapter surveys the maintenance of frequent patterns in transaction datasets. It is written to be accessible to researchers familiar with the... Sample PDF
Maintenance of Frequent Patterns: A Survey
$37.50
Chapter 15
Guozhu Dong, Jinyan Li, Guimei Liu, Limsoon Wong
This chapter considers the problem of “conditional contrast pattern mining.” It is related to contrast mining, where one considers the mining of... Sample PDF
Mining Conditional Contrast Patterns
$37.50
Chapter 16
Qinrong Feng, Duoqian Miao, Ruizhi Wang
Decision rules mining is an important technique in machine learning and data mining, it has been studied intensively during the past few years.... Sample PDF
Multidimensional Model-Based Decision Rules Mining
$37.50
About the Contributors