Shopping Cart | Login | Register | Language: English

Post-Mining of Association Rules: Techniques for Effective Knowledge Extraction

Release Date: May, 2009. Copyright © 2009. 394 pages.
Select a Format:
Hardcover
$156.00
List Price: $195.00
Current Promotions:
20% Online Bookstore Discount*
In Stock. Have it as soon as Apr. 29 with express shipping*.
DOI: 10.4018/978-1-60566-404-0, ISBN13: 9781605664040, ISBN10: 1605664049, EISBN13: 9781605664057
Cite Book

MLA

Zhao, Yanchang, Chengqi Zhang and Longbing Cao. "Post-Mining of Association Rules: Techniques for Effective Knowledge Extraction." IGI Global, 2009. 1-394. Web. 24 Apr. 2014. doi:10.4018/978-1-60566-404-0

APA

Zhao, Y., Zhang, C., & Cao, L. (2009). Post-Mining of Association Rules: Techniques for Effective Knowledge Extraction (pp. 1-394). Hershey, PA: IGI Global. doi:10.4018/978-1-60566-404-0

Chicago

Zhao, Yanchang, Chengqi Zhang and Longbing Cao. "Post-Mining of Association Rules: Techniques for Effective Knowledge Extraction." 1-394 (2009), accessed April 24, 2014. doi:10.4018/978-1-60566-404-0

Export Reference

Mendeley
Favorite
Post-Mining of Association Rules: Techniques for Effective Knowledge Extraction
Access on Platform
More Information
Browse by Subject
Top

Description

There is often a large number of association rules discovered in data mining practice, making it difficult for users to identify those that are of particular interest to them. Therefore, it is important to remove insignificant rules and prune redundancy as well as summarize, visualize, and post-mine the discovered rules.

Post-Mining of Association Rules: Techniques for Effective Knowledge Extraction provides a systematic collection on post-mining, summarization and presentation of association rules, and new forms of association rules. This book presents researchers, practitioners, and academicians with tools to extract useful and actionable knowledge after discovering a large number of association rules.

Top

Table of Contents and List of Contributors

Search this Book: Reset
Chapter 1
Paul D. McNicholas, Yanchang Zhao
Association rules present one of the most versatile techniques for the analysis of binary data, with applications in areas as diverse as retail... Sample PDF
Association Rules: An Overview
$37.50
Chapter 2
Mirko Boettcher, Georg Ruß, Detlef Nauck, Rudolf Kruse
Association rule mining typically produces large numbers of rules, thereby creating a second-order data mining problem: which of the generated rules... Sample PDF
From Change Mining to Relevance Feedback: A Unified View on Assessing Rule Interestingness
$37.50
Chapter 3
Solange Oliveira Rezende, Edson Augusto Melanda, Magaly Lika Fujimoto, Roberta Akemi Sinoara, Veronica Oliveira de Carvalho
Association rule mining is a data mining task that is applied in several real problems. However, due to the huge number of association rules that... Sample PDF
Combining Data-Driven and User-Driven Evaluation Measures to Identify Interesting Rules
$37.50
Chapter 4
Julien Blanchard, Fabrice Guillet, Pascale Kuntz
Assessing rules with interestingness measures is the cornerstone of successful applications of association rule discovery. However, as numerous... Sample PDF
Semantics-Based Classification of Rule Interestingness Measures
$37.50
Chapter 5
Huawen Liu, Jigui Sun, Huijie Zhang
In data mining, rule management is getting more and more important. Usually, a large number of rules will be induced from large databases in many... Sample PDF
Post-Processing for Rule Reduction Using Closed Set
$37.50
Chapter 6
Hacène Cherfi, Amedeo Napoli, Yannick Toussaint
A text mining process using association rules generates a very large number of rules. According to experts of the domain, most of these rules... Sample PDF
A Conformity Measure Using Background Knowledge for Association Rules: Application to Text Mining
$37.50
Chapter 7
Hetal Thakkar, Barzan Mozafari, Carlo Zaniolo
The real-time (or just-on-time) requirement associated with online association rule mining implies the need to expedite the analysis and validation... Sample PDF
Continuous Post-Mining of Association Rules in a Data Stream Management System
$37.50
Chapter 8
Ronaldo Cristiano Prati
Receiver Operating Characteristics (ROC) graph is a popular way of assessing the performance of classification rules. However, as such graphs are... Sample PDF
QROC: A Variation of ROC Space to Analyze Item Set Costs/Benefits in Association Rules
$37.50
Chapter 9
Maria-Luiza Antonie, David Chodos, Osmar Zaïane
The chapter introduces the associative classifier, a classification model based on association rules, and describes the three phases of the model... Sample PDF
Variations on Associative Classifiers and Classification Results Analyses
$37.50
Chapter 10
Silvia Chiusano, Paolo Garza
In this chapter the authors make a comparative study of five well-known classification rule pruning methods with the aim of understanding their... Sample PDF
Selection of High Quality Rules in Associative Classification
$37.50
Chapter 11
Sadok Ben Yahia, Olivier Couturier, Tarek Hamrouni, Engelbert Mephu Nguifo
Providing efficient and easy-to-use graphical tools to users is a promising challenge of data mining, especially in the case of association rules.... Sample PDF
Meta-Knowledge Based Approach for an Interactive Visualization of Large Amounts of Association Rules
$37.50
Chapter 12
Claudio Haruo Yamamoto, Maria Cristina Ferreira de Oliveira, Solange Oliveira Rezende
Miners face many challenges when dealing with association rule mining tasks, such as defining proper parameters for the algorithm, handling sets of... Sample PDF
Visualization to Assist the Generation and Exploration of Association Rules
$37.50
Chapter 13
Nicolas Pasquier
After more than one decade of researches on association rule mining, efficient and scalable techniques for the discovery of relevant association... Sample PDF
Frequent Closed Itemsets Based Condensed Representations for Association Rules
$37.50
Chapter 14
Mengling Feng, Jinyan Li, Guozhu Dong, Limsoon Wong
This chapter surveys the maintenance of frequent patterns in transaction datasets. It is written to be accessible to researchers familiar with the... Sample PDF
Maintenance of Frequent Patterns: A Survey
$37.50
Chapter 15
Guozhu Dong, Jinyan Li, Guimei Liu, Limsoon Wong
This chapter considers the problem of “conditional contrast pattern mining.” It is related to contrast mining, where one considers the mining of... Sample PDF
Mining Conditional Contrast Patterns
$37.50
Chapter 16
Qinrong Feng, Duoqian Miao, Ruizhi Wang
Decision rules mining is an important technique in machine learning and data mining, it has been studied intensively during the past few years.... Sample PDF
Multidimensional Model-Based Decision Rules Mining
$37.50
Top

Reviews and Testimonials

This book examines the post-analysis and post-mining of association rules to find useful knowledge from a large number of discovered rules and presents a systematic view of the above topic.

– Yanchang Zhao, University of Technology Sydney, Australia

This work presents recent research on reducing the number of association rules after association mining exercises.

– Book News (August 2009)
Top

Topics Covered

  • Association rules
  • Background knowledge for association
  • Classification results analyses
  • Data stream management system
  • Maintenance of association rules
  • Meta-knowledge based approach
  • New forms of association rules
  • Post-mining of association rules
  • Semantics-based classification
  • Variations on associative classifiers
Top

Preface

Summary

This book examines the post-analysis and post-mining of association rules to find useful knowledge from a large number of discovered rules and presents a systematic view of the above topic. It introduces up-to-date research on extracting useful knowledge from a large number of discovered association rules, and covers interestingness, post-mining, rule selection, summarization, representation and visualization of association rules, as well as new forms of association rules and new trends of association rule mining.

Background

As one of the key techniques for data mining, association rule mining was first proposed in 1993, and is today widely used in many applications. An association rule is designed in the form of A„³B, where A and B are items or itemsets, e.g., beer„³diaper.

There are often a huge number of association rules discovered from a dataset, and it is sometimes very difficult for a user to identify interesting and useful ones. Therefore, it is important to remove insignificant rules, prune redundancy, summarize, post-mine and visualize the discovered rules. Moreover, the discovered association rules are in the simple form of A„³B, from which the information we can get is very limited. Some recent research has focused on new forms of association rules, such as combined association rules, class association rules, quantitative association rules, contrast patterns and multi-dimensional association rules.

Although there have already been a quite few publications on the post-analysis and post-mining of association rules, there are no books specifically on the above topic. Therefore, we have edited this book to provide a collection of work on the post-mining of association rules and present a whole picture of the post-mining stage of association rule mining.

Objectives and significance

The objectives of this book are to emphasize the importance of post-mining of association rules, to show a whole picture on the post-mining of association rules, and to present the up-to-date progress of the research on how to extract useful knowledge from a large number of discovered association rules.

The unique characteristic of this book is the comprehensive collection of the current research on post-mining and summarization of association rules and new trends of association rules. It aims to answer the question we have discovered many association rules, and so what? It presents readers what we can do or shall do to extract useful and actionable knowledge after discovering a large number of association rules, instead of algorithms or models for mining association rules themselves. It presents academia a whole picture of the current research progress on post-mining and summarization of association rules. It may help industry to learn from the ideas and apply them to find useful and actionable knowledge in real-world applications. This book also aims to expand the research on association rules to new areas, such as new forms of association rules. The ideas of post-analysis may also be used in the step of association rule mining and help to make new efficient algorithms for mining more useful association rules.

Target audiences

This book is aimed at researchers, postgraduate students and practitioners in the field of data mining. For researchers whose interests include data mining, this book presents them with a survey of techniques for post-mining of association rules, the up-to-date research progress and the emerging trends/directions in this area. It may spark new ideas on applying other techniques in data mining, machine learning, statistics, etc., to the post-mining phase of association rules, or using the post-mining techniques for association rules to tackle the problems in other fields.

For postgraduate students who are interested in data mining, this book presents an overview of association rule techniques and introduces the origin, interestingness, redundancy, visualization and maintenance of association rules, as well as associative classification and new forms of association rules. It presents not only the post-mining stage of association rules, but also many techniques that are actually used in association rule mining procedure.

For data miners from industry, this book provides techniques and methodologies for extracting useful and interesting knowledge from a huge number of association rules learned in a data mining practice. It presents a whole picture of what to do after association rule mining and advanced techniques to post-mine the learned rules. Moreover, it also presents a number of real-life case studies and applications, which may help data miners to design and develop their own data mining projects.

However, the audiences are not limited to those interested in association rules, because the post-mining of association rules involves visualization, clustering, classification and many other techniques of data mining, statistics and machine learning, which are actually beyond association rule mining itself.

Organization

This book is composed of six parts. Part I gives an introduction to association rules and the current research in the related topics, including the preliminary of association rules and the classic algorithms for association rule mining. Part II presents three techniques on using interestingness measures to select useful association rules. Part III presents four techniques for the post-processing of associations. Part IV presents two techniques for selecting high quality rules for associative classification. Part V discusses three techniques for visualization and representation of association rules. Part VI presents the maintenance of association rules and new forms of rules.

Part I presents an introduction to association rule techniques. In Chapter 1, McNicholas and Zhao discuss the origin of association rules and the functions by which association rules are traditionally characterised. The formal definition of an association rule, and its support, confidence and lift are presented, and the techniques for rule generation are introduced. It also discusses negations and negative association rules, rule pruning, the measures of interestingness, and the post-mining stage of the association rule paradigm.

Part II studies how to identify interesting rules. In Chapter 2, Boettcher et al. presented a unified view on assessing rule interestingness with the combination of rule change mining and relevance feedback. Rule change mining extends standard association rule mining by generating potentially interesting time-dependent features for an association rule during post-mining, and the existing textual description of a rule and those newly derived objective features are combined by using relevance feedback methods from information retrieval. The proposed technique yields a powerful, intuitive way for exploring the typically vast set of association rules.

Chapter 3 by Rezende et al. presents a new methodology for combining data-driven and user-driven evaluation measures to identify interesting rules. Both data-driven (or objective measures) and user-driven (or subjective measures) are discussed and then analyzed for their pros and cons. With the proposed new methodology, data-driven measures can be used to select some potentially interesting rules for the user's evaluation, and the rules and the knowledge obtained during the evaluation can be employed to calculate user-driven measures for identifying interesting rules.

Blanchard et al. present a semantics-based classification of rule interestingness measures in Chapter 4. They propose a novel and useful classification of interestingness measures according to three criteria: the subject, the scope, and the nature of the measure. These criteria are essential to grasp the meaning of the measures, and therefore to help the users to choose the ones he/she wants to apply. Moreover, the classification allows one to compare the rules to closely related concepts such as similarities, implications, and equivalences.

Part III presents four techniques on post-analysis and post-mining of association rules. Chapter 5 by Liu et al. presents a technique on post-processing for rule reduction using closed set. Superfluous rules are filtered out from knowledge base in a post-processing manner. With dependent relation discovered by closed set mining technique, redundant rules can be eliminated efficiently.

In Chapter 6, Cherfi et al. present a new technique to combine data mining and semantic techniques for post-mining and selection of association rules. To focus on the result interpretation and discover new knowledge units, they introduce an original approach to classify association rules according to qualitative criteria using domain model as background knowledge. Its successful application on text mining in molecular biology shows the benefits of taking into account a knowledge domain model of the data.

In the case of stream data, the post-mining of association is more challenging. Chapter 7 by Thakkar et al. present a technique for continuous post-mining of association rules in a data stream management system. The chapter describes the architecture and techniques used to achieve this advanced functionality in the Stream Mill Miner (SMM) prototype, an SQL-based DSMS designed to support continuous mining queries.

The Receiver Operating Characteristics (ROC) graph is a popular way of assessing the performance of classification rules, but they are inappropriate to evaluate the quality of association rules, as there is no class in association rule mining and the consequent part of two different association rules might not have any correlation at all. Prati presents in Chapter 8 a novel technique of QROC, a variation of ROC space to analyze itemset costs/benefits in association rules. It can be used to help analysts to evaluate the relative interestingness among different association rules in different cost scenarios.

Part IV presents rule selection techniques for classification. Chapter 9 by Antonie et al. presents the rule generation, pruning and selection in associative classifier, which is a classification model based on association rules. Several variations on the associative classifier model are presented, which are mining data sets with re-occurring items, using negative association rules, and pruning rules using graph-based techniques. They also present a system, ARC-UI, that allows a user to analyze the results of classifying an item using an associative classifier.

In Chapter 10, Chiusano and Garza discuss the selection of high quality rules in associative classification. They present a comparative study of five well-known classification rule pruning methods and analyze the characteristics of both the selected and pruned rule sets in terms of information content. A large set of experiments has been run to empirically evaluate the effect of the pruning methods when applied individually as well as when combined.

Part V presents the visualization and representation techniques for the presentation and exploration of association rules. In Chapter 11, Yahia et al. present two meta-knowledge based approaches for an interactive visualization of large amounts of association rules. Different from traditional methods of association rule visualization where association rule extraction and visualization are treated separately in a one-way process, the two proposed approaches that use meta-knowledge to guide the user during the mining process in an integrated framework covering both steps of the data mining process. The first one builds a roadmap of compact representation of association rules from which the user can explore generic bases of association rules and derive, if desired, redundant ones without information loss. The second approach clusters the set of association rules or its generic bases, and uses a fisheye view technique to help the user during the mining of association rules.

Chapter 12 by Yamamoto et al. also discusses the visualization techniques to assist the generation and exploration of association rules. It presents an overview of the many approaches on using visual representations and information visualization techniques to assist association rule mining. A classification of the different approaches that rely on visual representations is introduced, based on the role played by the visualization technique in the exploration of rule sets. A methodology that supports visually assisted selective generation of association rules based on identifying clusters of similar itemsets is also presented. Then, a case study and some trends/issues for further developments are presented.

Pasquier presents in Chapter 13 frequent closed itemset based condensed representations for association rules. Many applications of association rules to data from different domains have shown that techniques for filtering irrelevant and useless association rules are required to simplify their interpretation by the end-user. This chapter focuses on condensed representations that are characterized in the frequent closed itemsets framework to expose their advantages and drawbacks.

Part VI present techniques on the maintenance of association rules and new forms of association rules. Chapter 14 by Feng et al. presents a survey of the techniques for the maintenance of frequent patterns. The frequent pattern maintenance problem is summarized with a study on how the space of frequent patterns evolves in response to data updates. Focusing on incremental and decremental maintenance, four major types of maintenance algorithms are introduced, and the advantages and limitations of these algorithms are studied from both the theoretical and experimental perspectives. Possible solutions to certain limitations and potential research opportunities and emerging trends in frequent pattern maintenance are also discussed.

Conditional contrast patterns are designed by Dong et al. in Chapter 15. It is related to contrast mining, where one considers the mining of patterns/models that contrast two or more datasets, classes, conditions, time periods, etc. Roughly speaking, conditional contrasts capture situations where a small change in patterns is associated with a big change in the matching data of the patterns. It offers insights on ¡§discriminating¡¨ patterns for a given condition. It can also be viewed as a new direction for the analysis and mining of frequent patterns. The chapter formalizes the concepts of conditional contrast and provides theoretical results on conditional contrast mining. An efficient algorithm is proposed based on the results and experiment results are reported.

In Chapter 16, Feng et al. present a technique for multidimensional model-based decision rules mining, which can output generalized rules with different degree of generalization. A method of decision rules mining from different abstract levels is provided in the chapter, which aims to improve the efficiency of decision rules mining by combining the hierarchical structure of multidimensional model and the techniques of rough set theory.

Impacts and contributions

By collecting the research on the post-mining, summarization and presentation of association rule, as well as new forms and trends of association rules, this book shows the advanced techniques for the post-processing stage of association rules and presents readers what can be done to extract useful and actionable knowledge after discovering a large number of association rules. It will foster the research in the above topic and will benefit the use of association rule mining in real world applications. The reader can develop a clear picture on what can be done after discovering many association rules to extract useful knowledge and actionable patterns. Readers from industry can benefit by discovering how to deal with the large number of rules discovered and how to summarize or visualize the discovered rules to make them applicable in business applications. As editors, we hope this book will encourage more research into this area, stimulate new ideas on the related topics, and lead to implementations of the presented techniques in real-world applications.

Acknowledgements

This book dates back all the way to August 2007, when our book prospectus was submitted to IGI Global as a response to the Data Mining Techniques Call 2007. After its approval, this project began from October 2007 and ended in October 2008. During the process, more than one thousand emails have been sent and received, interacting with authors, reviewers, advisory board members and IGI team. We also received a lot of support from colleagues, researchers and the development team from IGI Global. We would like to take this opportunity to thank them for their unreserved help and support.

Firstly, we would like to thank the authors for their excellent work and formatting by following the guidelines closely. Some authors also took the painful procedure to convert their manuscripts from LaTex to WORD format as required. We are grateful for their patience and quick response to our many requests.

We also greatly appreciate the efforts of the reviewers, for responding on time, their constructive comments and helpful suggestions in the detailed review reports. Their work helped the authors to improve their manuscripts and also helped us to select high-quality papers as the book chapters.

Our thanks go to the members of the Editorial Advisory Board, Prof. Jean-Francois Boulicaut, Prof. Ramamohanarao Kotagiri, Prof. Jian Pei, Prof. Jaideep Srivastava and Prof. Philip S. Yu. Their insightful comments and suggestions helped to make the book coherent and consistent.

We would like to thank the IGI Global team for their supports throughout the one-year book development. We thank Ms. Julia Mosemann for her comments, suggestions and supports, which ensured the completion of this book within the planned timeframe.

We also thank Ms. Kristin M. Klinger and Ms. Jan Travers for their help on our book proposal and project contract.

We would also like to express our gratitude to our colleagues for their support and comments on this book and for their encouragement during the book editing procedure.

Last but not least, we would like to thank Australian Research Council (ARC) for the grant on a Linkage Project (LP0775041), and University of Technology, Sydney (UTS), Australia for the Early Career Researcher Grant, which supported our research in the past two years.

Top

Author(s)/Editor(s) Biography

Yanchang Zhao is a Postdoctoral Research Fellow in Data Sciences & Knowledge Discovery Research Lab, Centre for Quantum Computation and Intelligent Systems, Faculty of Engineering & IT, University of Technology, Sydney, Australia. His research interests focus on association rules, sequential patterns, clustering and post-mining. He has published more than 30 papers on the above topics, including six journal articles and two book chapters. He served as a chair of two international workshops, and a program committee member for 11 international conferences and a reviewer for 8 international journals and over a dozen of international conferences.

Chengqi Zhang is a Research Professor in Faculty of Engineering & IT, University of Technology, Sydney (Australia). He is the director of the Director of UTS Research Centre for Quantum Computation and Intelligent Systems and a Chief Investigator in Data Mining Program for Australian Capital Markets on Cooperative Research Centre. He has been a chief investigator of eight research projects. His research interests include Data Mining and Multi-Agent Systems. He is a co-author of three monographs, a co-editor of nine books, and an author or co-author of more than 150 research papers. He is the chair of the ACS (Australian Computer Society) National Committee for Artificial Intelligence and Expert Systems, a chair/member of the Steering Committee for three international conference.

Longbing Cao is an Associate Professor in Faculty of Engineering & IT, University of Technology, Sydney (Australia). He is the Director of Data Sciences & Knowledge Discovery Research Lab. His research interest focuses on domain driven data mining, multi-agents, and the integration of agent and data mining. He is a chief investigator of two ARC (Australian Research Council) Discovery projects and one ARC Linkage project. He has over 50 publications, including one monograph, two edited books and 10 journal articles. He is a program co-chair of 11 international conferences.

Top

Editorial Board

  • Jean-Francois Boulicaut, Institut National des Sciences Appliquées de Lyon, France
  • Ramamohanarao Kotagiri, The University of Melbourne, Australia
  • Jian Pei, Simon Fraser University, Canada
  • Jaideep Srivastava, University of Minnesota, USA
  • Philip S. Yu, University of Illinois at Chicago, USA