AFARTICA: A Frequent Item-Set Mining Method Using Artificial Cell Division Algorithm

AFARTICA: A Frequent Item-Set Mining Method Using Artificial Cell Division Algorithm

Saubhik Paladhi (University of Kalyani, Kalyani, India), Sankhadeep Chatterjee (University of Calcutta, Kolkata, India), Takaaki Goto (Toyo University, Saitama, Japan) and Soumya Sen (University of Calcutta, Kolkata, India)
Copyright: © 2019 |Pages: 23
DOI: 10.4018/JDM.2019070104

Abstract

Frequent item-set mining has been exhaustively studied in the last decade. Several successful approaches have been made to identify the maximal frequent item-sets from a set of typical item-sets. The present work has introduced a novel pruning mechanism which has proved itself to be significant time efficient. The novel technique is based on the Artificial Cell Division (ACD) algorithm which has been found to be highly successful in solving tasks that involve a multi-way search of the search space. The necessity conditions of the ACD process have been modified accordingly to tackle the pruning procedure. The proposed algorithm has been compared with the apriori algorithm implemented in WEKA. Accurate experimental evaluation has been conducted and the experimental results have proved the superiority of AFARTICA over apriori algorithm. The results have also indicated that the proposed algorithm can lead to better performance when the support threshold value is more for the same set of item-sets.
Article Preview
Top

Introduction

Frequent item-set mining is considered to be one of the fundamental tasks of data mining apart from other frequent problems like the discovery of association rules, correlation mining, multidimensional pattern mining, etc. The essential problem of mining frequent item-sets is based on finding the frequent item-sets from a given set of items that occurs over a given threshold amount of times. The problem is challenging as it is to be performed over a large database in real time.

The target of data mining is to identify patterns in data by using various tools and algorithms (Rajagopalan & Krovi, 2002). Most of the data-mining algorithms designed to solve the problem is built upon the idea of Apriori algorithm (Agrawal et al., 1996). This algorithm uses a breadth first search strategy starting from the bottom of the search space to the top in order to find every possible frequent item-set. An enumeration of all possible subsets of length of the frequent pattern is found to be computationally hard in case of dense data. Recent research trends have focused on finding the maximal frequent item-sets to tackle the problems. The present work has been greatly influenced by the methodology of this algorithm. Zaki (Zaki, 2000; Zaki & Hsiao, 1999) carried experiments using both real and synthetic data with new efficient rule mining framework on closed set. Generation of rules only on demand and the method of reduction of redundant rules are found to be interesting.

Complete Article List

Search this Journal:
Reset
Open Access Articles
Volume 31: 4 Issues (2020): 1 Released, 3 Forthcoming
Volume 30: 4 Issues (2019)
Volume 29: 4 Issues (2018)
Volume 28: 4 Issues (2017)
Volume 27: 4 Issues (2016)
Volume 26: 4 Issues (2015)
Volume 25: 4 Issues (2014)
Volume 24: 4 Issues (2013)
Volume 23: 4 Issues (2012)
Volume 22: 4 Issues (2011)
Volume 21: 4 Issues (2010)
Volume 20: 4 Issues (2009)
Volume 19: 4 Issues (2008)
Volume 18: 4 Issues (2007)
Volume 17: 4 Issues (2006)
Volume 16: 4 Issues (2005)
Volume 15: 4 Issues (2004)
Volume 14: 4 Issues (2003)
Volume 13: 4 Issues (2002)
Volume 12: 4 Issues (2001)
Volume 11: 4 Issues (2000)
Volume 10: 4 Issues (1999)
Volume 9: 4 Issues (1998)
Volume 8: 4 Issues (1997)
Volume 7: 4 Issues (1996)
Volume 6: 4 Issues (1995)
Volume 5: 4 Issues (1994)
Volume 4: 4 Issues (1993)
Volume 3: 4 Issues (1992)
Volume 2: 4 Issues (1991)
Volume 1: 2 Issues (1990)
View Complete Journal Contents Listing