An Efficient Approach for Incremental Association Rule Mining through Histogram Matching Technique

An Efficient Approach for Incremental Association Rule Mining through Histogram Matching Technique

Ajay Kumar (Department of Computer Science & Engineering, Jaypee University of Engineering & Technology, Guna, India), Shishir Kumar (Department of Computer Science & Engineering, Jaypee University of Engineering & Technology, Guna, India) and Sakshi Saxena (Department of Computer Science & Engineering, Jaypee University of Engineering & Technology, Guna, India)
Copyright: © 2012 |Pages: 14
DOI: 10.4018/ijirr.2012040103
OnDemand PDF Download:
$37.50

Abstract

The objective of the work being presented is to propose an approach for obtaining appropriate association rules when the data set is being incrementally updated. During this process raw data is clustered by K-mean Clustering Algorithm and appropriate rules are generated for each cluster. Further, a histogram and probability density function are also generated for each cluster. When Burst data set is coming to the system, initially the histogram and probability density function of this new data set are obtained. The new data set has to be added to the cluster whose histogram and probability density functions are almost similar. The proposed method is evaluated and explained on synthetic data.
Article Preview

2. Preliminary Definitions

Data mining is the process of extracting interesting (non-trivial, implicit, previously unknown and potentially useful) information or patterns from large information repositories and it is the core process of Knowledge Discovery in Database (KDD) (Zhao & Bhowmick, 2003). Data mining techniques include association rule mining, classification, clustering, mining time series, and sequential pattern mining, to name a few, with association rules mining receiving a significant research attention (Huang, Dai, & Chen, 2007).

Let at any given instant of time ‘t’ there is a transaction dataset D. Here D consists of a set of transactions and each transaction is identified with a transaction identifier, TID. Now let us assume that at time ‘t+l’, the database is updated such that a set of transaction D+ is added to the original database D and a set of transaction D- is deleted from the dataset. So the resulting modified database D* can be represented as: D* = (D U D+) – D-.

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 7: 4 Issues (2017)
Volume 6: 4 Issues (2016)
Volume 5: 4 Issues (2015)
Volume 4: 4 Issues (2014)
Volume 3: 4 Issues (2013)
Volume 2: 4 Issues (2012)
Volume 1: 4 Issues (2011)
View Complete Journal Contents Listing