Article Preview
TopIntroduction
Major technological developments and innovations in the field of information technology have made it easy for organizations to store a huge amount of data within its affordable limit. Data mining techniques come in handy to extract useful information for strategic decision making from voluminous data which is either centralized or distributed (Agrawal & Srikant, 1994; Han & Kamber, 2001).
The term data mining refers to extracting or mining knowledge from a massive amount of data. Data mining functionalities like association rule mining, cluster analysis, classification, prediction etc. specify the different kinds of patterns mined. Association Rule Mining (ARM) finds interesting association or correlation among a large set of data items. Finding association rules among huge amount of business transactions can help in making many business decisions such as catalog design, cross marketing, etc. A best example of ARM is market basket analysis. This is the process of analyzing the customer buying habits from the association between the different items which is available in the shopping baskets. This analysis can help retailers to develop marketing strategies. ARM involves two stages
Association Rule Mining: Basic Concepts
Let I = {i1,i2…im} be a set of m distinct items. Let D denote a database of transactions where each transaction T is a set of items such that T ⊆ I. Each transaction has a unique identifier, called TID. A set of item is referred to as an itemset. An itemset that contains k items is a k-itemset. Support of an itemset is defined as the ratio of the number of occurrences of the itemset in the data source to the total number of transactions in the data source. Support shows the frequency of occurrence of an itemset. The itemset X is said to have a support s if s% of transactions contain X. The support of an association rule X→Y is given bySupport = (Number of transactions containing X U Y) / (Total number of Transactions)where X is the antecedent and Y is the consequent
An itemset is said to be frequent when the number of occurrences of that particular itemset in the database is larger than a user-specified minimum support. Confidence shows the strength of the relation. The confidence of an association rule is given by,
Confidence = (Number of transactions containing X U Y) / (Total number of Transactions containing X)
An association rule is said to be strong when its confidence is larger than a user-specified minimum confidence. Association rules with support and confidence above the minimum support and minimum confidence alone are mined. Many algorithms have been proposed for frequent itemsets generation. They are Apriori, Pincer search, Frequent pattern tree, etc. (Agrawal & Srikant, 1994; Lin & Kedem, 2002; Han, Pei, Yin & Mao, 2004).