Article Preview
TopIntroduction
Data mining (Cios, Pedrycz, Swiniarski, & Kurgan, 2012; Cao & Zhang, 2006) is an influential tool for knowledge mining. Association rule mining (Liu, Zhai, & Pedrycz, 2012) is a well-known data mining technique. Association permits to incarcerate all possible rules, which signify the presence of a number of items in accordance with the presence of a few other items in the same transaction.
The inventive motivation behind association rule mining was market basket analysis to study the buying habits of customers (Agrawal, Imielinski, & Swami, 1993). In recent days, association rule mining has been widened to areas like medical diagnosis (Rajendran, & Madheswaran, 2010; Xing & Pei, 2010), network security (Wang & Bridges, 2000; Mao & Zhu, 2002; Sheikhan & Jadidi, 2009), geographical database (Koperski, & Han, 1995), biological database (Gupta, Mangal, Tiwari, & Mitra, 2006; Martinez, Pasquier, & Pasquier, 2008), stock market databases (Saradhi, Ram Prakash, Pavan Kumar, Rao, & Vijay, 2012), web mining (Chai & Li, 2010), misuse detection (Sheikhan & Jadidi, 2009) , manufacturing (Wantanabe, 2010) and electronic commerce (Natarajan & Sheka, 2005).
An association rule (Agrawal, Imielinski, & Swami, 1993) is an implication,
, where
, P and Q are set of items. Support and confidence are two primitive measures used for validating an association rule. The percentage of transaction containing both P and Q is defined as the support of the rule, whereas the ratio of the support of
and support of P is defined as the confidence of the rule. Thus the association rule mining problem is defined as “the discovery of all association rules satisfying user defined support and confidence.”
A good number of algorithms and methods (Brin, Motwani, & Silverstein, 1997; Han, Pei, & Yin, 2000; Park, Chen, & Yu, 1997; Srikant & Agrawal, 2000; Zhang & Zhang, 2001; Ayubi, Muyeba, Baraani-Dastjerdi, & Keane, 2009) are designed/developed for association rule mining. Majority of these algorithms are meant for handling Boolean data.
Usually, transactional data in real-world applications consists of quantitative data. As an approach to handle those quantitative data, partioning into intervals and treating each interval as a Boolean attribute is proposed (Srikant & Agrawal, 1995, 1996). In mining process, this discrete interval method would either discard or overemphasize the data points close to the boundary of the interval, called “sharp boundary problem.”
As a remedy, fuzzy association rules are proposed (Kouk, Fu, & Wong, 1998; Hong, Kuo, & Chi, 2001; Chen & Wei, 2002; Kaya, Alhajj, Polat, & Arslan, 2002; Muyeba, Khan, & Coenen, 2008) and became popular, as fuzzy set gives a soft transition between membership and non-membership of an item and hence very less boundary elements are excluded. Additionally, the linguistic variables like, “poor,” “moderate,” “rich,” that are used as fuzzy set make the association rule more interpretable. A comparison between fuzzy association rule and quantitative association rule can be found in Verlinde, De Cock, and Boute (2006)Hullermeier and Yi (2007).