Mining Top-k Regular High-Utility Itemsets in Transactional Databases

Mining Top-k Regular High-Utility Itemsets in Transactional Databases

P. Lalitha Kumari (National Institute of Technology, Warangal, India), S. G. Sanjeevi (National Institute of Technology, Warangal, India) and T.V. Madhusudhana Rao (Sri Sivani College Of Engineering, Srikakulam, India)
Copyright: © 2019 |Pages: 22
DOI: 10.4018/IJDWM.2019010104

Abstract

Mining high-utility itemsets is an important task in the area of data mining. It involves exponential mining space and returns a very large number of high-utility itemsets. In a real-time scenario, it is often sufficient to mine a small number of high-utility itemsets based on user-specified interestingness. Recently, the temporal regularity of an itemset is considered as an important interesting criterion for many applications. Methods for finding the regular high utility itemsets suffers from setting the threshold value. To address this problem, a novel algorithm called as TKRHU (Top k Regular High Utility Itemset) Miner is proposed to mine top-k high utility itemsets that appears regularly where k represents the desired number of regular high itemsets. A novel list structure RUL and efficient pruning techniques are developed to discover the top-k regular itemsets with high profit. Efficient pruning techniques are designed for reducing search space. Experimental results show that proposed algorithm using novel list structure achieves high efficiency in terms of runtime and space.
Article Preview

1. Introduction

Frequent Itemset mining can be considered as an important task in Association Rule Mining (ARM) (Agrawal, 1993; Agrawal, 1994). Some factors such as profit and quantity are not considered in ARM. An itemset with high frequency may not be interested, rather than users may be interested in itemsets with high profits for decision making. High-utility itemset mining is thus introduced to overcome some limitations in ARM. HUI can be thought of as an extension of frequency itemset mining by considering profits and quantities. Utility of an itemset can be found out by using profit and quantity. All the items in database are considered as equal importance in traditional association rule mining model. It checks whether an item is present in it or not. Frequent itemset mining (Agrawal et al., 1993; Agrawal et al., 1994), which identifies frequent itemsets in database, has a small portion of contribution towards overall profit, whereas non-frequent itemsets may contribute a large portion towards profit. Frequent itemsets mining uses threshold values such as support and confidence. Association rule mining algorithms generate many redundant rules as it uses the above thresholds. These redundant association rules are the main barrier to the efficient utilization of the association rules and should be removed to improve efficiency.

Several methods have been proposed to reduce the redundant association rules (Mafruz Zaman Ashrafi et al., 2004; Mafruz Zaman Ashrafi et al., 2005; Mafruz Zaman Ashrafi et al., 2007). Negative and exception rules in association rules have been discussed by Olena Daly et al. (2004), David Taniar et al. (2008). Haorianto Cokrowijoyo Tjioe et al. (2005) provide an approach to mine association rules in data warehouses by considering measurement of summarized data. Laura Irina Rusu et al. (2005) proposed methodology to build the data warehouse for xml data. Mafruz Zaman Ashrafi et al. (2004) proposed algorithm called ODAM for distributed environment.

Weighted association rule mining was introduced (Cai et al., 1998; Yun et al., 2008; Yun & Leggett et al., 2005; Yun et al., 2007) to address this limitation. For a retail business, it is necessary to identify its most valuable customers as these customers contribute more profit to the business. But these customers may not appear in a greater number of transactions. To address this issue, an additional parameter is added to frequent itemset mining i.e. weights or profits. High utility itemset mining refers to mine itemsets with high profits. The approaches for mining frequent itemsets cannot apply directly to mine HUIs. Recently many algorithms have been proposed by Liu et al. (2012), Lin et al. (2015), Liu et al. (2005), Tseng et al. (2010), Shie et al. (2012), Tseng et al. (2013), Tseng et al. (2015), Wu et al. (2012) to efficiently mine HUIs.

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 15: 4 Issues (2019): 2 Released, 2 Forthcoming
Volume 14: 4 Issues (2018)
Volume 13: 4 Issues (2017)
Volume 12: 4 Issues (2016)
Volume 11: 4 Issues (2015)
Volume 10: 4 Issues (2014)
Volume 9: 4 Issues (2013)
Volume 8: 4 Issues (2012)
Volume 7: 4 Issues (2011)
Volume 6: 4 Issues (2010)
Volume 5: 4 Issues (2009)
Volume 4: 4 Issues (2008)
Volume 3: 4 Issues (2007)
Volume 2: 4 Issues (2006)
Volume 1: 4 Issues (2005)
View Complete Journal Contents Listing