Article Preview
Top1. Introduction
With the growing emergence and rapid pervasion of social networks (e.g., Twitter, Facebook, MySpace, YouTube, sina weibo, etc), there is little doubt that social networks are playing a more and more important role as a medium for the spread of information, ideas and merchandises promoting. The phenomenon of “information overload” and “information bewilderment” in the internet is becoming a universal, complex and subtle force that hinders retrieval efficiency. Therefore, there is an urgent need for methods and techniques to provide personalized services. Association rule mining has been widely used in the fields of business and medicine. In this paper, we propose an effective scheme for association rule mining of personal hobbies in social networks. The proposed scheme can predict an individual’s hobby, so as to offer him/her with the services or products of interest.
Traditional association rule mining methods are all based on the two important metrics: the support vector and the confidence vector (Zhou et al, 2000; Ma, Zhong & Zhang, 2006). Gradually, people find that many rules discovered by means of above metrics are not of great concern, resulting in the introduction of the lift vector (Mei & Wang, 2010; Xiang, Lin & Yang, 2009). The most noticeable algorithm based on the lift vector is Apriori (Agrawal & Srikant, 1994), and the subsequently extended algorithms, such as Mannila’s algorithms (Mannila, Toivonen & Verkamo, 1994), Partition (Savasere, Omiecinski & Navathe, 1995), Sampling (Toivonen, 1996), dynamic itemset counting (DIC) (Brin et al, 2001). With the increasing number of databases, the computational complexity has been exponentially increasing. Facing thorny issues of computation, many efficient algorithms have been proposed. Pasquier et al. propose a new mining algorithm, called Close (Pasquier et al, 1999), which prunes the closed set lattice and is especially highly efficient for data of the market basket style. Lee et al. adopt the Path Tree for mining frequent sequential patterns over data streams and can efficiently integrate the users sequentially (Lee, Hung & Chen, 2013). Lam et al. propose two heuristic algorithms (Lam et al, 2014), one is the two-phase approach and the other is GoKrimp, to empirically study six real-life databases. They show that the most compressed patterns, which are less redundant, are better than frequent closed patterns when used as the feature sets for the classifiers based on support vector machines (SVMs). USpan (Yin, Zheng & Cao, 2012), proposed by Yin et al, is to mine high utility sequential patterns. Inspired by association rule mining, Galárraga et al. develop a rule mining model that is explicitly tailored to support the open world assumption (OWA) scenarios (Galárraga et al, 2013). Facing the growing amount of ontologies and semantic annotations available on the Web, Nebot et al. present a novel method for mining association rules from semantic instance data repositories (Nebot & Berlanga, 2012). Recently, due to the rare study of dynamic association rules, a method (Zhang, Zeng & Xu, 2012) based on data tendency is proposed by Zhang et al. to reveal the characteristics of time-varying rules.