Article Preview
Top1. Introduction
Personalization recommender systems (RSs) become more and more popular in some well-known E-commerce websites such as Amazon, eBay etc. (Bryan, OMahony, & Cunningham, 2008; Burke, Mobasher, & Williams, 2006; Mehta, Hofmann, & Fankhauser, 2007; Zhang, & Kulkarni, 2014). These E-commerce services have not gained higher customer satisfaction on products or services but also derived more benefits from customer ratings since being successfully armed with personalized recommendations. Collaborative filtering recommender systems (CFRSs) have been proved to be one of the most of RSs. However, CFRSs are highly vulnerable to “shilling” attacks or “profile injection” attacks sice its openness, it is common occurrence that attackers contaminate the CFRSs with malicious rates to achieve their attack intentions. To avoid negative results, the companies either defend such attacks or benefit from customer ratings by detecting functional weakness as well as deficiencies of products for improving the detected lacks (Gnnemann, Gnnemann, & Faloutsos, 2014a; Gnnemann, Gnnemann, & Faloutsos, 2014b). Thus, constructing an effective detection method to defend such attacks and remove them from the CFRSs is crucial.
Despite detection methods based on such attacks have received much attention, the problem of detection accuracy remains largely unsolved. For example, it is difficult to extract more effective features to characterize the difference between attack profiles and genuine profiles (Burke, Mobasher, & Williams, 2006; Zhang, & Kulkarni, 2014; Williams, Mobasher, & Burke, 2007; Williams, Mobasher, Burke, & Bhaumik, 2007; Morid, & Shajari, 2014; Mehta, 2007). Furthermore, some features based on calculating similarity between users are effective to capture the concerned attackers, but the computation time is unreasonable, especially for large-scale networks within million of nodes (Bryan, OMahony, & Cunningham, 2008; Burke, Mobasher, & Williams, 2006; Mehta, Hofmann, & Fankhauser, 2007; Zhang, & Kulkarni, 2014; Williams, Mobasher, & Burke, 2007; Williams, Mobasher, Burke, & Bhaumik, 2007; Morid, & Shajari, 2014; Mehta, 2007). It is noteworthy that few focus on the real world datasets including Amazon, Yelp etc, to discover or detect the hidden attacks (Gnnemann, Gnnemann, & Faloutsos, 2014a; Gnnemann, Gnnemann, & Faloutsos, 2014b).
Given a graph within million of nodes (consists of attackers and genuine users), how can we detect automatically anomalous or suspicious nodes with reasonable computation time? In practice, the attackers are paid to make certain accounts seem more legitimate or famous through giving them many additional users. The attackers deliver these purchases through either generating fake accounts or controlling real accounts through malware and using them to follow their “customers” (Jiang, Cui, Beutel, Faloutsos, & Yang, 2014). In this case, the attackers are manipulating well-designed graph to give certain accounts undue credibility, since the attackers only require adding edges to the graph.