Detecting Anomalous Ratings in Collaborative Filtering Recommender Systems

Detecting Anomalous Ratings in Collaborative Filtering Recommender Systems

Zhihai Yang (Xi'an Jiaotong University, Xi'an, China) and Zhongmin Cai (Xi'an Jiaotong University, Xi'an, China)
Copyright: © 2016 |Pages: 11
DOI: 10.4018/IJDCF.2016040102
OnDemand PDF Download:
$37.50

Abstract

Online rating data is ubiquitous on existing popular E-commerce websites such as Amazon, Yelp etc., which influences deeply the following customer choices about products used by E-businessman. Collaborative filtering recommender systems (CFRSs) play crucial role in rating systems. Since CFRSs are highly vulnerable to “shilling” attacks, it is common occurrence that attackers contaminate the rating systems with malicious rates to achieve their attack intentions. Despite detection methods based on such attacks have received much attention, the problem of detection accuracy remains largely unsolved. Moreover, few can scale up to handle large networks. This paper proposes a fast and effective detection method which combines two stages to find out abnormal users. Firstly, the manuscript employs a graph mining method to spot automatically suspicious nodes in a constructed graph with millions of nodes. And then, this manuscript continue to determine abnormal users by exploiting suspected target items based on the result of first stage. Experiments evaluate the effectiveness of the method.
Article Preview

1. Introduction

Personalization recommender systems (RSs) become more and more popular in some well-known E-commerce websites such as Amazon, eBay etc. (Bryan, OMahony, & Cunningham, 2008; Burke, Mobasher, & Williams, 2006; Mehta, Hofmann, & Fankhauser, 2007; Zhang, & Kulkarni, 2014). These E-commerce services have not gained higher customer satisfaction on products or services but also derived more benefits from customer ratings since being successfully armed with personalized recommendations. Collaborative filtering recommender systems (CFRSs) have been proved to be one of the most of RSs. However, CFRSs are highly vulnerable to “shilling” attacks or “profile injection” attacks sice its openness, it is common occurrence that attackers contaminate the CFRSs with malicious rates to achieve their attack intentions. To avoid negative results, the companies either defend such attacks or benefit from customer ratings by detecting functional weakness as well as deficiencies of products for improving the detected lacks (Gnnemann, Gnnemann, & Faloutsos, 2014a; Gnnemann, Gnnemann, & Faloutsos, 2014b). Thus, constructing an effective detection method to defend such attacks and remove them from the CFRSs is crucial.

Despite detection methods based on such attacks have received much attention, the problem of detection accuracy remains largely unsolved. For example, it is difficult to extract more effective features to characterize the difference between attack profiles and genuine profiles (Burke, Mobasher, & Williams, 2006; Zhang, & Kulkarni, 2014; Williams, Mobasher, & Burke, 2007; Williams, Mobasher, Burke, & Bhaumik, 2007; Morid, & Shajari, 2014; Mehta, 2007). Furthermore, some features based on calculating similarity between users are effective to capture the concerned attackers, but the computation time is unreasonable, especially for large-scale networks within million of nodes (Bryan, OMahony, & Cunningham, 2008; Burke, Mobasher, & Williams, 2006; Mehta, Hofmann, & Fankhauser, 2007; Zhang, & Kulkarni, 2014; Williams, Mobasher, & Burke, 2007; Williams, Mobasher, Burke, & Bhaumik, 2007; Morid, & Shajari, 2014; Mehta, 2007). It is noteworthy that few focus on the real world datasets including Amazon, Yelp etc, to discover or detect the hidden attacks (Gnnemann, Gnnemann, & Faloutsos, 2014a; Gnnemann, Gnnemann, & Faloutsos, 2014b).

Given a graph within million of nodes (consists of attackers and genuine users), how can we detect automatically anomalous or suspicious nodes with reasonable computation time? In practice, the attackers are paid to make certain accounts seem more legitimate or famous through giving them many additional users. The attackers deliver these purchases through either generating fake accounts or controlling real accounts through malware and using them to follow their “customers” (Jiang, Cui, Beutel, Faloutsos, & Yang, 2014). In this case, the attackers are manipulating well-designed graph to give certain accounts undue credibility, since the attackers only require adding edges to the graph.

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 9: 4 Issues (2017)
Volume 8: 4 Issues (2016)
Volume 7: 4 Issues (2015)
Volume 6: 4 Issues (2014)
Volume 5: 4 Issues (2013)
Volume 4: 4 Issues (2012)
Volume 3: 4 Issues (2011)
Volume 2: 4 Issues (2010)
Volume 1: 4 Issues (2009)
View Complete Journal Contents Listing