Collaborative Recommendation Systems and Link Analysis

Collaborative Recommendation Systems and Link Analysis

François Fouss
DOI: 10.4018/978-1-61520-841-8.ch005
OnDemand:
(Individual Chapters)
Available
$33.75
List Price: $37.50
10% Discount:-$3.75
TOTAL SAVINGS: $3.75

Abstract

Link analysis is a framework usually associated with fields such as graph mining, relational learning, Web mining, text mining, hyper-text mining, visualization of link structures. It provides and analyzes relationships and associations between many objects of various types that are not apparent from isolated pieces of information. This chapter shows how to apply various link-analysis algorithms exploiting the graph structure of databases on collaborative-recommendation tasks. More precisely, two kinds of link-analysis algorithms are applied to recommend items to users: random-walk based models and kernel-based models. These link-analysis based algorithms do not use any feature of the items in order to compute the recommendations, they first compute a matrix containing the links between persons and items, and then derive recommendations from this matrix or part of it.
Chapter Preview
Top

Introduction

Link analysis is a data-mining technique based on the analysis of various kinds of networks. Link-analysis methods can be used, for example, for classification, prediction, clustering or association-rules discovery.

A traditional link-analysis subfield that has attracted considerable interest and curiosity from the social and behavioral science community in recent decades (Carrington, et al., 2006; Lazega, 1998; White & Smyth, 2003) is the field aiming at analyzing links existing in various kinds of social networks. Much of this interest can be attributed to the appealing focus of social network analysis on relationships among social entities, and on the patterns and implications of these entities. Many researchers have realized that the network perspective allows new leverage for answering standard-social and behavioral-science research questions by giving precise formal definitions to aspects of the political, economic, or social structural environment. From the view of social-network analysis, the social environment can be expressed as patterns or regularities in relationships among interacting units. The focus on relations and on patterns of relations, requires a set of methods and analytic concepts that are distinct from the methods of traditional statistics and data analysis. Such phrases as webs of relationships, closely knit networks of relations, social role, social position, group, clique, popularity, isolation, centrality, prestige, prominence, etc. are given mathematical definitions by social-network analysis.

More generally, link analysis (see (Thelwall, 2004) for an introduction on link analysis) is a framework usually associated with fields such as graph mining, relational learning, Web mining, text mining, hyper-text mining, visualization of link structures. It provides relationships and associations between many objects of various types that are not apparent from isolated pieces of information. Computer-assisted or fully automatic computer-based link analysis is increasingly used in diverse application fields such as database marketing (find typical features of customers, identify frequent patterns in purchase behavior, etc.), fraud detection (discover suspicious behaviors), Web mining (find good Web pages relatively to a specific query), telecommunication-network analysis (visualize the main communication patterns and potential network bottlenecks), criminal investigations, text mining, epidemiology and pharmacology, or search engines.

The work described in this chapter shows how to apply various link-analysis algorithms exploiting the graph structure of databases on collaborative-recommendation tasks. More precisely, two kinds of link-analysis algorithms are applied to recommend items to users.

The first type of algorithms is based on a Markov-chain model. A random-walk model through the database is defined by assigning a transition probability to each link. Thus, a virtual random walker can jump from element to element and each element therefore represents a state of a Markov chain. From this random-walk model, we define quantities such as the average commute time, the average first-passage time, the average first-passage cost, and the pseudoinverse of the Laplacian matrix of the graph. This chapter also introduces the Euclidean Commute-Time Distance (ECTD) (corresponding to the square root of the average commute-time distance), which is a distance between nodes.

The second type of algorithms is based on graph kernels. The common idea behind kernel-based algorithms is to express similarities between pairs of points in a data space in terms of a kernel function, and thereby to implicitly construct a mapping to a feature space in which the kernel appears as the inner product between pairs of points (implying that the entries of the kernel matrix can be considered as similarities between the points). Seven kernels on a graph (namely, the exponential diffusion kernel, the Laplacian exponential diffusion kernel, the von Neumann diffusion kernel, the regularized Laplacian kernel, the commute-time kernel, the Markov diffusion kernel, and the cross-entropy diffusion matrix) are reviewed.

Complete Chapter List

Search this Book:
Reset