Direction-Aware Proximity on Graphs

Direction-Aware Proximity on Graphs

Hanghang Tong (Carnegie Mellon University, USA), Yehuda Koren (AT&T Labs - Research, USA) and Christos Faloutsos (Carnegie Mellon University, USA)
Copyright: © 2009 |Pages: 8
DOI: 10.4018/978-1-60566-010-3.ch101
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

In many graph mining settings, measuring node proximity is a fundamental problem. While most of existing measurements are (implicitly or explicitly) designed for undirected graphs; edge directions in the graph provide a new perspective to proximity measurement: measuring the proximity from A to B; rather than between A and B. (See Figure 1 as an example). In this chapter, we study the role of edge direction in measuring proximity on graphs. To be specific, we will address the following fundamental research questions in the context of direction-aware proximity: 1. Problem definitions: How to define a directionaware proximity? 2. Computational issues: How to compute the proximity score efficiently? 3. Applications: How can direction-aware proximity benefit graph mining?
Chapter Preview
Top

Introduction

In many graph mining settings, measuring node proximity is a fundamental problem. While most of existing measurements are (implicitly or explicitly) designed for undirected graphs; edge directions in the graph provide a new perspective to proximity measurement: measuring the proximity from A to B; rather than between A and B. (See Figure 1 as an example).

Figure 1.

An example of Dir-CePS. Each node represents a paper and edge denotes ‘cited-by’ relationship. By employing directional information, Dir-CePS can explore several relationships among the same query nodes (the two octagonal nodes): (a) A query-node to query-node relations; (b) common descendants of the query nodes (paper number 22154 apparently merged the two areas of papers 2131 and 5036); (c) common ancestors of the query nodes: paper 9616 seems to be the paper that initiated the research areas of the query papers/nodes. See ‘Main Focus’ section for the description of Dir-CePS algorithm.

In this chapter, we study the role of edge direction in measuring proximity on graphs. To be specific, we will address the following fundamental research questions in the context of direction-aware proximity:

  • 1.

    Problem definitions: How to define a direction-aware proximity?

  • 2.

    Computational issues: How to compute the proximity score efficiently?

  • 3.

    Applications: How can direction-aware proximity benefit graph mining?

Top

Background

In the literature, there are several measures of node proximity. Most standard measures are based on basic graph theoretical concepts -the shortest path length and the maximum flow. However the dependency of these measures on a single element of the graph, the shortest path or the minimum cut, makes them more suitable to managed networks, but inappropriate for measuring the random nature of relationships within social networks or other self organizing networks. Consequently, some works suggested more involved measures such as the sink-augmented delivered current (Faloutsos, Mccurley & Tomkins, 2004), cycle free effective conductance (Koren, North & Volinsky, 2006), survivable network (Grötschel, Monma & Stoer, 1993), random walks with restart (He, Li, zhang, Tong & Zhang, 2004; Pan, Yang, Faloutsos & Duygulu, 2004). Notice that none of the existing methods meets all the three desirable properties that our approach meets: (a) dealing with directionality, (b) quality of the proximity score and (c) scalability.

Graph proximity is an important building block in many graph mining settings. Representative work includes connection subgraph (Faloutsos, McCurley & Tomkins, 2004; Koren, North & Volinsky, 2006; Tong & Faloutsos 2006), personalized PageRank (Haveliwala, 2002), neighborhood formulation in bipartite graphs (Sun, Qu, Chakrabarti & Faloutsos, 2005), content-based image retrieval (He, Li, zhang, Tong & Zhang, 2004), cross modal correlation discovery (Pan, Yang, Faloutsos & Duygulu, 2004), the BANKS system (Aditya, Bhalotia, Chakrabarti, Hulgeri, Nakhe & Parag 2002), link prediction (Liben-Nowell & Kleinberg, 2003), detecting anomalous nodes and links in the graph (Sun, Qu, Chakrabarti & Faloutsos, 2005), ObjectRank (Balmin, Hristidis & Papakonstantinou, 2004) and RelationalRank (Geerts, Mannila & Terzi, 2004).

Complete Chapter List

Search this Book:
Reset