Weights and Multi-Edges in Link Prediction

Weights and Multi-Edges in Link Prediction

Victor F. Cavalcante (IBM Research, Brazil) and Ana Paula Appel (IBM Research, Brazil)
Copyright: © 2014 |Pages: 10
DOI: 10.4018/978-1-4666-5202-6.ch242

Chapter Preview



Over the past years the amount of data collected has increased substantially and it has been powered mainly by the World Wide Web expansion. The outbreak of novel and increasingly more powerful analytical approaches makes the extraction and production of knowledge from big amounts of data possible. Also, the spread of online social networks use is one of the factors responsible for the current high interest in complex network analyses.

Graphs may express and model mathematically complex network structures whenever it is useful to represent how things are either physically or logically linked to one another in a network structure. A graph G is mathematically represented as G = <V,E>, on which V represents a set of |V| nodes (or vertices), and E represents a set of |E| edges (or links), and a relation that associates with each edge two vertices (West, 2001). Two nodes are neighbors if they are connected by an edge.

For example, in social networks, nodes represent people or groups of people, and edges correspond to some kind of social interaction (Backstrom & Leskovec, 2011). In information networks, nodes correspond to information resources such as Web pages or documents, and edges represent logical connections such as hyperlinks, citations (Shi, Leskovec, & McFarland, 2010) or cross-references, and so on.

Complex networks are graph-based representations used for data have substantial non-trivial topological features. The study of complex networks brought important properties such as the power-law degree distributions (L. A. Adamic et al., 2000) and Small World phenomenon (Milgram, 1967), which help us to understand the interaction among human being, the dissemination of information and intrusion detection (Newman, 2010).

One of the most interesting task in network mining is link prediction, defined as: “Given a snapshot of a graph G, predict accurately which edges will appear in the near future of network.” (Liben-Nowell & Kleinberg, 2003). While in social network this definition is represented by “People you may know” feature, in online commerce this represents recommendations “Costumers who buy X tend to buy Y.” The question “How people get connected” is relevant not only in the context of social networks, but also in work social network inside companies (Paula, Appel, Pinhanez, Cavalcante, & Andrade, 2012).

Predicting interaction and collaboration among people in organizations can help manage companies in a productive way. The task of recommending unknowns but “similar” people is quite different from possible friend recommendation tasks, which focus on recommending individuals who have friends in common (Guy, Ur, Ronen, Perer, & Jacovi, 2011). In the context of organizations, introducing people with similar skills, profiles or common interests can be valuable for employees in many ways. For instance, searching for people with similar skills can facilitate issues related to problem solving. Through networking, people are able to offer advice to one another and help each other out with new projects or career development (Burke, Marlow, & Lento, 2010).

Weights and directions in networks have, with some exceptions (Akoglu, McGlohon, & Faloutsos, 2010), received relatively little attention, meaning that all connections in the network have the same importance, which is not true. An excellent reason for this is that the simple cases are usually investigated first (unweighted networks), before moving on to more complex ones (weighted networks). On the other hand, there are many cases where edge weights are known for networks, and to ignore them is to throw out a lot of data that might help us to understand these systems better. For instance, in a social network not all connections (friends) have the same importance; weights might indicate the strength of a relationship. The same happens with direction. In some social networks, such as Facebook, reciprocity is common, but in others, such as Twitter, this is not true and this should not be ignored.

Key Terms in this Chapter

Simple Graph: Undirected graph that has no loops (edges connected at both ends to the same vertex) and no more than one edge between any two different vertices.

World Wide Web: Abbreviated as WWW or commonly known as the Web, it is a system of interlinked hypertext documents accessed via the Internet.

Community Detection: Group of nodes from network that can be easily grouped into (potentially overlapping) sets of nodes densely connected internally.

Link Analysis: Data-analysis technique used to evaluate relationships (connections) between nodes.

Directed Network: network or set of nodes connected by edges having directions associated to each edge.

Small World: Tendency for individual elements in a large system to be separated from any other element in the system by only a few steps (a.k.a. “six degrees of separation”).

Weighted Network: Network with weights assigned to ties among nodes.

Complete Chapter List

Search this Book: