Performance Analysis of Naïve Bayes Classifier Over Similarity Score-Based Techniques for Missing Link Prediction in Ego Networks

Performance Analysis of Naïve Bayes Classifier Over Similarity Score-Based Techniques for Missing Link Prediction in Ego Networks

Anand Kumar Gupta, Neetu Sardana
Copyright: © 2021 |Pages: 13
DOI: 10.4018/JITR.2021010107
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Keywords Ego Network, Link Prediction, Machine Learning, Performance Analysis, Similarity Score, Social Network
Article Preview
Top

Introduction

Social networking websites have established as the most influential component on the web. Nowadays, it is the most popular medium to exchange information such as, photos, videos or any other form of media among people. Facebook, Instagram, WhatsApp and Twitter are the most popular social networking platforms and are currently used by millions of individuals across the globe. With the rapid development in web based services, online social networking websites have become a part of people’s life (Dong et al., 2013). These social networking websites help to distribute various kinds of media with the individuals whom they are directly linked to. The flow of information in social networks is fast, if the network is dense (or cohesive). Practically, spread of media is affected due to high number of individuals in the social network. A social network constitutes large number of individuals and existence of a direct connection among all pairs of individuals in the network is practically impossible. Hence, the objective of social network websites is to augment the cohesion among individuals so that the rate of information diffusion increases. This objective of augmenting the cohesion can be accomplished by considering ego networks of individuals. An ego is an individual focal node and an ego network is a personal network which consists of this focal node and all its direct connections. These directly connected nodes are known as “alters” of ego (or focal) node. Figure 1 represents a sample of an ego network of focal node, X. The focal node, X is directly connected to four other nodes, A, B, C and D, known as alters of X. The directly connected nodes of alters are termed as “alters’ of alter” (Mcauley & Leskovec, 2014). In Figure 1, node, E is ‘alter of alter’ for focal node, X. Rate of information spread is high if the ego network is dense. This is due to the fact that a dense ego network will have large amount of direct links among its alters (i.e. coupling among alters would be tight). Link prediction techniques are used to augment the denseness of connections in an ego network. Link prediction techniques forecast the probability of new link formation among disconnected nodes in an ego network (Hasan et al, 2011). This prediction is based on the commonality among disconnected nodes. Higher the common nodes or features among two disconnected nodes, more is the probability of link formation among them. The increase in number of links in the ego network thus affects the ratio of information spread among individuals.

Figure 1.

Sample of an ego network

JITR.2021010107.f01

Link prediction techniques can be categorized under 2 heads based upon their utilization: a.) Link prediction techniques for forecasting missing links in a social network, and b.) Link prediction techniques for predicting possible links which can be formed in the future (Dong et al., 2013). The first category of missing link prediction is investigated when a fixed time stamp network data is available for training and prediction i.e. a partially observed structural graph of social network available exists and can be supplied as an input to the prediction framework. In this situation, there is no future data available for testing and validating the predicted results. Hence, a subset of randomly selected connections present in the ego network are eliminated. Link prediction techniques are then applied to find missing links in an ego network (Narang et al., 2013). In the second category, snapshots of complete experimental network graphs from social network are present for varied time intervals. Hence, link prediction techniques utilize snapshot of data at timestamp, t1 for input and then predicts the future possible links in the network. The snapshot of data at timestamp, t2 (where t1<t2) is then used for validation of prediction results. In this work, the link prediction task has been considered as missing link prediction problem since structural graphs of network were available for experimental evaluation.

Complete Article List

Search this Journal:
Reset
Volume 16: 1 Issue (2024): Forthcoming, Available for Pre-Order
Volume 15: 6 Issues (2022): 1 Released, 5 Forthcoming
Volume 14: 4 Issues (2021)
Volume 13: 4 Issues (2020)
Volume 12: 4 Issues (2019)
Volume 11: 4 Issues (2018)
Volume 10: 4 Issues (2017)
Volume 9: 4 Issues (2016)
Volume 8: 4 Issues (2015)
Volume 7: 4 Issues (2014)
Volume 6: 4 Issues (2013)
Volume 5: 4 Issues (2012)
Volume 4: 4 Issues (2011)
Volume 3: 4 Issues (2010)
Volume 2: 4 Issues (2009)
Volume 1: 4 Issues (2008)
View Complete Journal Contents Listing