Structural Mining for Link Prediction Using Various Machine Learning Algorithms

Structural Mining for Link Prediction Using Various Machine Learning Algorithms

Ranjan Kumar Behera, Kshira Sagar Sahoo, Debadatt Naik, Santanu Kumar Rath, Bibhudatta Sahoo
DOI: 10.4018/IJSESD.2021070105
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Link prediction is an emerging research problem in social network analysis, where future possible links are predicted based on the structural or the content information associated with the network. In this paper, various machine learning (ML) techniques have been utilized for predicting the future possible links based on the features extracted from the topological structure. Moreover, feature sets have been prepared by measuring different similarity metrics between all pair of nodes between which no link exists. For predicting the future possible links various supervised ML algorithms like K-NN, MLP, bagging, SVM, decision tree have been implemented. The feature set for each instance in the dataset has been prepared by measuring the similarity index between the non-existence links. The model has been trained to identify the new links which are likely to appear in the future but currently do not exist in the network. Further, the proposed model is validated through various performance metrics.
Article Preview
Top

Introduction

Social network has become an inevitable part of our life where a large number of users are participated for social interaction. Basically. it is a collection of dynamic objects where the structure associated with it changes at an exponential rate. Modelling the network evolution is a complex problem as it associated with large number of dependent parameters(Doreian, 1997). Balancing those parameter plays a vital role in sustainability in social ecology. These parameters are mostly associated with the structural analysis of the network. The network can be represented as a graph structure where the users are depicted as nodes and the relationships between them are depicted as edges in the graph. Social network analysis (SNA) deals with analysing the characteristics of various patterns associated with the network. SNA can be classified into two different aspects on research (Scott, 2017)(Wasserman, 1994)(Carrington, 2005) (Sahoo, Sahoo, 2016).

  • 1.

    Structured based SNA: Social network can be analyse based on the topological features associated with the network. A number of researches have been carried out based on the structural information. Link prediction, centrality analysis(Koschützki, 2008), network evolution(Barabâsi, 2002), community detection(Fortunato, 2010) are few of them.

  • 2.

    Content based SNA: Each nodes and edges in the networks are associated with set of features. The patterns in the network can be analysed by processing the content associated with each entity. The Feature based analysis can lead to number of data mining applications like recommender system, community formation based on the common interest, influential analysis(Tang, 2009), modelling information diffusion(Bakshy, 2012) (Mishra,Sahoo, Sahoo, & Jena, 2017) etc. However, the applications are limited to the particular network domain rather than generalized to all network. Sarita et al. has presented an application that can predict the electoral candidates of 2015 election from social media in their paper (Sarita cs, 2015).

  • 3.

    Link prediction is one of the interesting research problems in social network where future associations that are likely to be established are predicted. This problem can be address through both content based and structured based analysis. In this paper the proposed algorithm focused on learning the linkage pattern existing in the network. We have prepared the datasets by measuring the similarity among the pair of nodes between which link is not exist at current time. Similarity between the nodes have been measured by various similarity index which are based on the topological index. Link prediction can be considered as one of link mining and analysis task which can further be considered as basic for many real time applications. It can be used for modelling recommender systems for e-commerce applications where the item can be recommended to the user for online shopping based on the interest. It may be applied to big data and IoT applications that leads to enormous business opportunities(Lokshina, 2018) (Sahoo, Sahoo, Dash, & Mishra, 2017) (Behera, Sahoo, Mahapatra, Rath, & Sahoo, 2018).

It can be used for modelling the terrorist network to identify the potential threats. It can also be used for predicting the links between the researchers in order to model the current research trend.

The major contribution of our research is as follows:

  • 1.

    Various strutural based similarity index has been measuredto quantify the similarity between the pair of nodes where link does not exist at current timestamp.

  • 2.

    We have identified the list of features for link prediction algorithms which can be applicable to all the domain instead of a particular domain. The feature set we have prepared is found to be inexpensive and computationally efficient.

  • 3.

    Implementation of various machine learning algorithms for structural analysis of the network in order to predict the hidden links.

  • 4.

    Performance of various classifier has been extensively evaluated based on different parameters like precision, recall, F-measure etc.

Complete Article List

Search this Journal:
Reset
Volume 15: 1 Issue (2024)
Volume 14: 1 Issue (2023)
Volume 13: 9 Issues (2022)
Volume 12: 4 Issues (2021)
Volume 11: 4 Issues (2020)
Volume 10: 4 Issues (2019)
Volume 9: 4 Issues (2018)
Volume 8: 4 Issues (2017)
Volume 7: 4 Issues (2016)
Volume 6: 4 Issues (2015)
Volume 5: 4 Issues (2014)
Volume 4: 4 Issues (2013)
Volume 3: 4 Issues (2012)
Volume 2: 4 Issues (2011)
Volume 1: 4 Issues (2010)
View Complete Journal Contents Listing