Centrality Metrics-Based Connected Dominating Sets for Real-World Network Graphs

Centrality Metrics-Based Connected Dominating Sets for Real-World Network Graphs

Natarajan Meghanathan (Jackson State University, USA), Md Atiqur Rahman (Jackson State University, USA) and Mahzabin Akhter (Jackson State University, USA)
DOI: 10.4018/978-1-5225-8188-8.ch001

Abstract

The authors investigate the use of centrality metrics as node weights to determine connected dominating sets (CDS) for a suite of 60 real-world network graphs of diverse degree distribution. They employ centrality metrics that are neighborhood-based (degree centrality [DEG] and eigenvector centrality [EVC]), shortest path-based (betweenness centrality [BWC] and closeness centrality [CLC]) as well as the local clustering coefficient complement-based degree centrality metric (LCC'DC), which is a hybrid of the neighborhood and shortest path-based categories. The authors target for minimum CDS node size (number of nodes constituting the CDS). Though both the BWC and CLC are shortest path-based centrality metrics, they observe the BWC-based CDSs to be of the smallest node size for about 60% of the real-world networks and the CLC-based CDSs to be of the largest node size for more than 40% of the real-world networks. The authors observe the computationally light LCC'DC-based CDS node size to be the same as the computationally heavy BWC-based CDS node size for about 50% of the real-world networks.
Chapter Preview
Top

1. Introduction

Network Science has been an actively researched area for analysis and visualization of characteristics of large complex real-world networks. Several metrics (a.k.a. measures) have been used for network analysis. Of these, centrality of the nodes constituting the network (a measure of the importance of the nodes with respect to the rest of the nodes) is one of the key metrics. The centrality-based ranking of nodes in real-world networks is purely a link-statistic based approach and does not depend on any offline information such as reputation of the particular entity representing the node, etc (Newman, 2010). The widely-used centrality metrics are: the neighborhood-based Degree (DEG) and Eigenvector (EVC) centrality metrics and the shortest path-base Betweeness (BWC) and Closeness (CLC) centrality metrics, all of which are studied in this chapter. We also employ the recently proposed local clustering coefficient-based degree centrality (LCC'DC) metric (Meghanathan, 2017a), which is a hybrid of the neighborhood-based and shortest path-based centrality metrics. From computational complexity point of view, DEG and LCC'DC are computationally-light; whereas, BWC, EVC and CLC are computationally-heavy. Recently, several correlation studies (e.g., Meghanathan & He, 2016; Meghanathan, 2017d) have been conducted on the above centrality metrics: the LCC'DC metric exhibited a strong positive correlation with the BWC metric with respect to all the three correlation measures (Pearson's, Spearman's and Kendall's) for prediction, network-wide ranking and pair-wise ranking.

We model large real-world network as a graph of vertices and edges (vertices represent the nodes - entities like players, countries, airports, books, etc constituting the real-world network, and edges represent the interactions between these entities in the form of competitions played against, airline connections, books related to a particular domain, etc). The currently available application software (e.g., Gephi, 2014 & Pajek, 2014) as well as literature for network analysis and visualization focus on several graph theoretic properties like diameter, clustering coefficient, path length, components, communities, etc. However, connected dominating sets (CDS) have been rarely analyzed in the literature of complex network analysis (Meghanathan, 2017b, Meghanathan, 2017e).

A CDS (Cormen et al., 2009) of a graph is a subset of the vertices (and the edges connecting these vertices) such that every vertex in the graph is either in the CDS or is connected to a node in the CDS (in the latter case, the node is said to be covered by a node in the CDS). A CDS forms the connected backbone of a network graph and could be used as the underlying communication topology for broadcasting throughout the network with the minimum number of forward transmissions (Meghanathan, 2012a). Only the nodes constituting the CDS need to forward a message (to the other nodes constituting the CDS) and the rest of the nodes in the network could be mere receivers. Accordingly, the primary objective of CDS-related research has been to find CDS of smaller size. Unfortunately, the problem of finding a minimum connected dominating set (CDS with the fewest number of constituent nodes) in both weighted and unit-weight graphs (undirected or directed) is a NP-hard problem (Cormen et al., 2009). Several heuristics (Gu et al, 2012; Meghanathan, 2012a, etc) have been proposed to find approximations to minimum-sized CDS. A common theme among all of these heuristics is to use a metric (like node degree) as the determining factor to include a node as part of the CDS. CDS construction starts with the node having the maximum value for the underlying metric and the nodes connected to this starting node are considered candidate nodes for inclusion to the CDS (as long as these nodes bring in at least one node that is yet to be covered). Among such candidate nodes, the node that covers the largest number of uncovered nodes is included to the CDS; this procedure is continued until all the nodes in the network graph are covered by at least one node in the CDS.

Complete Chapter List

Search this Book:
Reset