Robust and Efficient Custom Routing for Interconnection Networks with Distributed Shortcuts

Robust and Efficient Custom Routing for Interconnection Networks with Distributed Shortcuts

T. X. Le Nhat, T. Truong Nguyen, Khanh-Van Nguyen
Copyright: © 2014 |Pages: 24
DOI: 10.4018/ijdst.2014100104
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

We aim at creating a robust and efficient custom routing mechanism for Distributed Shortcut Networks (Nguyen et al., 2013), which address new challenging issues posed by recently advanced studies in the areas of massively parallel computing and large-scale data centers. We follow the design principles of Distributed Shortcut Networks (DSN), which construct non-random topologies with the creation of long-range shortcuts inspired by observations in small-world networks. However, we focus on designing a powerful custom routing mechanism which smartly exploits some precious properties of the topology. As a result, our new DSN-a network with a carefully refined routing logic performs significantly better than the basic DSN in term of communication latency while provides strengths in fault-tolerance as well as load-balance. These help the network become robust against link failures or burst of traffic demand while topology-agnostic deadlock-free routing (e.g. the famous up*/down* algorithm) suffers a lot.
Article Preview
Top

Introduction

An interconnection network, i.e the topology and routing algorithm for connecting and communicating between computing peers, can play a crucial role in defining the efficiency and cost of a large distributed system. The study of interconnection networks is an important, traditional research area which belongs to the fundamentals of various fields such as parallel computing, high-performance computing, distributed computing or computer networks. Recently, there has been a strong interest in finding new methods for creating modern interconnection network topologies which can solve the new, emerging problems in using the traditional topologies and which can strongly satisfy the highest qualities demanded for interconnection networks that modern Information and Communication Technologies (ICT) require such as in the areas of massively parallel computing, computing grids, or large scale data centers. Typically, the many traditional topologies have been found less desirable in applications where latency is sensitive, such as massively parallel computing, or also too rigid in structure to cope with incremental expansion that often occurs in modern data centers.1

As many scientists forecast, large parallel applications to be deployed on next generation High Performance Computing (HPC) systems would suffer from communication latencies that could reach hundreds of nanoseconds (K. Scott Hemmert et al., 2008; J. Tomkins, 2007). Thus designing low-latency interconnection networks is essential for these systems. Switch delays (e.g., about 100 nanoseconds in InfiniBand QDR) are relatively large when compared to the link delays (i.e. 5 ns/m). To achieve low end-to-end communication latency, a topology of switches should thus have low diameter and low average shortest path length, both measured in numbers of switch hops.

Recently, a new approach is proposed to use random shortcut topologies, which are generated either as fully random graphs (Singla et al., 2012) or by adding random shortcuts to classical topologies (Koibuchi et al., 2012; Shin et al., 2011). These new topologies, when compared to non-random topologies of the same degree, surprisingly have drastically reduced diameter and average shortest path length (Koibuchi et al., 2012). Also the random nature in creating shortcut links (instead of a strict rule based on a rigid graph structure as in traditional topologies) provides a flexibility that strongly benefits incremental expansion and a redundancy of links and paths that greatly supports network fault-tolerance.

In the light of the above analysis, obviously, traditional topologies are becoming less desirable in our modern world and using random shortcut models would be the way to create the next generation of interconnection network topologies. However, a practical concern for random topologies is their long cable length for a physical deployment (Koibuchi et al., 2012; Koibuchi et al., 2013). Even in the case of non-random topologies, for example, the first generation Earth Simulator requires over two thousand kilometers of cabling, while the K-computer requires one thousand kilometers. The use of random links further increases aggregate cable length by up to 2-3 times in our calculation. Thus, in spite of good performance results, random shortcut might thus become impractical at large scale due to the added cost. Other practical concerns with random topologies are on routing implementation and traffic balancing. With respect to the former, random topologies are unlikely to support custom routing implementation while several nonrandom topologies can exploit topological regularity to make routing logic simple and small (e.g., dimension-order routing). Thus, a topology-agnostic deadlock-free routing, such as up*/down* routing is assumed, which however can introduce traffic imbalance (Koibuchi et al., 2013; Koibuchi et al., 2012).

Complete Article List

Search this Journal:
Reset
Volume 15: 1 Issue (2024)
Volume 14: 2 Issues (2023)
Volume 13: 8 Issues (2022)
Volume 12: 4 Issues (2021)
Volume 11: 4 Issues (2020)
Volume 10: 4 Issues (2019)
Volume 9: 4 Issues (2018)
Volume 8: 4 Issues (2017)
Volume 7: 4 Issues (2016)
Volume 6: 4 Issues (2015)
Volume 5: 4 Issues (2014)
Volume 4: 4 Issues (2013)
Volume 3: 4 Issues (2012)
Volume 2: 4 Issues (2011)
Volume 1: 4 Issues (2010)
View Complete Journal Contents Listing