A Combined Approach for Implementation of Broadcast Operation in Grid Computing

A Combined Approach for Implementation of Broadcast Operation in Grid Computing

Ghalem Belalem, Mohammed Ilyes Kara Mostefa
Copyright: © 2013 |Pages: 12
DOI: 10.4018/jghpc.2013010102
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

In distributed computing, the collective communications scheduling is among the most important scheduling problems related to intensive applications executed over heterogeneous platforms. The optimization of collective operations allows the improvement of parallel and distributed applications performance by reducing the completion time of these operations. In this paper, the authors are particularly interested by the optimization of broadcast operation executed over large scale distributed environment such as grid computing. For this aim, we combined the two levels approach implemented in MagPIe library proposed for the hierarchical large scale systems with the ECEF (Earliest Completion Edge First) heuristic proposed for the IPG (Information Power Grid). Simulation results show the advantage of our proposed hybrid strategy compared to the classical ECEF heuristic.
Article Preview
Top

Introduction

The last decade has seen the growth of a significant number of large scale infrastructures such as grid computing which construction became the main objective of different pacific researchers with many worldwide collaborations between funding agencies, commercial vendors, national centers and laboratories who came together to form a community of broad expertise to run a high intensive computing applications in different scientific fields such as image processing, biochemistry, simulations and data mining (Jinno et al., 2003; Kielmann et al., 2001).

These new grid infrastructures may include processing resources which may have any performance level, treatment and communication capability (Vadhiyar et al., 2000, Vadhiyar et al., 2004). There exist computing grids which involve nodes that are themselves high performance parallel machines or clusters. However, these nodes need to communicate very frequently to exchange data; intermediate results or rather provide the final result. When this type of communication involves many interlocutors, we say that we are talking about collective communication (Bernaschi & Iannello, 1998; Cha et al., 2003; Lowekamp & Beguelin, 1996). Recent researches show that running high computational applications on large scale parallel and distributed architectures become a challenging task. The performance of such applications is widely dominated by the collective communication cost on these architectures. This cost can often be harmful for such applications due to architectural features that comprise a large number of heterogeneous nodes and communication links.

In a large scale computational grid, one or more heterogeneity factors can exist like communication capabilities of nodes, network architectures and communication protocols (Bernaschi & Iannello, 1998). In a grid, we can have different nodes with different communication capabilities which may use different communication protocols to communicate and exchange messages and may even be interconnected in different ways. In a grid, different network architectures can coexist in the same LAN (Banikazeni et al., 1998; Figueira et al., 2002; Karonis et al., 2002, Lee & Han, 2005; Jeannot & Steffenel, 2007), the objective of the work presented in this paper is to optimize the completion time of broadcast operation. The broadcast operation belongs to the one to many collective communication patterns. In this collective operation, a so-called root process sends a data to all other processes. In other words: at the beginning of the operation, only one root process holds the message that should be broadcasted (Park et al., 2003); at the end of the operation, every process has a copy of that message (Barchet-Estefanel & Mounié, 2006; Thakur & Gropp, 2003; Vadhiyar et al., 2000) (see Figure 1). The completion time of broadcast operation is the time during which al messages are delivered.

Figure 1.

Broadcast example with 4 processes

jghpc.2013010102.f01

This paper is organized as follows. In the second section, we speak about related works, the section after describes our architecture model, the next section introduced our proposal which combines solutions previously suggested in order to optimize the operation of broadcast in heterogeneous environments, the following section presents work of simulations which show the contribution of our approach and finally we finish by conclusions.

Top

Most of the optimization solutions of collectives operations on parallel and distributed environments proposed tree-based collective communications scheduling, this structure can often avoid messages duplication problems, bottlenecks and network congestion: contention effect. The major difference is the choice of provision of the nodes in the tree, in other words: by which nodes begin the scheduling and by which nodes end. Since the problem of finding the optimal communication scheduling in distributed heterogeneous system is NP-complete (Bhat et al., 2003), different researchers proposed scheduling heuristics which produce solutions close to the optimal.

Complete Article List

Search this Journal:
Reset
Volume 16: 1 Issue (2024)
Volume 15: 2 Issues (2023)
Volume 14: 6 Issues (2022): 1 Released, 5 Forthcoming
Volume 13: 4 Issues (2021)
Volume 12: 4 Issues (2020)
Volume 11: 4 Issues (2019)
Volume 10: 4 Issues (2018)
Volume 9: 4 Issues (2017)
Volume 8: 4 Issues (2016)
Volume 7: 4 Issues (2015)
Volume 6: 4 Issues (2014)
Volume 5: 4 Issues (2013)
Volume 4: 4 Issues (2012)
Volume 3: 4 Issues (2011)
Volume 2: 4 Issues (2010)
Volume 1: 4 Issues (2009)
View Complete Journal Contents Listing