A Design Methodology of MIN-Based Network for MPPSoC on Reconfigurable Architecture

A Design Methodology of MIN-Based Network for MPPSoC on Reconfigurable Architecture

Y. Aydi, M. Baklouti, Ph. Marquet, M. Abid, J.L. Dekeyser
DOI: 10.4018/978-1-60960-086-0.ch009
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Massive parallel processing systems, particularly Single Instruction Multiple Data architectures, play a crucial role in the field of data intensive parallel applications. One of the primary goals in using these systems is their scalability and their linear increase in processing power by increasing the number of processing units. However, communication networks are the big challenging issue facing researchers. One of the most important networks on chip for parallel systems is the multistage interconnection network. In this paper, we propose a design methodology of multistage interconnection networks for massively parallel systems on chip. The framework covers the design step from algorithm level to RTL. We first develop a functional formalization of MIN-based on-chip network at a high level of abstraction. The specification and the validation of the model have been defined in the logic of ACL2 proving system. The main objective in this step is to provide a formal description of the network that integrates architectural parameters which have a huge impact on design costs. After validating the functional model, step 2 consists in the design and the implementation of the Delta multistage networks on chip dedicated to parallel multi-cores architectures on reconfigurable platforms FPGA. In the last step, we propose an evaluation methodology based on performance and cost metrics to evaluate different topologies of dynamic network through data parallel applications with different number of cores. We also show in the proposed framework that multistage interconnection networks are cost-effective high performance networks for parallel SOCs.
Chapter Preview
Top

Introduction

A large number of data parallel applications, especially multimedia, image processing and real time applications, ported to embedded systems require intensive computations. The complexity of these devices has lead to the appearance of parallel programming and parallel systems on a chip such as clusters, multiprocessor systems, grid systems, etc. These systems can satisfy the need for speed required by the data parallel applications due to their increasing complexity needed to compute the solution to the problem, the size of the data set to be processed, and the time constraint on when a solution to the problem must be attained. The SIMD (Single Instruction Multiple Data) parallel systems play a crucial role in the field of intensive signal processing (Meilander, Baker, M. J., 2003) because of their area and energy-efficiency. They are effective for applications that are highly parallelizable and require execution of the same operation over and over again. Massively parallel machines make use of fine-grained computational units, called Processing Elements (PEs) working in parallel to speed up computation. They are connected together in some sort of simple network topology, which often is custom-made for the type of application it’s intended for (Flynn, 1972). In SIMD machines, a main processor is responsible for synchronously controlling the whole architecture. Targeting high-performance applications requires that developers create implementations that are fast enough to meet demanding processing requirements are developed quickly enough to reach the time to market and can easily be updated to provide a different functionality. Given the rich domain of parallel applications it is always possible to find a set of applications that perform well on a given parallel architecture. However, it is often difficult to determine which architecture is best for a given application. Reconfigurable architecture will therefore be a key step in the development of such systems. Nowadays we have a great variety of high capacity programmable chips, also called reconfigurable devices (FPGAs) where we can easily integrate complete SoCs architectures for many different applications. FPGA reconfigurability can be exploited to implement a reusable design, which can be adjusted for specific applications without altering the basic structure. While SIMD systems may have been out of fashion in the 1990s, they are now developed to make effective use of the millions of transistors available and to be based on the new design methodologies. By using VLSI technology based on the use of intellectual properties (IPs) and replication of components effectively; massively parallel processors can achieve high performance at low cost. Key issues are how the processor and memory are partitioned and replicated, and how the communications and IO are accomplished. In fact, for most the parallel systems, communication networks are considered as one of the challenges facing researchers. These parallel systems require a cost-effective yet high-performance interconnection scheme to provide the needed communications between processors. Thus, this interconnect system must support the entire inter-component data traffic and has a significant impact on the overall system performance (Pasricha, Dutt, Bozorgzadeh & Ben-Romdhane, 2005). As a promising alternative, Networks on Chip (NoCs) have been proposed by academia and industry to handle communication needs for the future multiprocessor systems-on-chip, in particular parallel processing SoCs (Benini & Micheli, 2002). In comparison with previous communication platforms (e.g., a single shared bus, a hierarchy of buses, dedicated point-to-point wires), NoCs provide enhanced performance and scalability. All NoCs advantages are achieved thanks to efficient sharing of wires and a high level of parallelism (Dally & Towles, 2001). The NoC is considered the trend for future generations of multi-core processors (Schack, Heenes & Hoffmann, 2009).

Complete Chapter List

Search this Book:
Reset