An Efficient Hardware/Software Communication Mechanism for Reconfigurable NoC

An Efficient Hardware/Software Communication Mechanism for Reconfigurable NoC

Wei-Wen Lin (National Chung Cheng University, Taiwan, R.O.C.), Jih-Sheng Shen (National Chung Cheng University, Taiwan, R.O.C.) and Pao-Ann Hsiung (National Chung Cheng University, Taiwan, R.O.C.)
DOI: 10.4018/978-1-61520-807-4.ch004
OnDemand PDF Download:
$37.50

Abstract

With the progress of technology, more and more intellectual properties (IPs) can be integrated into one single chip. The performance bottleneck has shifted from the computation in individual IPs to the communication among IPs. A Network-on-Chip (NoC) was proposed to provide high scalability and parallel communication. An ASIC-implemented NoC lacks flexibility and has a high non-recurring engineering (NRE) cost. As an alternative, we can implement an NoC in a Field Programmable Gate Arrays (FPGA). In addition, FPGA devices can support dynamic partial reconfiguration such that the hardware circuits can be configured into an FPGA at run time when necessary, without interfering hardware circuits that are already running. Such an FPGA-based NoC, namely reconfigurable NoC (RNoC), is more flexible and the NRE cost of FPGA-based NoC is also much lower than that of an ASIC-based NoC. Because of dynamic partial reconfiguration, there are several issues in the RNoC design. We focus on how communication between hardware and software can be made efficient for RNoC. We implement three communication architectures for RNoC namely single output FIFO-based architecture, multiple output FIFO-based architecture, and shared memory-based architecture. The average communication memory overhead is less on the single output FIFO-based architecture and the shared memory-based architecture than on the multiple output FIFO-based architecture when the lifetime interval is smaller than 0.5. In the performance analysis, some real applications are applied. Real application examples show that performance of the multiple output FIFO-based architecture is more efficient by as much as 1.789 times than the performance of the single output FIFO-based architecture. The performance of the shared memory-based architecture is more efficient by as much as 1.748 times than the performance of the single output FIFO-based architecture.
Chapter Preview
Top

Introduction

With the progress of technology, we are able to integrate several intellectual properties (IPs) into one single chip. The performance bottleneck has shifted from the computation in individual IPs to the communication among IPs. Common on-chip communication architectures include point-to-point dedicated wire-based and shared bus-based architectures. Although the dedicated wire-based architecture can guarantee the required communication bandwidth between IPs, it suffers from low link utilization. The shared bus-based architecture provides higher link utilization than the dedicated wire-based architecture. But, when the number of IPs increases, the shared bus-based architecture faces the drastic contention problem caused by IPs. A feasible solution to the link utilization and the contention problem has been proposed called the Network-on-Chip (NoC). In an NoC, routers are responsible for data transmissions among IPs. NoC provides high link utilization and alleviates the contention problem through the parallel execution of routers. However, an ASIC-implemented NoC lacks flexibility and has a high non-recurring engineering (NRE) cost. As an alternative, we can implement an NoC in a Field Programmable Gate Arrays (FPGA). Such an FPGA-based NoC is more flexible and the NRE cost of FPGA-based NoC is also much lower than that of an ASIC-based NoC. In addition, FPGA devices can support dynamic partial reconfiguration such that the hardware circuits can be configured into an FPGA at run time when necessary, without interfering hardware circuits that are already running. Because of dynamic partial reconfiguration, an FPGA-based reconfigurable NoC system is able to accommodate the execution of more hardware applications than an ASIC-implemented NoC system given the same amount of system resources. Nevertheless, an FPGA-based reconfigurable NoC system has many problems to be solved such as hardware module placement, hardware applications scheduling, reconfigurable topologies, reconfigurable IP design and hardware/software (HW/SW) communication.

In general, a system is composed of many components, such as processors, memory, and I/O peripherals. With the improvement of semiconductor manufacturing technology, an entire system can be integrated into a single chip, often called a System-on-Chip (SoC). As summarized by the International Technology Roadmap for Semiconductors (ITRS) [2008], the semiconductor manufacturing technology is progressing from 180nm in year 2000 to 22nm in the future year 2016. This trend reveals that future SoCs will include more and more IP cores. Consequently, with the increasing number of IP cores in SoCs, the communication architecture plays a more important role in system performance.

A common communication architecture consists of dedicated wires as shown in Figure 1. The dedicated wire-based architecture has a high communication efficiency due to the point-to-point routing resources between two IP cores. However, when the number of IP cores increases, the dedicated wire-based architecture suffers from the increasing complexity of placing and routing. In addition, it also suffers from low link utilization because the dedicated wires can be utilized only by the connected IP cores. Figure 2 illustrates another general communication architecture, namely the shared bus. As implied by the name, a bus is shared by several IP cores which are connected to the bus. Therefore, the shared bus-based architecture has higher link utilization than the dedicated wire-based architecture. However, it needs a global arbiter to grant bus access to an IP core. The shared bus-based architecture must resolve bus access contentions among IP cores, which ensures that two or more IP cores cannot access the bus concurrently. When we integrate more and more IP cores in a shared bus-based architecture, IP cores face a long waiting period to access the bus due to severe contentions. From the above observation, not only the dedicated wired-based architecture but the shared bus-based architecture also suffers from the scalability issue. Dedicated wires cause low link utilization and the shared bus lacks the capability for parallel communication. Hence, a novel interconnect communication architecture called the Network-on-Chip (NoC) was proposed to provide a feasible solution [Benini & Micheli, 2002; Duato et al., 1997].

Figure 1.

Dedicated wire-based communication architecture

Complete Chapter List

Search this Book:
Reset