Self-Repair Technology for Global Interconnects on SoCs

Self-Repair Technology for Global Interconnects on SoCs

Daniel Scheit (Brandenburg University of Technology Cottbus, Germany) and Heinrich Theodor Vierhaus (Brandenburg University of Technology Cottbus, Germany)
Copyright: © 2011 |Pages: 21
DOI: 10.4018/978-1-60960-212-3.ch009
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

The reliability of interconnects on ICs has become a major problem in recent years, due to the rise of complexity, low-k-insulating material with reduced stability, and wear-out-effects due to high current density. The total reliability of a system on a chip is more and more dependent on the reliability of interconnects. The growing volume of communication due to the increasing number of integrated functional units is the main reason. Articles have been published, which predict that static faults due to wear-out effects will occur more often. This will harm the reliability and decrease the mean-time-to-failure. Most of the published techniques are aimed at the correction of transient faults. Built-in self-repair has not been discussed as much as the other techniques. In this chapter, the authors will provide an overview over the state of the art for fault-tolerant interconnects. They will discuss the use of built-in self repair in combination with other approved solutions. The combination is a promising way to deal with all kinds of faults.
Chapter Preview
Top

Introduction

The total wire length on a chip will increase continuously (International Technology Roadmap for Semiconductors Interconnect, 2007). Simultaneously, the wire pitch and diameter will shrink, while the aspect ratio will increase. The current density will grow, because the voltage cannot be reduced on a linear scale with the wire diameter. Hence the RC delay will increase. These trends have a negative impact on the chip and system reliability. A longer wire has a higher probability to fail than a shorter one under the assumption that all other parameters are equal. The same is true for the number of wires. The decreased wire pitch makes fabrication more difficult. Faults are more likely. While defects introduced at the time of production are one reason, defects that may occur due to wear-out effects, seem to gain importance with shrinking feature size. Two of the most important wear-out effects are electro and stress migration. A high current density under higher temperature or mechanical stress between metal and silicon can lead to a transport of metal atoms. This transport leads to voids and hillocks, which can end up in a open wire or shorts because of broken insulators. The increasing aspect ratio leads to larger capacitances between adjacent wires. Coupling capacitances between lines lead to statistical variations in signal delays, which can end up in dynamic faults. The voltage drops on supply lines make the circuit more prone to transient faults, caused by the voltage supply noise or electro-magnetic interferences.

The impact of the reliability of the integrated interconnects on the overall system reliability is increasing. Global interconnects play an important role in system-on-chip (SoC). They are used to connect different cores and for test access mechanisms (TAM) (Zorian & Marinissen, 2000). The test access mechanisms are needed to transport test vectors and answers to and from the embedded cores. Fault tolerant TAM is necessary to test integrated cores and to isolate them in case of an error which could be caused by defects on local interconnects. The same is true for interconnects used for global communication. With fault tolerant global communication and test access mechanisms, graceful degradation is possible.

In this chapter, we will show what is done and what can be further done to ensure a high level of reliability also under economic aspects. Several established methods to increase the reliability are to be shown first, followed by new methods and architectures. In the following background section, we will introduce the existing and upcoming types of faults and fault mechanisms. Later we will show how they can be prevented, and if necessary, corrected using codes or built-in self repair. The next section – Built-in self-repair (BISR) of interconnects – is the main section of this chapter. At the beginning of this section, we define the goals the solutions have to meet. Then we explore the design space to see which solutions are possible. This allows us to develop the system level. To keep it vivid, we describe two examples, which will be implemented down to the register transfer level. This is necessary to obtain values for the reliability model to show the benefit of the additional circuits. Subsequent to this section, we present possible research directions, followed by the conclusions and references.

Complete Chapter List

Search this Book:
Reset