Fault Mitigation in Reconfigurable NoC Routers with Thin Design Rules

Fault Mitigation in Reconfigurable NoC Routers with Thin Design Rules

Elena Suvorova (Saint-Petersburg State University of Aerospace Instrumentation, St. Petersburg, Russia), Yuriy Sheynin (Saint-Petersburg State University of Aerospace Instrumentation, St. Petersburg, Russia) and Nadezhda Matveeva (Saint-Petersburg State University of Aerospace Instrumentation, St. Petersburg, Russia)
DOI: 10.4018/IJERTCS.2015010102
OnDemand PDF Download:
$37.50

Abstract

Modern networks-on-chip (NoC) for embedded systems are manufactured by thin design rules; they should be resistant to failures due to the specific aspects of the technology. In the paper we consider failure mitigation approaches, evaluate them for thin design rules. Most fault mitigation approaches are based on reconfiguration of NoC and its main components – routers. We suggest the methodology for development of reconfigurable routers with fault mitigation, estimate them using simulation that enables dynamic failure injection. The proposed method can be used for routers with different structures in NoC with various interconnection graphs.
Article Preview

2. Causes Of Faults And Failures In Soc

2.1. Causes of Faults and Failures

Main types of errors are (ITRS,2013; Reyneri et al., 2010; Pignol, 2010):

  • Transient faults (soft errors);

  • Permanent faults (hard errors).

The existing functionality of a SoC could be restored after a soft error. The hard error occurs as a result of an irreparable damage.

Soft errors are (ITRS, 2013; Reyneri et al., 2010; Pignol, 2010):

  • Single event upset (SEU) – change of a value stored in a flip-flop (SRAM cell) to an opposite value;

  • Multiple cell upset (MCU) – change of values in several neighboring memory cells;

  • Single event transient (SET) – occurrence of a glitch at a transistor output;

  • Single event functional interrupt (SEFI) – a soft error in a component that has an impact on functionality of a whole system; for example, an error that occurred in the processor program counter.

Hard errors are (ITRS, 2013; Reyneri et al., 2010; Pignol, 2010):

  • Stuck at fault (‘0’, ‘1’) – when value in one point of a circuit has the constant value “0” or “1”. It happens due to a physical destruction of a floorplan such as the gap lane;

  • Single event latch-up (SEL) – dramatic increase of the leakage current;

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 8: 2 Issues (2017)
Volume 7: 2 Issues (2016)
Volume 6: 2 Issues (2015)
Volume 5: 4 Issues (2014)
Volume 4: 4 Issues (2013)
Volume 3: 4 Issues (2012)
Volume 2: 4 Issues (2011)
Volume 1: 4 Issues (2010)
View Complete Journal Contents Listing