Article Preview
Top1. Introduction
Modern Field Programmable Gate Arrays (FPGAs) have become an attractive solution for many applications because of their ability to implement high-performance designs with relatively low cost, low power, and high flexibility compared to Application Specific Integrated Circuits (ASICs). Moreover, the recent increase in FPGA density allows for implementing complex systems, which integrate many Intellectual Property (IP) cores. According to the Grand View Research (Grand View Research, 2020), the global FPGA market is expected to reach $18.8 billion by 2027 compared to $6 billion in 2015 (Manners, 2010). The main trending applications of FPGAs based on this report are automotive, data processing, military, aerospace, broadcast, smart cameras, and telecommunication applications. Image processing is fundamental to many of these applications. FPGAs have been traditionally used to either accelerate parts of an imaging system pipeline, or to implement a compete imaging solutions (Khalifat et al., 2015).
SRAM-based FPGAs are highly sensitive to radiation. Radiation can cause corruption in the data stored in the configuration memory in the form of Single-Event Upsets (SEUs) or Multiple Bit Upsets (MBUs), which can lead to errors in any part of the system. With the increase in the density of configuration cells in FPGAs, the probability of faults increases. However, newer SRAM technologies have better upset immunization, but they are still susceptible to upsets at a lower rate. According to a study published in (Xilinx, 2015), the 7-series has a total rate of 100 FIT/Mb for all elements compared to 260 FIT/Mb in Virtex-4 (1FIT/Mb = 1 upset per109 hours per 106 configuration bits).
This work exploits the flexibility of FPGAs to build a dynamic imaging system, where hardware functions are swapped in and out based on the required functionality using the Dynamic Partial Reconfiguration (DPR). DPR allows for changing the functionality of certain blocks in the FPGA at runtime without interrupting the operation of the system. DPR can be performed internally using an internal configuration port, such as the Internal Configuration Access Port (ICAP). The proposed system implements an efficient partially reconfigurable imaging pipeline between the image sensor and the output.
To enable the implemented DPR imaging system to work effectively in harsh environments, many fault mitigation techniques have been integrated into the system. Some of these techniques rely on the internal configuration port in FPGAs for both error detection and correction, and some of them use it only for correction, as the detection is done using a different method as it going to be stated in the related work.
This paper addresses the reliability issues of the imaging system proposed in (Khalifat et al., 2015) and ways to improve it using the available techniques for both the reconfigurable part and the static part. The solutions are mainly based on Xilinx 7-series FPGAs as Xilinx have more market share compared to other vendors. The latest studies said that Xilinx held around 50-55% share of the FPGA market in 2020 and higher share percentage in the academia. The contribution of this paper can be summarized in few points: first, the Software Error Mitigation (SEM) IP (Xilinx, 2015; Crabill and Chang, 2015) is integrated into the system to evaluate the system reliability by injecting errors and applying a correction mechanism. Second, for the static part of the system, a Built-in Self-Test (BIST) using Cyclic Redundancy Check (CRC) is presented for an imaging system. Third, a Triple Modular Redundancy (TMR) is applied to the Reconfigurable Region (RR) to increase the reliability of the reconfigurable part. The remainder of the paper is organized as follows: A brief look at the imaging system is presented in Section 2, while Section 3 discusses the related work. A brief background on the reliability features in Xilinx FPGAs is presented in Section 4. Section 5 discusses the deployment of SEM IP on the system and its limitations and the possible improvements. Section 6 presents the BIST solution for the static part of the system. Section 7 discusses the TMR design of the reconfigurable part and finally, conclusions are pointed out in Section 8.