Superposed Redundancy Approach for Building Reliable Communication in Multi-Bus Heterogeneous Systems

Superposed Redundancy Approach for Building Reliable Communication in Multi-Bus Heterogeneous Systems

Chafik Arar (Computer Science Department, University of Batna 2, Batna, Algeria)
DOI: 10.4018/IJERTCS.2019010101

Abstract

In this article, the author uses a new variant of passive redundancy, which allows for a fictitious dual assignment by simultaneously scheduling two backup copies that overlap on the same communication bus at a given time. The proposed reliable fault tolerant greedy list scheduling algorithm is based on a superposed backup copy. This scheduling algorithm is considering up to n communication buses faults, caused by hardware faults and compensated by software redundancy solutions. it allows a reliable communication and efficient use of buses. In the experiments, the proposed methods are evaluated in terms of data scheduling length for a set of DSP benchmarks from the DSPstone.
Article Preview
Top

1. Introduction

Today, heterogeneous, embedded, distributed and real-time systems are becoming more and more important in our societies. This is reflected in the fact that such systems take over exceptionally critical decisions. But unfortunately, because of their complexity, these systems are very sensitive to faults, and due to potentially catastrophic consequences that may result from their malfunction, fault tolerance techniques are more needed to ensure that these systems continue to operate satisfactorily even in the presence of faults (Herault 2016; Gao, 2016; Grünsteidl et al., 2014; Davis et al., 2011).

In a system, hardware and software can somehow be the target of a variety of defects. Fault tolerance (Dongarra, et al., 2015; Ganesh, et al.,2014) is essential to ensure the continuity of the service of those systems; it can be implemented by different ways. In this work the author focuses on hardware faults, more particularly on communication faults and proposes a new reliable fault-tolerant scheduling algorithm for real-time embedded systems, dedicated to multi-bus heterogeneous architectures with multiple processors linked by several shared buses (Arar et al. 2013). The proposed scheduling algorithm is considering only bus's faults, caused by hardware faults and compensated by hybrid software redundancy solutions (Yang, et al., 2017).

This paper proposes a solution based on scheduling algorithms, more specifically those based on static scheduling (Singh, et al., 2015). This choice was largely motivated by the fact that static scheduling algorithms allow including the dependencies and the execution cost of tasks and data dependencies in their scheduling decisions, and the schedule is already computed at compile-time (Arar et al., 2015).

Inspired by the ability of a quantum system to be in multiple states at the same time (Dirac, 1981; Giancoli, 2016), author defines a new variant of passive backup copy called: superposed backup copy. The joint use of this new type of backup copy and the scheduling theory allows the scheduling of two superposed backup copies of two different data on the same bus at the same time. Overlapping of two superposed backup copies on the same bus, leads to an efficient use of resources by reducing the length of data scheduling on the buses (Jun et al.2014).

To evaluate the performance of the proposed algorithm, author compares its results with an exact solution. The exact solution in question is obtained through the use of linear programming in order to optimally formulate the fault-tolerant data scheduling problem, using two types of backup copies, with the main objective minimizing scheduling time.

The Formulation using linear programming provides the exact results, but since the problem is NP-hard, this solution generally takes long time to obtain an optimal solution, and for some cases a feasible solution can't be finding in an acceptable time. That's why, author propose a heuristic algorithm-based solution called FTA-RD-n. It makes it possible to, first maximize the reliability of the system and second, to minimize the length of the whole generated schedule in both presence and absence of faults Simulation results are able to show that our approach can generally reduce the run-time overhead.

The rest of the paper is organized as following. Next section discusses some related works. Section 3 gives detailed description of our system models. Section 4 give a motivational example, which shows how our approach can minimize the length of the whole generated schedule and describes the design and the implementation details of our scheduling algorithm. Section 5 describes the evaluation methodology and provides the results from our evaluation and analysis findings. Last section concludes the Work and discusses the future work.

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 11: 4 Issues (2020): 1 Released, 3 Forthcoming
Volume 10: 4 Issues (2019)
Volume 9: 2 Issues (2018)
Volume 8: 2 Issues (2017)
Volume 7: 2 Issues (2016)
Volume 6: 2 Issues (2015)
Volume 5: 4 Issues (2014)
Volume 4: 4 Issues (2013)
Volume 3: 4 Issues (2012)
Volume 2: 4 Issues (2011)
Volume 1: 4 Issues (2010)
View Complete Journal Contents Listing