Review of Fault Tolerance Frameworks in the Cloud

Review of Fault Tolerance Frameworks in the Cloud

Ajay Rawat, Rama Sushil, Amit Agarwal
Copyright: © 2020 |Pages: 21
DOI: 10.4018/IJISMD.2020070105
(Individual Articles)
No Current Special Offers


Fault tolerance is the most imperious issue in the cloud to provide reliable services. Inherent vulnerability to failure hampers the performance and reliability of cloud services. Hence, to achieve reliability, fault tolerance becomes a mandatory feature which is hard to implement due to the dynamic infrastructure and complex interdependencies. Numerous fault tolerance techniques have been developed in the literature to address the challenges of cloud reliability. A recent research survey presented in this paper attempts to integrate the different fault tolerance architecture. This study presents a critical research review on various existing fault tolerance techniques to improve services reliability, availability, and applications execution in the cloud. A comparative analysis, based on different critical metrics like failure prediction, detection strategy, failure history, VM placement, and limitations, of the reviewed framework systems is also included in the paper. This review intends to facilitate the development of the new fault tolerance technique for the cloud environment.
Article Preview


Nowadays, cloud computing is a revolutionary technology that reduces the cost involved in computing. Even though it has gained widespread popularity in the industry, load balancing, resource management, security, workflow, scheduling and fault tolerance (FT) are the significant challenges in it. Among these, a virtual machine (VM) management is the most considerable challenge because of the possibility of fault presents in the dynamic cloud environment, which results in an unreliable outcome. There are different types of faults such as hardware, software, timing, value, permanent, transient, network, processor, interaction, process and omission fault that may occur in cloud-based computing resources (Salfner et al., 2010). The categorization of faults is shown in Figure 1. These faults can result in different types of failures in the cloud infrastructure such as network, hardware, software, database, overflow, and time-out-failure (Agarwal & Sharma, 2016; Mariani, 2003) as shown in Figure 2. Similarly, failures can also occur due to network congestion, server overload, malicious attacks, human factors and different unknown errors (Oliner & Stearley, 2007; Oppenheimer & Patterson, 2002). In the series of publications, various causes of failure have been reported in the literature (Schroeder & Gibson, 2010, 2007).

Figure 1.

Categorization of faults


Complete Article List

Search this Journal:
Volume 15: 1 Issue (2024): Forthcoming, Available for Pre-Order
Volume 14: 1 Issue (2023)
Volume 13: 8 Issues (2022): 7 Released, 1 Forthcoming
Volume 12: 4 Issues (2021)
Volume 11: 4 Issues (2020)
Volume 10: 4 Issues (2019)
Volume 9: 4 Issues (2018)
Volume 8: 4 Issues (2017)
Volume 7: 4 Issues (2016)
Volume 6: 4 Issues (2015)
Volume 5: 4 Issues (2014)
Volume 4: 4 Issues (2013)
Volume 3: 4 Issues (2012)
Volume 2: 4 Issues (2011)
Volume 1: 4 Issues (2010)
View Complete Journal Contents Listing