Efficient Fault Tolerance on Cloud Environments

Efficient Fault Tolerance on Cloud Environments

Sam Goundar (CENTRUM Catolica Graduate School of Business, Pontificia Universidad del Peru, Peru) and Akashdeep Bhardwaj (University of Petroleum & Energy Studies, Dehradun, India)
Copyright: © 2018 |Pages: 12
DOI: 10.4018/IJCAC.2018070102

Abstract

With mission critical web applications and resources being hosted on cloud environments, and cloud services growing fast, the need for having greater level of service assurance regarding fault tolerance for availability and reliability has increased. The high priority now is ensuring a fault tolerant environment that can keep the systems up and running. To minimize the impact of downtime or accessibility failure due to systems, network devices or hardware, the expectations are that such failures need to be anticipated and handled proactively in fast, intelligent way. This article discusses the fault tolerance system for cloud computing environments, analyzes whether this is effective for Cloud environments.
Article Preview

2. Fault Tolerance For Cloud Environments

Fault Tolerance aims to ensure systems are able to deliver in case of one of more failures of the unit’s components. Fault Tolerance (Anjali et al., 2016) is system resource availability and reliability not being affected in case any of the preceding component or execution devices (Mohammed et al., 2016) failing or there are multiple failures for the hosted application system or infrastructure devices (Zhang et al., 2011). Usually systems, devices or resources are often over provisioned or purposely underutilized to ensure even if the application performance might be affected during an outage, the systems continue to perform possibly at a reduced level, rather than failing completely within predictable and acceptable bounds. Fault tolerance is mostly implemented in high-availability life-critical system environments. Providing fault tolerant design (Patra et al., 2013) for each and every single component is however not an effective solution. The associated redundancy and over provisioning brings a number of parasitic penalties: increase in weight, cost, power, size, consumption, as well as time to design, verify and test before delivering the service. The following options are taken into account when determining how and why the computing components should be fault tolerant:

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 10: 4 Issues (2020): Forthcoming, Available for Pre-Order
Volume 9: 4 Issues (2019): 3 Released, 1 Forthcoming
Volume 8: 4 Issues (2018)
Volume 7: 4 Issues (2017)
Volume 6: 4 Issues (2016)
Volume 5: 4 Issues (2015)
Volume 4: 4 Issues (2014)
Volume 3: 4 Issues (2013)
Volume 2: 4 Issues (2012)
Volume 1: 4 Issues (2011)
View Complete Journal Contents Listing