A Migration Approach for Fault Tolerance in Cloud Computing

A Migration Approach for Fault Tolerance in Cloud Computing

Said Limam, Ghalem Belalem
Copyright: © 2014 |Pages: 14
DOI: 10.4018/ijghpc.2014040102
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Cloud computing has become a significant technology and a great solution for providing a flexible, on-demand, and dynamically scalable computing infrastructure for many applications. Cloud computing also presents a significant technology trends. With the cloud computing technology, users use a variety of devices to access programs, storage, and application-development platforms over the Internet, via services offered by cloud computing providers. The probability of failure occur during the execution becomes stronger when the number of node increases; since it is impossible to fully prevent failures, one solution is to implement fault tolerance mechanisms. Fault tolerance has become a major task for computer engineers and software developers because the occurrence of faults increases the cost of using resources. In this paper, the authors have proposed an approach that is a combination of migration and checkpoint mechanism. The checkpoint mechanism minimizes the time lost and reduces the effect of failures on application execution while the migration mechanism guarantee the continuity of application execution and avoid any loss due to hardware failure in a way transparent and efficient. The results obtained by the simulation show the effectiveness of our approaches to fault tolerance in term of execution time and masking effects of failures.
Article Preview
Top

Introduction

Most Cloud computing can be defined as a new style of computing in which dynamically scalable and often virtualized resources are provided as a services over the internet. Cloud computing has become a significant technology trend, and many experts expect that cloud computing will reshape information technology (IT) processes and the IT marketplace. With the cloud computing technology, users use a variety of devices, including PCs, laptops, smartphones, and PDAs to access programs, storage, and application-development platforms over the internet, via services offered by cloud computing providers. It provides the illusion of availability of unlimited resources to fulfill dynamic and variable user requirements. Advantages of the cloud computing technology include cost savings, high availability, and easy scalability (Aljawarneh, 2011; Kaushal & Bala, 2011; Arshad et al., 2012).

Cloud computing takes the technology, services, and applications that are similar to those on the Internet and turns them into a self-service utility. The use of the word “cloud” makes reference to the two essential concepts:

  • Abstraction: Cloud computing abstracts the details of system implementation from users and developers. Applications run on physical systems that aren't specified, data is stored in locations that are unknown, administration of systems is outsourced to others, and access by users is ubiquitous;

  • Virtualization: Cloud computing virtualizes systems by pooling and sharing resources. Systems and storage can be provisioned as needed from a centralized infrastructure, costs are assessed on a metered basis, multi-tenancy is enabled, and resources are scalable with agility (Sosinsky, 2011).

Virtualization techniques are commonly used in cloud platforms to implement partitioning of resources. Instead of having direct access to cloud resources, customers have access to virtual machines, which represent a fraction of a physical machine. Then, we identify three layers in such a cloud infrastructure: the physical resource layer (containing the overall cloud resources), the virtualization layer (containing virtual machines) and the applications layer (containing applications of external companies, which are hosted in the cloud) (Tchana et al., 2012).

The reliability of Cloud computing still remains a major concern among users. Due to economic pressures, these computing infrastructures often use commodity components exposing the hardware to scale and conditions for which it was not originally designed (Vishwanath & Nagappan, 2010). As a result, significantly large numbers of failures manifests in the system and seemingly impose high implications on the hosted applications, impacting their availability and performance. For example, Amazon’s Elastic Compute Cloud (EC2) experienced failure in Elastic Block Storage (EBS) drives and network configuration (“Amazon Elastic Compute Cloud,”) bringing down thousands of hosted applications and websites for 3 days and 3 hours (“Summary of the Amazon EC2”). Table 1 shows failover records from some of the cloud service provider system.

Table 1.
Cloud incidents
Organization      Services      SummaryDurationDate
DropboxWebsite, Mobile Apps, APIDropbox's website, mobile apps, and API had an outage caused by issues during routine internal maintenance (“Outage post-mortem”, 2014)1 day2014-01-10
Apple Inc.iTunes, Apple App Store, iCloud, CalendarApple Inc.'s iTunes, App Store, iCloud, and Calendar services had an outage that affected some users (“Apple’s App Store, iTunes back online after outages,”)6 hours2013-02-20
Microsoft CorporationWindows Azure ComputeMicrosoft Windows Azure suffered an extensive, worldwide outage in February that wasn’t fully addressed for more than 24 hours (“Microsoft Offers Credits, ” 2012)1 day2012-02-28
Microsoft CorporationMicrosoft Windows AzureMicrosoft Azure: the service came out of beta. The outage left people without access to their applications (“Microsoft Windows Azure 22-Hour”, 2009)22 hours2009-03-13
Google, Inc.GmailUsers went without Gmail access (“Gmail Back After 30 Hours,”)1 day and 6 hours2008-10-16

Complete Article List

Search this Journal:
Reset
Volume 16: 1 Issue (2024)
Volume 15: 2 Issues (2023)
Volume 14: 6 Issues (2022): 1 Released, 5 Forthcoming
Volume 13: 4 Issues (2021)
Volume 12: 4 Issues (2020)
Volume 11: 4 Issues (2019)
Volume 10: 4 Issues (2018)
Volume 9: 4 Issues (2017)
Volume 8: 4 Issues (2016)
Volume 7: 4 Issues (2015)
Volume 6: 4 Issues (2014)
Volume 5: 4 Issues (2013)
Volume 4: 4 Issues (2012)
Volume 3: 4 Issues (2011)
Volume 2: 4 Issues (2010)
Volume 1: 4 Issues (2009)
View Complete Journal Contents Listing