Data Replication in Cloud Systems: A Survey

Data Replication in Cloud Systems: A Survey

Khaoula Tabet (University Larbi Tebessi, Department of Mathematics and Computer Science, Tebessa, Algeria), Riad Mokadem (Paul Sabatier University, Toulouse, France), Mohamed Ridda Laouar (University Larbi Tebessi, Department of Mathematics and Computer Science, Tebessa, Algeria) and Sean Eom (Southeast Missouri State University, Management Information Systems, Cape Girardeau, MO, USA)
Copyright: © 2017 |Pages: 17
DOI: 10.4018/IJISSC.2017070102

Abstract

This paper presents a survey of data replication strategies in cloud systems. Based on the survey and reviews of existing classifications, we propose another classification of replication strategies based on the following five dimensions: (i) static vs. dynamic, (ii) reactive vs. proactive workload balancing, (iii) provider vs. customer centric, (iv) optimal number vs. dynamic adjustment of the replica factor and (v) objective function. Ideally, a good replication strategy must simultaneously consider multiple criteria: (i) the reduction of access time, (ii) the reduction of the bandwidth consumption, (iii) the storage resource availability, (iv) a balanced workload between replicas and (v) a strategic placement algorithm including an adjusted number of replicas. Therefore, selecting a data replication strategy is a classic example of multiple criteria decision making problems. The taxonomy we present can be a useful guideline for IT managers to select the data replication strategy for their organization.
Article Preview

Introduction

With the increasing globalization of contemporary business organizations, distributed databases and their management have become one of the key areas in database research. A distributed database is a single logical database scattered across multiple computers. Basically, there are two options for distributing a database: data partitionaing or data replication. Data replication is one of the important decisions in organizations (Km & Eom, 2016). It refers to the creation of identical copies of data (replicas). Data partitioning is another strategy for distributing a database that breaks a table into multiple records (horizontal partitioning) or multiple columns (vertical partoitioning).

This paper presents a survey of data replication strategies in cloud systems. Data replication improves data availability, response time, fault tolerance, and reduces network traffic. It is frequently used in: (i) DBMS (Pérez, García-Carballeira, Carretero, Calderón, & Fernández, 2010), (ii) parallel and distributed systems (Loukopoulos, Lampsas, & Ahmad, 2005; Benoit, Rehn-Sonigo, & Robert, 2008), (iii) mobile systems (Tos, Mokadem, Hameurlain, Ayav, & Bora, 2016) and (vi) large scale systems, including P2P(Xhafa, Kolici, Potlog, Spaho, Barolli, & Takizawa, 2012)and data Grid systems (Mansouri, Azad, & Chamkori, 2014). Many replication strategies proposed aim to answer the following questions:

  • What data should be replicated?

  • When should the data be replicated

  • Where the new replicas should be placed?

Data replication is a necessary tool for effectively managing a database in distributed database environments. Most of the works in the literature have classified replication strategies based on the following criteria: (i) static vs. dynamic classification (Chervenak, Deelman, Foster et al., 2002; Čibej, Slivnik, & Robič, 2005), (ii) centralized vs. decentralized replication (Sashi & Thanamani, 2011; Amjad, Sher, & Daud, 2012; Grace & Manimegalai, 2014), (iii) server vs. client replication (Doğan, 2009; Steen & Pierre, 2010), (iv) objective function based classification (Mokadem & Hameurlain, 2015), and (v) system architecture based classification (Tos, Mokadem, Hameurlain et al., 2015). However, the existing replication strategies are not adapted to the cloud system. They aim to obtain the best performance without taking the profit of cloud providers or the satisfaction of tenant requirements into account. Creating as many replicas in clouds may not be economically feasible. Hence, replication strategies in such environments should also ensure both a tenant Quality of Service (QoS) and the economic profitability of the provider.

This paper presents a survey of data replication strategies in cloud systems. We propose another classification of replication strategies based on the following five dimensions:

  • Static vs. dynamic (Ghemawat, Gobioff, & Leung, 2003; Bai, Jin, Liao et al., 2013);

  • Reactive vs. proactive workload balancing (Silvestre, Monnet, Krishnaswamy et al., 2012; Hussein & Mousa, 2014);

  • Provider-centric vs. customer-centric (Sakr & Liu, 2012; Sousa & Machado, 2012);

  • Minimal blocking probability (Xue, Shen, & Guo, 2015) and energy efficiency and bandwidth consumption (Boru, Kliazovich, Granelli et al., 2015);

  • Objective function (Bonvin, Papaioannou, & Aberer, 2011; Kirubakaran, Valarmathy, & Kamalanathan, 2013; Tos et al., 2016).

Complete Article List

Search this Journal:
Reset
Open Access Articles
Volume 10: 4 Issues (2019): Forthcoming, Available for Pre-Order
Volume 9: 4 Issues (2018): 3 Released, 1 Forthcoming
Volume 8: 4 Issues (2017)
Volume 7: 4 Issues (2016)
Volume 6: 4 Issues (2015)
Volume 5: 4 Issues (2014)
Volume 4: 4 Issues (2013)
Volume 3: 4 Issues (2012)
Volume 2: 4 Issues (2011)
Volume 1: 4 Issues (2010)
View Complete Journal Contents Listing