Article Preview
TopIntroduction
In distributed systems, data replication is the process of storing multiple copies of data at different nodes (Tenzekhti, Day, & Ould-Khaoua, 2002; Wang, & Li, 2006). The primary objective of data replication is to increase the availability of data and fault tolerance. Moreover, if one of the nodes is failed due to some unavoidable reasons, then the data is accessed from other nodes. However, the major concern is to make the data consistent in each individual node (Jajodia, & Mutchler, 1990). In general, replication is of two types, namely active and passive. A replication is said to be active if the update request is processed at every node. It can be used for deterministic processes. On the other hand, passive replication processes an update request on a single node and propagates the update to the other nodes. It can be used for deterministic and nondeterministic processes (Deshpande, & Kamalapur, 2014).
In distributed systems, a fault (or a failure) can disconnect the connected nodes (referred as a network) into two or more disconnected networks. A fault may be permanent, transient and intermittent (Koren, & Krishna, 2007; Panda, Khilar, & Mohapatra, 2013; Panda, Khilar, & Mohapatra, 2014; Mishra, & Panda, 2017; Panda, & Khilar, 2012; Panda, & Khilar, 2012; Bhoi, Panda, & Khilar, 2012). A permanent fault at a node reflects severe damage and disconnects the node from the network. A transient fault at a node occurs for a short duration of time. It disconnects the node from the network for some time. An intermittent fault at a node oscillates between active and inactive state. Here, active means presence of the fault, which disconnects the node. On the other hand, passive means that the node works normally. In the presence of the above faults, a network must ensure that the data at each node should be consistent (Koren, & Krishna, 2007).
There are several approaches to manage the multiple copies of data at different nodes. One of them is voting among the multiple copies. This approach can be hierarchical or non-hierarchical. In hierarchical voting, the nodes are represented in the form of a tree. Here, a read operation is carried out by reading any one of the copies and a write operation is carried by updating each and individual copy. On the other hand, the non-hierarchical voting depends on the availability of connected nodes and the node itself. Here, it is assumed that each individual node has exactly one vote.