Article Preview
Top1. Introduction
Checkpointing and rollback-recovery is one of the well-known backward error recovery techniques to reduce the loss of computation due to the process faults, through recording the process states and exchanging messages periodically on the appointed stable storage during failure-free execution in distributed computing (Kuang et al., 2014; Meroufel et al., 2014; Islam et al., 2014; Mendizabal et al., 2014; Awasthi et al., 2014). Each of saved replicated state of the process is called a checkpoint. In log-based rollback recovery schemes, the events experienced is also recorded into a location that will survive the process fault. That action is called logging (Kupta et al, 2008).
Upon a fault, there is a recovery mechanism which brings the failure process to the normal execution. Following the recovery, loading the process checkpoint is called a rollback. The reprocessing of the lost execution, starting from the checkpoint and replaying the logged events until the point just before the fault, is called a recovery (Elnozahy et al., 2002).
Many new characteristics are introduced in mobile computing, such as mobility, disconnections, finite power source, vulnerable to physical damage, lack of stable storage (Kupta et al., 2008; Elnozahy et al., 2002). Therefore, the wireless network connection is more fragile and mobile host is much less reliable than the traditional wired distributed computing. Mobile hosts may disconnect from the rest of the network due to doze mode, abrupt power off or permanents damage. Therefore, it is more desirable for mobile computing to be equipped with an appropriate rollback recovery scheme to minimize the loss of computation due to the process fault.
Research on rollback recovery fault tolerant scheme for mobile computing systems has received tremendous interests in recent years. Various schemes have been presented to accommodate the characteristics of mobile computing (Ono et al., 2004; Brzezinsk et al., 2006; Pradhan et al., 1996; Li et al., 2005; Cao et al., 2001; Men at al., 2006; Alvisi et al., 1998; Taesoon et al., 2003, 2002). However, how to minimize the failure-free overhead incurred and to ensure the consistent recoverability, requires to be investigated further.