Peer-to-Peer (P2P) storage systems leverage the combined storage capacity of a network of storage devices (peers) contributed typically by autonomous end-users as a common pool of storage space to store and share content. A major challenge in such a system comprising of autonomous participants is to guarantee quality of service in terms of persistence and availability of the stored content. This chapter focuses on the different possible design choices for maintaining redundancy in P2P storage systems, including algorithm details of maintenance mechanisms, analytical models to understand system’s dynamics, empirical results from simulation experiments as well as experiences from prototype deployments.
TopIntroduction
For diverse reasons including fault-tolerance, load-balance, response time, or geographic distribution of end users, distributed data stores have been around for a long while. Historically this included distributed databases, distributed file systems and Usenet servers among others. Peer-to-peer (P2P) storage is a paradigm which extends those concepts and aims to leverage the combined storage capacity of a network of storage devices (peers) contributed typically by autonomous end-users as a common pool of storage space to store and share content, and is designed to provide persistence and availability of the stored content despite unreliability of the individual autonomous peers in a decentralized environment.
There are many aspects which determine the design space of P2P storage systems. A lot of these design decisions are pertaining to the fact that to achieve fault-tolerance and load-balancing, redundancy of stored content is necessary. As a consequence – choice of the kind of redundancy to be used (e.g., replication, encoded or mixed), the placement of these redundant objects, and the maintenance of redundancy in presence of churn determine the design axis of p2p storage systems, as shown in Figure 1.
Figure 1. The design space of P2P storage systems
The primary focus of this chapter is along one dimension of this design space - studying different redundancy maintenance strategies, analytical tools to understand the interplay of maintenance and churn, and empirical comparative study of various maintenance strategies.
We first provide background information, including historical context of P2P storage systems and their applications, as well as motivate the problem studied in this chapter - that of maintaining redundancy in P2P storage systems under churn. This will include characterization of churn itself.
We elaborate the various redundancy maintenance strategies (including high level algorithmic details) of existing maintenance mechanisms. The effectiveness of the maintenance mechanisms, and the dynamics of maintenance under churn have been studied both by constructing analytical models, as well as by simulation experiments.
We outline two analytical models to study storage systems under churn. Markov models are used to study the dynamic equilibrium in P2P storage systems under the hypothetical assumption of how the system would behave if the level of churn is constant. Stochastic differential equations are used to study the system’s evolution as the churn level keeps varying.
We then summarize results from simulation based experiments as well as experiences from prototype deployments to study (in)effectiveness and (in)efficiency of the different maintenance mechanisms and garbage collection mechanisms.
Materials presented in this chapter include both our own works and a survey of works done by other researchers, to provide a well-rounded overview of current literature.
Among other implications of the design choices are issues like – “How are the distributed objects found?” “How is the consistency of mutable content stored redundantly dealt with?” Such topics are beyond the scope of this chapter. For the first issue – search of objects, readers should refer to structured overlays and DHTs which is one possible mechanism to locate objects in a decentralized manner.
TopBackground
Early major attempts of building distributed storage systems were started in the middle of 80s. Most traditional distributed storage systems (e.g., NFS, AFS, CODA, etc) are built on top of reliable dedicated workstations with good network connections, and are implemented in a client-server manner to perform replication, caching and maintenance. It is expensive to build such a system for small companies. The client-server architecture also makes those systems less scalable.
In recent years, Internet growth has inspired a new class of storage systems that adopt the peer-to-peer paradigm. By harnessing the storage space at the edge of the Internet, P2P storage systems can provide inexpensive, scalable, fault-tolerant, and highly-available storage without centralized servers. Some of recent P2P storage systems include OceanStore (Kubiatowicz et al., 2000), DHash (Dabek et al., 2004), Total Recall (Bhagwan et al., 2004), etc. In spite of its superior scalability and trivial storage cost compared to traditional systems, P2P storage systems pose many new challenges due to the intrinsic characteristics of autonomous peers.