Information Dispersal Algorithms and Their Applications in Cloud Computing

Information Dispersal Algorithms and Their Applications in Cloud Computing

Makhan Singh (Panjab University, India) and Sarbjeet Singh (Panjab University, India)
DOI: 10.4018/978-1-5225-3029-9.ch004
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

Information dispersal is a technique in which pieces of data are distributed among various nodes such that the data can be reconstituted from any threshold number of these pieces. Information dispersal algorithms employ a method in which a file F needs to be dispersed among n nodes such that any m pieces will be sufficient to reconstruct the whole file F. The size of each piece is |F/m|. We must also ensure that the complete knowledge of any m-1 pieces is insufficient to reconstruct the complete file F. The ideas for accomplishing this have been given in many literatures in the past. A discussion and comparison of some of these is covered in this chapter.
Chapter Preview
Top

Introduction

Information storage and its transmission in a network environment cause many problems like security issues, availability problems, compromised reliability of the whole network, confidentiality problems etc. The encryption of the data can remove some of these problems but most of them still remain. One method to remove some of these problems can be replicating the data or file at multiple locations. But this gives rise to new problems like storage and network overload. Two problems where Information Dispersal is used mostly are in the storage of data on various systems or the transmission of data on a computer network. Thus, creating multiple copies of the data only increases overhead of the system. This is why the Information Dispersal Scheme was proposed as it deals with the distribution of n pieces of the file/data F such that the file can be reconstructed from any m parts of the file. In this manner, we can provide efficient solution to most of the above mentioned problems without actually having to increase the system overhead.

The general idea of how Information Dispersal Algorithms work is shown in the diagrams of Figure 1 and Figure 2.

Figure 1.

Distribution of file F using information dispersal algorithms

The data/file F is distributed among n parts among various systems. Each part is of the size |F/m| and the sum of the lengths of all these parts is (n/m). F. To make Information Dispersal space efficient we can choose m and n such that the ratio of n/m is as close to 1 as possible. The reconstruction of the file F is similar to the distribution but here we only need any m available parts of the file F. But we must also ensure that no m-1 parts of the file F would give away any information about it. This ensures that even if a malicious user gets m-1 parts of the file the security of the file will not be compromised.

The information dispersal scheme is not only explainable using dispersal and recombination of data but also using secret key sharing, transmission problem, storage problem and parallel computing problem as well. This is what makes this scheme so popular because it can be used in a variety of contexts in many different fields.

The Information Dispersal scheme provides many advantages over the conventional method of replicating the file because it is space efficient, secure, provides higher reliability and availability. The reconstruction process is shown in the diagram given below.

Figure 2.

Recombination of file F using information dispersal algorithms

These diagrams show the basic procedure for splitting and reconstituting the file F but various authors have used many different techniques to actually split and recombine the file F which we shall discuss and compare in this chapter.

Top

Background

Information storage and its transmission in a network environment cause many problems like security issues, availability problems, compromised reliability of the whole network, confidentiality problems etc. In this literature various mechanisms have been proposed that deals with these issues. In this chapter information dispersal schemes proposed by some researchers have been reviewed and thus can deals with above issues.

Complete Chapter List

Search this Book:
Reset