Storage Grid is a new model for deploying and managing the heterogeneous, dynamic, large-scale, and geographically distributed storage resources. This chapter discusses the challenges and solutions involved in building a Service Oriented Storage (SOS) Grid. By wrapping the diverse storage resources into atomic Grid services and federating multiple atomic Grid services into composite services, the SOS Grid can tackle the heterogeneity and interoperability. Peer-to-peer philosophy and techniques are employed in the SOS Grid to eliminate the system bottleneck and single point of failure of the traditional centralized or hierarchical Grid architecture, while providing dynamicity and scalability. Because Grid service is not designed for critical and real-time applications, the SOS Grid adopts Grid service to glue the distributed and heterogeneous storage resources, while using binary code to transfer data. The proposed methods strike a good balance among the heterogeneity, interoperability, scalability and performance of the SOS Grid.
According to a new report from IDC (IDC white paper, 2007), 161 exabytes of digital information were created and copied in 2006. The growth will continue to increase exponentially. The amount of information in 2010 will surge more than six fold to 988 exabytes which amounts to a compound annual growth rate of 57%. About 70% of the digital information will be generated by individuals over the next three years. The data will be stored in a large number of data centers which are distributed across the Internet. The data centers may have completely heterogeneous operating systems, computer architectures, and IT infrastructures.
The explosive growth of data has been identified as the key driver to escalate storage requirements. There are two major technologies which impact the evolution of storage systems. The first one is parallel processing such as redundant arrays of inexpensive disks (RAID) (Gibson, et al, 1988). The second one is the influence of network technology on storage system architecture. Network based storage systems such as network attached storage (NAS) and storage area network (SAN) (Gibson and Meter, 2000; Morris and Truskowski, 2003) offer a robust and easy method to control and access large amounts of storage resources. However, the ever increasing amounts of data generated worldwide incur a significant impact on the storage systems we have today (Min, et al, 2005). It requires more sophisticated techniques and more flexible and reliable storage systems to store and manage the data (e.g. providing petabytes and even exabyte storage capacity, and aggregate bandwidth over 100 GB/s). Undoubtedly, NAS and SAN cannot meet the requirements. It is a big challenge to design an autonomous, dynamic, large-scale and scalable storage system which consolidates distributed and heterogeneous storage resources to satisfy both the bandwidth and storage capacity requirements.
A Grid is a flexible, secure, coordinated resource sharing among dynamic collections of individuals, institutions, and resources (Foster, et al, 2001). The objective is to virtualize resources including computers, networks, instruments and so on and allow users and applications to access the resources in a transparent manner. A Grid environment may consist of hundreds or even thousands of geographically distributed and heterogeneous resources to match the requirements imposed by all kinds of Grid applications. Grid computing has emerged as an important new field, distinguished from conventional distributed computing by its focus on large scale resource sharing and high performance orientation. Table 1 illustrates the characteristics of Grid and data storage.Table 1.
Characteristics comparison of Grid and data storage
|Grid Characteristics||Data Storage Characteristics|
|Large scale or global distribution|| Worldwide|
|Dynamic coordination|| Ever-increasing|
|Collaborative virtual organizations||Require Interoperability|
|User transparent||Involve large quantity of heterogeneous storage resources|
|Secured communication||High security, privacy, and reliability|
Key Terms in this Chapter
Storage Grid: Storage grid is a virtual organization which federates the geographically distributed and heterogeneous storage systems into a logical community with only minimal administrative requirements, while providing scalability and interoperability.
Web Service: Web service is defined by W3C as a software system designed to support interoperable machine to machine interaction over a network. A web service provides interfaces described by a machine-processable WSDL document, and other systems can interact with the service using SOAP messages.
Storage Service Composition: Storage service composition indicates combining available atomic storage services as composition service to meet the data requirements of complex applications.
Grid Service: A grid service is a stateful web service with an associated lifetime which provides a set of interfaces through which grid users may interact.
Storage Interoperability: Storage interoperability provides seamless resource consolidation and cooperation among a large number of heterogeneous storage resources by using standard interfaces.
Grid Scheduler: Grid scheduler is in charge of scheduling jobs or applications where resources are distributed across a large scale or multiple administrative domains.
Storage Management: Storage management indicates a virtualization method which is employed to maximize the overall resource utilization of the storage systems by intelligently allocating the available storage resources among the applications above it, thus guaranteeing on-demand storage requirements.
Storage Scalability: Storage scalability is the ability to provide satisfied capabilities including storage capacity, performance and fault tolerance when a storage system is increased in size in order to meet the data requirements.
Complete Chapter List
Emmanuel Udoh, Frank Zhigang Wang
Emmanuel Udoh, Frank Zhigang Wang, Vineet R. Khare
Enis Afgan, Purushotham Bangalore
Kuo-Chan Huang, Po-Chi Shih, Yeh-Ching Chung
Gianni Pucciani, Flavia Donno, Andrea Domenici, Heinz Stockinger
Ming Wu, Xian-He Sun
Zhihui Du, Zhili Cheng, Xiaoying Wang, Chuang Lin
Kris Bubendorfer, Ben Palmer, Ian Welch
Sandro Fiore, Alessandro Negro, Salvatore Vadacca, Massimo Cafaro, Giovanni Aloisio, Roberto Barbera
Man Wang, Zhihui Du, Zhili Cheng
Vineet R. Khare, Frank Zhigang Wang
Yuhui Deng, Frank Zhigang Wang, Na Helian
Dominic Cherry, Maozhen Li, Man Qi
Maozhen Li, Man Qi, Bin Yu
Irfan Habib, Ashiq Anjum, Richard McClatchey
Kurt Vanmechelen, Jan Broeckhove, Wim Depoorter, Khalid Abdelkader
Rosario M. Piro
Frans Arickx, Jan Broeckhove, Peter Hellinckx, David Dewolfs, Kurt Vanmechelen
Gabriel Aparicio, Fernando Blanco, Ignacio Blanquer, César Bonavides, Juan Luis Chaves, Miguel Embid, Álvaro Hernández
Gerald Schaefer, Roger Tait
Daniele Andreotti, Armando Fella, Eleonora Luppi
Roberto Barbera, Valeria Ardizzone, Leandro Ciuffo
Dirk Gorissen, Tom Dhaene, Piet Demeester, Jan Broeckhove
Gokop Goteng, Ashutosh Tiwari, Rajkumar Roy
Hai Jin, Li Qi, Jie Dai, Yaqin Luo