At the heart of the Grid technology is the concept of resource sharing, which includes computers, storage and networks. Grid currently appears to be the most suitable technology to support this type of future development. Web Services are the key technology in Grid infrastructure. This chapter presents a case study of a Web Services design and implementation to allow medical data in differing formats to be stored in a standardised form and to expose algorithms from existing applications that manipulate these data sets as online service objects. The aim is to explain the key concerns in service design and development using a real-world application as a case study. By reading this chapter, the reader should gain an overall understanding of how a service-oriented Grid application can be designed and implemented.
In bioinformatics experiments related to in silico modelling, huge sets of data at different physical scales, and frequently from different sources, are created. Algorithms to process these data will often have been developed by researchers in different institutions; some of them will be newly created but others will have been in long-term use. Models are becoming more complex, often involving teams of researchers working at different locations, each possibly specialising in only one aspect of the overall problem, so the demands for resource sharing and high-performance computing, which is often available only at a distant site, are growing.
It remains a huge challenge to share and manage the distributed data, algorithms and computational resources and provide a suitable environment within which users can perform their tasks. Unless users can work in this environment in a familiar way that requires little change from their previous practice, there is likely to be resistance to the change and poor user take-up.
To provide support for users’ activities, it can be valuable to build a digital library infrastructure that allows clinicians and researchers not only to preserve, trace and share data resources, but also to collaborate at the data-processing level.
This chapter describes the implementation of the digital library as a service-oriented Grid application, based on work in a project funded by the European Commission – LHDL: the Living Human Digital Library. We shall briefly introduce Grid technology including the data grid middleware used in the project, then describe Service-Oriented Architecture (SOA) and Web Services technology. Finally, we describe the LHDL project, including its overall design and the main concerns for the implementation. We also explain the reasons for choosing the particular technology and tools, with the aim of sharing our experiences with the reader.
To summarise, the overall objectives of the chapter are to:
Provide an overview of relevant aspects of Grid technology.
Describe the features of a digital bioinformatics library based on Grid and Web Services.
Explain in detail a Web Service design and implementation that can be used for the Grid.
Share experiences related to building such an application.
In this section, we briefly describe Grid technology, its key features and its evolution. We then introduce a general Grid architecture, and summarise the resources that are provided by Grid. Finally, we survey Grid middleware and data Grid middleware.
As expressed by Foster and Kesselman (1998) and later refined in Foster et al. (2001), the Grid concept is encapsulated by ‘coordinated resource sharing and problem solving in a dynamic, multi-institutional virtual organisation’.
In the commercial world, IBM defines a Grid as ‘a standards-based application/resource-sharing architecture that makes it possible for heterogeneous systems and applications to share, compute and store resources transparently.’ (Clabby, 2004)
According to the Expert Group Report (2003), the Grid evolved through several phases, beginning as a means of sharing computing resources. Data sharing, and the use of special devices such as scientific instruments and medical equipment, were added later. The combination of the first generation of Grid with Web technology led to generic Grid services.
The focus later shifted to knowledge sharing and collaboration between organisations, while maintaining the security requirements of each individual. The knowledge Grid facilitates data mining across the Internet. It requires techniques for abstracting heterogeneous data, creating meta-data, publishing, discovering and describing data in the Grid.