Due to some barriers to adoption we have not seen a proliferation of Grid Computing technologies throughout e-Science or other domains. This chapter outlines many issues that are a consequence to the existing Grid Middleware based approaches. The authors believe a Grid Operating system, or an operating system with built in Grid computing capability might be able to address the drawbacks of the existing infrastructure, leading to a fault tolerant, flexible and easy to use stack for rapid deployment of Grids. This chapter presents the motivation and issues which lead us to a Grid operating System and outline its design, implementation and evaluation details.
Despite having made substantial advances during the last decade, grid computing is still neither pervasive nor widely deployed. Gartner predicted that by 2006 grid computing would mature sufficiently to leave the science laboratories and enter the business world (Gartner Group, 2003). But so far there have been only a few success stories, since only a subset of business applications are supported by existing grid infrastructures. To date the computing research community and particularly eScience projects have been the biggest beneficiaries of grid computing whereas other communities, such as user-centric fields like medical sciences, cannot easily implement existing grid architectures to support their applications. In Mattmann (2007), various requirements have been outlined which existing grid middleware do not support, thus increasing the cost of the adoption of grids for medical science. It is a similar story for enterprise computing where existing grid middleware is not scalable or lacks fault tolerance. Moreover they are mostly platform dependent and are insufficiently flexible to support enterprise applications. These issues in grid adoption can be traced to technical hurdles which arise as a consequence of the current approaches to grid computing. We are of the opinion that these hurdles originate from the current middleware approach to grid computing, as detailed below.
The middleware approach to grid computing was developed in science laboratories in which computing clusters distributed across the world are linked together in order to create grids to solve mainly compute and data intensive scientific problems. The role of grid middleware in this paradigm was to ‘glue’ the clusters together to achieve interoperability. Notable grid middleware included Globus (Foster and Kesselman, 1997), gLite (Laure, 2004) and UNICORE (Erwin and Snelling, 2001). This approach has however created some obstacles to grid adoption in other fields, since the cluster-oriented grids of today are not very suitable for user-centric computation, due in part to their complex operation and maintenance requirements. The main barriers (Ali, 2006) to the adoption of grid computing that result from the strong focus of current grid computing research on eScience are: the support for only limited application types (which mostly comprise highly parallel and batch applications), potentially inflexible network topologies, the steep learning curve required for configuring and maintaining a grid with grid middleware, the lack of fault tolerance in the infrastructure and an inflexibility of virtual organization (VO) management software to create more fine grained VOs, as required by some applications. All of these limitations make grid computing in its present incarnation unsuitable for the common user with little computing expertise and make grid computing expensive for existing users to maintain.
Key Terms in this Chapter
Hypervisor: The software which enables virtualization in a system. Mostly denotes software which enables platform virtualization.
Discovery Service: A service in Grid computing which is responsible for the discovery of distributed resources.
Super Peer: A model in peer to peer systems, where a node in the system represents a collection of independent peers which interact with the larger system through a centralized node. This peer is termed as the super peer.
Virtualization: A term that broadly refers to the abstraction of resources. Resources may include applications, platforms and systems.
Peer to Peer: A computing model in distributed systems where constituent nodes interact with each other without centralized mechanisms.
Platform Virtualization: A virtualization model which creates a logical abstraction of a hardware platform. This logical abstraction is typically denoted as a “virtual machine”, which is capable of simulating the capabilities of the concerned platform.
Grid Computing: A model of distributed computing based on the dynamic sharing of resources between participants, organisations and companies with the aim of combining these resources and carrying out intensive computing applications or the processing of vast amounts of data.