Dynamic Maintenance in ChinaGrid Support Platform

Dynamic Maintenance in ChinaGrid Support Platform

Hai Jin (Huazhong University of Science and Technology, China), Li Qi (Huazhong University of Science and Technology, China), Jie Dai (Huazhong University of Science and Technology, China) and Yaqin Luo (Huazhong University of Science and Technology, China)
Copyright: © 2011 |Pages: 8
DOI: 10.4018/978-1-60960-587-2.ch418
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

A grid system is usually composed of thousands of nodes which are broadly distributed in different virtual organizations. Owing to geographical boundaries among these organizations, the system administrators suffer a great pressure to coordinate when grid system experiences a maintaining period. Furthermore, the runtime dynamicity of service state aggravates the complexity of tasks. Consequently, building an efficient and reliable maintaining model becomes an urgent challenge to ensure the correctness and consistency of grid nodes. In our experiment with ChinaGrid, a Dynamic Maintenance mechanism has been adopted in the fundamental grid middleware called ChinaGrid Support Platform. By resolving the above problems with system infrastructure, service dependency and service consistency, the availability of the system can be improved even the scope of maintenance extends to wider region.
Chapter Preview
Top

Introduction

Dynamic maintenance for large-scale resources in grid environment is a big challenge owing to complexity of grid services and exigent requirement of grid users. Inappropriate processes of maintenance lead to unpredictable failures in wide area. Due to geographical distribution of computing and data resources in different administrative regions, a reliable maintenance mechanism is urgently necessary to coordinate different hosts and ensure the efficiency of maintenance task.

For the administrators of grids, the maintaining task is running through the whole lifecycle of service components. As shown in Figure 1, Jin and Qi (2007) defined that each service component in grid has the lifecycle of: released, deployed, initialed, activated, and destroyed. Responding to these stages, the maintaining tasks include publish, deploy, undeploy, redeploy, configure, activate, and deactivate. Especially, these tasks should face the distributed challenges in grid environment.

Figure 1.

Lifecycle of service component

A number of earlier investigations have addressed providing and standardizing maintenance for distributed resources. The Configuration, Description, Deployment and Lifecycle Management (CDDLM), proposed by Open Grid Forum (2006), is to standardize distributed software deployment and configuration in a validated lifecycle. Another specification of deployment infrastructure, the Installable Unit Deployment Descriptor (IUDD) released by W3C (2004), also provides a solution of supporting dynamic maintenance in run-time execution environment. Web Services Distributed Management (2006), proposed by Organization for the Advancement of Structured Information Standards (OASIS), discusses how management of any resource can be accessed via web services protocols and management of the web services resources via the former. Talwar and Milojicic (2005) discussed the approaches for service deployment, and defined Quality of Manageability to measure the quality and efficiency of maintenance for service components.

Today’s domain consumers demand the maintenances without shutting down the system, but the existing specifications and solutions can not efficiently reduce the downtime due to maintenance. Therefore, the performance and availability of grid services during maintenance need further attention when focusing on the maintenance of resources.

As the improvement from infrastructure, researchers believe the feature of dynamic deployment in grid container can achieve higher availability. Weissman (2005) proposed an architecture basing on Apache Tomcat’s dynamic deployment functionality which allows service renovating and reconfiguring without taking down the whole site. Smith and Friese (2005) also introduced a similar solution to support dynamic deployment. Liu and Lewis (2005) designed an intermediate language X# to support the dynamic deployment among heterogeneous implementations of grid container.

Above the infrastructure, the maintaining logics focus on the consistency and complexity of grids. Shankar and Talwar (2006) invented a specification-enhanced ECA model, called Event-Condition-Precondition-Action-Postcondition (ECPAP), for designing adaptation maintaining rules. The policy-based solution can adapt some specific emergencies (e.g. file transferring mistake, maintaining failure, etc.) in grid.

Complete Chapter List

Search this Book:
Reset