Information services play a crucial role in grid computing environments in that the state information of a grid system can be used to facilitate the discovery of resources and services available to meet user requirements and help tune the performance of the grid. This paper models PIndex, which is a grouped peer-to-peer network with Colored Petri Nets (CPNs) for scalable grid information services. Based on the CPN model, a simulator is implemented for PIndex simulation and performance evaluation. The correctness of the simulator is further verified by comparing the results computed from the CPN model with the results generated by the PIndex simulator.
TopIntroduction
The past few years have witnessed a rapid development of grid computing infrastructures and applications (Li & Baker, 2005; Wang, Helian, Wu, Deng, Khare, & Thompson, 2007; Wang, Wu, Helian, Parker, et al. 2007; Wang, Wu, Helian, Xu, 2007). Information services play a crucial role in grid environments in that they facilitate the discovery of resources and services (Czajkowski, Kesselman, Fitzgerald, & Foster, 2001). Information services periodically collect data on available resources including hardware and software in a grid environment. The data can then be used by a number of elements in a grid to keep the grid running smoothly. For example, job schedulers use resource information to make adaptive decisions on allocating resources to jobs to achieve certain goals such as a minimum make-span in execution of jobs (Berman et al., 2003).
Grid middleware technologies facilitate information services. For example, the current Globus Toolkit (http://www.cern.ch/glite), also facilitates resource registration and discovery. It is worth noting that grids differentiate themselves from traditional distributed systems in the following aspects:
- •
The size of a grid is usually large in terms of the number of computing nodes involved.
- •
Resources in a grid are usually heterogeneous with various computing capabilities and services.
- •
A grid is dynamic in that computing nodes may join or leave a grid freely. In addition, some resources such as the CPU load of a grid node may change frequently.
The aforementioned characteristics of grids bring forth a number of challenges to existing information services, notably the MDS4 and the R-GMA. The hierarchical structure along with centralized management of MDS4 has an inherent delay associated with it which potentially limits its scalability in resource registration. It might take a long time for resource information to be updated from the leaf nodes to the root index service node. Cai, Frank, Chen, and Szekely (2004) point out that the scheme to partition resource information on index servers is typically predefined and cannot adapt to the dynamic changes of VOs. The MDS4 also lacks a mechanism to deal with failures of index servers which may break the information service network into isolated subnets. The R-GMA contains a centralized registry (Groep, Templon, & Loomis, 2006), and performs poorly when dealing with only 100 consumer nodes (Zhang, Freschl, & Schopf, 2007).