Article Preview
TopIntroduction
The grid scheduling service, also known as super-scheduling (Schopf, 2003), is defined as “scheduling job across grid resources such as computational clusters, parallel supercomputers, desktop machines that belong to different administrative domains”. It is a crucial component for grid computing infrastructures because it determines the effectiveness and efficiency of a grid system by identifying, characterizing, discovering, selecting, and allocating the resources that are best suited for a particular job.
Grid scheduling is a critical but complex task. The heterogeneous and distributed nature of grid systems imposes additional constraints on scheduling services, such as lack of remote resource control, or incomplete overall knowledge of the grid system.
Besides the theoretical issues, the realities of grid scheduler design and implementation have made things even more complicated. Existing grid schedulers typically depend on (or are completely integrated in) some particular grid middleware. Therefore, it is a nontrivial task to migrate a grid scheduler from one middleware to another, or to exchange messages between schedulers, or to delegate jobs between different types of scheduler. Grid schedulers designed upon various middlewares respectively can be regarded as a set of heterogeneous grid schedulers.
The contribution of this paper is the design of a decentralized modular high-level grid scheduler named MaGate. The MaGate scheduler dedicates to improve the rate of successfully executed jobs submitted to the same grid community, by means of interacting with each other and delegating jobs amongst all participating nodes of the community. In other words, the MaGate schedulers are driven to co-operated with each other, to provide intelligent scheduling for the scope of serving the grid community as a whole, not just for a single grid node individually.
To achieve the purpose mentioned above, the MaGate scheduler emphasizes on several relevant issues: (i) the approach of discovering remote resources dynamically and efficiently; (ii) the community policy of determining jobs to delegate remotely, and acceptation of arrived remote jobs; (iii) the platform independent communication protocol to facilitate the interaction between different MaGate schedulers on heterogenous nodes; (iv) the negotiation procedure to tackle various job delegation scenarios flexibly, i.e., job delegation accept/reject/conditional reject, job delegation proxy and forwarding, etc.
The MaGate is being developed within the SmartGRID project (Huang, Brocco, Kuonen, Courant, & Hirsbrunner, 2008), which aims at improving the efficiency of existing grids through a modular, layered architecture: the Smart Resource Management Layer (SRML) to support grid scheduling, and the Smart Signaling Layer (SSL) to provide resource discovery. Furthermore, communication between layers is mediated by means of the Datawarehouse Interface (DWI).
The Smart Resource Management Layer (SRML) is comprised of a set of MaGates. Each MaGate is composed of a set of loosely coupled modules, in order to tackle several critical issues raised by grid scheduling, such as: