Scheduling

Scheduling

Valentin Cristea (Politehnica University of Bucharest, Romania), Ciprian Dobre (Politehnica University of Bucharest, Romania), Corina Stratan (Politehnica University of Bucharest, Romania) and Florin Pop (Politehnica University of Bucharest, Romania)
DOI: 10.4018/978-1-61520-703-9.ch005

Abstract

This chapter presents the scheduling problem in large scale distributed systems. Most parts of the chapter are devoted to discussion of scheduling algorithms and models. The main challenges of scheduling problem are approached here. The implementation issues are also covered. The chapter has three parts. The first part covers basics like scheduling models, scheduling algorithms for independent tasks and DAG scheduling Algorithms for tasks with dependencies. The first part of the chapter presents a classification of scheduling problems, methods that are relevant for the solution procedures, and computational complexity. The scheduling models are presented based on systems architecture described in Resource Management chapter. This firs part also provides a critical analysis of most important algorithms from different points of view, such as static versus dynamic policies, objective functions, applications models, adaptation, QoS constraints and strategies dealing with dynamic behavior of resources. The second part covers new scheduling mechanism like resources co-allocation and advance reservation. Multi-criteria optimization mechanisms for users and systems constrain (e.g. load-balancing, minimization of execution time) are described and analyzed in this chapter. This part uses algorithm and methods to highlight the importance of these topics. The dynamic scheduling is also the subject of this part. It is also presented the implementation issues for scheduler tools. Since it is not possible to cover the whole area of scheduling in one chapter, some restrictions are imposed. Firstly, the chapter presents only Scheduling for Large Scale Distributed Systems (LSDS), without single system scheduling. Secondly, some interesting topics of fault tolerance (re-scheduling) are not analyzed in this chapter.
Chapter Preview
Top

Introduction

This chapter presents the scheduling problem in large scale distributed systems. Most parts of the chapter are devoted to discussion of scheduling algorithms and models. The main challenges of scheduling problem are approached here. The implementation issues are also covered.

The chapter has three parts. The first part covers basics like scheduling models, scheduling algorithms for independent tasks and DAG scheduling Algorithms for tasks with dependencies. The first part of the chapter presents a classification of scheduling problems, methods that are relevant for the solution procedures, and computational complexity. The scheduling models are presented based on systems architecture described in Resource Management chapter. This firs part also provides a critical analysis of most important algorithms from different points of view, such as static versus dynamic policies, objective functions, applications models, adaptation, QoS constraints and strategies dealing with dynamic behavior of resources.

The second part covers new scheduling mechanism like resources co-allocation and advance reservation. Multi-criteria optimization mechanisms for users and systems constrain (e.g. load-balancing, minimization of execution time) are described and analyzed in this chapter. This part uses algorithm and methods to highlight the importance of these topics. The dynamic scheduling is also the subject of this part. It is also presented the implementation issues for scheduler tools. Since it is not possible to cover the whole area of scheduling in one chapter, some restrictions are imposed. Firstly, the chapter presents only Scheduling for Large Scale Distributed Systems (LSDS), without single system scheduling. Secondly, some interesting topics of fault tolerance (re-scheduling) are not analyzed in this chapter.

The scheduling is presented using a higher level abstraction for the distributed systems by ignoring infrastructure components such as authentication, authorization, and access control. A very good definition for the distributed system that can be used in this chapter for understanding scheduling problem keys is given in (Baker et al., 2002): “A type of parallel and distributed system that enables the sharing, selection, and aggregation of geographically distributed autonomous and heterogeneous resources dynamically at runtime depending on their availability, capability, performance, cost, and users' quality-of-service requirements”.

More applications are turning to LSDS computing to meet their computational and data storage needs. Single sites are simply no longer efficient for meeting the resource needs of high-end applications, and using distributed resources can give the application many benefits. Effective LSDS computing is possible, however, only if the resources are scheduled well.

In scheduling context a task could to be defined, for an user, as anything that needs a resource, from a bandwidth request, to an application, to a set of applications (for example, a parameter sweep). In the same context, a resource could be anything that can be scheduled: a machine, processors, disk space and memory, a QoS network.

Scheduling is a key concept for multitasking and multiprocessing design in large scale distributed systems and a most important aspect in real-time system design. Scheduling is the process of assigning tasks on compute resources according with a task policy and ordering communication between tasks. This assignment is carried out by software known as a scheduler. The scheduling process is also known as the allocation process of computation and communication in time. (Ramin & Wieder, 2005).

In manufacturing, the scope of scheduling is to minimize the production time and costs (in many case the cost is represented by money), by offering a production facility what to make, when, with which staff, and on which resources (Blazewicz et al., 2001). Production scheduling aims to maximize the efficiency of the operation and reduce costs.

Complete Chapter List

Search this Book:
Reset