Workload Management Systems for the Cloud Environment

Workload Management Systems for the Cloud Environment

Eman A. Maghawry (Ain Shams University, Egypt), Rasha M. Ismail (Ain Shams University, Egypt), Nagwa. L. Badr (Ain Shams University, Egypt) and Mohamed F. Tolba (Ain Shams University, Egypt)
Copyright: © 2017 |Pages: 20
DOI: 10.4018/978-1-5225-2229-4.ch005
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

Workload Management is a performance management process in which an autonomic database management system on a cloud environment efficiently makes use of its virtual resources. Workload management for concurrent queries is one of the challenging aspects of executing queries over the cloud. The core problem is to manage any unpredictable overload with respect to varying resource capabilities and performances. This chapter proposes an efficient workload management system for controlling the queries execution over a cloud. The chapter presents architecture to improve the query response time. It handles the user's queries then selecting the suitable resources for executing these queries. Furthermore, managing the life cycle of virtual resources through responding to any load that occurs on the resources. This is done by dynamically rebalancing the queries distribution load across the resources in the cloud. The results show that applying this Workload Management System improves the query response time by 68%.
Chapter Preview
Top

Introduction

Modern business software and services demand high availability and scalability. This has resulted in an increasing demand for large scale infrastructures which generally tend to be highly complicated and expensive for the organization requesting them. These higher resource and maintenance costs have given rise to a paradigm shift towards cloud computing. Its services are offered and maintained by various providers over the Internet. These service offerings range from software applications to virtualized platforms and infrastructures (Mittal, 2001). Cloud Computing is the combination of parallel and distributing computing paradigms. This distributed computing paradigm is driven by the economies of scale, in which a pool of virtualized and managed computing power, storage, platforms and services are delivered on demand to remote users over the Internet (Foster, Zhao, Raicu,& Lu, 2008). It is helping in realizing the potential of large scale data intensive computing by providing effective scaling of resources (Mian, Martin, Brown,& Zhang, 2001). Also, a Cloud computing environment is widely employed in scientific, business and industrial applications to increase the use and delivery model involving the Internet to provide dynamic virtualized resources (Jeong & Park, 2012). It offers the vision of a virtually infinite pool of computing, storage and networking resources where applications can be scalable deployed (Hayes, 2008).

As data continues to grow, it enables the remote clients to store their data on a cloud storage environment with different clients’ expectations over the internet. An increasing amount of data is stored in cloud repositories which provide high availability, accessibility and scalability (Duggan, Chi, Hacigumus, Zhu, & Cetintemel, 2013). As clouds are built over wide area networks, the use of large scale computer clusters are often built from low cost hardware and network equipment, where resources are allocated dynamically amongst users of the cluster (Yang, Li, Han,& Wang, 2013). Therefore the cloud storage environment has resulted in an increasing demand to co-ordinate access to the shared resources to improve the overall performance.

Users can purchase traditional data centers, sub-divide hardware into virtual machines or outsource all of their work to one of many of cloud providers (Duggan et al., 2013). As the number of users submitting the queries increase, that leads to increases in the load and traffic of the virtual resources. So it is essential to incorporate a mechanism to balance the load across these virtual resources (Somasundaram, Govindarajan, Rajagopalan,& Madhusudhana, 2012).

Workload management is the discipline of effectively managing, controlling and monitoring application workloads across computing systems (Niu, Martin,&Powley, 2009). Workload is as a set of requests that access and process data under some constraints. The data access performed by a query can vary from retrieval of a single record to the scan of an entire file or table. Since the load on data resources in a cloud can fluctuate rapidly among its multiple workloads, it is impossible for system administrators to manually adjust the system configurations in order to maintain the workloads objectives during their execution. Therefore, managing the query workload automatically in a cloud computing environment is a challenge to satisfy the cloud users. This is done by relocating resources through admission control in the presence of workload fluctuations (Niu, Martin, Powley, Horman,& Bird, 2006).

Key Terms in this Chapter

Abstract Query Tree: Generated by database management system parser which generates query execution plan as a tree. The tree consists of internal nodes that have the operations of the query, such as select operation, and the relations used in the operation put in the leaves of the tree.

Autonomic Database Management System: The system’s ability to manage itself automatically without increasing costs or the size of the management team thereby achieving an administrator’s goals. Systems must be quickly adaptable to new conditions integrated to it.

Distributed Resources: A list of resources in which their storage devices are not all integrated to a common processing unit, it consists of multiple computers, located in different physical location and dispersed over a network of interconnected computers.

Workload Management: A number of tasks or number that’s assigned to a particular resource over a given period. It manages workload distributions to provide an optimal performance for applications and users.

Cloud Computing: A kind of Internet based computing that relies on sharing computing resources instead of having local and personal servers to access applications. It’s based on the delivery of on-demand computing resources over the Internet on a pay-for-use basis.

Query processing: The process of how queries are processed and optimized within the database management system. It consists of a series of steps that take the query as input and produce its result as output.

Cloud Storage: An environment of data storage where the digital data is stored in multiple servers with logical pools over cloud computing environment and the physical environment is typically owned and managed by a hosting company.

Load Distribution: Distributes tasks across multiple computing resources that provide the requested database service. Its main goal is to optimize the resource utilization and minimize response time without overloading any resource.

Resource Replication: The creation of multiple instances of the same resource, it enables data from one resource to be replicated to one or more resources. That is typically performed when a resource’s availability and performance need to be enhanced. AU57: Pict Element 1

Complete Chapter List

Search this Book:
Reset