Energy Efficiency Oriented Scheduling for Heterogeneous Cloud Systems

Energy Efficiency Oriented Scheduling for Heterogeneous Cloud Systems

Weiwei Lin (South China University of Technology, Guangzhou, China), Chao Yang (South China University of Technology, Guangzhou, China), Chaoyue Zhu (South China University of Technology, Guangzhou, China), James Z. Wang (School of Computing, Clemson University, Clemson, SC, USA) and Zhiping Peng (Guangdong University of Petrochemical Technology, Maoming, China)
Copyright: © 2014 |Pages: 14
DOI: 10.4018/IJGHPC.2014100101

Abstract

Nowadays, how to improve energy efficiency has become a challenging problem in cloud computing. However, most existing efforts in improving the energy efficiency of a cloud system only focus on resource allocation at the system level like managing physical nodes or virtual machines. This paper tries to address the energy efficiency problem of a heterogeneous cloud system at the task scheduling level. A novel task scheduling algorithm is proposed to reduce the energy consumption of the system while maintaining its performance, without closing or consolidating any system resources such as virtual machines or storage systems. The algorithm dynamically monitors CPU and memory load information of participating nodes on a heterogeneous Hadoop platform with Ganglia, then selects and submits an appropriate task to the node with relatively low workload to avoid excessive energy consumption on some nodes. Experimental results show that this novel scheduling algorithm can effectively improve the energy-saving ratio of a heterogeneous cloud platform while maintaining a high system performance.
Article Preview

1. Introduction

The flourish of World Wide Web (WWW) and internet applications has led to a rapid growth in the volume of data to be processed, promoting the development of cloud computing and big data processing technologies (Buyya, Yeo and Venugopal, 2008). Hadoop, an open source cloud platform, which uses MapReduce (Dean and Ghemawat, 2010) to handle large scale distributed data processing, has been widely used for big data. MapReduce is a software framework for users to write applications easily to process vast amounts of data in-parallel on large clusters of commodity hardware in a reliable, fault-tolerant manner. It has emerged as one of the most popular frameworks for data-intensive and distributed cloud computing. Because MapReduce usually processes tens of thousands of tasks, a good scheduling algorithm is very important to the system performance (Dean and Ghemawat, 2010; Weiwei and Bo, 2012).

Currently, three most popular scheduling algorithms in Hadoop platform, FIFO scheduling algorithm (Zaharia, Borthakur, Sen, and et al., 2010), Fair scheduling algorithm (Isard, Prabhakaran, Currey, and et al., 2009) and Capacity scheduling algorithm (Leverich, and Kozyrakis, 2010), mainly focus on overall system performance and system resource utilization, without considering its energy consumption. Especially, in a heterogeneous cloud computing environment, these scheduling algorithms may cause a certain node to be occupied for an extensive period of time. In such a situation, the system performance might be subsided due to the high occupancy of resources for a long period of time even though the workloads do not exceed the physical capacity. For example, when memory occupancy is over 90%, virtual memory is often used, which results in more I/O on disk due to memory swapping. Memory swapping not only affects the system performance but also causes the system to consume more energy. Similarly, if the CPU is continuously occupied for a long period of time, the high CPU temperature might slow down the computation and more energy is needed to cool down the system.

Based on these observations, we found that pursuing the high resource utilization solely may not lead to an optimal task scheduling in terms of energy consumption and system performance. Especially in current cloud computing environment, thousands of computing nodes and storage devices hosted in a data center often demand large amount of energy to run the system and consequently to cool down the heat generated by the system. A recent survey (Jonathan, 2011) showed that electricity used by data centers worldwide increased by about 56% from 2005 to 2010 instead of doubling (as it did from 2000 to 2005). It also found that the energy cost of one server for 4 years essentially equals to its hardware cost. Although many researchers have tried to solve the high energy consumption problem of data center, most of the existing approaches (Anton, Jemal and Rajkumar, 2012; Von, Wang, Younge, and He, 2009; Jurgen, Rita, Claudio and Jose, 2011; Rong, Xizhou and Kirk, 2005; Matthieu, Eugen, Anne-Cécile and et al., 2013; Weiwei, Bo, Liangchang and Deyu, 2013) only address the problem at system management and resource allocation level. Although energy efficient task scheduling algorithms have been proposed in homogeneous cloud systems, the impact of task scheduling to the energy consumption of heterogeneous cloud systems has yet been to be studied.

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 11: 4 Issues (2019): Forthcoming, Available for Pre-Order
Volume 10: 4 Issues (2018): 3 Released, 1 Forthcoming
Volume 9: 4 Issues (2017)
Volume 8: 4 Issues (2016)
Volume 7: 4 Issues (2015)
Volume 6: 4 Issues (2014)
Volume 5: 4 Issues (2013)
Volume 4: 4 Issues (2012)
Volume 3: 4 Issues (2011)
Volume 2: 4 Issues (2010)
Volume 1: 4 Issues (2009)
View Complete Journal Contents Listing