Performance Evaluation of Cloud Data Centers with Batch Task Arrivals

Performance Evaluation of Cloud Data Centers with Batch Task Arrivals

Hamzeh Khazaei (Ryerson University, Canada), Jelena Mišić (Ryerson University, Canada) and Vojislav B. Mišić (Ryerson University, Canada)
Copyright: © 2014 |Pages: 25
DOI: 10.4018/978-1-4666-4522-6.ch009
OnDemand PDF Download:
No Current Special Offers


Accurate performance evaluation of cloud computing resources is a necessary prerequisite for ensuring that Quality of Service (QoS) parameters remain within agreed limits. In this chapter, the authors consider cloud centers with Poisson arrivals of batch task requests under total rejection policy; task service times are assumed to follow a general distribution. They describe a new approximate analytical model for performance evaluation of such systems and show that important performance indicators such as mean request response time, waiting time in the queue, queue length, blocking probability, probability of immediate service, and probability distribution of the number of tasks in the system can be obtained in a wide range of input parameters.
Chapter Preview


Cloud computing is a novel computing paradigm in which different computing resources such as infrastructure, platforms and software applications are made accessible over the Internet to remote users as services (Vaquero, Rodero-Merino, Caceres, and Lindner 2008). It is quickly gaining acceptance: according to IDC, 17 billion dollars was spent on cloud-related technologies, hardware and software in 2009, and spending is expected to grow to 45 billion dollars by 2013 (Patrizio 2011). Due to the dynamic nature of cloud environments, diversity of users' requests, and time dependency of load, providing agreed Quality of Service (QoS) while avoiding over-provisioning is a difficult task (Xiong and Perros 2009). Performance evaluation of cloud centers is therefore an important research task. However, despite considerable research effort that has been devoted to cloud computing in both academia and industry, only a small portion of it have dealt with performance evaluation. In this chapter, we address this deficiency by proposing an analytical model for performance evaluation of cloud centers. The model utilizes queuing theory and probabilistic analysis to allow tractable evaluation of several important performance indicators, including response time and other related measures (Wang, von Laszewski, Younge, He, Kunze, Tao, and Fu 2010).

We assume that the cloud center consists of a number of servers that are allocated to users in the order of request arrivals. Users may request a number of servers in a single request, i.e., we allow batch arrivals, hereafter referred as super-tasks arrivals. This model is consistent with the so-called On-Demand services provided by the Elastic Compute Cloud (EC2) from Amazon (2010). Such services provide no advance reservation and no long-term commitment, which is why clients may experience delays in fulfillment of requests. (The other types of services offered by Amazon EC2, known as Reserved and Spot services, have different allocation policies and availability). While many of the large cloud centers employ virtualization to provide the required resources such as servers (Fu, Hao, Tu, Ma, Baldwin, and Bastani 2010), we consider servers to be physical servers; our model is thus applicable to intra-company (private) clouds as well as to public clouds of small or medium providers.

As the user population size is relatively high and the probability of a given user requesting service is relatively low the arrival process can be adequately modeled as a Markovian process, i.e., super-tasks arrive according to a Poisson process (Grimmett and Stirzaker 2010). However, some authors claimed that Poisson process is not adequately modeled the arrival process in real cloud centers (Benson, Akella, and Maltz 2010).

When a super-task arrives, if the necessary number of servers is available, they are allocated immediately; if not, the super-task is queued in the input buffer until the servers become available, or rejected if the input buffer is unable to hold the request. As a result, all tasks within a super-task obtain service, or are rejected, simultaneously. This policy, known as total rejection policy, is well suited to modeling the behavior of a cloud center; it is assumed that the users request as many servers as they need, and would not accept a partial fulfillment of their requests.

A request may target a specific infrastructure instance (e.g., a dual- or quad-core CPU with specified amount of RAM), a platform (e.g., Windows, Linux, or Solaris), or a software application (e.g., a database management system, a Web server, or an application server), with different probabilities. Assuming that the service time for each component of the resulting infrastructure-platform-application tuple follows a simple exponential or Erlang distribution, the aggregate service time of the cloud center would follow a hyper-exponential or hyper-Erlang distribution. In this case, the coefficient of variation (CoV, defined as the ratio of standard deviation and mean value) of the resulting service time distribution exceeds the value of one (Corral-Ruiz, Cruz-Perez, and Hernandez-Valdez 2010). As a result, the service time should be modeled with a general distribution, preferably one that allows the coefficient of variation to be adjusted independently of the mean value.

Complete Chapter List

Search this Book: