Workload Prediction on Google Cluster Trace

Workload Prediction on Google Cluster Trace

Md. Rasheduzzaman (Department of Electrical and Computer Engineering, North South University, Dhaka, Bangladesh), Md. Amirul Islam (Department of Electrical and Computer Engineering, North South University, Dhaka, Bangladesh) and Rashedur M. Rahman (Department of Electrical and Computer Engineering, North South University, Dhaka, Bangladesh)
Copyright: © 2014 |Pages: 19
DOI: 10.4018/ijghpc.2014070103

Abstract

Workload prediction in cloud systems is an important task to ensure maximum resource utilization. So, a cloud system requires efficient resource allocation to minimize the resource cost while maximizing the profit. One optimal strategy for efficient resource utilization is to timely allocate resources according to the need of applications. The important precondition of this strategy is obtaining future workload information in advance. The main focus of this analysis is to design and compare different forecasting models to predict future workload. This paper develops model through Adaptive Neuro Fuzzy Inference System (ANFIS), Non-linear Autoregressive Network with Exogenous inputs (NARX), Autoregressive Integrated Moving Average (ARIMA), and Support Vector Regression (SVR). Public trace data (workload trace version II) which is made available by Google were used to verify the accuracy, stability and adaptability of different models. Finally, this paper compares these prediction models to find out the model which ensures better prediction. Performance of forecasting techniques is measured by some popular statistical metric, i.e., Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), Sum of Squared Error (SSE), Normalized Mean Squared Error (NMSE). The experimental result indicates that NARX model outperforms other models, e.g., ANFIS, ARIMA, and SVR.
Article Preview

A lot of researches have been conducted in forecasting time series. These studies have provided a better understanding of what might happen in future. Wang, Chau, Cheng and Qiu (2009) compared different artificial intelligence (AI) (such as SVM, ARIMA, ANFIS, ANN and GP) methods to build effective hydropower resource management and scheduling. They used four statistical performance measurement techniques to validate different AI models. The authors showed that AI methods are more powerful than traditional time series forecasting models. They authors reported that SVM had better result in both training and validation stage where ANFIS showed different performance in training and validation stage. Genetic Programming (GP) had better result in validation phase.

(Contreras) et. al. (2003) carefully studied the performance impact of the ARIMA model on next day electricity price forecasting. The authors proposed two different ARIMA models for predicting electricity price in Spain and California. In their research they showed that ARIMA model required only 5 hours to forecast future price for Spain, whereas it took 2 hours for prediction the price for California. Daz-Robles et al. (2008) designed a hybrid model by combining ARIMA and ANN model for forecasting air quality in urban area. They showed that traditional Box-Jenkins time series (ARIMA) model and multi-linear regression (MLR) models have limited accuracy. Khashei and Bijari (2010) built a novel predictive model by combing ARIMA and ANN model which gives better accuracy. They tested their model with three real-data set and argued that in case of higher performance this model can be used as appropriate alternative.

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 11: 4 Issues (2019): Forthcoming, Available for Pre-Order
Volume 10: 4 Issues (2018)
Volume 9: 4 Issues (2017)
Volume 8: 4 Issues (2016)
Volume 7: 4 Issues (2015)
Volume 6: 4 Issues (2014)
Volume 5: 4 Issues (2013)
Volume 4: 4 Issues (2012)
Volume 3: 4 Issues (2011)
Volume 2: 4 Issues (2010)
Volume 1: 4 Issues (2009)
View Complete Journal Contents Listing