A Novel Completion-Time-Minimization Scheduling Approach of Scientific Workflows Over Heterogeneous Cloud Computing Systems

A Novel Completion-Time-Minimization Scheduling Approach of Scientific Workflows Over Heterogeneous Cloud Computing Systems

S. Sabahat H. Bukhari (College of Computer Science, Chongqing University, Chongqing, China) and Yunni Xia (College of Computer Science, Chongqing University, Chongqing, China)
Copyright: © 2019 |Pages: 20
DOI: 10.4018/IJWSR.2019100101

Abstract

The cloud computing paradigm provides an ideal platform for supporting large-scale scientific-workflow-based applications over the internet. However, the scheduling and execution of scientific workflows still face various challenges such as cost and response time management, which aim at handling acquisition delays of physical servers and minimizing the overall completion time of workflows. A careful investigation into existing methods shows that most existing approaches consider static performance of physical machines (PMs) and ignore the impact of resource acquisition delays in their scheduling models. In this article, the authors present a meta-heuristic-based method to scheduling scientific workflows aiming at reducing workflow completion time through appropriately managing acquisition and transmission delays required for inter-PM communications. The authors carry out extensive case studies as well based on real-world commercial cloud sand multiple workflow templates. Experimental results clearly show that the proposed method outperforms the state-of-art ones such as ICPCP, CEGA, and JIT-C in terms of workflow completion time.
Article Preview
Top

1. Introduction

Scientific and business analysis applications consist of a large size of data. To shrink the overall processing time, data is divided and processed in parallel. Intermitted data are transferred into multiple steps that can be managed and scheduled through process-like models, e.g., service compositions (Cao et al., 2017; Fu et al., 2018; Gao et al., 2018; Sun et al., 2018; Xia et al., 2012, 2013; Zheng et al., 2017) and workflows. Workflows are recently admired for computation intensive large-scale scientific applications and orchestrating data, e.g., molecular biology and high-energy physics. Scientific workflows aim to integrate data and computation steps into organized operations that perform semi-automatic computational tasks for scientific applications. They generally offer graphical interfaces to integrate different techniques with effective methods to use them; thereby enhance the working efficiency of scientists. They are typically represented as directed graphs, i.e., directed acyclic graphs (DAGs), in which the nodes represent separate computing components and the edges represent the communication component to which data and results is transmitted.

Recently, cloud computing systems and platforms are widely-accepted as a promising supporting infrastructure for large scale scientific applications. Cloud computing system provision virtual and physical resources to combination of single or more groups of users. The resource owners decide when and to whom they should allot the specific resource (Deng et al., 2017; Wu et al., 2017; Xia et al., 2015; Xia et al., 2015a, 2015b).

In this way, collaboration can combine the cloud resources to give super-computer computational power to users for their large-scale scientific applications. This model permits tenant or end users to secure and release required resources through a pay-as-you-go manner. The scientific applications can practice to elastically scale resource pool up or down at run time. The cloud management only assigns the required or computational resources which provide the maximum utilization rate to reduce operating costs. Scientific workflows are generally scheduled on cloud through the following steps: 1) to run scientific tasks, a bag of physical resources is selected from the resource pool; 2) a schedule is generated and mapping is performed on the corresponding task resource. IaaS clouds provide resources to users in the form of virtual machines (VM) instances deployed at the provider’s data center.

Recently, the scientific workflow-oriented cloud scheduling problem attracts enormous research attentions (Li et al., 2018; Peng et al., 2018). Since the multi-constraint-multi-objective workflow scheduling problem is well-acknowledged to be NP-hard, it is extremely time-consuming to find optimal solution through traversal-based algorithms. Existing works in this direction fall into two major categories, namely the best–effort scheduling methods and the QoS-constrained scheduling ones (Yu et al., 2008). The best–effort scheduling approaches aim at minimizing the workflow execution time while ignoring other objectives, e.g., cost and reliability. The QoS-constraint methods, instead, is capable of handling multiple quantitative objectives and constraints.

Figure 1.

Cloud computing environment

IJWSR.2019100101.f01

Complete Article List

Search this Journal:
Reset
Open Access Articles
Volume 17: 4 Issues (2020): 1 Released, 3 Forthcoming
Volume 16: 4 Issues (2019)
Volume 15: 4 Issues (2018)
Volume 14: 4 Issues (2017)
Volume 13: 4 Issues (2016)
Volume 12: 4 Issues (2015)
Volume 11: 4 Issues (2014)
Volume 10: 4 Issues (2013)
Volume 9: 4 Issues (2012)
Volume 8: 4 Issues (2011)
Volume 7: 4 Issues (2010)
Volume 6: 4 Issues (2009)
Volume 5: 4 Issues (2008)
Volume 4: 4 Issues (2007)
Volume 3: 4 Issues (2006)
Volume 2: 4 Issues (2005)
Volume 1: 4 Issues (2004)
View Complete Journal Contents Listing