A Cost-Optimized Data Parallel Task Scheduling in Multi-Core Resources Under Deadline and Budget Constraints

A Cost-Optimized Data Parallel Task Scheduling in Multi-Core Resources Under Deadline and Budget Constraints

Saravanan Krishnan, Rajalakshmi N. R.
Copyright: © 2022 |Pages: 16
DOI: 10.4018/IJCAC.305857
This article was retracted

Abstract

Large-scale distributed systems have advantages of high processing speeds and large communication bandwidths over the network. The processing of huge real-world data through distributed computing system becomes obscure because the major concern in large-scale distributed systems is to guarantee the completion of data processing task to be done within a budget and time constraints. This paper proposes a cost-optimized data parallel task scheduling in multi-core resources to address the above issue. By running concurrent executions on a multi-core resource, the number of parallel executions could be increased correspondingly, thereby it is able to finish the task within the deadline. A model is developed here to optimize the operational cost of data parallel task by feasibly assigning load fractions to each multi-core resource. This work experimented with data parallel task. The outcome of the work gives better solutions in terms of processing task by deadline at optimised computational cost.
Article Preview
Top

1. Introduction

The escalation of computing services has been enhanced recently. Its attractiveness mainly stems from the release of IT resources, such as the transformation of capital IT expenditure into economical one. It has the potential for reducing costs through economies of scale. The unique advantages of computing services facilitation is the time and cost constrained task execution, because of large number of jobs bounded with an assured budget for execution within the computing time (Chen, 2012). As an unprecedented volume of data is being dealt by IT industries every day, the open source implementation of MapReduce programming model in Hadoop turns out to be the standard technique for analyzing these peta scale data in a cost-efficient way. A large number of users would be served by a simpler version of the MapReduce framework in data-bound workloads (Jonas, 2017). Every day averagely 25000 MapReduce jobs are hosted on Facebook Hadoop clusters containing 3000 machines. These vast amounts of data create new challenges and opportunities which escort to discoveries and extraordinary new knowledge in many application domains ranging from science and engineering to business. The challenges are not bounded to the size of data only, but also to the time and cost constraints (Rani & Vinaya Babu, 2015).

It has been affirmed that the execution of gathered massive, non-uniform real-world data through grid computing and cloud computing is becoming more obscure (Hajikano et al., 2016). This could be overcome by efficient scheduling of task , because, the optimal scheduling schedules the resources efficiently to get quick response time for real-time application. Therefore, a cost optimized task scheduling for data intensive jobs is presented here. The motivation of this work is execution of data intensive jobs within time and cost constraints. The optimal selection of machines yields a significant enhancement on performance and maximum resource utilization (Abdelaziz, 2018). But , the scheduling of heterogeneous computing resources depends on the following parameters of resource capacity, resource availability, workload size, and resource utilization cost. Service level agreement (SLA) negotiation also moderates the scheduling and utilization of resources (Cheng, 2015). Therefore, the above parameters have to be considered, while scheduling the tasks on resources to meet out user expectations (Figure 1).

Figure 1.

A framework for Resource Scheduling

IJCAC.305857.f01

The user expectations are strongly entailing the desired quality-of-service (QoS) such as quality of results, the execution time, throughput, economic costs, reliability, trust, etc. Moreover, the timeliness of computation is acquired by allowing users to specify an absolute deadline. In the view of attaining the timeliness of the computation, more patterns are employed to model a generic flow of work. The data parallelism pattern is appropriate to embarrassingly model a parallel computation of data-intensive task. This pattern leads to the concurrent execution of multiple and independent data parallel tasks on heterogeneous computing resources. However, data parallel task scheduling in heterogeneous environments with the aim of satisfying QoS constraints (such as cost & execution time) is a complex issue.

Nowadays, most of the resource has multi-core processor which signifies two or more processing cores are placed on the same chip. Multi-core processor improves overall performance by handling more work in parallel. An efficient model is needed to acquire performance of these multi-core resources. The data parallel processing operation favors the design of many processing elements to handle large amounts of data to often yield high throughput and performance (Blake, 2009).

Complete Article List

Search this Journal:
Reset
Volume 14: 1 Issue (2024)
Volume 13: 1 Issue (2023)
Volume 12: 4 Issues (2022): 2 Released, 2 Forthcoming
Volume 11: 4 Issues (2021)
Volume 10: 4 Issues (2020)
Volume 9: 4 Issues (2019)
Volume 8: 4 Issues (2018)
Volume 7: 4 Issues (2017)
Volume 6: 4 Issues (2016)
Volume 5: 4 Issues (2015)
Volume 4: 4 Issues (2014)
Volume 3: 4 Issues (2013)
Volume 2: 4 Issues (2012)
Volume 1: 4 Issues (2011)
View Complete Journal Contents Listing