Article Preview
Top1. Introduction
The rapid innovation in distributed multi-component computing application frameworks (Hindman et al., 2011; McRae, 1997), calls for an urgent need to build an equivalent multi-component distributed system infrastructure or meta-computing infrastructure. However, a number of research groups (Ghodsi et al., 2011; Isard et al., 2009; Grimshaw and Wulf, 1997; Foster and Kesselman, 1997) have proposed and implemented metacomputing infrastructure targeted at achieving high performance throughput for a large number of diverse compute intensive metacomputing applications. The recent shift in paradigm from parallel application, requesting for resources from single computing clusters to metacomputing applications, requesting for resources from heterogeneous metacomputing clusters can be attributed to the single goal of achieving high performance.
In this paper, we have grouped computing application into two classes based on their resource requirement needs. First is the single-component application, this is a class of application which resource requirements can be handled by resources from a single cluster. Second class of application is the multi-component application, these are sets of applications which resource requirement needs goes beyond resources provisioning from a single cluster; rather, it requires heterogeneous resources from multiple clusters. These resources can include; remote databases servers, remote laboratory instruments, remote compute intensive servers, and remote network servers (Weissman, 2000). The challenges inherent in distributed heterogeneous computing environment are well known (Freund and Siegel, 1993). Examples of frameworks that provide heterogeneous computing environments include, multi-component clusters, grid and cloud computing systems (Foster and Kesselman, 1997; Weissman, 2000).
Exploiting the performance potential that comes with the heterogeneous computing environments, requires effective application scheduling. This in essence, would require the appropriate and efficient selection and allocation of candidate resources to user application. This problem is particularly challenging due to the heterogeneous and unpredictable nature of both the resources and the application itself. The problem of scheduling heterogeneous application and resource can be made more effective, by applying some scheduling heuristics that best understand the complete structures of both the application and resource information. The scheduling heuristics should be able to automatically extract this information and forward it to the global scheduler for adequate scheduling decision making.
Our intent in this paper, is to present a conceptual design framework for a multi-component based reference scheduling architecture, capable of scheduling simultaneously both single and multi-component heterogeneous applications, across diverse multi-platforms of heterogeneous multi-clusters, with the aim of reducing application execution time, achieving optimal resource throughput and utilization. Several independent or separate scheduling implementations for single-component and multi-component cluster can be seen in (Ezugwu et al., 2015; Weissman, 1998; Mechoso et al., 1994).
We intend to build an object-based storage infrastructure that replaces the existing Meta Directory Service (MDS) information infrastructure, which lacks the capability of supporting abstract queries from user applications (Dabhi, and Prajapati, 2008). Existing distributed systems such as the grid, uses the syntax or schema based resource matchmakers, algorithmic schedulers, and execution monitors for scripted job sequences (Dabhi, and Prajapati, 2008). To overcome the heterogeneous and dynamic nature of distributed systems, the object-based information infrastructure plays a very important role in maintaining dynamism associated with most scheduling components.