Article Preview
Top1. Introduction
Mesh interconnection networks have been extensively employed in large-scale multicomputers due to their structural regularity, simplicity, ease of implementation and scalability (Ababneh, 1995; Bani-Mohammad et al., 2006; Bani-Mohammad et al., 2009; Bani-Mohammad & Ababneh, 2013; Foster, 1995; Kumar et al., 2003; Lo et al., 1997; Seo & Kim, 2003; Zhu, 1992).
Effective processor allocation and job scheduling are critical if the full computational power of multicomputers is to be exploited properly (Choo et al., 2000; Yoo & Das, 2002). Processor allocation is responsible for determining the set of processors on which a parallel job is executed, whereas job scheduling is responsible for selecting the order in which jobs are selected for execution (Choo et al., 2000; De Rose et al., 2007; Yoo & Das, 2002). The job scheduler selects the next job to execute, and then the processor allocator assigns free processors for executing the selected job (Choo et al., 2000; Windisch, 1997).
In mesh multicomputers, a parallel job is typically allocated a distinct contiguous sub-mesh for the duration of its execution, and the sub-mesh has the same general shape as the mesh multicomputer itself. Most contiguous processor allocation strategies proposed in the literature are for 2D meshes (Ababneh, 2009; Chuang & Tzeng, 1994; Kim & Yoon, 1998; Zhu, 1992). Although the 2D mesh has been used in a number of parallel machines, such as the Cray XE6m (Cray, 2010b), the iWARP (Peterson et al., 1991) and the Touchstone Delta system (Intel, 1991), current multicomputers, such as the K-computer (Riken & Fujitsu, 2014), the Cray XE6 (Cray, 2010a) and the IBM BlueGene/L (Horn et al., 2014) utilize the 3D mesh or the 3D tori (Cray, 2010a; Horn et al., 2014; Riken & Fujitsu, 2014) as interconnection network due to their lower diameter, lower average communication distance, and higher connectivity and network degree (Athas & Seitz, 1988).
Contiguous allocation can lead to high external processor fragmentation, which occurs when the allocation strategy cannot allocate available processors to an incoming job because the available processors are not contiguous or they do not have the requested shape. Non-contiguous allocation can reduce processor fragmentation and have better system performance than contiguous allocation because a job can execute on multiple smaller sub-meshes. However, contiguous allocation is used in highly-parallel systems such as the IBM BlueGene/L because it isolates jobs from each other, which is useful for security and accounting reasons (Aridor et al., 2005). Contiguous allocation for 3D meshes has been investigated in the literature (Ababneh, 2001; Bani-Mohammad et al., 2006; Bani-Mohammad et al., 2009; Choo et al., 2000; Mao et al., 2005; Qiao & Ni, 1995).