Push-based Prefetching in Remote Memory Sharing System

Push-based Prefetching in Remote Memory Sharing System

Rui Chu (National University of Defense Technology, China), Nong Xiao (National University of Defense Technology, China) and Xicheng Lu (National University of Defense Technology, China)
Copyright: © 2011 |Pages: 15
DOI: 10.4018/978-1-60960-603-9.ch017
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

Remote memory sharing systems aim at the goal of improving overall performance using distributed computing nodes with surplus memory capacity. To exploit the memory resources connected by the high-speed network, the user nodes, which are short of memory, can obtain extra space provision. The performance of remote memory sharing is constrained with the expensive network communication cost. In order to hide the latency of remote memory access and improve the performance, we proposed the push-based prefetching to enable the memory providers to push the potential useful pages to the user nodes. For each provider, it employs sequential pattern mining techniques, which adapts to the characteristics of memory page access sequences, on locating useful memory pages for prefetching. We have verified the effectiveness of the proposed method through trace-driven simulations.
Chapter Preview
Top

Introduction

The rapid developing of Internet has boosted the bloom of network computing technology. As typical systems, cluster computing, peer-to-peer computing, grid computing, as well as cloud computing, commonly focus on the goal of sharing various resources distributed in a certain network environment, and provide services for a large number of users. The resources to be shared in such systems include CPU cycles, storage, data, and, as particularly discussed in this work, the memory.

As one of the most important resources in computer architecture, memory plays a key role in the factors impacting the system performance. Especially for the memory-intensive applications that have large work sets, or the I/O-intensive applications that massively access the disk, the memory capacity may dominate the overall performance. The ultimate reason is that there exist large gaps on performance and capacity between memory and disk (Patterson, 2004), thus the traditional computer systems have to supplement the memory capacity using the low-speed disk based virtual memory, or improve the disk performance using the limited memory based cache. Accordingly, an intermediate hierarchy between memory and disk is needed to relax such restrictions.

Remote memory sharing, which aggregates a large number of idles nodes in the network environment, and exploits their memory resources for fast storage, could meet the requirements of intermediate hierarchy with adequate performance and capacity (Feeley, et al., 1995; Hines, Lewandowski, et al., 2006; Newhall, et al., 2008; Pakin, et al., 2007). The memory-intensive applications can swap obsolete local memory pages to remote memory instead of local disk (Feeley, et al., 1995), or the I/O-intensive applications can also benefit from the large data cache with better hit ratio (Vishwanath, et al., 2008). Various remote memory sharing schemes were proposed in the past decades. Their difference mainly exists on the underlying network environments. The network memory or cooperative caching stands on a single cluster (Deshpande, et al., 2010; Wang, et al., 2007), while our previous work named RAM Grid devotes to the memory sharing in the high-speed wide-area network such as a campus network (Chu, et al., 2006; Zhang, et al., 2007), and the recently proposed RAM Cloud also tries to aggregate the memory resources in the data center (Ousterhout, et al., 2010). Their common ground is to boost the system performance with shared remote memory.

In order to study the potential performance improvement of remote memory sharing system, we will use our previous work RAM Grid as an example, to compare the overheads of data access for an 8KB block over local disk, local network file system and remote memory resource across the campus network with average 2ms round-trip latency and 2MB bandwidth. From Table 1, we can observe that the remote memory access only reduces the overhead by 25%~30%, and the major overhead mainly comes from the network transmission cost (nearly 60%). Therefore, the performance of remote memory sharing can be obviously improved if we reduce or hide some of the transmission cost. Prefetching is such an approach to hide the cost of low speed media among different levels of storage devices (Shi, et al., 2006; Vanderwiel, et al., 2000; Yang, et al., 2004). In this work, we will employ prefetching in remote memory sharing in order to reduce the overhead and improve the performance. Differing from traditional I/O devices, the remote nodes providing memory resources often have extra CPU cycles. Therefore, they can be exploited to decide the prefetching policy and parameters, thus releasing the user nodes, which are often dedicated to mass of computing tasks, from the process of prefetching. In contrast to traditional approaches, in which the prefetching data are decided by a rather simple algorithm in a user node, such a push-based prefetching scheme can be more effective.

Complete Chapter List

Search this Book:
Reset