Flash-Aware Buffer Management for Database Systems

Flash-Aware Buffer Management for Database Systems

Yi Ou (Department of Computer Science, University of Kaiserslautern, Kaiserslautern, Germany), Peiquan Jin (University of Science & Technology of China, Hefei, China) and Theo Härder (Department of Computer Science, University of Kaiserslautern, Kaiserslautern, Germany)
Copyright: © 2013 |Pages: 18
DOI: 10.4018/ijkbo.2013100102
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

Classical buffer replacement policies, e. g., LRU, are suboptimal for database systems having flash disks for persistence, because they are not aware of the distinguished characteristics of flash-based storage devices. The authors introduce the basic principles of buffer management for such devices and present two efficient buffer algorithms that apply these principles. These algorithms significantly improve the performance of flash-based databases, as confirmed by our trace-driven performance study.
Article Preview

Introduction

Flash disks are playing an increasingly important role for server-side computing, because, compared to magnetic disks, they are much more energy efficient and they have no mechanical parts and, therefore, hardly any perceptible latency (Graefe, 2007). Typically, flash disks are managed by the operating system as block devices through the same interface types as those to magnetic disks. However, the distinguished performance characteristics of flash disks make it necessary to reconsider the design of DBMSs, for which the I/O performance is critical.

Device Performance Characteristics

The most important building blocks of flash disks are flash memory and flash translation layer (FTL). Logical block addresses are mapped by the FTL to varying locations on the physical medium. This mapping is required due to the intrinsic limitations of flash memory (Woodhouse, 2001; Gal & Toledo, 2005). The FTL implementation is device-related and supplied by the disk manufacturer. Many efforts are made to systematically benchmark the performance of flash disks (Gray & Fitzgerald, 2008; Härder, Schmidt, Ou, & Bächle, 2009; Bouganim, 2009). Using these observations, the most important conclusions of these benchmarks are:

  • For sequential read-or-write workloads, flash disks often achieve a performance comparable to high-end magnetic disks.

  • For random workloads, the performance asymmetry of flash disks and their difference to magnetic disks is significant: random reads are typically two orders of magnitude faster than those on magnetic disks, while random writes on flash disks are often even slower than those on magnetic disks. As an example, some flash disks achieve 12,000 IOPS for random reads and only 130 IOPS for random writes of 4 KB blocks (MTRON, 2008), while high-end magnetic disks typically have more than 200 IOPS for random I/O(Gray & Fitzgerald, 2008).

  • Due to the employment of device caches and further optimizations in the FTL (Kim, Kim, Noh, Min, & Cho, 2002; Lee, Park, Chung, Lee, Park, & Song, 2007), page-level writes with strong spatial locality can be served by flash disks more efficiently than write requests without locality. In our context, spatial locality refers to the property of contiguously accessed DB pages being physically stored close to each other.

Interestingly, many benchmarks show that flash disks can handle random writes with larger request sizes more efficiently. For example, the bandwidth of random writes using units of 128 KB is more than an order of magnitude higher than writing at units of 8 KB. In fact, a write request of, say 128 KB, is internally mapped to 64 sequential writes of 2-KB flash pages inside a flash block. Note that sequential access is an extreme case of high spatial locality.

The Problem

Flash disks are considered an important alternative to magnetic disks. Therefore, we focus here on the problem of buffer management for DBMSs having flash disks as secondary storage. If we denote the sequence of n logical I/O requests as X, a buffer management algorithm A is a function that maps X and a buffer with b pages into a sequence of m physical I/O requests , mn, i. e., A(X,b)=Y.

Let C(Y) denote the accumulated time necessary for a storage device to serve Y, we have C(Y) = C(A(X,b)). Given a sequence of logical I/O requests X, a buffer with b pages, and a buffer management algorithm A, we say A is optimal, iff for any other algorithm ,

.

For magnetic disks, C(Y) is often assumed to be linear to |Y|. Clearly, this assumption does not hold for flash disks, because C heavily depends on the write/read ratio and the write patterns of Y. Therefore, each I/O request, either logical or physical, has to be represented as a tuple of the form (op, pageId), where op is either “R” (for a read request) or “W” (for a write request).

While the above formalization defines our problem, our goal is not to find the optimal algorithm in theory, but a practically applicable one that has acceptable runtime overhead and minimizes I/O cost as far as possible.

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 8: 4 Issues (2018): 1 Released, 3 Forthcoming
Volume 7: 4 Issues (2017)
Volume 6: 4 Issues (2016)
Volume 5: 4 Issues (2015)
Volume 4: 4 Issues (2014)
Volume 3: 4 Issues (2013)
Volume 2: 4 Issues (2012)
Volume 1: 4 Issues (2011)
View Complete Journal Contents Listing