Flashing in the Cloud: Shedding Some Light on NAND Flash Memory Storage Systems

Flashing in the Cloud: Shedding Some Light on NAND Flash Memory Storage Systems

Jalil Boukhobza (University of Western Brittany, France)
Copyright: © 2013 |Pages: 26
DOI: 10.4018/978-1-4666-3934-8.ch015


Data and storage systems are one of the most important issues to tackle when dealing with cloud computing. Performance, in terms of data transfer and energy cost, predictability, and scalability are the main challenges researchers are faced with, and new techniques for storing, managing, and accessing huge amounts of data are required to make cloud computing technology feasible. With the emergence of flash memories in mass storage systems and the advantages it can provide in terms of speed and power efficiency as compared to traditional disks, one must rethink the storage system architectures accordingly. Indeed, the integration of flash memories is considered as a key technology to leverage the performance of data-centric computing. The purpose of this chapter is to introduce flash memory storage systems by focusing on their specific architectures and algorithms, and finally their integration into servers and data centers.
Chapter Preview


The amount of data of different forms generated today is growing exponentially. In fact, it is growing faster than Moore’s law. For instance, online data indexed by Google increased by more than a 56 factor from 2002 to 2009 (5 to 280 Exabyte) (Ranganathan, 2011). This trend is not limited to Web data, enterprise data volume is also growing very fast, it observed a cumulative growth rate of 173%. (Ranganathan, 2011; Winter, 2008).

In a recent study (Farmer, 2010), EMC forecasted that the amount of digital information created annually will grow by a factor of 44 from 2009 to 2020. It also predicts that, by 2020, more than 1/3rd of all digital information created annually will live or pass through the cloud, which underlines the need for huge additional storage capacity.

In fact, applications are becoming more and more data intensive: transaction processing, emails, search engines, TV, social networking, video and photo sharing, etc. Those data centric applications can operate on data in many ways such as: capturing, analyzing, processing, classifying, archiving, (Ranganathan, 2011) and so putting more and more stress on I/O storage systems performances. This tendency is also confirmed for scientific computing in which phenomena understanding passes through large scale data analysis (Gray, 2006a).

In 2006, data centers in the US consumed more than 1.5% of the energy generated that year, and the percentage is estimated to increase by 18% per year (Zhang, Chang & Boutaba, 2010). Moreover, the storage system represents 20 to 40% of total energy consumption in typical data centers (Carter & Rajamani, 2010). Consequently, in addition to performance and capacity considerations, energy efficiency becomes one of the most critical metric to consider in order to address the increasing cost of operating a data center (Roberts, Kgil & Muldge, 2009b).

To partly reduce energy costs, big companies such as Microsoft, Google and Yahoo have built new data centers near to large and cost efficient power sources (Vasudevan et al., 2010). In fact, Data centers are more and more driven by the management of their limited energy budget (Carter & Rajamani, 2010).

Given that only 5% of the world’s data is digitalized (Lyman, & Varian, 2003; Ranganathan, 2011), the growth in capacity, performance and energy need will continue for many years.

The confluence of these trends pushes to rethink traditional architectures and memory hierarchies and makes a large room for challenging research in the area of storage systems.

Among the “10 obstacles and opportunities for cloud computing” discussed in (Armbrust et al., 2010), three of them are directly related to data and storage systems: 1) data transfer bottleneck, 2) performance unpredictability due to disk I/O performance, and 3) storage scalability.

The traditional approach for building high performance and capacity storage systems consists in using arrays of disk drives and distributing data across the disks to provide parallel I/O along with error detection/correction to provide fault tolerance. Because of performance and energy limitations of this approach, while considering the huge amounts of data still to be stored and managed, and in addition to technological limits of disks (heat dissipation limits the increasing of the disk RPM to enhance performance), integrating non volatile memories, especially flash memories which are the most popular and mature candidate, can be part of the answer.

In fact, the advances in adoption of flash memories are considered as “the most important technology changes pertinent to data centric computing” (Ranganathan, 2011).

Complete Chapter List

Search this Book: