Database Techniques for Multi Cores and Big Memory

Database Techniques for Multi Cores and Big Memory

Xiongpai Qin, Biao Qin, Cuiping Li, Hong Chen, Xiaoyong Du, Shan Wang
Copyright: © 2014 |Pages: 10
DOI: 10.4018/978-1-4666-5202-6.ch062
OnDemand:
(Individual Chapters)
Available
$33.75
List Price: $37.50
10% Discount:-$3.75
TOTAL SAVINGS: $3.75

Chapter Preview

Top

Backround

This section gives a brief introduction to some new hardware technologies that database systems could leverage.

Multi Core CPU

Improving the performance of CPU through increasing its clock frequency becomes more and more difficult. Researchers and engineers are seeking new ways to improve the performance of CPU, and they bring forth multi core technology.

In a typical multi core CPU, 2, 4, 8 or more cores are integrated on a chip. The cores have their own private caches (L (Level) 1 Cache), and share some larger but slower caches (L2 Cache). These cores access the shared main memory for parallel data processing. Putting several cores on a die allows for higher communication speeds between the cores, which will benefit many computing tasks.

The amount of performance gained by using multi core CPUs depends on the problem to be solved and the algorithms used. Many applications such as database systems are basically multi-threaded, and they can benefit from multi core CPUs at once without any modification to exist software. However, for fully utilizing the cores to boost software performance, there is still much work to do.

Figure 1.

Diagram of a generic dual core CPU (Wikipedia-a, 2013)

978-1-4666-5202-6.ch062.f01

GPGPU

GPU is traditionally used to accelerate the specific task of graphic rendering. GPU venders have integrated many computing units in a single die, and optimized the bandwidth for processing large volume of graphic data.

The huge computation power of GPU is exploited to perform other tasks as well, and GPU has become GPGPU (General Purpose GPU). GPU vendors have recognized the value of that. NVIDIA, one of major GPU manufacturers, has provided CUDA (Compute Unified Device Architecture), a SDK for easy programming of GPU for general tasks. Since GPU is designed primarily for graphic processing instead of general tasks, the architecture of a GPU is rather different from CPU. Taking NVIDIA CUDA as an example, it has its own unique thread hierarchy and memory hierarchy.

The thread hierarchy consists of four levels. (a) Grid is the first level of thread hierarchy, which is a group of one or more blocks. A grid is created for each CUDA kernel function. (b) The next level of thread hierarchy is Block, which is a user defined group of 1 to 512 threads. Each block is identified by a blockIdx. (c) The third level of thread hierarchy is Warp - scheduling unit of up to 32 threads. (d) Final level of the hierarchy is Thread, which is distributed by the CUDA runtime and identified by a threadIdx.

The CUDA platform has three primary memory types in the memory hierarchy. (a) Local Memory is per-thread memory for local variables and register spilling. (b) Shared Memory is per-block low latency memory to allow for intra-block data sharing and synchronization. (c) Global Memory is the device level memory that may be shared between blocks or grids. CUDA has also constant cache and texture cache for fast access of some specific data. The caches are read only and have faster access than shared memory. In CUDA data can be copied from one memory type to another, and to and from main memory, for GPU to process the data.

Key Terms in this Chapter

Multi Core CPU: Several cores are manufactured on a single die for higher performance by parallelism.

Partition Attributes Across (PAX): A table is firstly horizontally partitioned into pages, in each page all values of each attribute are grouped together, which greatly improves cache performance.

Cache Oblivious: Refers to database data/index layout and processing techniques those are independent of parameters of memory hierarchy.

N-ary Storage Model (NSM): Stores records contiguously starting from the beginning of each disk page, and uses an offset (slot) table at the end of the page to position the beginning of each record (row). NSM has poor cache performance because it loads the cache with unnecessary attributes.

Cache Sensitive (Cache Conscious): Refers to database data/index layout and processing techniques those are adaptive the parameters of memory hierarchy, to accelerate database operations.

Decomposition Storage Model (DSM): Vertically partitions an n-attribute relation into columns, each of which is accessed only when queries need it.

General Purpose Graphics Processing Unit (GPU/GPGPU): GPU is not only used for graphic processing, but also for other tasks such as data processing.

Complete Chapter List

Search this Book:
Reset