Big Data in Massive Parallel Processing: A Multi-Core Processors Perspective

Big Data in Massive Parallel Processing: A Multi-Core Processors Perspective

Vijayalakshmi Saravanan (The State University of New York at Buffalo, USA), Anpalagan Alagan (Ryerson University, Canada) and Isaac Woungang (Ryerson University, Canada)
DOI: 10.4018/978-1-5225-3142-5.ch011
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

With the advent of novel wireless technologies and Cloud Computing, large volumes of data are being produced from various heterogeneous devices such as mobile phones, credit cards, and computers. Managing this data has become the de-facto challenge in the current Information Systems. According to Moore's law, processor speeds are no longer doubling, the processing power also continuing to grow rapidly which leads to a new scientific data intensive problem in every field, especially Big Data domain. The revolution of Big Data lies in the improved statistical analysis and computational power depend on its processing speed. Hence, the need to put massively multi-core systems on the job is vital in order to overcome the physical limits of complexity and speed. It also arises with many challenges such as difficulties in capturing massive applications, data storage, and analysis. This chapter discusses some of the Big Data architectural challenges in the perspective of multi-core processors.
Chapter Preview
Top

Background

The Big Data paradigm has evolved rapidly in the recent years. According to Manyika et al. (2011), the main characteristics of Big Data shown in Figure 1 are described as follows:

Key Terms in this Chapter

Data Mining: Computational process and technique of discovering useful patterns in a large dataset.

Clustering: Technique that consists of conceptually devising meaningful groups of objects that share some common characteristics. This involves dividing the objects into groups and performing classification that is assigning individual objects to these groups.

Embarrassingly Parallel Workload: Workload where little or no effort is needed to divide a task into a number of parallel tasks.

Data Intensive Computing: Class of parallel computing applications that use a data parallel approach to process large volumes of data.

Big Data: Large volumes of structured and unstructured data whether homogeneous or heterogeneous data. In general, it is not the amount of data that matters the most but what organizations intend to do with the data. In this sense, the goal of Big Data analysis is to provide some insights that may lead to better decisions and strategic business support.

Neuromorphic Computing: Concept that describes the use of large-scale integration (VLSI) systems containing electronic analog circuits to mimic the neurobiological architectures present in the nervous system.

Quantum Computing: Area of computing that focuses on the study of Quantum Computers (also called theoretical computation systems).

Cloud Computing: Paradigm and technology that has been designed to augment the mobile device or user’s application capability beyond its physical boundaries by wirelessly transferring over the Internet the computation burden from it to the resource rich data. This is where the computationally intensive task will be processed in virtual machines. This is operated on a pay-for-use basis.

Data Parallelism: Simultaneous execution on multiple cores of the same function across the elements of a dataset. This also refers to Single Instruction Multiple Data (SIMD).

Granular Computing: Emerging computing paradigm of information processing concerned with the processing of complex information entities (so-called information granules).

Complete Chapter List

Search this Book:
Reset