High Performance Computing, Big Data, and Cloud Computing: The Perfect De Facto Trio or Converging Technological Mantras?

High Performance Computing, Big Data, and Cloud Computing: The Perfect De Facto Trio or Converging Technological Mantras?

Jose-Luis González-Sánchez (COMPUTAEX Foundation, Spain), Jesús Calle-Cancho (COMPUTAEX Foundation, Spain), David Cortés-Polo (COMPUTAEX Foundation, Spain), Luis-Ignacio Jiménez-Gil (COMPUTAEX Foundation, Spain) and Alfonso López-Rourich (COMPUTAEX Foundation, Spain)
DOI: 10.4018/978-1-5225-7074-5.ch009

Abstract

If the fourth industrial revolution should be the revolution of values, where people, more than ever, are at the center of everything, it may be the technology that gives us the ability to place ourselves in that privileged position. However, there is consensus that the fourth industrial revolution is not defined by a set of emerging technologies in themselves, but by the transition to new systems that are built on the infrastructure of the digital revolution that we have already lived. The speed of the advances experienced in the last decade, along with the scope and impact of these in society, have allowed us to understand that we have reached a new technological revolution. The convergence, that is the real revolution, not only of digital technologies but also physical and biological will allow humanity to face the great challenges that have been marked for decades or centuries.
Chapter Preview
Top

Introduction

For quite some time High Performance Computing (HPC) (Hager, 2011) has contributed to scientific innovation, industrial competitiveness, and raising living standards. The enormous growth in scientific, social and economic data is leading to great demand for computing resources and an increase in demand for complex simulations and analysis to which the vast technological ecosystem known as Big Data (BD) (Jin, 2015) is trying to respond. Cloud Computing (CC) (Mell 2011) provides the ability to access these extremely large volumes of data, some of which are collected almost in real time, and allows us to enjoy virtually unlimited computing power without having to support the infrastructures, software or services needed.

The CC and BD are, from the outset, de facto partners and with the advent of Hadoop, all the advantages of distributed resources can be obtained from a set of physical or virtual machines.

Although in its beginnings the relationship between the CC and HPC was not clear, precisely because one of the aspirations of the Cloud paradigm is to offer infrastructure services, the Cloud could be understood as an alternative to large computing resources and a way to avoid the necessity of acquiring Supercomputers. However, it was not long before HPC infrastructure and service provision were offered through private and public clouds.

HPC and BD could be seen as a marriage of convenience in an attempt to shift the advantages of distributed systems to clustering infrastructures that support HPC. This approach leads us to the obvious association of Big & Quick Data, or perhaps Big Compute, in an attempt to bring the processing speed of supercomputers to big data projects.

It seems appropriate to focus on the possibilities that can be obtained from the expression HPC + BD + CC being the outcome data capture, simulation and the visualization of results, the best possible trio. Perhaps this can be seen as an example of the convergence of a broad set of technological mantras (Open Data, IoT, M2M or Free Software), to process large volumes of a wide variety of data at high speed (machine learning, social technology, business intelligence, sensors or mobile apps).

Those mantras are offering many new ways to reach and capture massive volumes of data that were not even imagined before, making possible to connect distant elements between them and creating meta-data from which is possible to extrapolate new knowledge.

Cyber-physical systems are facilitating automation through the IoT and cloud computing, but it is clear that we must have real infrastructures allowing us to offer all types of existing clouds from a technological point of view. The data processing centers play a fundamental role to support the IoT and the virtuality of the hardware equipment that really supports all the contents that big data techniques allow to analyze. Increasingly they require the HPC to process, store and access the large volumes of information that we are generating to convert them into the desired knowledge.

Once the concepts and the advantages of each singular and combined paradigm have been clearly defined, the description of case studies can illustrate the multiplier effect produced by the convergence of these different technologies in order to understand that HPC, BD and CC, separately or together, to address the opportunities and challenges of the 21st century.

Key Terms in this Chapter

Cloud Computing: A proposal to provide a pool of configurable computing resources through on-demand network access that can be rapidly provisioned and released with minimal management effort or service provider interaction.

Data Mining: Process that tries to discover patterns in large volumes of data sets. It uses the methods of artificial intelligence, machine learning, statistics, and database systems.

Supercomputing: Processing of massively complex or data problems using the concentrated compute resources of multiple computer systems working in parallel (supercomputer).

Internet of Things: The ever-growing network of physical objects that feature internet connectivity and the communication that occurs between these and internet-enabled devices and systems.

Machine Learning: A scientific discipline in the field of artificial intelligence that creates systems that learn automatically. Learning in this context means identifying complex patterns in millions of data.

High Performance Computing: The use of super computers and parallel processing techniques for solving complex computational problems.

Big Data: A voluminous amount of structured, semi-structured, and unstructured data from various sources around us, such us Internet, sensors, nature, and society.

High Performance Data Analysis: HPC and BD convergence.

Open Data: Philosophy and practice that seeks to make certain data freely available to everyone, without restrictions of copyright, patents, or other control mechanisms.

Big and Quick Data: The big data that we propose to obtain as a result of the pairing of BG with HPC.

Complete Chapter List

Search this Book:
Reset