The Future of High-Performance Computing (HPC)

The Future of High-Performance Computing (HPC)

Copyright: © 2018 |Pages: 14
DOI: 10.4018/978-1-5225-2255-3.ch347
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

For decades, HPC has established itself as an essential tool for discoveries, innovations and new insights in science, research and development, engineering and business across a wide range of application areas in academia and industry. Today High-Performance Computing is also well recognized to be of strategic and economic value – HPC matters and is transforming industries. This article will discuss new emerging technologies that are being developed for all areas of HPC: compute/processing, memory and storage, interconnect fabric, I/O and software to address the ongoing challenges in HPC such as balanced architecture, energy efficient high-performance, density, reliability, sustainability, and last but not least ease-of-use. Of specific interest are the challenges and opportunities for the next frontier in HPC envisioned around the 2020 timeframe: ExaFlops computing. We will also outline the new and emerging area of High Performance Data Analytics, Big Data Analytics using HPC, and discuss the emerging new delivery mechanism for HPC - HPC in the Cloud.
Chapter Preview
Top

Introduction

High-Performance Computing (HPC) is used to address and solve the world’s most complex computational problems. For decades, HPC has established itself as an essential tool for discoveries, innovations and new insights in science, research and development, engineering and business across a wide range of application areas in academia and industry. It has become an integral part of the scientific method – the third leg along with theory and experiment.

Today, High-Performance Computing is also well recognized to be of strategic and economic value – HPC matters and is transforming industries (Osseyran & Giles, 2015).

High Performance Computing enables scientists and engineers to solve complex and large science, engineering, and business problems using advanced algorithms and applications that require very high compute capabilities, fast memory and storage, high bandwidth and low latency throughput, high fidelity visualization, and enhanced networking.

Today, the IT industry is being transformed by cloud, big data, social media, artificial intelligence, and “Internet of Things” technologies and business models. All of these trends require advanced computational simulation models and powerful highly scalable systems. Hence, sophisticated HPC capabilities are critical to the organizations and companies that want to establish and enhance leadership positions in their respective areas.

Some industry verticals and application areas where HPC is used are as follows:

  • Manufacturing, Computer Aided Engineering (CAE)

  • Automotive Industry

  • Aerospace Industry

  • Weather Forecast and Climate Research

  • Energy, Oil & Gas Industry, Geophysics

  • Life-Science and Bio-Informatics (Genomics)

  • Government Research Laboratories

  • Universities (Academics), Machine Intelligence, Machine/Deep Learning, Artificial Intelligence (AI)

  • Astrophysics, High-Energy Physics, Computational Chemistry, Material Science

  • Financial Services Industry(FSI)

  • Digital Content Creation (DCC)

  • Defense

  • Security and Intelligence

Top

Background

After its initial years of proprietary computer systems in the 1970/1980’s, HPC has evolved with industry standards that democratized Supercomputing, making advanced computing available to more users and wider application segments.

Today’s modern HPC solutions are utilizing high-performance server compute nodes connected with high performance fabrics connected to high-performance storage systems, mainly deployed on distributed cluster architectures running on Linux based operating systems with up to tens of thousands of processors. For specific workloads and applications with the need for large coherent shared memory capacity (terabytes of data: TB), more specialized solutions and systems are used based on cc:NUMA (Cache-Coherent Non-Uniform Memory Access) architectures. For example, the SGI UV system supports up to 256 CPU sockets and up to 64TB of cache-coherent shared memory in a single system.

While in the past chip designs used to be limited by space and the number of transistors available, now power consumption is becoming the main constraint for High-Performance Computing. With several new emerging technologies there will be multiple opportunities to address some of the ongoing challenges in HPC such as balanced architectures, energy efficiency, density, reliability, resiliency, sustainability, and last but not least ease-of-use.

Key Terms in this Chapter

HPC: High-Performance Computing, solving the world’s hardest computational problems.

HDD: Hard Disk Drive, a data storage device used for storing and retrieving digital information using rapidly rotating disks (platters) coated with magnetic material.

DSP: Digital Signal Processing, refers to various techniques for improving the accuracy and reliability of digital communications, also used for acceleration of regular structured simple parallel computational tasks.

FLOPS: Floating-Point Operations per Second, a performance unit for computer processing capabilities.

DDR4: Latest generation of DRAM memory technology.

NAND: Negative-AND logic gates based flash memory type of non-volatile storage technology that does not require power to retain data.

CPU: Central Processing Unit, a general purpose integrated circuit chip, the brains of a computer where most calculations take place.

SDI: Software Defined Infrastructure, a computing infrastructure entirely under the control of software with no operator or human intervention.

PFLOPS: Petaflops, 10 15 floating-point operations per second, a compute performance unit.

SIMD: Single Instruction Multiple Data, a technology to perform the same operation on multiple data points simultaneously to exploit data level parallelism.

MW: Mega Watt, one million watt unit of power.

IOPS: Input/Output Operations per second, a data I/O performance unit.

TFLOPS: Teraflops, 10 12 floating-point operations per second, a compute performance unit.

GFLOPS: Gigaflops, 10 9 floating-point operations per second, a compute performance unit.

DRAM: Dynamic Random Access Memory, a memory chip that depends upon an applied voltage to keep the stored data.

IoT: Internet of Things, a network of physical objects that feature an IP address for internet connectivity, and the communication that occurs between these objects and other Internet-enabled devices and systems.

OpenCL: Open Computing Language, a programming framework for writing software that execute across heterogeneous computing platforms.

Cluster: A computer cluster consists of a set of loosely or tightly connected computers (nodes) that work together so that, in many respects, they can be viewed as a single system, controlled by software.

GPU: Graphics Processing Unit, a single-chip special purpose processor primarily used to manage and boost the performance of video and graphics, also used for acceleration of regular structured simple parallel computational tasks.

DDR: Double Data Rate, a type of volatile DRAM memory using both the falling and rising edges of the clock signal.

PCI: Peripheral Component Interconnect, a standard for connecting computers and their peripherals.

MPI: Message Passing Interface, a library specification for message-passing between cluster nodes to program parallel applications.

FPGA: Field-Programmable Gate Array, an integrated circuit designed to be configured and re-programmed by a customer or a designer after manufacturing.

KW: Kilo Watt, one thousand watt unit of power.

OpenMP: Open Multi-Processing, an application programming interface that supports multi-platform shared memory multiprocessing programming used to program parallel multi-threaded software.

NVM: Non-Volatile Memory, a type of persistent computer memory that can retrieve stored information even after having been power cycled (turned off and back on).

Complete Chapter List

Search this Book:
Reset