Parallel Programming Models and Systems for High Performance Computing

Parallel Programming Models and Systems for High Performance Computing

Manjunath Gorentla Venkata (Oak Ridge National Laboratory, USA) and Stephen Poole (Oak Ridge National Laboratory, USA)
Copyright: © 2015 |Pages: 39
DOI: 10.4018/978-1-4666-8213-9.ch008
OnDemand PDF Download:
List Price: $37.50


A parallel programming model is an abstraction of a parallel system that allows expression of both algorithms and shared data structures. To accommodate the diversity in parallel system architectures and user requirements, there are a variety of programming models including the models providing a shared memory view or a distributed memory view of the system. The programming models are implemented as libraries, language extensions, or compiler directives. This chapter provides a discussion on programming models and its implementations aimed at application developers, system software researchers, and hardware architects. The first part provides an overview of the programming models. The second part is an in-depth discussion on high-performance networking interface to implement the programming model. The last part of the chapter discusses implementation of a programming model with a case study. Each part of the chapter concludes with a discussion on current research trends and its impact on future architectures.
Chapter Preview

1. Overview Of Current And Emerging Programming Models

Parallel systems, also called supercomputer or clusters, are used for executing a wide range of computationally intensive scientific simulations such as weather forecasting, ocean modeling, combustion simulation, molecular modeling, physical system simulations, and the others. They have evolved from systems built from combining few processors to systems with millions of nodes, each with hundreds of computing cores. The predominant architecture of these systems have either been shared memory (computing nodes with access to the same global memory) or distributed memory (computing nodes with access to the private memory) systems.

Applications employ parallel computation techniques to exploit the parallelism in the systems and arrive at the solution, quickly. Parallel computation involves dividing the computation into separate computation tasks and mapping each task to an execution context, which is typically managed by an Operating System (OS) process or a thread. A majority of these applications require co-ordination between various computation tasks to arrive at the solution. The coordination involves exchange of data and intermediate results, synchronization between all or some of the processes, spawning of new computation, and restarting some computation.

Achieving parallelism on these systems presents numerous challenges: parallelizing the algorithm to millions of computing cores, controlling the parallel execution contexts, optimizing the computation to reduce communication overhead, handling distributed errors and failures, and optimizing for power usage. These challenges are going to exacerbate as we move towards more extreme-scale systems.

The programming models aim to address these challenges by providing concise, efficient, and scalable abstractions to express the parallel algorithms and shared data structures. They achieve this by making trade-offs and appropriate design choices on: how data is distributed and viewed, execution context is mapped to the physical (system) resources, execution contexts are synchronized, and communication is expressed or controlled. As a consequence of design decisions, there are various programming models, which vary in the abstraction provided to the user, and ideal system architecture required for these abstractions to be highly scalable and high performing. The common variations of these programming models are distributed memory, shared memory, Partitioned Global Address Space (PGAS), and hybrid systems.

Complete Chapter List

Search this Book: