Runtime Adaptation Techniques for HPC Applications

Runtime Adaptation Techniques for HPC Applications

Edgar Gabriel
Copyright: © 2010 |Pages: 23
DOI: 10.4018/978-1-60566-661-7.ch025
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

This chapter discusses runtime adaption techniques targeting high-performance computing applications. In order to exploit the capabilities of modern high-end computing systems, applications and system software have to be able to adapt their behavior to hardware and application characteristics. Using the Abstract Data and Communication Library (ADCL) as the driving example, the chapter shows the advantage of using adaptive techniques to exploit characteristics of the network and of the application. This allows to reduce the execution time of applications significantly and to avoid having to maintain different architecture dependent versions of the source code.
Chapter Preview
Top

Introduction

High Performance Computing (HPC) has reshaped science and industry in many areas. Recent groundbreaking achievements in biology, drug design and medical computing would not have been possible without the usage of massive computational resources. However, software development for HPC systems is currently facing significant challenges, since many of the software technologies applied in the last ten years have reached their limits. The number of applications being capable of efficiently using several thousands of processors or achieving a sustained performance of multiple teraflops is very limited and is usually the result of many person-years of optimizations for a particular platform. These optimizations are however often not portable. As an example, an application optimized for a commodity PC cluster performs (often) poorly on an IBM Blue Gene or the NEC Earth Simulator. Among the problems application developers face are the wide variety of available hardware and software components, such as

  • Processor type and frequency, number of processor per node and number of cores per processor,

  • Size and performance of the main memory, cache hierarchy,

  • Characteristics and performance of the network interconnect,

  • Operating system, device drivers and communication libraries,

and the influence of each of these components on the performance of their application. Hence, an end-user faces a unique execution environment on each parallel machine he uses. Even experts struggle to fully understand correlations between hardware and software parameters of the execution environment and their effect on the performance of a parallel application.

Key Terms in this Chapter

Static Tuning: Tuning of a code sequence or function before executing the real application.

Dynamic Tuning: Tuning of a code sequence or function during the execution of the real application.

Adaptive Applications: Application capable of changing its behavior, switch to alternate code sections or change to different values for certain parameters at runtime as a response to different input data or changing conditions.

Decision Algorithms: algorithms used to compare different versions of the same unctionalitywhile executing the application with respect to a particular metric such as execution time.

Complete Chapter List

Search this Book:
Reset