The Crystal Ball in HPC has Never Been More Exciting, nor More Important

The Crystal Ball in HPC has Never Been More Exciting, nor More Important

Pierre Kuonen, Marie-Christine Sawley
DOI: 10.4018/978-1-4666-0906-8.ch007
OnDemand:
(Individual Chapters)
Available
$33.75
List Price: $37.50
10% Discount:-$3.75
TOTAL SAVINGS: $3.75

Chapter Preview

Top

A Brief Review Of The Previous Decade

1997-2002: The Hope of Home Made HPC Machines

The origin of the SOS workshop series has to be found in 1997. Until then, HPC had been dominated by twenty years of vector computing. At the time, Massively Parallel Processing (MPP) model emerged as the driving force for the decade which was starting. Back in those days, the MPP model was incarnated by two tendencies: the CRAY T3E, which was holding strong positions in the Top500 with six machines, and the HPC cluster made of “commodity processors”: Number One was the ASCI Red system at Sandia Labs which had just conquered the Teraflops blue ribbon. That machine still holds today the record of longevity as Number One with seven nominations, from June 97 until November 01: an unusually long time, revealing the difficulties many vendors encountered and how hard the transition had been.

Scientists and engineers from Sandia Lab, Oak Ridge and EPFL sat together during summer 1997 as they were seeking for the most suitable technologies, interconnect, topologies and solutions needed for building their next HPC cluster using commoditized computing elements. The idea was to benchmark their respective solution and as much as possible contribute to the cross fertilization for the next generation systems. The following editions of the workshop provided excellent opportunities for focusing on software development, programming models and parallel compilers, as well as on the effort of writing codes that could help solving leading edge science problems.

2002-2005: The Era of the Ecosystem

The announcement of the Japanese Earth Simulator, the first non-US Number one of the Top500 marked a new chapter: with a Linpack rate of 41 Tflops, this highly efficient machine was able to run a real, complete application at 35 Tflops. At the same time, we could observe that the number and quality of computational scientists developing community software had grown significantly and that these new groups used entry level clusters to access the arena. In return, they expected their software to be able to scale on a larger machine, as similar as possible to that on which they had been using while developing the code. The Top10 has since shown an aggregation around two or three driving solutions, slightly differing replica of mainstream architectures. New problematic emerged with the massive increase of number of components: individual failures or data corruptions, impacting the correct execution of a task. This gave rise to new areas of research in Computer Science such as fault tolerance and resilience.

This decade was characterized by what was in those years coined as the “Ecosystem” a fertile environment on different parts of which the scientific tasks are shipped according to performance, production campaigns or development phases. In spite of its admirable achievement, the Earth Simulator was scarce of an irrigation system and of vast soil where new applications can be rooted and nurtured. This relative isolation played certainly a role in keeping the segment of vector computer narrow and which has dwindled then.

Complete Chapter List

Search this Book:
Reset