Dirk Gorissen (Gent University–IBBT, Belgium), Tom Dhaene (Gent University–IBBT, Belgium), Piet Demeester (Gent University–IBBT, Belgium) and Jan Broeckhove (Gent University–IBBT, Belgium)

Source Title: Handbook of Research on Grid Technologies and Utility Computing: Concepts for Managing Large-Scale Applications

Copyright: © 2009
|Pages: 10
DOI: 10.4018/978-1-60566-184-1.ch025

Chapter Preview

TopComputer based simulation has become an integral part of the engineering design process. Rather than building real world prototypes and performing experiments, application scientists can build a computational model and simulate the physical processes at a fraction of the original cost. However, despite the steady growth of computing power, the computational cost to perform these complex, high-fidelity simulations are still enormous. A simulation may take many minutes, hours, days or even weeks (Gu, 2001; Lin et al., 2005; Qian et al., 2006). This is especially evident for routine tasks such as optimization, sensitivity analysis and design space exploration as noted below:

“...it is reported that it takes Ford Motor Company about 36-160 hrs to run one crash simulation. For a two-variable optimization problem, assuming on average 50 iterations are needed by optimization and assuming each iteration needs one crash simulation, the total computation time would be 75 days to 11 months, which is unacceptable in practice” *(**Wang and Shan, 2007**, p1).*

Consequently, scientists have turned towards upfront approximation methods to reduce simulation times. The basic approach is to construct a simplified approximation of the computationally expensive simulator, which is then used in place of the original code to facilitate Multi-Objective Design Optimization (MDO), design space exploration, reliability analysis, and so on (Simpson, 2004). Since the approximation model acts as surrogate for the original code, it is referred to as a *surrogate model* or *metamodel.*

While the time needed for one evaluation of the original simulator is typically in the order of minutes or hours, the surrogate function, due to its compact mathematical notation, can be evaluated in the order of milliseconds. However, in order to construct an accurate surrogate one still requires evaluations of the original objective function, thus cost remains an issue. The focus of this paper is to discuss one technique to reduce this cost even further using distributed computing. By intelligently running simulations in parallel, the “wall-clock” execution time, in order to come to an acceptable surrogate model can be considerably reduced.

We present a framework that integrates the automated building of surrogate models with the distributed evaluation of the simulator. This integration occurs on multiple levels: resource level, scheduling level and the service level. Each of these will be detailed below.

TopSurrogate models play a significant role in many disciplines (hydrology, automotive industry, robotics, ...) where they help bridge the gap between simulation and understanding. The principal reason driving their use is that the simulator is too time consuming to run for a large number of simulations. A second reason is when simulating large scale systems, for example: a full-wave simulation of an electronic circuit board. Electro-magnetic modeling of the whole board in one run is almost intractable. Instead the board is modeled as a collection of small, compact, accurate surrogates that represent different functional components (capacitors, resistors, etc.) on the board.

There are a huge number of different surrogate model types available, with applications in domains ranging from medicine, ecology, economics to aerodynamics. Depending on the domain, popular model types include Radial Basis Function (RBF) models, Rational Functions, Artificial Neural Networks (ANN), Support Vector Machines (SVM), and Kriging models (Wang and Shan, 2007).

An important aspect of surrogate modeling is sample selection. Since data is computationally expensive to obtain, it is impossible to use traditional, one-shot, full factorial or space filling designs. Data points must be selected iteratively, there where the information gain will be the greatest (Kleijnen, 2005). A sampling function is needed that minimizes the number of sample points selected in each iteration, yet maximizes the information gain of each sampling step. This process is called adaptive sampling, but is also known as active learning, Optimal Experimental Design (OED), and sequential design.

Workflow: A workflow is a set of tasks (=nodes) that process data in a structured and systematic manner. In the case that each node is implemented as a service, a workflow describes how the services interact and exchange data in order to solve a higher level problem.

Experimental Design: The theory of Design of Experiments (DOE) describes methods and algorithms for optimally selecting data points from an n dimensional parameter space. A simple example, say you have to select 1000 points from a 3-dimensional space (n=3). This can be done randomly, using a full factorial design (an equal number of points in every dimension, i.e., 10x10x10) or according to a Latin hypercube. These are 3 basic examples of an experimental design.

Meta-Scheduler: A meta-scheduler is a software layer that abstracts the details of different grid middlewares. In this way a client can support multiple submission systems while only having to deal with one protocol (that used by the abstraction layer). An example of a meta-scheduler is GridWay (www.gridway.org).

Middleware: The middleware is responsible for managing the grid resources (access control, job scheduling, resource registration and discovery, etc.), abstracting away the details and presenting the user with a consistent, virtual computer to work with. Examples of middlewares include: Globus, Unicore, Legion and Triana.

Service Oriented Architecture (SOA): SOA represents an architectural model in which functionality is decomposed into small, distinct units (services), which can be distributed over a network and can be combined together and reused to create applications.

Surrogate Model: This is a model that approximates a more complex, higher order model and used in place of the complex model (hence the term surrogate). The reason is usually that the complex model is too computationally expensive to use directly, hence the need for a faster approximation. It is also known as a response surface model or a metamodel.

Sequential Design: For a high number of dimensions, n > 3, it quickly becomes impossible to use traditional space filling experimental designs since the number of points needed grows exponentially. Instead data points must be chosen iteratively and intelligently, there where the information gain is the highest. This process is known as sequential design, adaptive sampling or active learning.

Search this Book:

Reset

Editorial Advisory Board

Table of Contents

Foreword

Ruth E. Shaw

Preface

Emmanuel Udoh, Frank Zhigang Wang

Acknowledgment

Emmanuel Udoh

Chapter 1

Overview of Grid Computing
(pages 1-10)

Emmanuel Udoh, Frank Zhigang Wang, Vineet R. Khare

$37.50

$37.50

Chapter 3

Enis Afgan, Purushotham Bangalore

$37.50

Chapter 4

Effective Resource Allocation and Job Scheduling Mechanisms for Load Sharing in a Computational Grid
(pages 31-40)

Kuo-Chan Huang, Po-Chi Shih, Yeh-Ching Chung

$37.50

$37.50

Chapter 6

Consistency of Replicated Datasets in Grid Computing
(pages 49-58)

Gianni Pucciani, Flavia Donno, Andrea Domenici, Heinz Stockinger

$37.50

$37.50

$37.50

Chapter 9

Trust and Privacy in Grid Resource Auctions
(pages 85-96)

Kris Bubendorfer, Ben Palmer, Ian Welch

$37.50

Chapter 10

An Architectural Overview of the GRelC Data Access Service
(pages 98-108)

Sandro Fiore, Alessandro Negro, Salvatore Vadacca, Massimo Cafaro, Giovanni Aloisio, Roberto Barbera

$37.50

Chapter 11

Adaptive Resource Management in Grid Environment
(pages 109-117)

Man Wang, Zhihui Du, Zhili Cheng

$37.50

Chapter 12

Bio-Inspired Grid Resource Management
(pages 118-125)

Vineet R. Khare, Frank Zhigang Wang

$37.50

Chapter 13

Service Oriented Storage System Grid
(pages 126-135)

Yuhui Deng, Frank Zhigang Wang, Na Helian

$37.50

Chapter 14

Dominic Cherry, Maozhen Li, Man Qi

$37.50

$37.50

Chapter 16

Irfan Habib, Ashiq Anjum, Richard McClatchey

$37.50

Chapter 17

Pricing Computational Resources in Grid Economies
(pages 170-182)

Kurt Vanmechelen, Jan Broeckhove, Wim Depoorter, Khalid Abdelkader

$37.50

$37.50

Chapter 19

Grid-Based Nuclear Physics Applications
(pages 195-205)

Frans Arickx, Jan Broeckhove, Peter Hellinckx, David Dewolfs, Kurt Vanmechelen

$37.50

Chapter 20

Developing Biomedical Applications in the Framework of EELA
(pages 206-218)

Gabriel Aparicio, Fernando Blanco, Ignacio Blanquer, César Bonavides, Juan Luis Chaves, Miguel Embid, Álvaro Hernández

$37.50

Chapter 21

Distributed Image Processing on a Blackboard System
(pages 219-225)

Gerald Schaefer, Roger Tait

$37.50

Chapter 22

Simulated Events Production on the Grid for the BaBar Experiment
(pages 226-234)

Daniele Andreotti, Armando Fella, Eleonora Luppi

$37.50

$37.50

Chapter 24

Roberto Barbera, Valeria Ardizzone, Leandro Ciuffo

$37.50

Chapter 25

Grid Enabled Surrogate Modeling
(pages 249-258)

Dirk Gorissen, Tom Dhaene, Piet Demeester, Jan Broeckhove

$37.50

$37.50

Chapter 27

Gokop Goteng, Ashutosh Tiwari, Rajkumar Roy

$37.50

$37.50

$37.50

Chapter 30

Dynamic Maintenance in ChinaGrid Support Platform
(pages 303-311)

Hai Jin, Li Qi, Jie Dai, Yaqin Luo

$37.50

About the Contributors

Index