Porting Applications to Grids

Porting Applications to Grids

Wolfgang Gentzsch (EU Project DEISA and Board of Directors of the Open Grid Forum, Germany)
Copyright: © 2010 |Pages: 27
DOI: 10.4018/978-1-60566-661-7.ch004
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

Aim of this chapter is to guide developers and users through the most important stages of implementing software applications on Grid infrastructures, and to discuss important challenges and potential solutions. Those challenges come from the underlying grid infrastructure, like security, resource management, and information services; the application data, data management, and the structure, volume, and location of the data; and the application architecture, monolithic or workflow, serial or parallel. As a case study, the author presents the DEISA Distributed European Infrastructure for Supercomputing Applications and describes its DEISA Extreme Computing Initiative DECI for porting and running scientific grand challenge applications. The chapter concludes with an outlook on Compute Clouds, and suggests ten rules of building a sustainable grid as a prerequisite for long-term sustainability of the grid applications.
Chapter Preview
Top

Introduction

Over the last 40 years, the history of computing is deeply marked of the affliction of the application developers who continuously are porting and optimizing their application codes to the latest and greatest computing architectures and environments. After the von-Neumann mainframe came the vector computer, then the shared-memory parallel computer, the distributed-memory parallel computer, the very-long-instruction word computer, the workstation cluster, the meta-computer, and the Grid (never fear, it continues, with SOA, Cloud, Virtualization, Many-core, and so on). There is no easy solution to this, and the real solution would be a separation of concerns between discipline-specific content and domain-independent software and hardware infrastructure. However, this often comes along with a loss of performance stemming from the overhead of the infrastructure layers. Recently, users and developers face another wave of complex computing infrastructures: the Grid.

Let’s start with answering the question: What is a Grid? Back in 1998, Ian Foster and Carl Kesselman (1998) attempted the following definition: “A computational grid is a hardware and software infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high-end computational capabilities.” In a subsequent article (Foster, 2002), “The Anatomy of the Grid,” Ian Foster, Carl Kesselman, and Steve Tuecke changed this definition to include social and policy issues, stating that Grid computing is concerned with “coordinated resource sharing and problem solving in dynamic, multi-institutional virtual organizations.” The key concept is the ability to negotiate resource-sharing arrangements among a set of participating parties (providers and consumers) and then to use the resulting resource pool for some purpose. They continued: “The sharing that we are concerned with is not primarily file exchange but rather direct access to computers, software, data, and other resources, as is required by a range of collaborative problem-solving and resource-brokering strategies emerging in industry, science, and engineering. This sharing is, necessarily, highly controlled, with resource providers and consumers defining clearly and carefully just what is shared, who is allowed to share, and the conditions under which sharing occurs. A set of individuals and/or institutions defined by such sharing rules form what we call a virtual organization.” This author’s concern, from the beginning (Gentzsch, 2002), was that the new definition seemed very ambitious, and as history has proven, many of the Grid projects with a focus on these ambitious objectives did not lead to a sustainable grid production environment, so far. We can only repeat that the simpler the grid infrastructure, and the easier to use, and the sharper its focus, the bigger is its chance for success. And it is for a good reason (which we will explain in the following) that currently the so-called Clouds are becoming more and more popular (Amazon, 2007).

Over the last ten years, hundreds of applications in science, industry and enterprises have been ported to Grid infrastructures, mostly prototypes in the early definition of Foster & Kesselman (1998). Each application is unique in that it solves a specific problem, based on modeling, for example, a specific phenomenon in nature (physics, chemistry, biology, etc.), presented as a mathematical formula together with appropriate initial and boundary conditions, represented by its discrete analogue using sophisticated numerical methods, translated into a programming language computers can understand, adjusted to the underlying computer architecture, embedded in a workflow, and accessible remotely by the user through a secure, transparent and application-specific portal. In just these very few words, this summarizes the wide spectrum and complexity we face in problem solving on grid infrastructures.

Key Terms in this Chapter

DEISA: The Distributed European Infrastructure for Supercomputing Applications is a consortium of leading national supercomputing centres that currently deploys and operates a persistent, production quality, distributed supercomputing environment with continental scope. The purpose of this EU funded research infrastructure is to enable scientific discovery across a broad spectrum of science and technology, by enhancing and reinforcing European capabilities in the area of high performance computing. This becomes possible through a deep integration of existing national high-end platforms, tightly coupled by a dedicated network and supported by innovative system and grid software.

DECI: The purpose of the DEISA Extreme Computing Initiative (DECI) is to enhance the impact of the DEISA research infrastructure on leading European science and technology. DECI identifies, enables, deploys and operates “flagship” applications in selected areas of science and technology. These leading, ground breaking applications must deal with complex, demanding, innovative simulations that would not be possible without the DEISA infrastructure, and which would benefit from the exceptional resources of the Consortium.

Grid Engine: An open source batch-queuing and workload management system. Grid Engine is typically used on a compute farm or compute cluster and is responsible for accepting, scheduling, dispatching, and managing the remote execution of large numbers of standalone, parallel or interactive user jobs. It also manages and schedules the allocation of distributed resources such as processors, memory, disk space, and software licenses.

OGSA: The Open Grid Services Architecture, describes an architecture for a service-oriented grid computing environment for business and scientific use, developed within the Open Grid Forum. OGSA is based on several Web service technologies, notably WSDL and SOAP. Briefly, OGSA is a distributed interaction and computing architecture based around services, assuring interoperability on heterogeneous systems so that different types of resources can communicate and share information. OGSA has been described as a refinement of the emerging Web Services architecture, specifically designed to support Grid requirements.

Clouds Computing: Computing paradigm focusing on provisioning of metered services related to the use of hardware, software platforms, and applications, billed on a pay-per-use base, and pushed by vendors such as Amazon, Google, Microsoft, Salesforce, Sun, and others. Accordingly, there are many different (but similar) definitions (as with Grid Computing).

Grid: A service for sharing computer power and data storage capacity over the Internet, unlike the Web which is a service just for sharing information over the Internet. The Grid goes well beyond simple communication between computers, and aims ultimately to turn the global network of computers into one vast computational resource. Today, the Grid is a “work in progress”, with the underlying technology still in a prototype phase, and being developed by hundreds of researchers and software engineers around the world.

Virtual Organization: A group of people with similar interest that primarily interact via communication media such as newsletters, telephone, email, online social networks etc. rather than face to face, for social, professional, educational or other purposes. In Grid Computing, a VO is a group who shares the same computing resources.

Globus Toolkit: A software toolkit designed by the Globus Alliance to provide a set of tools for Grid Computing middleware based on standard grid APIs. Its latest development version, GT4, is based on standards currently being drafted by the Open Grid Forum.

Open Grid Forum: The Open Grid Forum is a community of users, developers, and vendors leading the global standardisation effort for grid computing. OGF accelerates grid adoption to enable business value and scientific discovery by providing an open forum for grid innovation and developing open standards for grid software interoperability. The work of OGF is carried out through community-initiated working groups, which develop standards and specifications in cooperation with other leading standards organisations, software vendors, and users. The OGF community consists of thousands of individuals in industry and research, representing over 400 organisations in more than 50 countries.

Grid Portal: A Grid Portal provides a single secure web interface for end-users and administrators to computational resources (computing, storage, network, data, applications) and other services, while hiding the complexity of the underlying hardware and software of the distributed computing environment. An example is the EnginFrame cluster, grid, and cloud portal which for example in DEISA serves as the portal for the Life Science community.

UNICORE: The Uniform Interface to Computing Resources offers a ready-to-run Grid system including client and server software. UNICORE makes distributed computing and data resources available in a seamless and secure way in intranets and the internet. The UNICORE project created software that allows users to submit jobs to remote high performance computing resources without having to learn details of the target operating system, data storage conventions and techniques, or administrative policies and procedures at the target site.

Web Service: A software system designed to support interoperable machine-to-machine interaction over a network. It has an interface described in a machine-process able format (specifically WSDL). Other systems interact with the Web service in a manner prescribed by its description using SOAP-messages, typically conveyed using HTTP with an XML serialisation in conjunction with other Web-related standards.

Complete Chapter List

Search this Book:
Reset