UNICORE: A Middleware for Life Sciences Grid

UNICORE: A Middleware for Life Sciences Grid

Piotr Bala (ICM University of Warsaw and Copernicus University, Poland), Kim Baldridge (University of Zurich, Switzerland), Emilio Benfenati (Istituto Mario Negri, Italy), Mose Casalegno (Istituto Mario Negri, Italy), Uko Maran (University of Tartu, Estonia) and Lukasz Miroslaw (University of Zürich, Switzerland)
DOI: 10.4018/978-1-60566-374-6.ch030
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

This chapter provides an overview of Grid middleware and applications related to biomedical and life sciences disciplines. Various technologies, including web-based solutions, are presented. One of the solutions, the UNICORE framework, in its recent version implements key grid standards and specifications. The system architecture and capabilities, such as security, workflow and data management are described. Special attention is given to the idea of a ‘gridbean’, which expands the UNICORE use for different applications. Examples of gridbeans are provided and the capabilities of UNICORE are illustrated through specific examples built using this grid middleware. In particular, the Chemomentum workbench and its use for in-silico design and modeling in chemistry and life sciences are both described.
Chapter Preview
Top

Introduction

The past decade has witnessed tremendous growth in our chemical and biological understanding at the molecular scale that impacts all facets of our daily lives. Inherently driven by information technologies (IT) and massive data acquisition, this growth in understanding requires a stronger coupling between computation and experiment. The expansion of high-throughput technologies-in chemical design, microarray analysis, device technology, physical and chemical analysis processes, combinatorial chemical synthesis, and screening, as well as their computational counterparts, has generated an unprecedented volume and diversity of data that must be abstracted into manageable units. Historically, chemistry has evolved a sophisticated symbolic and iconographic way of projecting a myriad of chemical phenomena onto a model of molecular structure and associated properties. This rapid growth of both experimental and computational data has driven the proliferation of computational tools and the development of first-generation chemoinformatics for data storage, analysis, mining, management, and presentation. However, it is generally accepted that first-generation chemoinformatics tools have not been widely adopted, nor have they met the needs of the researchers. There is a general need for integrating data, information and knowledge that can be accomplished through innovative ways of computational access, resource integration, and creation of transparent toolsets.

The computational science challenge, therefore, is to provide new algorithms and data analysis tools that can exploit more efficiently the existing computational power for more detailed chemical analysis. The corresponding challenges in information technology and visualization include providing richer access to existing repositories of data, enabling transformation of that data into interpretable and predictable scientific information, and leveraging workflows to parallelize the efforts of many researchers for general access.

Over the past few years our team has been developing middleware and end-user interface technologies to enable novice and advanced users to access capabilities of sophisticated scientific applications running on large-scale grid computing resources. In this paper, we present some of these developments, from web-based tools to integrated toolkits such as NIMROD (Abramson, Lewis, Peachy 2001) and BOINC (Boinc 2008), to full-featured grid middleware such as Globus (Foster 2006) and UNICORE (Erwin, Snelling 2002). The main focus is on the latter, which is known from a number of successful deployments in the life sciences area.

Key Terms in this Chapter

Quantitative Structure-Activity Relationship (QSAR): The process by which chemical structure is quantitatively correlated with a well defined process, such as biological activity or chemical reactivity.

Web Services Resource Framework (WSRF): An application of web service technology to realise access to and management of statefull resources.

CML (Chemical Markup Language): Domain specific implementation based strictly on XML. It uses XML’s portability to help developers and chemists design interoperable documents.

General Atomic and Molecular Electronic Structure System (GAMESS): A general ab initio quantum chemistry package.

Job Submission Description Language (JSDL): An extensible XML specification for the description of simple tasks to non-interactive computer execution systems. It describes the submission aspects of a job, and does not attempt to describe the state of running or historic jobs.

Storage Resource Broker (SRB): A client-server middleware that provides a uniform interface for connecting to heterogeneous data resources over a network and accessing replicated data sets.

UNICORE (UNiform Interface to the COmputational REsources): Grid system including client and server software. UNICORE makes distributed computing and data resources available in a seamless and secure way in the internet.

REACH (Registration, Evaluation, Authorisation and Restriction of Chemical substances): European Community Regulation on chemicals and their safe use (EC 1907/2006). The aim of REACH is to improve the protection of human health and the environment through the better and earlier identification of the intrinsic properties of chemical substances.

Open Grid Services Architecture (OGSA): An architecture for a service-oriented grid computing environment. OGSA is based on several other Web service technologies but it aims to be largely agnostic in relation to the transport-level handling of data.

Protein Data Bank (PDB): An archive of experimentally-determined, biological macromolecule structures from the Brookhaven National Laboratory.

Pacific Rim Application and Grid Middleware Assembly (PRAGMA): Open organization in which Pacific Rim institutions collaborate to develop grid-enabled applications and deploy the infrastructure throughout the Pacific Region to allow data, computing, and other resource sharing.

XACML (eXtensible Access Control Markup Language): A declarative access control policy language implemented in XML and a processing model, describing how to interpret the policies.

Grid Enabled Molecular Science Through Online Networked Environments (GEMSTONE): A framework that provides researchers in the molecular sciences with a tool to discover remote grid application services and compose them as appropriate to the chemical and physical nature of the problem at hand.

Complete Chapter List

Search this Book:
Reset