ProGenGrid: A Grid Problem Solving for Bioinformatics

ProGenGrid: A Grid Problem Solving for Bioinformatics

Maria Mirto (University of Salento and SPACI Consortium, Italy), Italo Epicoco (University of Salento and SPACI Consortium, Italy), Massimo Cafaro (University of Salento and SPACI Consortium, Italy) and Sandro Fiore (University of Salento and SPACI Consortium, Italy)
DOI: 10.4018/978-1-60566-374-6.ch014
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

In this chapter, the ProGenGrid (Proteomics and Genomics Grid) research project, which started in 2004, is described. It is a Grid Problem Solving Environment, specialized for the Bioinformatics domain, which aims at providing an integrated environment in order to compose, schedule and monitor biological applications in a Computational Grid. The main feature offered by this environment is the possibility to use a friendly web interface for composing workflow jobs, scheduled on different grid middleware.
Chapter Preview
Top

Background

PSEs have been investigated over the past years. Culler and Fried (Culler, 1963) initiated to investigate automatic software systems for solving mathematical problems with computers focusing primarily on applications issues instead of programming issues. At that time, the term applications indicated scientific and engineering applications that were generally solved using mathematical solvers or scientific algorithms managing vectors and matrices.

Despite the time passed from that early research work there in not a precise definition of what a PSE is. The following well-known definition was given by Gallopoulos, Houstis, and Rice: “A PSE is a computer system that provides all the computational features necessary to solve a target class of problems. . . . PSEs use the language of the target class of problems” (Houstis, 1997).

According to Walker et al., “A PSE is a complete, integrated computing environment for composing, compiling, and running applications in a specific area” (Walker, 2000).

In (Schuchardt, 2002), PSEs are defined as “Problem-oriented computing environments that support all the scientific computational problem-solving activities ranging from problem formulation, to algorithm selection, to simulation execution, to solution visualization. PSEs link a heterogeneous mix of resources including people, computers, data, and information within a seamless environment to solve a problem”. In 2003 Cunha described a PSE as “An integrated environment for solving a class of related problems in an application domain; easy to use by the end-user; based on state-of-the-art algorithms. It must provide support for problem specification, resource management, execution services” (Cunha, 2003).

Such definitions agree on the basic features of PSEs, but provide different approaches about how PSEs can be composed and used. PSEs can benefit from advancements in hardware/software solutions achieved in parallel and distributed systems and tools.

One of the most interesting models in the area of parallel and distributed computing is the Grid Computing paradigm.

Key Terms in this Chapter

Grid Security Infrastructure (GSI): It is a specification for secret, tamper-proof communication between software in a grid computing environment. Secure, authenticatable communication is enabled using both symmetric and asymmetric encryption. Delegation of credentials is also supported.

Grid Portal: A Grid Portal provides an efficient infrastructure to put Grid-empowered applications on corporate Intranet/Internet. Application interfaces can be tailored to the specific user’s skills or access rights. Users can therefore access and control their computing and engineering resources via an intuitive, standards compliant Web interface, virtually from anywhere using a standard Web browser.

UNICORE (Uniform Interface to Computing Resources): It offers a ready-to-run Grid system including client and server software. UNICORE makes distributed computing and data resources available in a seamless and secure way in intranets and the internet.

VOMS (Virtual Organization Membership Service): It serves as a central repository for user authorization information, providing support for sorting users into a general group hierarchy, keeping track of their roles, etc.

Web Service: A Web service (also Web Service) is defined by the W3C as “a software system designed to support interoperable machine-to-machine interaction over a network”. Web services are frequently just Web APIs that can be accessed over a network, such as the Internet, and executed on a remote system hosting the requested services.

gLite: It is a middleware for grid computing. Born from the collaborative efforts of more than 80 people in 12 different academic and industrial research centers as part of the EGEE Project, gLite provides a framework for building grid applications tapping into the power of distributed computing and storage resources across the Internet.

Globus: It is a middleware used for building Grid systems and applications. It is being developed by the Globus Alliance and many others all over the world. A growing number of projects and companies are using the Globus Toolkit to unlock the potential of grids for their cause.

Protein Tertiary Structure Prediction: It has the aim of determining the three-dimensional structure of proteins from their amino acid sequences. In more formal terms, this is the prediction of protein tertiary structure from primary structure.

Workflow Management System: WorkFlow Management Systems (WFMSs) are software systems that support the management of workflow in an organization. According to the Workflow Management Coalition, a WFMS system consists of three functional components: a definition tool that is used to define business processes; a worklist handler through which interaction between users and their personal worklists is managed; and a workflow engine, that provides a runtime execution environment.

Grid File Transfer Protocol (GridFTP): It is a protocol for high-performance, secure, reliable data transfer.

OGSA (Open Grid Service Architecture): It is the blueprint specification for standards-based grid computing. “Open” refers to both the standards-development process and the standards themselves. OGSA is “service-oriented” because it delivers functionality among loosely-coupled interacting services that are aligned with industry-accepted Web service standards. “Architecture” defines the components, their organizations and interactions, and the overall design philosophy.

WSRF: The WS-Resource Framework (WSRF) is a set of six Web services specifications that define what is termed the WS-Resource approach to modeling and managing state in a Web services context. To date, drafts of three of these specifications have been released, along with an architecture document that motivates and describes the WS-Resource approach to modeling stateful resources with Web services.

Complete Chapter List

Search this Book:
Reset