MoBiFlow: Principles and Design of a Workflow System for Molecular Biology

MoBiFlow: Principles and Design of a Workflow System for Molecular Biology

Markus Held, Wolfgang Küchlin, Wolfgang Blochinger
DOI: 10.4018/978-1-4666-3894-5.ch019
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Web-based problem solving environments provide sharing, execution and monitoring of scientific workflows. Where they depend on general purpose workflow development systems, the workflow notations are likely far too powerful and complex, especially in the area of biology, where programming skills are rare. On the other hand, application specific workflow systems may use special purpose languages and execution engines, suffering from a lack of standards, portability, documentation, stability of investment etc. In both cases, the need to support yet another application on the desk-top places a burden on the system administration of a research lab. In previous research the authors have developed the web based workflow systems Calvin and Hobbes, which enable biologists and computer scientists to approach these problems in collaboration. Both systems use a server-centric Web 2.0 based approach. Calvin is tailored to molecular biology applications, with a simple graphical workflow-language and easy access to existing BioMoby web services. Calvin workflows are compiled to industry standard BPEL workflows, which can be edited and refined in collaboration between researchers and computer scientists using the Hobbes tool. Together, Calvin and Hobbes form our workflow platform MoBiFlow, whose principles, design, and use cases are described in this paper.
Chapter Preview
Top

Introduction

Biological Problem Solving Environments (PSEs) are frequently provided as web applications. In the trivial case, such a PSE provides a single service or a limited set of services as a web application. Biologists “connect” these services manually by cutting and pasting data between web sites. Since this process is error-prone and hard to retrace, a plethora of different bioinformatics workflow systems have emerged (Taylor, Deelman, Gannon, & Shields, 2006; (Tiwari, & Sekhar, 2007). Increasingly, web portals are used to access and to share scientific workflows (De Roure, Goble, & Stevens, 2007; Christie & Marru, 2007). Some portals provide workflow construction facilities via Java Web Start (Christie & Marru, 2007; Sipos & Kacsuk, 2005), while true web-based workflow construction tools lack sophisticated user experience (Carrere & Gouzy, 2006; Bartocci, Corradini, Merelli, & Scortichini, 2007). Often, scientific workflow management systems are provided as desktop applications with rather complex user interfaces (Oinn et al., 2004; Shah et al., 2004).

Similar to programming languages, the complexity of the user interface and the workflow notation remains an area of conflict. Supporting a large set of different workflows and different types of services increases the complexity of a workflow language, thus reducing usability by domain experts. In contrast, high user-friendliness limits the possible set of workflow patterns and access to arbitrary services. Domain experts should be able to compose and execute workflows in a domain specific modeling system, and still fall back to a collaboration with software engineers if a more complex workflow model is needed. In this paper we propose a hybrid approach of augmenting an existing collaborative workflow development system with a biology-specific scientific workflow mode. In software engineering it is often recognized that 80% of the useful functionality is provided by 20% of the code (the “Pareto principle”).

We argue that many typical life science workflows can be expressed by a reduced workflow notation. Similar to high-level programming languages, which are compiled to a common machine language, we regard the Business Process Execution Language (BPEL) as the “assembly language” of an arbitrary set of domain-specific workflow languages (OASIS, 2007).

In this article, we give an overview of our previous work on the bioinformatics workflow system MoBiFlow and the software ecosystem on which it is built. The article extends our workshop paper (Küchlin & Held, 2010). MoBiFlow consists of the biology-specific workflow system Calvin and the collaborative workflow development system Hobbes, with additional tools such as the computation of workflow metrics, layout capabilities or Web 2.0 based video conferencing.

In particular, we make the following contributions:

  • 1.

    We have formalized common e-science processes in biology as a meta-process and we derive requirements for next generation bioinformatics workflow systems.

  • 2.

    We show how low-level workflow languages can be combined with domain-specific workflow notations to provide ease-of-use and flexibility at the same time.

  • 3.

    We describe the design principles behind the MoBiFlow system, which represents a solution for high-level and low-level collaborative workflow development and usage.

Complete Chapter List

Search this Book:
Reset