Web Services for Bioinformatics

Web Services for Bioinformatics

Abad Shah (University of Engineering and Technology, Pakistan), Zafar Singhera (Oracle Corporation, USA) and Syed Ahsan (University of Engineering and Technology, Pakistan)
Copyright: © 2013 |Pages: 19
DOI: 10.4018/978-1-4666-3604-0.ch019
OnDemand PDF Download:
List Price: $37.50


A large number of tools are available to Bioinformaticians to analyze the rapidly growing databanks of molecular biological data. These databanks represent complex biological systems and in order to understand them, it is often necessary to link many disparate data sets and use more than one analysis tool. However, owing to the lack of standards for data sets and the interfaces of the tools this is not a trivial task. Over the past few years, web services has become a popular way of sharing the data and tools distributed over the web and used by different researchers all over the globe. In this chapter we discuss the interoperability problem of databanks and tools and how web services are being used to try to solve it. These efforts have resulted in the evolution of web services tools from HTML/web form-based tools not suited for automatic workflow generation to advances in Semantic Web and Ontologies that have revolutionized the role of semantics. Also included is a discussion on two extensively used Web Service systems for Life Sciences, myGrid and Semantic-MOBY. In the end we discuss how the state-of-art research and technological development in Semantic Web, Ontology and Database Management can help address these issues.
Chapter Preview

1. Introduction

The two major problems that biological scientists are facing are distribution and heterogeneity of the data and its analysis tools. These problems are due to autonomous, decentralized and individualistic web based approach towards the biological research (Bodenreider & Stevens, 2006). Integration of the data and tools is a difficult task but it is vital for the integrative insilico experimentation and exchange of results (Lord et al., 2004). Biology has coped with this work in an effective but in ad-hoc manner. Almost all databases and tools of bioinformatics that have been made available on the web and the data integration techniques have been applied to the bioinformatics domain have met limited success because the data and information are made available in a non-standardized way (Lord et al., 2004; Post et al, 2007). However, unlike other domains, the bioinformatics domain on the Web has embraced the standards, such as XML and web services, and there exists a large number of bioinformatics data sources that are either accessible as web services or provide data using XML (Thakar, Ambite & Knoblock, 2005). A web service is a program/software that can be executed on a remote machine owning to the industry efforts to standardize web service description, discovery and invocation. These efforts have led to standards such as WSDL (Christenson et al, 2001), UDDI (UDDI2002), and SOAP (SOAP 2000) (Thakar, Ambite & Knoblock, 2005).

The integration of such services and their interoperability is now feasible by using web services technologies and the researchers can easily construct bioinformatics workflows and pipelines by combining two or more web services to solve their complex biological tasks such as protein function prediction, genome annotation, micro array analysis, etc (Cannta N., et al, 2008). However, these standards, in their current form, suffer from the lack of semantic representation leaving the promise of automatic integration of applications written to web services standards unfulfilled (Labarga et al., 2007).

More recently efforts have been made to populate web services with semantic metadata and semantic descriptions to enhance data exchange and integration (Lord et al., 2004; Thakar et al., 2005; Post et al., 2007). A semantic web approach provides standardized formats (such as RDF, RDF Schema (RDFS) and OWL) to achieve a formalized computational environment. The objective of Semantic Web is to bring meaning to the raw data content by defining relationships between distinct concepts using ontologies (Cabrall L. et al., 2004). The existing life sciences databanks can be built with better retrieval performance using ontological abstractions. Fortunately, the life sciences community has realized that the semantic modeling is a necessity for the biological knowledge bases (Ruttenberg et al., 2007) and many biological ontology initiatives exist (http://obo.sourceforge.net), with Gene Ontology (GO) and it is the most widely adopted ontology (Bodenreider O. & Stevens R., 2006; Ashburner et al., 2000).

However, a complete and seamless semantic integration of data and information sources and tools is a challenging objective that we are facing, amongst others. Problems related to the shared definitions of knowledge domains, i.e., ontologies, association of biological concepts to the existing data, semantic descriptions of services/requirements and automatic workflow generation (Bodenreider O. & Stevens R., 2006).

Complete Chapter List

Search this Book: