BioSimGrid Biomolecular Simulation Database

BioSimGrid Biomolecular Simulation Database

Kaihsu Tai (University of Oxford, UK) and Mark Sansom (University of Oxford, UK)
DOI: 10.4018/978-1-60960-195-9.ch304
OnDemand PDF Download:
No Current Special Offers


BioSimGrid is a distributed biomolecular simulation database. It is a general-purpose database for trajectories from molecular dynamics simulations. Though initially designed as a distributed data grid, BioSimGrid allows for installation as a stand-alone instance. This can later be integrated into a wider, networked system. This presentation of BioSimGrid follows a scenario in biological research to demonstrate how to install the system, and how to deposit, query, and analyze trajectories in this system, with real Python code examples for each step. What then follow are explanations of the underlying concepts in the implementation of BioSimGrid: relational database, distributed computing, and the input/output (deposit and analysis) modules. Finishing the presentation is a discussion of the emerging trends and concerns in the further development of BioSimGrid and similar biological databases. This discussion touches on quality-assurance issues and the use of BioSimGrid as a back-end for other speciality databases. The experience of developing BioSimGrid compels the conclusion: In the development and maintenance of biomolecular simulation databases, it is essential that sustainability be asserted as a key principle.
Chapter Preview

1. Introduction And Background: A Repository Of Biomolecular Simulations

Since the first application of the molecular dynamics on proteins in 1976 (Adcock and McCammon, 2006), this simulation methodology has added value to experimental structural biology by making biomolecules ‘come alive’ and by compensating in the nanosecond time-scale where experimental methods are only beginning to be able to access. Adding to this, we have the method of comparison, a precursor to the process of classification, which is fundamental to biology (Brooks and McLennan, 2006). Insights into the internal motions of proteins can come from comparing the results of molecular dynamics simulations, namely the trajectories (Pang et al., 2005; Tai et al., 2007). This process can be facilitated by having a database of trajectories. We have developed in the past few years (2003 to 2006) such a database called BioSimGrid (; Feig et al., 1999). To differentiate, BioSimGrid is a general-purpose database for trajectories from molecular dynamics simulations. It is free software licensed under the terms of GNU General Public License (Stallman 2002). It can take advantage of distributed (‘grid’) computing, to enhance reliability and ensure longevity of the trajectory content (Berman et al., 2003a).

By ‘general-purpose’, we mean the following. Firstly, BioSimGrid can admit trajectories generated by different simulation packages, such as Amber (Pearlman et al., 1995), Gromacs (Lindahl et al., 2001), Charmm (Brooks et al., 1983), NWChem (Straatsma et al., 2000), and NAMD (Kalé et al., 1999). Secondly, BioSimGrid is not restricted to a special kind of system or granularity. It can host simulations for nucleic acids, proteins, small molecules, or even non-biological polymers; simulations at the all-atom level or coarse-grain molecular dynamics (Bond and Sansom, 2006; Marrink et al., 2004; Nielsen et al., 2004). It can also store non-sequential ‘trajectories’, or rather ensembles, generated by other methods such as Monte Carlo, homology modelling (Šali and Blundell, 1993), and CONCOORD (de Groot et al., 1997).

Complete Chapter List

Search this Book: