Modeling and Implementing Scientific Hypothesis

Modeling and Implementing Scientific Hypothesis

Fabio Porto (Data Extreme Lab (DEXL), National Laboratory of Scientific Computing (LNCC), Petropolis, Brazil), Ramon G. Costa (Departamento de Ciência da Computação, Universidade Federal de Lavras, Lavras, Brazil), Ana Maria de C. Moura (Data Extreme Lab (DEXL), National Laboratory of Scientific Computing (LNCC), Petropolis, Brazil) and Bernardo Gonçalves (Data Extreme Lab (DEXL), National Laboratory of Scientific Computing (LNCC), Petropolis, Brazil)
Copyright: © 2015 |Pages: 13
DOI: 10.4018/JDM.2015040101
OnDemand PDF Download:
No Current Special Offers


Computational Simulations are important tools that enable scientists to study complex phenomena about which few data is available or that require dangerous human interventions. They involve complex and heterogeneous components, including: mathematical equations, hypothesis, computational models and data. In order to support in-silico scientific research this complex environment needs to be modeled and have its data and metadata managed enabling model evolution, prediction analysis and decision-making. This paper proposes a scientific hypothesis conceptual model that allows scientists to represent the phenomenon been investigated, the hypotheses formulated in the attempt to explain it, and provides the ability to store results of experiment simulations with their corresponding provenance metadata. The proposed model supports scientific life-cycle through: provenance management, exchange of hypothesis as data, experiment reproducibility, model steering and simulation result analyses. A cardiovascular numerical simulation illustrates the applicability of the model and an initial implementation using SciDB is discussed.
Article Preview


The availability of important experimental and computational facilities nowadays induces large-scale scientific projects to produce a never before observed amount of experimental and simulation data. This wealth of data needs to be structured and managed in a way that readily makes sense to scientists, so that relevant knowledge may be extracted to contribute to the scientific investigation process. Current data management technologies are clearly unable to cope with scientists' requirements (Stonebraker et al., 2009), despite the efforts the community has dedicated to the area. Such efforts can be measured by the community support to an international conference (SSDBM), running for almost 20 years on scientific and statistical database management, various workshops on associated themes, and important projects such as POSTGRES at Berkeley (Stonebraker and Rowe, 1986). All these initiatives have considerably contributed to extend database technology towards the support to scientific data management.

Giving such a panorama, one may ask what could be missing on the support to scientific applications from a database viewpoint. In this paper, we investigate this question from the perspective of data management support for the complete scientific life-cycle, from hypotheses formulation to experiment validation. As it turns out, efforts in this area have been steered towards supporting the in-silico experimental phase of the scientific life-cycle (Mattoso et al. 2010), involving the execution of scientific workflows and the management of the associated data and metadata. The complete scientific life-cycle extends beyond that, and includes the studied phenomenon, formulated hypotheses and computational models. The lack of support to these elements in current in-silico approaches leaves extremely important information out-of-reach of the scientific community.

This paper contributes to fill this gap, by introducing a scientific hypothesis conceptual model. In this model, the starting point of a scientific investigation is the natural phenomenon description. The studied phenomenon occurs in nature in some space-time frame, in which selected physical quantities are observed. Scientific hypotheses conceptually represent the scientific models a scientist conceives to explain the observed phenomenon. Testing hypotheses in-silico involves running experiments, representing the scientific models, and confronting simulated data with collected observations.

The proposed conceptual model is the basis for registering the complete scientific exploration life-cycle. The following benefits are brought by this approach:

  • Extends the in-silico support beyond the experimental phase and towards the complete scientific life-cycle;

  • Supports provenance information regarding scientific hypotheses evolution;

  • Facilitates the communication among scientists in a research groups (by exposing their mental models);

  • Supports the reproducibility of experiments (by enhancing the experiment metadata with hypotheses and models);

  • Supports model steering (by investigating models evolution);

  • Supports experiment result analyses (by relating models, models parameters and simulated results);

In order to illustrate the use of the proposed conceptual model, a case study is discussed, based on models of the human cardio-vascular system. The phenomenon is simulated by a complex and data intensive numerical simulation that runs for days to compute a single blood cycle on a cluster with 1200 nodes. The analyses of simulated results are supported by the SciDB (Cudre-Mauroux et al., 2009), multi-dimensional array database system.

The remainder of this paper is structured as follows. Initially we discuss some related work. The next section describes a use case concerning the simulation of the human cardiovascular system. The Hypothesis Conceptual Model that integrates scientific hypotheses to the in-silico experiment entities is presented in the following section. This model is the base to develop a database prototype using SciDB in support of the cardio vascular scientific hypothesis, which is described next section. Finally, we conclude the paper with suggestions for future work.

Complete Article List

Search this Journal:
Open Access Articles
Volume 32: 4 Issues (2021): 2 Released, 2 Forthcoming
Volume 31: 4 Issues (2020)
Volume 30: 4 Issues (2019)
Volume 29: 4 Issues (2018)
Volume 28: 4 Issues (2017)
Volume 27: 4 Issues (2016)
Volume 26: 4 Issues (2015)
Volume 25: 4 Issues (2014)
Volume 24: 4 Issues (2013)
Volume 23: 4 Issues (2012)
Volume 22: 4 Issues (2011)
Volume 21: 4 Issues (2010)
Volume 20: 4 Issues (2009)
Volume 19: 4 Issues (2008)
Volume 18: 4 Issues (2007)
Volume 17: 4 Issues (2006)
Volume 16: 4 Issues (2005)
Volume 15: 4 Issues (2004)
Volume 14: 4 Issues (2003)
Volume 13: 4 Issues (2002)
Volume 12: 4 Issues (2001)
Volume 11: 4 Issues (2000)
Volume 10: 4 Issues (1999)
Volume 9: 4 Issues (1998)
Volume 8: 4 Issues (1997)
Volume 7: 4 Issues (1996)
Volume 6: 4 Issues (1995)
Volume 5: 4 Issues (1994)
Volume 4: 4 Issues (1993)
Volume 3: 4 Issues (1992)
Volume 2: 4 Issues (1991)
Volume 1: 2 Issues (1990)
View Complete Journal Contents Listing