Nonlinear Stochastic Differential Equations Method for Reverse Engineering of Gene Regulatory Network

Nonlinear Stochastic Differential Equations Method for Reverse Engineering of Gene Regulatory Network

Adriana Climescu-Haulica (Université Joseph Fourier, France) and Michelle Quirk (Los Alamos National Laboratory, USA)
DOI: 10.4018/978-1-60566-685-3.ch009
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

In this chapter, we present a method to infer the structure of the gene regulatory network that takes in account both the kinetic molecular interactions and the randomness of data. The dynamics of the gene expression level are fitted via a nonlinear stochastic differential equation (SDE) model. The drift term of the equation contains the transcription rate related to the architecture of the local regulatory network. The statistical analysis of data combines maximum likelihood principle with Akaike Information Criteria (AIC) through a forward selection Strategy to yield a set of specific regulators and their contribution. Tested with expression data concerning the cell cycle for S. Cerevisiae and embryogenesis for the D. melanogaster, this method provides a framework for the reverse engineering of various gene regulatory networks.
Chapter Preview
Top

Introduction

A foreground question towards the understanding of the cell regulatory mechanism is that on how to infer the structure of the transcriptional regulatory network from experimental data.

To answer this question equates to decipher the learning machinery which enables the transcriptional program to adapt in time as the cell progresses through development or undergoes environmental changes. A current trend to trace dynamic features on the relationship between genes and their regulators is to analyze time-dependent microarray gene expression data obtained in pertinent conditions. The quantitative analysis of the variation of mRNA levels is expected to reverse-engineer the transcriptional regulatory network architecture, once this quantitative analysis is corroborated by qualitative tools to recognize specific promoter sequences, binding sites and transcription factors. Novel computational strategies arise from this perspective and a first major impact is expected in disease control.

The modeling methodology for regulatory networks has to demonstrate awareness of the quickly moving perspective in molecular biology. Recent research - only four years after the completion of the Human Genome Project - reveals that the human protein coding of genes resumes to a smaller set accounting for only 20500 (Clamp et al., 2007).

This conclusion emerges from the fully acquired - by now - observation that the majority of the transcriptional output of the genomes of higher organisms is noncoding RNAs (Claverie, 2005). It is assumed that noncoding RNAs are the key to the genetic control architecture as a very complex system for cis- and trans-acting RNA based regulatory network and gene-gene communication via RNA-DNA/chromatin, RNA-RNA and RNA-protein interactions. Hence, the understanding of the regulatory network invokes the study of epigenomic phenomena. The protein coding gene lies in a very plastic environment able to learn and adapt to different conditions by means of a large panel of mechanisms. The experiment traceability of these mechanisms is still a work in progress; however an integrative system biology approach is sought to combine different layers of information available from

  • 1.

    the production of various types of experimental data (microarray, combined microarray (He et al., 2006), DNA microarray (Tavazoie et al., 1999), ChIP-chip(Ren et al., 2000) ;

  • 2.

    the results obtained from processing the data with different computational approaches ;

  • 3.

    the genetic, molecular and biochemical studies.

Accordingly, the computational methods for regulatory network inference have to be built in a robust evolutive manner, to allow the assimilation of novel discoveries.

The method proposed in this chapter renders a framework which may adapt to different types of gene expression data. An automated procedure is given; it takes as input a gene expression data set and an ensemble of candidate regulatory genes, considered from up to date discoveries. The output provides the structure of the gene regulatory network, expressed as a list of potential activators and repressors for each gene of the input data set.

The following section shows the principal characteristics of this method in light of the actual research in the area of gene regulatory network inference from expression data. The first part of the main thrust of the paper describes the construction of the nonlinear SDE used to model the dynamics of a target gene expression level together with the statistical analysis and the corresponding algorithm. The second part shows that, applied to the expression measurements of the mRNA levels of Saccharomyces cerevisiae (Spellman et al., 1998), this model improves the fitting results from previous studies. We provide also the analysis of time dependent gene expression measurements on Drosophila melanogaster embryogenesis (Tomancak et.al, 2002).

Our goal is to provide tools for large scale investigation of transcriptomic data – thus we describe an improved method able to extract information on the cell regulatory mechanism, and potentially to contribute to the reverse engineering of the transcriptional regulatory network.

Key Terms in this Chapter

transcription rate: the variation in time of the mRNA production for a given gene

stochastic calculus: the framework which permits to define rigorous theory of integration for stochastic processes

local regulatory network: the set of regulators corresponding to a single target gene

Model Selection: the procedure from which a statistical model is selected from a set of potential models, given the data; usually that corresponds to the choice of a set of parameters

regulation function: the quantitative time dependent form on which the regulators interfere on the mRNA target gene production

goodness of fit: the measure for how well a statistical model fits a set of observations

learning relationship: plastically link between two entities (in our case genes); it adapts during the time with respect to various stimulus

Complete Chapter List

Search this Book:
Reset