Article Preview
Top1. Introduction
Recent years have seen the advent of DNA microarray technology and information retrieval has been proved essential for the reconstruction of gene regulatory networks (GRNs) from temporal gene expression data. GRNs proves important for understanding many unkown biological functionalities and processes. It gives insights of the activities of genes and provide knowledge about transcriptional regulations among them (Aalto et al., 2020). GRNs is a virtual network of genes and their mutual influences, where node of the network is a gene and edges are the influence from the regulator to the target gene which either activates or suppress target gene’s ability of protein formation (Morgan et al., 2019). GRNs have been successfully applied in diagnostics and contributes in identification of essential genes (Xie et al., 2020). The well known issue encountered in the analysis of temporal data for GRN reconstruction problem is the curse of dimensionality (Altman & Krzywinski, 2018).
In context of the computational models used for GRNs reconstruction from time series data, researchers has adopted several methods (Delgado & Gómez-Vela, 2019; Razaghi-Moghadam & Nikoloski, 2020) such as Boolean networks, Bayesian networks (BNs), dynamic Bayesian networks (DBNs) and linear additive genetic model. Boolean networks model (Barman & Kwon, 2018) considers only two states for each gene: active and inactive. This model does not take into consideration the intermediary effects on the genes which cause information loss. Bayesian networks (BNs) model (Sanchez-Castillo et al., 2018) are graph based models forming a genetic network as a directed acyclic graphs. This model effectively handles noise, missing values and the random nature of gene expression data, however it does not take into account the dynamical nature of GRNs and the temporal aspect of the data. The limitations of BNs were overcome by dynamic Bayesian networks (DBNs) (Adabor & Acquaah-Mensah, 2019). The linear additive genetic model (Luque-Baena et al., 2014) may identify linear regulatory relationships but does not consider the non-linear behaviour of GRNs.
Motivation: Considering the limitations of these models researchers adopted recurrent neural network (RNN) for the problem of GRNs reconstruction. RNN model clearly manifested the temporal nature of gene expression data and non-linear dynamics among gene regulations which is essential for GRN reconstruction. This model has an ability to consider the feedforward and feedback loops of the genetic regulation network (Biswas & Acharyya, 2016, 2018). Time-series data is the input to the RNN model. The data contains expression levels (xi(t)) of genes at consecutive time points. The gene expression level xi(t+1) of a gene (i) of the current time point (t+1) of an RNN.
Figure 1. The overview of the work flow
Layer is simulated from the expression levels xj(t) of genes (j) (regulator genes) at previous time point t accompanied by the set of genetic network parameters. In terms of GRNs topology very few regulatory genes j influence a target gene i which concludes that genetic network connectivity is sparse. In sense of the modelling of GRNs with RNN model, the set of parameters which requires training are weights , bias term and the time constant associated with each gene.