Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
OnDemand
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us

Integrating Various Data Sources for Improved Quality in Reverse Engineering of Gene Regulatory Networks

Mika Gustafsson, Michael Hörnquist

Source Title: Handbook of Research on Computational Methodologies in Gene Regulatory Networks

DOI: 10.4018/978-1-60566-685-3.ch020

OnDemand:

(Individual Chapters)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

In this chapter we outline a methodology to reverse engineer GRNs from various data sources within an ODE framework. The methodology is generally applicable and is suitable to handle the broad error distribution present in microarrays. The main effort of this chapter is the exploration of a fully data driven approach to the integration problem in a “soft evidence” based way. Integration is here seen as the process of incorporation of uncertain a priori knowledge and is therefore only relied upon if it lowers the prediction error. An efficient implementation is carried out by a linear programming formulation. This LP problem is solved repeatedly with small modifications, from which we can benefit by restarting the primal simplex method from nearby solutions, which enables a computational efficient execution. We perform a case study for data from the yeast cell cycle, where all verified genes are putative regulators and the a priori knowledge consists of several types of binding data, text-mining and annotation knowledge.

Chapter Preview

Top

Introduction

Biological systems are intrinsically complex, still robust and at the same time able to quickly adapt to new situations. To understand, describe and model a wide range of biological systems −involving genes, proteins, metabolites and ecological food webs− networks have served as the unifying language (Barabasi et al. 2004). This description has often revealed a complex network topology. In the case of Gene Regulatory Networks (GRNs), some features are the existence of key genes regulating multiple processes (“hubs”), feed-back motifs and modularity enhancing the system robustness (Milo et al. 2002; Barabasi et al. 2004). Furthermore, the dynamical systems seem to be tuned to enable a stable system by keeping hubs repressed, but still flexible by utilizing, e.g., incoherent feed-back loops (Gustafsson et al. In press b, Ma’ayan et al. 2008). In addition to the architectural complications, we know that gene regulation is a non-linear process including combinatorial control, saturation and stochasticity. These pieces give raise to an extremely challenging modelling problem, which becomes even more complicated by the size of the genome.

Further, the experimental advancements in the last decades have resulted in a vast amount of large-scale data sets available through public databases. To infer a large-scale GRN it is of uttermost importance to take as much as possible of these data into account. Particularly informative for understanding genome-wide gene regulation is the interaction map between Transcription Factors (TFs) and their DNA binding regions. This information may give direct structural properties of the regulatory possibilities, e.g., the presence of a binding element upstream of gene of A for a TF which gene B codes for induces an enhanced possibility for regulation of gene A by gene B.

Other types of structural information may come from sequence based predictions, e.g., prediction of putative regulations from the TF binding sites (TFBS) and from common biological knowledge. The latter can be incorporated in a variety of ways, which may come from annotation knowledge or more “unclean” knowledge as text-mining. Annotation knowledge may be the collection of detailed knowledge from previous experiments, while text-mining may be a possibility to include the plethora of published biological papers in databases. On a more detailed causal level there is also a large number of time-series expression data sets for mRNA levels (see, e.g., Omnibus at Entrez (PubMed 2007) for collections at a unified format). However, although all these experiments are present on a large-scale, they are all typically several orders of magnitudes smaller than the number of presumptive regulators. Hence, all data at hand should be taken in consideration to overcome the indefiniteness of the reverse engineering problem. The greatest challenge in GRN inference to tackle is that the number of genes vastly exceeds the number of experiments, making it a tough statistical question. We should therefore strive to avoid introducing more entities in the model. Consequently, we project gene regulation onto the space of genes only, despite the fact that gene regulation is carried out from the interactions of mRNA molecules, proteins and metabolites (Brazhnik et al. 2002; Ptashne et al. 2002). Indeed, the obtained GRN is then an effective network of gene-to-gene interactions, where these interactions cannot be interpreted as biochemical reactions.

Key Terms in this Chapter

Data Integration: is the merging of data stemming from different sources, such as expression data and TF-binding data.

Sparseness: in a regulatory network context means that there are relatively few interactions per gene.

Warm start: optimization is a starting of the optimization algorithm in a state where it is close to the optimum.

Soft evidence: is the concept to take into account multiple pieces of evidence as uncertain knowledge. We use the concept to stress the fact that we are using the multiple prior edge information to increase the probability for an edge, and not merely as filters.

Linear Programming (LP): denotes the optimization problem where the objective function is linear and there are linear constraints. Efficient optimization algorithms for solving LP problems exist, especially the simplex method.

Prior Knowledge: is our prior belief of a certain event. In this chapter we fuse different pieces of e.g. structural data into our prior belief, which enables the integration of structural and expression data.

Least Absolute Deviation (LAD): is here the minimization criteria which we base our solutions on. It is known to be more robust towards outliers than the more popular least squares method.

Complete Chapter List

Search this Book:

Reset

MLA

APA

Chicago

Export Reference

Integrating Various Data Sources for Improved Quality in Reverse Engineering of Gene Regulatory Networks

Abstract

Introduction

Key Terms in this Chapter

Complete Chapter List