Prioritize Transcription Factor Binding Sites for Multiple Co-Expressed Gene Sets Based on Lasso Multinomial Regression Models

Prioritize Transcription Factor Binding Sites for Multiple Co-Expressed Gene Sets Based on Lasso Multinomial Regression Models

Hong Hu, Yang Dai
DOI: 10.4018/978-1-5225-0353-8.ch008
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Computational prediction of cis-regulatory elements for a set of co-expressed genes based on sequence analysis provides an overwhelming volume of potential transcription factor binding sites. It presents a challenge to prioritize a set of functional transcription factors and their binding sites on the regulatory regions of the target genes that are relevant to the gene expression study. A novel approach based on the use of lasso multinomial regression models is proposed to address this problem. We examine the ability of the lasso models using a time-course microarray data obtained from a comprehensive study of gene expression profiles in skin and mucosal in mouse over all stages of wound healing.
Chapter Preview
Top

Experimental Identification Of Transcription Factor Binding Sites

Large scale experimental methods have been proposed for the identification of TFBSs on human genome for a given TF. The recently developed technologies of chromatin-immunoprecipitation followed by massively sequencing (ChIP-seq) or microarray (ChIP-array) allow researchers to capture the genome-wide binding profile of a TF under a given experimental condition (Iyer et al., 2001; Johnson, Mortazavi, Myers, & Wold, 2007; Landt et al., 2012; Robertson et al., 2007). The Encyclopedia of DNA Elements (ENCODE) Consortium, which aims to build a comprehensive list of annotation for functional elements in human genome, has performed a large number of ChIP-seq experiments (Dunham et al., 2012). However, this technology is limited by the availability of the TF-specific antibodies and the high experimental cost. Even with the consortium effort, the ChIP-seq data generated in the ENCODE project only covers a limited number of TFs for a few cell types (Dunham et al., 2012). Although the cost for ChIP-seq study is reducing with the rapid development of the next generation sequencing technology, it is not always possible to use this experimental approach. This is because there are situations where the key set of TFs that regulate the underlying process are unknown a prior.

Complete Chapter List

Search this Book:
Reset