Detection and Employment of Biological Sequence Motifs

Detection and Employment of Biological Sequence Motifs

Marjan Trutschl (Louisiana State University – Shreveport, USA & Louisiana State University Health – Shreveport, USA), Phillip C. S. R. Kilgore (Louisiana State University – Shreveport, USA), Rona S. Scott (Louisiana State University Health – Shreveport, USA), Christine E. Birdwell (Louisiana State University Health – Shreveport, USA) and Urška Cvek (Louisiana State University – Shreveport, USA & Louisiana State University Health – Shreveport, USA)
Copyright: © 2015 |Pages: 31
DOI: 10.4018/978-1-4666-6611-5.ch005
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

Biological sequence motifs are short nucleotide or amino acid sequences that are biologically significant and are attractive to scientists because they are usually highly conserved and result in structural and regulatory implications. In this chapter, the authors show practical applications of these data, followed by a review of the algorithms, techniques, and tools. They address the nature of motifs and elucidate on several methods for de novo motif discovery, covering the algorithms based on Gibbs sampling, expectation maximization, Bayesian inference, covariance models, and discriminative learning. The authors present the tools and their requirements to weigh their individual benefits and challenges. Since interpretation of a large set of results can pose significant challenges, they discuss several methods for handling data that span from visualization to integration into pipelines and curated databases. Additionally, the authors show practical applications of these data with examples.
Chapter Preview
Top

Background

The concept of motifs and their relationship to regulation of the cellular environment can be traced back to the late 1950s. Although regulatory elements had been shown to exist in DNA as early as 1951 (McClintock, 1951), it was the work of Jacques Monod and Francois Jacob regarding the regulation of lactose metabolism in Escherichia coli that lead to the first generalized theory concerning regulatory elements. Via the lac repressor, a protein which moderates the translation of the proteins used in lactose synthesis, Jacob et al. were able to develop a framework accounting for transcriptional regulation (Jacob, Perrin, Sanchez, and Monod, 1960). This seminal work only considered repressor elements and was primitive in comparison to modern views regarding transcriptional regulation; however, it is notable in that it presents the concept of an operator, a segment of DNA to which regulatory elements may bind.

At about the same time, a number of motifs were being identified within gene promoter regions. Promoters are DNA sequences which regulate the initiation of transcription of nearby genes. The 1970s saw the discovery of two conserved motifs that recruit the general transcriptional factors and RNA polymerase (Hurwitz, 1960; Stephens, 1960) to promoters: the TATA box “TATAA” (Rifton, Goldberg, Karp, & Hogness, 1978) and the Pribnow box “TATAAT” (Pribnow, 1975). The former is called the Goldberg-Hogness Box in eukaryotes, and the latter is known as the -10 sequence in bacterial promoters.Additional promoter motifs have since been identified and underscore the regulatory complexity between prokaryotes and eukaryotes. Bacterial promoters usually have three unique motifs while eukaryotic promoters can have up to seven (Clancy, 2008).

Complete Chapter List

Search this Book:
Reset