A New Approach for Sequence Analysis: Illustrating an Expanded Bioinformatics View through Exploring Properties of the Prestin Protein

A New Approach for Sequence Analysis: Illustrating an Expanded Bioinformatics View through Exploring Properties of the Prestin Protein

Kathryn Dempsey (University of Nebraska at Omaha, USA & University of Nebraska Medical Center, USA), Benjamin Currall (Creighton University, USA), Richard Hallworth (Creighton University, USA) and Hesham Ali (University of Nebraska at Omaha, USA & University of Nebraska Medical Center, USA)
Copyright: © 2013 |Pages: 21
DOI: 10.4018/978-1-4666-3604-0.ch079
OnDemand PDF Download:
No Current Special Offers


Understanding the structure-function relationship of proteins offers the key to biological processes, and can offer knowledge for better investigation of matters with widespread impact, such as pathological disease and drug intervention. This relationship is dictated at the simplest level by the primary protein sequence. Since useful structures and functions are conserved within biology, a sequence with known structure-function relationship can be compared to related sequences to aid in novel structure-function prediction. Sequence analysis provides a means for suggesting evolutionary relationships, and inferring structural or functional similarity. It is crucial to consider these parameters while comparing sequences as they influence both the algorithms used and the implications of the results. For example, proteins that are closely related on an evolutionary time scale may have very similar structure, but entirely different functions. In contrast, proteins which have undergone convergent evolution may have dissimilar primary structure, but perform similar functions. This chapter details how the aspects of evolution, structure, and function can be taken into account when performing sequence analysis, and proposes an expansion on traditional approaches resulting in direct improvement of said analysis. This model is applied to a case study in the prestin protein and shows that the proposed approach provides a better understanding of input and output and can improve the performance of sequence analysis by means of motif detection software.
Chapter Preview


Computational methods have simplified the analysis of the massive genetic code. Sequence analysis, as it is known, takes advantage of the inherent conservation of genomes by comparing set of nucleotide or amino acid sequences to infer relationships. One of the first described methods for sequence comparison was published in 1970 by Gibbs and McIntyre et al. in defining a means for comparing two biopolymer sequences with the allowance of gaps to discern homology using the dot method. In 1990, Altschul et al. described a method for the approximation of sequence alignments that would later become known as NCBI’s BLAST. Since then, various methods for sequence comparison have been proposed reducing computational cost, runtimes, and improving accuracy. Currently, a variety of reputable methods exist for creating a multiple sequence alignment and one may choose the tool appropriate for their domain with the best speed, the least computational burden, or the best accuracy. This has only highlighted that sequence comparison, by means of alignment or pattern search, remains a consistent genre of tools for discerning basic characteristics about a set of sequences.

Even with the many advances in sequence comparison since 1970, one may beg the question of improvement, especially with the recent explosion of DNA and protein sequence data gathered from high-throughput methods. Sequence comparison methods have come under fire due to their over-simplifying the analysis process. The call for ‘intelligent’ sequence analysis has been proposed by Wagner et al.(2008) by describing alignment methods that incorporate domain expertise with sophisticated methods to achieve the best alignment possible. The idea of intelligent sequence analysis implies that the user can apply their expertise to appropriately prepare the input data and thus better understand and interpret the output, and even adjust software parameters to best suit their focus. This intelligent sequence analysis requires that users have knowledge in both biological and informatics fields. By breaking down sequence analysis, these ideas can be better applied.

In our analysis of nucleotide and amino acid sequences, we first divide methods into experimental analysis or computational analysis. Experimental analysis includes all experimental methods (e.g. sequencing and mass spectrometry) used to determine the sequence of nucleotides and amino acid residues pertaining to genomes as well as experimental methods that determine the evolution, structure, and/or function of those nucleotide and amino acid sequences. Computational analysis includes methods for preparing, accessing, and analyzing sequence data, with capabilities for examination of massive volumes of data using the parallel architectures of supercomputing.

Complete Chapter List

Search this Book: