Efficient and Robust Analysis of Large Phylogenetic Datasets

Efficient and Robust Analysis of Large Phylogenetic Datasets

Sven Rahmann (Bielefeld University, Germany), Tobias Muller (University of Wurzburg, Germany), Thomas Dandekar (University of Wurzburg, Germany) and Matthias Wolf (University of Wurzburg, Germany)
Copyright: © 2006 |Pages: 14
DOI: 10.4018/978-1-59140-863-5.ch006
OnDemand PDF Download:


The goal of phylogenetics is to reconstruct ancestral relationships between different taxa, e.g., different species in the tree of life, by means of certain characters, such as genomic sequences. We consider the prominent problem of reconstructing the basal phylogenetic tree topology when several subclades have already been identified or are well known by other means, such as morphological characteristics. Whereas most available tools attempt to estimate a fully resolved tree from scratch, the profile neighbor-joining (PNJ) method focuses directly on the mentioned problem and has proven a robust and efficient method for large-scale datasets, especially when used in an iterative way. We describe an implementation of this idea, the ProfDist software package, which is freely available, and apply the method to estimate the phylogeny of the eukaryotes. Overall, the PNJ approach provides a novel effective way to mine large sequence datasets for relevant phylogenetic information.

Complete Chapter List

Search this Book:
Table of Contents
Hui-Huang Hsu
Chapter 1
Hui-Huang Hsu
Bioinformatics uses information technologies to facilitate the discovery of new knowledge in molecular biology. Among the information technologies... Sample PDF
Introduction to Data Mining in Bioinformatics
Chapter 2
Li Liao
Recently, clustering and classification methods have seen many applications in bioinformatics. Some are simply straightforward applications of... Sample PDF
Hierarchical Profiling, Scoring and Applications in Bioinformatics
Chapter 3
D. Frank Hsu, Yun-Sheng Chung, Bruce S. Kristal
Combination methods have been investigated as a possible means to improve performance in multi-variable (multi-criterion or multi-objective)... Sample PDF
Combinatorial Fusion Analysis: Methods and Practices of Combining Multiple Scoring Systems
Chapter 4
Hsuan T. Chang
This chapter introduces various visualization (i.e., graphical representation) schemes of symbolic DNA sequences, which are basically represented by... Sample PDF
DNA Sequence Visualization
Chapter 5
Simon Lin, Salvatore Mungal, Richard Haney, Edward F. Patz Jr., Patrick McConnell
This chapter provides a rudimentary review of the field of proteomics as it applies to mass spectrometry, data handling, and analysis. It points out... Sample PDF
Proteomics with Mass Spectrometry
Chapter 6
Sven Rahmann, Tobias Muller, Thomas Dandekar, Matthias Wolf
The goal of phylogenetics is to reconstruct ancestral relationships between different taxa, e.g., different species in the tree of life, by means of... Sample PDF
Efficient and Robust Analysis of Large Phylogenetic Datasets
Chapter 7
Tatsuya Akutsu
This chapter provides an overview of computational problems and techniques for protein threading. Protein threading is one of the most powerful... Sample PDF
Algorithmic Aspects of Protein Threading
Chapter 8
Arpad Kelemen, Yulan Liang
Pattern differentiations and formulations are two main research tracks for heterogeneous genomic data pattern analysis. In this chapter, we develop... Sample PDF
Pattern Differentiations and Formulations for Heterogeneous Genomic Data through Hybrid Approaches
Chapter 9
Vincent S. Tseng, Ching-Pin Kao
In recent years, clustering analysis has even become a valuable and useful tool for in-silico analysis of microarray or gene expression data.... Sample PDF
Paramaterless Clustering Techniques for Gene Expression Analysis
Chapter 10
Junying Zhang
This chapter introduces gene selection approaches in microarray data analysis for two purposes: cancer classification and tissue heterogeneity... Sample PDF
Joint Discriminatory Gene Selection for Molecular Classification of Cancer
Chapter 11
Takashi Kido
This chapter introduces computational methods for detecting complex disease loci with haplotype analysis. It argues that the haplotype analysis... Sample PDF
A Haplotype Analysis System for Genes Discovery of Common Diseases
Chapter 12
Peng-Yeng yin, Shyong-Jian Shyu, Guan-Shieng Huang, Shuang-Te Liao
With the advent of new sequencing technology for biological data, the number of sequenced proteins stored in public databases has become an... Sample PDF
A Bayesian Framework for Improving Clustering Accuracy of Protein Sequences Based on Association Rules
Chapter 13
Byung-Hoon Park, Phuongan Dam, Chongle Pan, Ying Xu, Al Geist, Grant Heffelfinger, Nagiza F. Samatova
Protein-protein interactions are fundamental to cellular processes. They are responsible for phenomena like DNA replication, gene transcription... Sample PDF
In Silico Recognition of Protein-Protein Interaction: Theory and Applications
Chapter 14
Christopher Besemann, Anne Denton, Ajay Yekkirala, Ron Hutchison, Marc Anderson
In this chapter, we discuss the use of differential association rules to study the annotations of proteins in one or more interaction networks.... Sample PDF
Differential Association Rules: Understanding Annotations in Protein Interaction Networks
Chapter 15
Francisco M. Couto, Mario J. Silva
This chapter introduces the use of Text Mining in scientific literature for biological research, with a special focus on automatic gene and protein... Sample PDF
Mining BioLiterature: Toward Automatic Annotation of Genes and Proteins
Chapter 16
Kwangmin Choi, Sun Kim
Understanding the genetic content of a genome is a very important but challenging task. One of the most effective methods to annotate a genome is to... Sample PDF
Comparative Genome Annotation Systems
About the Authors