Methods for Choosing Clusters in Phylogenetic Trees

Methods for Choosing Clusters in Phylogenetic Trees

Tom Burr (Los Alamos National Laboratory, USA)
Copyright: © 2005 |Pages: 6
DOI: 10.4018/978-1-59140-557-3.ch138
OnDemand PDF Download:
No Current Special Offers


One data mining activity is cluster analysis, of which there are several types. One type deserving special attention is clustering that arises due to evolutionary relationships among organisms. Genetic data is often used to infer evolutionary relations among a collection of species, viruses, bacterial, or other taxonomic units (taxa). A phylogenetic tree (Figure 1, top) is a visual representation of either the true or the estimated branching order of the taxa, depending on the context. Because the taxa often cluster in agreement with auxiliary information, such as geographic or temporal isolation, a common activity associated with tree estimation is to infer the number of clusters and cluster memberships, which is also a common goal in most applications of cluster analysis. However, tree estimation is unique because of the types of data used and the use of probabilistic evolutionary models which lead to computationally demanding optimization problems. Furthermore, novel methods to choose the number of clusters and cluster memberships have been developed and will be described here. The methods include a unique application of model-based clustering, a maximum likelihood plus bootstrap method, and a Bayesian method based on obtaining samples from the posterior probability distribution on the space of possible branching orders.

Complete Chapter List

Search this Book: