Article Preview
TopIntroduction
With the surge of genomics, the availability of large amounts of sequence and genome data has revolutionarily changed traditional phylogenetics (Delsuc, Brinkmann, & Philippe, 2005). However, the gene tree and species tree problem still remains a fundamental problem in phylogenetics for its biological and algorithmic complexities (Ma, Li, & Zhang, 2000; Oliver, 2008; Rasmussen & Kellis, 2007). It means that the gene trees, which are phylogenies obtained from individual genes, are incongruent with the species tree. The incongruence between species trees and gene trees simply refers to the fact that they have different branch patterns in the tree topologies (Rasmussen & Kellis, 2007). The species phylogeny reflects the true evolution history of species where all lineages are results of speciation and divergence. Alternatively, a gene tree represents the evolutionary history of an individual gene for a set of organisms, which may not be the evolutionary history of species. Theoretically, it can be viewed as a hypothesis about how a gene evolves through gene duplication, gene loss and nucleotide substitution.
Although the incongruence between gene trees and species trees can be observed from all levels of taxa, it occurs more often for closely related species. For a group of species, the probability of incongruence between a gene trees and species tree can be computed as , where is the speciation time and is the effective population size. For a fixed effective population size, the less related species have a relatively larger speciation time interval than the closely related species. Thus, the probability of incongruence between the gene trees and species tree is usually high for a set of closely related species (Nei & Kumar, 2000).
Such an incongruence may originate from biological and technical factors. Biologically, the factors like gene duplication, gene loss, lineage sorting or horizontal gene transfer would affect the evolutionary signature of a gene and cause a gene tree different from a species phylogeny (Nei & Kumar, 2000; Whitaker, McConkey, & Westhead, 2009). Technically, the factors like inappropriate substitution models, insufficient data sampling or artifacts in phylogenetic reconstruction methods also play important roles in inferring an incongruent gene tree (Huelsenbeck, 1995; Hahn, 2007). Due to its biological and algorithmic complexities, successful resolving incongruence between the gene trees and species trees may have fundamental impacts on molecular evolution and phylogenetics.