Effect of site-specific heterogeneous evolution on phylogenetic reconstruction: A simple evaluation

Institute of Biomedical Sciences, Fudan University, Shanghai 200433, China.
Gene (Impact Factor: 2.14). 09/2008; 441(1-2):156-62. DOI: 10.1016/j.gene.2008.08.003
Source: PubMed


Recent studies have shown that heterogeneous evolution may mislead phylogenetic analysis, which has been neglected for a long time. We evaluate the effect of heterogeneous evolution on phylogenetic analysis, using 18 fish mitogenomic coding sequences as an example. Using the software DIVERGE, we identify 198 amino acid sites that have experienced heterogeneous evolution. After removing these sites, the rest of sites are shown to be virtually homogeneous in the evolutionary rate. There are some differences between phylogenetic trees built with heterogeneous sites ("before tree") and without heterogeneous sites ("after tree"). Our study demonstrates that for phylogenetic reconstruction, an effective approach is to identify and remove sites with heterogeneous evolution, and suggests that researchers can use the software DIVERGE to remove the influence of heterogeneous evolution before reconstructing phylogenetic trees.

17 Reads
  • [Show abstract] [Hide abstract]
    ABSTRACT: Variation in substitution rates among evolutionary lineages (among-lineage rate variation or ALRV) has been reported to negatively affect the estimation of phylogenies. When the substitution processes underlying ALRV are modeled inadequately, non-sister taxa with similar substitution rates are estimated incorrectly as sister species due to long-branch attraction. Recent advances in modeling site-specific rate variation (heterotachy) have reduced the impacts of ALRV on phylogeny estimation in several empirical and simulated datasets. However, the addition of parameters to the substitution model reduces power to estimate each parameter correctly, which can also lead to incorrect phylogeny estimation. A potential solution to this problem is to identify the levels of ALRV that negatively impact phylogeny estimation such that molecular markers with non-deleterious levels of ALRV can be identified. To this end, we used analyses of empirical and simulated gene datasets to evaluate whether levels of ALRV identified in a mitochondrial genomic dataset for salamanders negatively impacted phylogeny estimation. We simulated data with and without ALRV, holding all other evolutionary parameters constant, and compared the phylogenetic performance of both simulated and empirical datasets. Overall, we found limited, positive effects of ALRV on phylogeny estimation in this dataset, the majority of which resulted from an increase in substitution rate on short branches. We conclude that ALRV does not always negatively impact phylogeny estimation. Therefore, ALRV can likely be disregarded as a criterion for marker selection in comparable phylogenetic studies.
    Molecular Phylogenetics and Evolution 03/2010; 54(3):849-56. DOI:10.1016/j.ympev.2009.12.025 · 3.92 Impact Factor
  • Xun Gu ·
    [Show abstract] [Hide abstract]
    ABSTRACT: Evolutionary genomics is a relatively new research field with the ultimate goal of understanding the underlying evolutionary and genetic mechanisms for the emergence of genome complexity under changing environments. It stems from an integration of high throughput data from functional genomics, statistical modelling and bioinformatics, and the procedure of phylogeny-based analysis. This book summarises the statistical framework of evolutionary genomics, and illustrates how statistical modelling and testing can enhance our understanding of functional genomic evolution. The book reviews the recent developments in methodology from an evolutionary perspective of genome function, and incorporates substantial examples from high throughput data in model organisms. In addition to phylogeny-based functional analysis of DNA sequences, the book includes discussion on how new types of functional genomic data (e.g., microarray) can provide exciting new insights into the evolution of genome function, which can lead in turn to an understanding of the emergence of genome complexity during evolution.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Protein evolution includes the birth and death of structural motifs. For example, a zinc finger or a salt bridge may be present in some, but not all, members of a protein family. We propose that such transitions are manifest in sequence phylogenies as concerted shifts in substitution rates of amino acids that are neighbors in a representative structure. First, we identified rate shifts in a quartet from the Fpg/Nei family of base excision repair enzymes using a method developed by Xun Gu and coworkers. We found the shifts to be spatially correlated, more precisely, associated with a flexible loop involved in bacterial Fpg substrate specificity. Consistent with our result, sequences and structures provide convincing evidence that this loop plays a very different role in other family members. Second, then, we developed a method for identifying latent protein structural characters (LSC) given a set of homologous sequences based on Gu's method and proximity in a high-resolution structure. Third, we identified LSC and assigned states of LSC to clades within the Fpg/Nei family of base excision repair enzymes. We describe seven LSC; an accompanying Proteopedia page ( describes these in greater detail and facilitates 3D viewing. The LSC we found provided a surprisingly complete picture of the interaction of the protein with the DNA capturing familiar examples, such as a Zn finger, as well as more subtle interactions. Their preponderance is consistent with an important role as phylogenetic characters. Phylogenetic inference based on LSC provided convincing evidence of independent losses of Zn fingers. Structural motifs may serve as important phylogenetic characters and modeling transitions involving structural motifs may provide a much deeper understanding of protein evolution.
    PLoS ONE 10/2011; 6(10):e25246. DOI:10.1371/journal.pone.0025246 · 3.23 Impact Factor

Similar Publications