Article

A method for inferring the rate of evolution of homologous characters that can potentially improve phylogenetic inference, resolve deep divergence and correct systematic biases.

Molecular Evolution and Bioinformatics Unit, Department of Biology, National University of Ireland, Maynooth, County Kildare, Ireland.
Systematic Biology (impact factor: 10.23). 07/2011; 60(6):833-44. DOI:10.1093/sysbio/syr064
Source: PubMed

ABSTRACT Current phylogenetic methods attempt to account for evolutionary rate variation across characters in a matrix. This is generally achieved by the use of sophisticated evolutionary models, combined with dense sampling of large numbers of characters. However, systematic biases and superimposed substitutions make this task very difficult. Model adequacy can sometimes be achieved at the cost of adding large numbers of free parameters, with each parameter being optimized according to some criterion, resulting in increased computation times and large variances in the model estimates. In this study, we develop a simple approach that estimates the relative evolutionary rate of each homologous character. The method that we describe uses the similarity between characters as a proxy for evolutionary rate. In this article, we work on the premise that if the character-state distribution of a homologous character is similar to many other characters, then this character is likely to be relatively slowly evolving. If the character-state distribution of a homologous character is not similar to many or any of the rest of the characters in a data set, then it is likely to be the result of rapid evolution. We show that in some test cases, at least, the premise can hold and the inferences are robust. Importantly, the method does not use a "starting tree" to make the inference and therefore is tree independent. We demonstrate that this approach can work as well as a maximum likelihood (ML) approach, though the ML method needs to have a known phylogeny, or at least a very good estimate of that phylogeny. We then demonstrate some uses for this method of analysis, including the improvement in phylogeny reconstruction for both deep-level and recent relationships and overcoming systematic biases such as base composition bias. Furthermore, we compare this approach to two well-established methods for reweighting or removing characters. These other methods are tree-based and we show that they can be systematically biased. We feel this method can be useful for phylogeny reconstruction, understanding evolutionary rate variation, and for understanding selection variation on different characters.

0 0
 · 
1 Bookmark
 · 
222 Views
  • Article: Reconstruction of Family-Level Phylogenetic Relationships within Demospongiae (Porifera) Using Nuclear Encoded Housekeeping Genes.
    [show abstract] [hide abstract]
    ABSTRACT: BACKGROUND: Demosponges are challenging for phylogenetic systematics because of their plastic and relatively simple morphologies and many deep divergences between major clades. To improve understanding of the phylogenetic relationships within Demospongiae, we sequenced and analyzed seven nuclear housekeeping genes involved in a variety of cellular functions from a diverse group of sponges. METHODOLOGY/PRINCIPAL FINDINGS: We generated data from each of the four sponge classes (i.e., Calcarea, Demospongiae, Hexactinellida, and Homoscleromorpha), but focused on family-level relationships within demosponges. With data for 21 newly sampled families, our Maximum Likelihood and Bayesian-based approaches recovered previously phylogenetically defined taxa: Keratosa(p), Myxospongiae(p), Spongillida(p), Haploscleromorpha(p) (the marine haplosclerids) and Democlavia(p). We found conflicting results concerning the relationships of Keratosa(p) and Myxospongiae(p) to the remaining demosponges, but our results strongly supported a clade of Haploscleromorpha(p)+Spongillida(p)+Democlavia(p). In contrast to hypotheses based on mitochondrial genome and ribosomal data, nuclear housekeeping gene data suggested that freshwater sponges (Spongillida(p)) are sister to Haploscleromorpha(p) rather than part of Democlavia(p). Within Keratosa(p), we found equivocal results as to the monophyly of Dictyoceratida. Within Myxospongiae(p), Chondrosida and Verongida were monophyletic. A well-supported clade within Democlavia(p), Tetractinellida(p), composed of all sampled members of Astrophorina and Spirophorina (including the only lithistid in our analysis), was consistently revealed as the sister group to all other members of Democlavia(p). Within Tetractinellida(p), we did not recover monophyletic Astrophorina or Spirophorina. Our results also reaffirmed the monophyly of order Poecilosclerida (excluding Desmacellidae and Raspailiidae), and polyphyly of Hadromerida and Halichondrida. CONCLUSIONS/SIGNIFICANCE: These results, using an independent nuclear gene set, confirmed many hypotheses based on ribosomal and/or mitochondrial genes, and they also identified clades with low statistical support or clades that conflicted with traditional morphological classification. Our results will serve as a basis for future exploration of these outstanding questions using more taxon- and gene-rich datasets.
    PLoS ONE 01/2013; 8(1):e50437. · 4.09 Impact Factor
  • Article: Separating the wheat from the chaff: mitigating the effects of noise in a plastome phylogenomic data set from Pinus L. (Pinaceae).
    [show abstract] [hide abstract]
    ABSTRACT: Through next-generation sequencing, the amount of sequence data potentially available for phylogenetic analyses has increased exponentially in recent years. Simultaneously, the risk of incorporating 'noisy' data with misleading phylogenetic signal has also increased, and may disproportionately influence the topology of weakly supported nodes and lineages featuring rapid radiations and/or elevated rates of evolution. We investigated the influence of phylogenetic noise in large data sets by applying two fundamental strategies, variable site removal and long-branch exclusion, to the phylogenetic analysis of a full plastome alignment of 107 species of Pinus and six Pinaceae outgroups. While high overall phylogenetic resolution resulted from inclusion of all data, three historically recalcitrant nodes remained conflicted with previous analyses. Close investigation of these nodes revealed dramatically different responses to data removal. Whereas topological resolution and bootstrap support for two clades peaked with removal of highly variable sites, the third clade resolved most strongly when all sites were included. Similar trends were observed using long-branch exclusion, but patterns were neither as strong nor as clear. When compared to previous phylogenetic analyses of nuclear loci and morphological data, the most highly supported topologies seen in Pinus plastome analysis are congruent for the two clades gaining support from variable site removal and long-branch exclusion, but in conflict for the clade with highest support from the full data set. These results suggest that removal of misleading signal in phylogenomic datasets can result not only in increased resolution for poorly supported nodes, but may serve as a tool for identifying erroneous yet highly supported topologies. For Pinus chloroplast genomes, removal of variable sites appears to be more effective than long-branch exclusion for clarifying phylogenetic hypotheses.
    BMC Evolutionary Biology 06/2012; 12:100. · 3.52 Impact Factor
  • Source
    Article: Early evolution without a tree of life.
    [show abstract] [hide abstract]
    ABSTRACT: Life is a chemical reaction. Three major transitions in early evolution are considered without recourse to a tree of life. The origin of prokaryotes required a steady supply of energy and electrons, probably in the form of molecular hydrogen stemming from serpentinization. Microbial genome evolution is not a treelike process because of lateral gene transfer and the endosymbiotic origins of organelles. The lack of true intermediates in the prokaryote-to-eukaryote transition has a bioenergetic cause.
    Biology Direct 06/2011; 6:36. · 4.02 Impact Factor

Full-text (2 Sources)

View
5 Downloads
Available from
19 Oct 2012

Keywords

base composition bias
 
character-state distribution
 
Current phylogenetic methods attempt
 
different characters
 
evolutionary rate variation
 
known phylogeny
 
large numbers
 
maximum likelihood
 
ML method
 
Model adequacy
 
overcoming systematic biases
 
phylogeny reconstruction
 
rapid evolution
 
relative evolutionary rate
 
simple approach
 
sophisticated evolutionary models
 
test cases
 
understanding evolutionary rate variation
 
understanding selection variation
 
well-established methods