Approximate Likelihood-Ratio Test for Branches: A Fast, Accurate, and Powerful Alternative

Equipe Méthodes et Algorithmes pour la Bioinformatique LIRMM-CNRS, Université Montpellier II, Montpellier 34392, France.
Systematic Biology (Impact Factor: 14.39). 09/2006; 55(4):539-52. DOI: 10.1080/10635150600755453
Source: PubMed


We revisit statistical tests for branches of evolutionary trees reconstructed upon molecular data. A new, fast, approximate likelihood-ratio test (aLRT) for branches is presented here as a competitive alternative to nonparametric bootstrap and Bayesian estimation of branch support. The aLRT is based on the idea of the conventional LRT, with the null hypothesis corresponding to the assumption that the inferred branch has length 0. We show that the LRT statistic is asymptotically distributed as a maximum of three random variables drawn from the chi(0)2 + chi(1)2 distribution. The new aLRT of interior branch uses this distribution for significance testing, but the test statistic is approximated in a slightly conservative but practical way as 2(l1- l2), i.e., double the difference between the maximum log-likelihood values corresponding to the best tree and the second best topological arrangement around the branch of interest. Such a test is fast because the log-likelihood value l2 is computed by optimizing only over the branch of interest and the four adjacent branches, whereas other parameters are fixed at their optimal values corresponding to the best ML tree. The performance of the new test was studied on simulated 4-, 12-, and 100-taxon data sets with sequences of different lengths. The aLRT is shown to be accurate, powerful, and robust to certain violations of model assumptions. The aLRT is implemented within the algorithm used by the recent fast maximum likelihood tree estimation program PHYML (Guindon and Gascuel, 2003).

Download full-text


Available from: Maria Anisimova,
66 Reads
  • Source
    • "Sequences were aligned with MUSCLE v3.7 configured for highest accuracy (Edgar, 2004). Ambiguous regions were removed with Gblocks v0.91b (Castresana, 2000) and the phylogenetic tree was reconstructed using the maximum likelihood method implemented in PhyML v3.0 aLRT (Anisimova and Gascuel, 2006; Guindon and Gascuel, 2003). The reliability of internal branches was assessed using the bootstrapping method (1000 bootstrap replicates). "
    [Show abstract] [Hide abstract]
    ABSTRACT: The red flour beetle Tribolium castaneum is a destructive insect pest of stored food and feed products, and a model organism for development, evolutionary biology and immunity. The insect innate immune system includes antimicrobial peptides (AMPs) with a wide spectrum of targets including viruses, bacteria, fungi and parasites. Defensins are an evolutionarily-conserved class of AMPs and a potential new source of antimicrobial agents. In this context, we report the antimicrobial activity, phylogenetic and structural properties of three T. castaneum defensins (Def1, Def2 and Def3) and their relevance in the immunity of T. castaneum against bacterial pathogens. All three recombinant defensins showed bactericidal activity against Micrococcus luteus and Bacillus thuringiensis serovar tolworthi, but only Def1 and Def2 showed a bacteriostatic effect against Staphylococcus epidermidis. None of the defensins showed activity against the Gram-negative bacteria Escherichia coli and Pseudomonas entomophila or against the yeast Saccharomyces cerevisiae. All three defensins were transcriptionally upregulated following a bacterial challenge, suggesting a key role in the immunity of T. castaneum against bacterial pathogens. Phylogenetic analysis showed that defensins from T. castaneum, mealworms, Udo longhorn beetle and houseflies cluster within a well-defined clade of insect defensins. We conclude that T. castaneum defensins are primarily active against Gram-positive bacteria and that other AMPs may play a more prominent role against Gram-negative species.
    Journal of Invertebrate Pathology 11/2015; 132:208–215. DOI:10.1016/j.jip.2015.10.009 · 2.11 Impact Factor
  • Source
    • "A total of 107 higher-level nodes were 310 included in this comparison. SH-like supports are computed for a 311 nearest neighbor interchange (NNI) optimal tree, which is gener- 312 ated during the process and can be slightly different from the ini- 313 tial tree (Anisimova and Gascuel, 2006). For example, for the 314 optimal tree from our combined dataset, the position of Calabari- 315 idae was slightly different from that in the NNI-optimized tree. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Two common approaches for estimating phylogenies in species-rich groups are to: (i) sample many loci for few species (e.g. phylogenomic approach), or (ii) sample many species for fewer loci (e.g. supermatrix approach). In theory, these approaches can be combined to simultaneously resolve both higher-level relationships (with many genes) and species-level relationships (with many taxa). However, fundamental questions remain unanswered about this combined approach. First, will higher-level relationships more closely resemble those estimated from many genes or those from many taxa? Second, will branch support increase for higher-level relationships (relative to the estimate from many taxa)? Here, we address these questions in squamate reptiles. We combined two recently published datasets, one based on 44 genes for 161 species, and one based on 12 genes for 4161 species. The likelihood-based tree from the combined matrix (52 genes, 4162 species) shared more higher-level clades with the 44-gene tree (90% vs. 77% shared). Branch support for higher level-relationships was marginally higher than in the 12-gene tree, but lower than in the 44-gene tree. Relationships were apparently not obscured by the abundant missing data (92% overall). We provide a time-calibrated phylogeny based on extensive sampling of genes and taxa as a resource for comparative studies.
    Molecular Phylogenetics and Evolution 10/2015; DOI:10.1016/j.ympev.2015.10.009 · 3.92 Impact Factor
  • Source
    • "Protein-coding nucleotide sequences of 26 vertebrate rhodopsins downloaded from GenBank, and the Thamnophis proximus rhodopsin sequence (Yang 2010), were aligned using MEGA4 (Table S1) (Tamura et al. 2007). A gene tree was inferred by maximum likelihood (ML) using PhyML 3 under the GTR+G+I model with a BioNJ starting tree, the best of NNI and SPR tree improvement, and aLRT SH-like branch support (Anisimova and Gascuel 2006; Guindon et al. 2010). The topology of this tree disagreed with the expected species relationships (Fig. S1). "
    [Show abstract] [Hide abstract]
    ABSTRACT: The nocturnal origin of mammals is a longstanding hypothesis that is considered instrumental for the evolution of endothermy, a potential key innovation in this successful clade. This hypothesis is primarily based on indirect anatomical inference from fossils. Here, we reconstruct the evolutionary history of rhodopsin – the vertebrate visual pigment mediating the first step in phototransduction at low-light levels – via codon-based model tests for selection, combined with gene resurrection methods that allow study of ancient proteins. Rhodopsin coding sequences were reconstructed for three key nodes: Amniota, Mammalia, and Theria. When expressed in vitro, all sequences generated stable visual pigments with λMAX values similar to the well-studied bovine rhodopsin. Retinal release rates of mammalian and therian ancestral rhodopsins, measured via fluorescence spectroscopy, were significantly slower than those of the amniote ancestor, indicating altered molecular function possibly related to nocturnality. Positive selection along the therian branch suggests adaptive evolution in rhodopsin concurrent with therian ecological diversification events during the Mesozoic that allowed for an exploration of the environment at varying light levels.This article is protected by copyright. All rights reserved
    Evolution 10/2015; DOI:10.1111/evo.12794 · 4.61 Impact Factor
Show more