Katoh K, Kuma K, Toh H, Miyata TMAFFT version 5: Improvement in accuracy of multiple sequence alignment. Nucleic Acids Res 33:511-518

Bioinformatics Center, Institute for Chemical Research, Kyoto University Uji, Kyoto 611-0011, Japan.
Nucleic Acids Research (Impact Factor: 9.11). 02/2005; 33(2):511-8. DOI: 10.1093/nar/gki198
Source: PubMed

ABSTRACT The accuracy of multiple sequence alignment program MAFFT has been improved. The new version (5.3) of MAFFT offers new iterative
refinement options, H-INS-i, F-INS-i and G-INS-i, in which pairwise alignment information are incorporated into objective
function. These new options of MAFFT showed higher accuracy than currently available methods including TCoffee version 2 and
CLUSTAL W in benchmark tests consisting of alignments of >50 sequences. Like the previously available options, the new options
of MAFFT can handle hundreds of sequences on a standard desktop computer. We also examined the effect of the number of homologues
included in an alignment. For a multiple alignment consisting of ∼8 sequences with low similarity, the accuracy was improved
(2–10 percentage points) when the sequences were aligned together with dozens of their close homologues (E-value < 10−5–10−20) collected from a database. Such improvement was generally observed for most methods, but remarkably large for the new options
of MAFFT proposed here. Thus, we made a Ruby script, mafftE.rb, which aligns the input sequences together with their close
homologues collected from SwissProt using NCBI-BLAST.

  • Source
    • "Alignments of the amino acid sequences for each gene were performed by using MAFFT (Katoh et al. 2005) with default settings. Based on the aligned amino acid and the original (non-aligned) nucleotide sequences, we reconstructed nucleotide alignment by substituting each amino acid with the corresponding codon from the nucleotide sequence and by substituting each gap in the amino acid alignment with three gaps in the nucleotide alignment. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Genetic factors may play an important role in species extinction but their actual effect remains poorly understood, particularly because of a strong and potentially masking effect expected from ecological traits. We investigated the role of genetics in mammal extinction taking both ecological and genetic factors into account. As a proxy for the role of genetics we used the ratio of the rates of nonsynonymous (amino acid changing) to synonymous (leaving the amino acid unchanged) nucleotide substitutions, Ka / Ks. Because most nonsynonymous substitutions are likely to be slightly deleterious and thus selected against, this ratio is a measure of the inefficiency of selection: if large (but less than 1), it implies a low efficiency of selection against nonsynonymous mutations. As a result, nonsynonymous mutations may accumulate and thus contribute to extinction. As a proxy for the role of ecology we used body mass W, with which most extinction-related ecological traits strongly correlate. As a measure of extinction risk we used species’ affiliation with the five levels of extinction threat according to the IUCN Red List of Threatened Species. We calculated Ka / Ks for mitochondrial protein-coding genes of 211 mammalian species, each of which was characterized by body mass and the level of threat. Using logistic regression analysis, we then constructed a set of logistic regression models of extinction risk on ln(Ka / Ks) and lnW. We found that Ka / Ks and body mass are responsible for a 38% and a 62% increase in extinction risk, respectively. Given that the standard error of these values is 13%, the contribution of genetic factors to extinction risk in mammals is estimated to be one-quarter to one-half of the total of ecological and genetic effects. We conclude that the effect of genetics on extinction is significant, though it is almost certainly smaller than the effect of ecological traits.
    Oikos 08/2015; 124(8):983-993. DOI:10.1111/oik.01734 · 3.56 Impact Factor
  • Source
    • "Newly generated sequences were combined with published data and aligned using MAFFT v6.611b (Katoh et al. 2005) with 100 cycles of iterative refinement and the genafpair algorithm. Substitution saturation at 1st + 2nd and 3rd codon positions in the cox1_5 0 and cox1_3 0 partitions was estimated using DAMBE (Xia et al. 2003; Xia 2013), following the procedure outlined in Xia & Lemey (2009), analysing only fully resolved sites. "
    [Show abstract] [Hide abstract]
    ABSTRACT: 2015). A phylogeny of Vesiculariidae (Bryozoa, Ctenostomata) supports synonymization of three genera and reveals possible cryptic diversity.-Zoologica Scripta, 00, 000–000. Compared to their calcified sister group, order Cheilostomata, uncalcified ctenostome bry-ozoans exhibit relatively simple and often inconsistent morphologies, making them particularly suitable candidates for the use of molecular tools to delimit species and examine their interrelationships. The family Vesiculariidae is composed of six genera, three of which, Zoobotryon, Avenella and Watersiana are monotypic, and one, Vesicularia, encompasses four species. The majority of vesiculariid diversity, however, is found in Amathia (39 species) and Bowerbankia (21 species). The respective monophyletic status for Amathia and Bowerbankia has recently been put into question by molecular evidence and is being further examined in this study. Multigene (ssrDNA, rrnL, cox1) phylogenetic analysis revealed that Bowerbankia is paraphyletic to the inclusion of Zoobotryon and Amathia, where the latter was resolved as non-monophyletic. Although Vesicularia also nested within this paraphyletic assemblage in some of the analyses, Bayesian topology testing did not support this result. Our results are discussed within the context of published morphological evidence and lead to the conclusion that Bowerbankia and Zoo-botryon should be classified as junior subjective synonyms of Amathia. A revised nomenclature is provided. Furthermore, we examined genetic divergences between widely distributed supposed conspecific species and discovered possible cryptic diversity in the outgroup taxon Anguinella palmata and in Bowerbankia citrina, Amathia vidovici and Amathia crispa.
    Zoologica Scripta 07/2015; DOI:10.1111/zsc.12130 · 2.92 Impact Factor
  • Source
    • "with coding sequences , the nucleotide alignment was constrained by the amino acid sequences alignment . Although the number of sequences was small in each taxonomic group , we compared the alignments obtained with MUSCLE and submitted the raw amino acid sequences to the GUIDANCE filter ( Penn et al . 2010 ) , using the alignment algorithm MAFFT ( Katoh et al . 2005 ) . MUSCLE and MAFFT produced very similar outputs and we therefore chose the alignments produced by the former ( Additional file 1 : Figure S1 , Additional file 3 : Figure S2 , Additional file 4 : Figure S3 ) . GUIDANCE provided us alignment scores and regions of the align - ments that were not well supported ( Additional file 1 : Figu"
    [Show abstract] [Hide abstract]
    ABSTRACT: Multi-domain proteins form the majority of proteins in eukaryotes. During their formation by tandem duplication or gene fusion, new interactions between domains may arise as a result of the structurally-forced proximity of domains. The proper function of the formed proteins likely required the molecular adjustment of these stress zones by specific amino acid replacements, which should be detectable by the molecular signature of selection that governed their changes. We used multi-domain globins from three different invertebrate lineages to investigate the selective forces that acted throughout the evolution of these molecules. In the youngest of these molecules [Branchipolynoe scaleworm; original duplication ca. 60 million years (Ma)], we were able to detect some amino acids under positive selection corresponding to the initial duplication event. In older lineages (didomain globin from bivalve mollusks and nematodes), there was no evidence of amino acid positions under positive selection, possibly the result of accumulated non-adaptative mutations since the original duplication event (165 and 245 Ma, respectively). Some amino acids under positive selection were sometimes detected in later branches, either after speciation events, or after the initial duplication event. In Branchipolynoe, the position of the amino acids under positive selection on a 3D model suggests some of them are located at the interface between two domains; while others are locate in the heme pocket.
    SpringerPlus 07/2015; 4:354. DOI:10.1186/s40064-015-1124-2
Show more