MAFFT version 5: Improvement in accuracy of multiple sequence alignment

Bioinformatics Center, Institute for Chemical Research, Kyoto University Uji, Kyoto 611-0011, Japan.
Nucleic Acids Research (Impact Factor: 9.11). 02/2005; 33(2):511-8. DOI: 10.1093/nar/gki198
Source: PubMed

ABSTRACT The accuracy of multiple sequence alignment program MAFFT has been improved. The new version (5.3) of MAFFT offers new iterative
refinement options, H-INS-i, F-INS-i and G-INS-i, in which pairwise alignment information are incorporated into objective
function. These new options of MAFFT showed higher accuracy than currently available methods including TCoffee version 2 and
CLUSTAL W in benchmark tests consisting of alignments of >50 sequences. Like the previously available options, the new options
of MAFFT can handle hundreds of sequences on a standard desktop computer. We also examined the effect of the number of homologues
included in an alignment. For a multiple alignment consisting of ∼8 sequences with low similarity, the accuracy was improved
(2–10 percentage points) when the sequences were aligned together with dozens of their close homologues (E-value < 10−5–10−20) collected from a database. Such improvement was generally observed for most methods, but remarkably large for the new options
of MAFFT proposed here. Thus, we made a Ruby script, mafftE.rb, which aligns the input sequences together with their close
homologues collected from SwissProt using NCBI-BLAST.

Available from: Kazutaka Katoh, Jan 15, 2014
