ArticlePDF Available

Abstract and Figures

We present the latest version of the Molecular Evolutionary Genetics Analysis (MEGA) software, which contains many sophisticated methods and tools for phylogenomics and phylomedicine. In this major upgrade, MEGA has been optimized for use on 64-bit computing systems for analyzing bigger datasets. Researchers can now explore and analyze tens of thousands of sequences in MEGA. The new version also provides an advanced wizard for building timetrees and includes a new functionality to automatically predict gene duplication events in gene family trees. The 64-bit MEGA is made available in two interfaces: graphical and command line. The graphical user interface (GUI) is a native Microsoft Windows application that can also be used on Mac OSX. The command line MEGA is available as native applications for Windows, Linux, and Mac OSX. They are intended for use in high-throughput and scripted analysis. Both versions are available from free of charge.
Content may be subject to copyright.
MEGA7: Molecular Evolutionary Genetics Analysis Version 7.0
Sudhir Kumar
, Glen Stecher
, and Koichiro Tamura*
Institute for Genomics and Evolutionary Medicine, Temple University
Department of Biology, Temple University
Center for Excellence in Genome Medicine and Research, King Abdulaziz University, Jeddah, Saudi Arabia
Research Center for Genomics and Bioinformatics, Tokyo Metropolitan University, Hachioji, Tokyo, Japan
Department of Biological Sciences, Tokyo Metropolitan University, Hachioji, Tokyo, Japan
*Corresponding author: E-mail:
Associate editor: Joel Dudley
We present the latest version of the Molecular Evolutionary Genetics Analysis (MEGA) software, which contains many
sophisticated methods and tools for phylogenomics and phylomedicine. In this major upgrade, MEGA has been optimized
for use on 64-bit computing systems for analyzing larger datasets. Researchers can now explore and analyze tens of
thousands of sequences in MEGA. The new version also provides an advanced wizard for building timetrees and includes a
new functionality to automatically predict gene duplication events in gene family trees. The 64-bit MEGA is made available
in two interfaces: graphical and command line. The graphical user interface (GUI) is a native Microsoft Windows
application that can also be used on Mac OS X. The command line MEGA is available as native applications for
Windows, Linux, and Mac OS X. They are intended for use in high-throughput and scripted analysis. Both versions
are available from free of charge.
Key words: gene families, timetree, software, evolution.
Molecular Evolutionary Genetics Analysis (MEGA) software is now
being applied to increasingly bigger datasets (Kumar et al. 1994;
Tamura et al. 2013). This necessitated technological advance-
ment of the computation core and the user interface of MEGA.
Researchers also need to conduct high-throughput and scripted
analyses on their operating system of choice, which requires
that MEGA be available in native cross-platform implementation.
We have advanced the MEGA software suite to address these
needs of researchers performing comparative analyses of DNA
and protein sequences of increasing larger datasets.
Addressing the Need to Analyze Bigger Datasets
Contemporary personal computers and workstations pack
much greater computing power and system memory than
ever before. It is now common to have many gigabytes of mem-
ory with a 64-bit architecture and an operating system to match.
To harness this power in evolutionary analyses, we have ad-
vanced the MEGA source code to fully utilize 64-bit computing
resources and memory in data handling, file processing, and
evolutionary analytics. MEGA’s internal data structures have
been upgraded, and the refactored source code has been tested
extensively using automated test harnesses.
We benchmarked 64-bit MEGA7 performance using 16S ribo-
somal RNA sequence alignments obtained from the SILVA
rRNA database project (Quast et al. 2013;Yilmaz et al. 2014)
with thousands of sites and increasingly greater number of se-
quences (as many as 10,000). Figure 1 shows that their
computational analysis requires large amounts of memory
and computing power. For the Neighbor-Joining (NJ) method
(Saitou and Nei 1987), memory usage increased at a polynomial
rate as the number of sequences was increased. The peak mem-
ory usage was 1.7 GB for the full dataset of 10,000 rRNA se-
quences (fig. 1B). For the Maximum Likelihood (ML) analyses,
memory usage increased linearly and the peak memory usage
was at 18.6 GB (fig. 1D).Thetimetocompletethecomputation
(fig. 1Aand C)showedapolynomialtrendforNJandalinear
trend for ML. ML required an order of magnitude greater time
and memory. We also benchmarked MEGA7 for datasets with
increasing number of sites. Computational time and peak mem-
ory showed a linear trend. In addition, we compared the mem-
ory and time needs for 32- and 64-bit versions (MEGA6and
MEGA7, respectively), and found no significant difference for NJ
and ML analyses. This is primarily because both MEGA6and
MEGA7 use 8-byte floating point data types. However, the 32-
bit MEGA6 could only carry out ML analysis for fewer than 3,000
sequences of the same length. Therefore, MEGA7isasignificant
upgrade that does not incur any discernible computational or
resource penalty.
Upgrading the Tree Explorer
The ability to construct a phylogenetic tree of >10,000 se-
quences required a major upgrade of the Tree Explorer as well,
because it needed to display very large trees. This was accom-
plished by replacing the native Windows scroll box with a
Brief communication
ßThe Author(s) 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
All rights reserved. For permissions, please e-mail:
Mol. Biol. Evol. doi:10.1093/molbev/msw054 Advance Access publication March 22, 2016 1
MBE Advance Access published April 13, 2016
at :: on April 19, 2016 from
custom virtual scroll box, which increased the number of taxa
that can be displayed in the Tree Explorer window from
4,000 in MEGA6 to greater than 100,000 sequences in
MEGA7. This is made possible by our new adaptive approach
to render the tree to ensure the best display quality and
exploration performance. To display a tree, we first evaluate
if the tree can be rendered as a device-dependent bitmap
(DDB), which depends on the power of the available graphics
processing unit. If successful, the tree image is stored in video
memory, which enhances performance. For example, in a
computer equipped with GeForce GT 640 graphics card,
Tree Explorer successfully rendered trees with more than
100,000 sequences and responded quickly to the user scroll-
ing and display changes. When a DDB is not possible to
generate, then Tree Explorer renders the tree as a device in-
dependent bitmap. Because of the extensive system memory
requirements, we automatically choose a pixel format that
maximizes the number of sequences displayed. Basically, the
pixel format dictates the number of colors used: 24 (2
colors), 18, 8, 4, or 1 bit (monochrome) per pixel. Memory
needs scale proportional to the number of bits used per pixel.
Cross-Platform MEGA-CC for High-Throughput and
Scripted Analyses
We have now refactored MEGA’s computation core (CC,
Kumar et al. 2012)sothatitcanbecompilednativelyfor
Linux, Windows, and Mac OS X systems in order to avoid the
need for emulation or virtualization. This required porting the
computation core source code to a cross-platform program-
ming language and replacing all the Microsoft Windows sys-
tem API calls. For instance, the App Linker system, which
integrates the MUSCLE (Edgar 2004) sequence alignment ap-
plication with MEGA, relied heavily on the Windows API for
inter-process communication and was refactored extensively.
In order to configure analyses in MEGA7-CC, we have chosen
to continue requiring an analysis options file (called .mao file)
that specifies all the input parameters to the command-line
driven MEGA-CC application; see figure 1 in Kumar et al. (2012).
To generate this control file, we provide native prototyper
applications (MEGA-PROTO) for Windows, Linux, and Mac OS
X. MEGA-PROTO obviates the need to learn a large number of
commands, and, thus, avoids a steep learning curve and po-
tential mistakes for inter-dependent options. It also enables us
to deliver exactly the same experience and options for those
who will use both GUI and CC versions of MEGA7.
Marking Gene Duplication Events in Gene Family Trees
We have added a new functionality in MEGA to mark tree
nodes where gene duplications are predicted to occur. This
system works with or without a species tree. If a species tree is
provided, then we mark gene duplications following Zmasek
and Eddy (2001) algorithm. This algorithm posits the smallest
number of gene duplications in the tree such that the min-
imum number of unobserved genes, due to losses or partial
sampling are invoked. When no species tree is provided, then
all internal nodes in the tree that contain one or more
FIG.1. Timeand memory requirements for phylogenetic analyses using the NJ method (A,B) and the ML analysis (C,D). For NJ analysis, we used the
Tamura–Nei (1993) model, uniform rates of evolution among sites, and pairwise deletion option to deal with the missing data. Time usage
increases polynomially with the number of sequences (third degree polynomial, R
¼1), as does the peak memory used (R
¼1) (A,B). The same
model and parameters were used for ML tree inference, where the time taken and the memory needs increased linearly with the number of
sequences. For ML analysis, the SPR (Subtree–Pruning–Regrafting) heuristic was used for tree searching and all 5,287 sites in the sequence
alignment were included. All the analyses were performed on a Dell Optiplex 9010 computer with an Intel Core-i7-3770 3.4 GHz processor, 20 GB
of RAM, NVidia GeForce GT 640 graphics card, and a 64-bit Windows 7 Enterprise operating system.
Kumar et al. .doi:10.1093/molbev/msw054 MBE
at :: on April 19, 2016 from
common species in the two descendant clades are marked as
gene duplication events. This algorithm provides a minimum
number of duplication events, because many duplication
nodes will remain undetected when the gene sampling is
incomplete. Nevertheless, it is useful for cases where species
trees are not well established.
Realizing that the root of the gene family tree is not always
obvious, MEGA runs the above analysis by automatically root-
ing the tree on each branch and selecting a root such that the
number of gene duplications inferred is minimized. This is
done only when the user does not specify a root explicitly. A
Gene Duplication Wizard (fig. 2)walkstheuserthroughallthe
necessary steps for this analysis. Results are displayed in the
Tree Explorer (fig. 3) which marks gene duplications with blue
solid diamonds. When a species tree is provided, speciation
events are marked with open red diamonds. Results can also
be exported to Newick formatted text files where gene du-
plications and speciation events are labeled using comments
in square brackets. In the future, we plan to extend this sys-
tem with the capability to automatically retrieve species tree
from external databases, including the NCBI Taxonomy
( and the
timetree of life (Hedges et al. 2015).
Timetree System Updates
We have now upgraded the Timetree Wizard (similar to
the wizard shown in fig. 2), which guides researchers
through a multi-step process of building a molecular phy-
logeny scaled to time using a sequence alignment and a
phylogenetic tree topology. This wizard accepts Newick
formatted tree files, assists users in defining the out-
group(s) on which the tree will be rooted, and allows users
to set divergence time calibration constraints. Setting
time constraints in order to calibrate the final timetree
is optional in the RelTime method (Tamura et al. 2012), so
MEGA7 does not require that calibration constraints be
available and it does not assume a molecular clock. If no
calibrations are used, MEGA7 will produce relative diver-
gence times for nodes, which are useful for determining
the ordering and spacing of divergence events in species
and gene family trees. However, users can obtain absolute
divergence time estimates for each node by providing
FIG.2. The Gene Duplication Wizard (A) to guide users through the process of searching gene duplication events in a gene family tree. In the first
step, the user loads a gene tree from a Newick formatted text file. Second, species associated with sequences are specified using a graphical
interface. In the third step, the user has the option to load a trusted species tree, in which case it will be possible to identify all duplication events in
the gene tree, from a Newick file. Fourth, the user has the option to specify the root of the gene tree in a graphical interface. If the user provides a
trusted species tree, then they must designate the root of that tree. Finally, the user launches the analysis and the results are displayed in the Tree
Explorer window (see fig. 3).
Molecular Evolutionary Genetics Analysis .doi:10.1093/molbev/msw054 MBE
at :: on April 19, 2016 from
calibrations with minimum and/or maximum constraints
(Tamura et al. 2013). It is important to note that MEGA7
does not use calibrations that are present in the clade
containing the outgroup(s), because that would require
an assumption of equal rates of evolution between
the ingroup and outgroup sequences, which cannot be
tested. For this reason, timetrees displayed in the Tree
Explorer have the outgroup cluster compressed and
grayed out by default to promote correct scientific anal-
ysis and interpretation.
Data Coverage Display by Node
In the Tree Explorer, users will be able to display another set
of numbers at internal tree nodes that correspond to the
proportion of positions in the alignment where there is at
least one sequence with an unambiguous nucleotide or
amino acid in both the descendent lineages; see figure 5
in Filipski et al. (2014). This metric is referred to as mini-
mum data coverage and is useful in exposing nodes in the
tree that lack sufficient data to make reliable phylogenetic
inferences. For example, when the minimum data coverage
is zero for a node, then the time elapsed on the branch
connecting this node with its descendant node will always
be of zero, because zero substitutions will be mapped to
that branch (Filipski et al. 2014). This means that diver-
gence times for such nodes would be underestimated.
Such branches will also have very low statistical confidence
when inferring the phylogenetic tree. So, it is always good
to examine this metric for all nodes in the tree.
We have made many major upgrades to MEGA’s infrastructure
and added a number of new functionalities that will enable
researchers to conduct additional analyses with greater ease.
These upgrades make the seventh version of MEGA more ver-
satile than previous versions. For Microsoft Windows, the
64-bit MEGA is made available with Graphical User Interface
and as a command line program intended for use in high-
throughput and scripted analysis. Both versions are available
from free of charge. The command
line version of MEGA7 is now available in native cross-platform
applications for Linux and Mac OS X also. The GUI version of
MEGA7 is also available for Mac OS X, where we provide an
installation that automatically configures the use of Wine for
compatibility with Mac OS X. Since Wine only supports 32-bit
software, we provide 32-bit MEGA7GUIforMacOSX.
However, Mac and Linux users can run the 64-bit Windows
version of MEGA7 GUI using virtual machine environments,
including VMWare, Parallels, or Crossover. Alternatively,
64-bit MEGA-CC along with MEGA-PROTO can be used as they
run natively on Windows, Mac OS X, and Linux.
We thank Charlotte Konikoff and Mike Suleski for extensively
testing MEGA7. Many other laboratory members and beta
testers provided invaluable feedback and bug reports. We
thank Julie Marin for help in assembling the rRNA data ana-
lyzed. This study was supported in part by research grants
from National Institutes of Health (HG002096-12 to S.K.) and
Japan Society for the Promotion of Science (JSPS) grants-in-
aid for scientific research (24370033) to K.T.
Edgar RC. 2004. Muscle: a multiple sequence alignment method with
reduced time and space complexity. BMC Bioinformatics 5:113.
Filipski A, Murillo O, Freydenzon A, Tamura K, Kumar S. 2014. Prospects
for building large timetrees using molecular data with incomplete
gene coverage among species. MolBiolEvol31:2542–2550.
Hedges SB, Marin J, Suleski M, Paymer M, Kumar S. 2015. Tree of life
reveals clock-like speciation and diversification. MolBiolEvol
Kumar S, Stecher G, Peterson D, Tamura K. 2012. MEGA-CC:comput-
ing core of molecular evolutionary genetics analysis program
for automated and iterative data analysis. Bioinformatics
EGA: molecular evolutionary genet-
ics analysis software for microcomputers. Comput Appl Biosci
Quast C, Pruesse E, Yilmaz P, Gerken J, Schweer T, Yarza P, Peplies J,
Glo¨ckner FO. 2013. The SILVA ribosomal RNA gene database proj-
ect: improved data processing and web-based tools. Nucleic Acids
Res 41:D590–D596.
Saitou N, Nei M. 1987. The neighbor-joining method—a new
method for reconstructing phylogenetic trees. MolBiolEvol
Tamura K, Battistuzzi FU, Billing-Ross P, Murillo O, Filipski A, Kumar S.
2012. Estimating divergence times in large molecular phylogenies.
Proc Natl Acad Sci USA 109:19333–19338.
Tamura K, Nei M. 1993. Estimation of the number of nucleotide substi-
tutions in the control region of mitochondrial-DNA in humans and
chimpanzees. Mol Biol Evol 10:512–526.
FIG.3.Tree Explorer window with gene duplications marked with
closed blue diamonds and speciation events, if a trusted species
tree is provided, are identified by open red diamonds (see fig. 2 legend
for more information).
Kumar et al. .doi:10.1093/molbev/msw054 MBE
at :: on April 19, 2016 from
Tamura K, Stecher G, Peterson D, Filipski A, Kumar S. 2013. MEGA6:
molecular evolutionary genetics analysis version 6.0. MolBiolEvol
Yilmaz P, Parfrey LW, Yarza P, Gerken J, Pruesse E, Quast C, Schweer T,
Peplies J, Ludwig W, Glockner FO. 2014. The SILVA and “All-species
Living Tree Project (LTP)” taxonomic frameworks. Nucleic Acids Res
Zmasek CM, Eddy SR. 2001. A simple algorithm to infer gene dupli-
cation and speciation events on a gene tree. Bioinformatics
Molecular Evolutionary Genetics Analysis .doi:10.1093/molbev/msw054 MBE
at :: on April 19, 2016 from
... The predicted proteins of C. liberica transcripts were also used to search homologous genes in the three genomes of coffee species, including C. canephora, C. eugenioides and C. arabica by Blastp method (Altschul et al., 1990;Wanner et al., 1995;Yu et al., 2018;Huang et al., 2020). The proteins with single copy in the four Coffea species were selected to construct the maximum likelihood phylogenetic tree by the MEGA 5.0 software (Kumar et al., 2016). ...
... PAL transcripts were validated for full coding region of open reading frame (ORF) by the ORF-FINDER software and further used for feature characterization including the protein lengths (aa), the molecular weights (Da), the theoretical isoelectric points (pI) and the subcellular localizations with the ProtParam tool and CELLO software (Rombel et al., 2002;Yu et al., 2006). The PAL proteins of the six species were selected for the construction of a maximum likelihood phylogenetic tree, and further aligned to investigate conserved amino acid residues by the MEGA 5.0 software (Kumar et al., 2016). The chromosomal distribution of coffee PAL genes was pointed according to their genome locations by the TBtools software (Chen et al., 2020). ...
Coffea liberica is wildly cultivated for producing Liberian coffee which ranks only after the Arabica (Coffea ara-bica) and Robusta (Coffea canephora) coffees. Another strength of Coffea liberica is its resistance to the devasting leaf rust fungus Hemileia vastatrix. In order to make a comprehensive understanding of the molecular specificity of Coffea liberica on coffee growth and defense, we hence assembled the first leaf transcriptome of Coffea liberica and finally generated 55,446 transcripts with about 70 % of them annotated in public databases. The maximum likelihood phylogenetic tree of single copy proteins in Coffea species revealed that Coffea liberica was closely related with Coffea canephora. Considering the important roles of phenylalanine ammonia-lyase (PAL) genes in growth and defense regulation of plants, we further characterized PALs in four coffee species. The result showed that six, four, four and three PAL genes exist in Coffea arabica, Coffea eugenioides, Coffea canephora and Coffea liberica, respectively. The phylogenetic tree was constructed with maximum likelihood method, which also revealed the species-specific evolution of PAL genes in model plants and coffee species. Although there was a slight expansion of PAL1 subgroup in Coffea arabica, the fractionation after tetraploidy might cause this diversion. Moreover, in silico expression analysis indicated PAL1s had expression dominance of in Coffea arabica. PAL genes were induced by leaf rust fungus, which showed higher induction in the resistant variety Híbrido de Timor (HDT) compared with the susceptible variety Caturra. Taken together, our study not only enriches the bio-information of transcripts in Coffea liberica, but also provides the evidences that PAL genes could control the agronomic traits related to disease resistance and secondary phenylpropanoid metabolites.
... ITS and TEF1α amplicons were sent to Macrogen, Inc (Seoul, South Korea) for Sanger sequencing. Next, consensus sequences were processed using MEGA (Molecular Evolutionary Genetic Analysis, version 7.0®) (Kumar et al. 2016). Finally, the product was compared against the Gen-Bank (NCBI 2016) and BOLD databases (Ratnasingham and Hebert 2007) and deposited in GenBank. ...
Full-text available
Coffee leaf rust (CLR) is caused by Hemileia vastatrix and is the most critical phytosanitary problem for coffee production. So far, no biocontrol agents (BCAs) have been registered, and the selection of native agents is required. The present work evaluates the enzymatic activity of chitinase (, glucanase (, and protease ( in three native CLR mycoparasite isolates from Chiapas, Mexico. Isolates were grown for 10 days on inducing substrate and on urediniospores collected in coffee plantations in Chiapas, Mexico. Isolate CERI-530 exhibited higher chitinase (38,178 ± 2950 U/mg) and glucanase (9720 ± 282 U/mg) activity in the presence of CLR, and chitinase and glucanase zymogram analysis revealed a typical 50 kDa band. Isolate CERI-701 showed additional 40, 30 and 20 kDa chitinase bands in the presence of CLR. Protease activity was visualized for all isolates, also in the presence of CLR. It was found that hydrolytic enzymes play an important role in the CLR-mycoparasite interactions for the strains of our study. Strain CERI-542 spores and supernatant significantly decreased CLR urediniospore germination. Observation under scanning electron microscope revealed that CERI-542 was the most aggressive CLR mycoparasite and had the ability to destroy CLR urediniospores, which also suggests that different action mechanisms are at work. The present study is a first step towards a deeper understanding of the interactions between native mycoparasites and CLR, and reports for the first time the formation of a covering over the urediniospores as part of these interactions.
... In the present study, 162 ITS sequences, 18 of GAPDH and 74 TEF-1α sequences of B. sorokiniana retrieved from NCBI, including 21 ITS sequences obtained in this study, were used to perform multiple sequence alignment (MSA) using ClustalW of MEGA-X [31] as well as to construct the phylogenetic tree using the JTT model with γ distribution and complete deletion of removal or gaps. Finally, the tree was visualized using the iTOL (iTOL v6 verson; ...
Full-text available
Bipolaris sorokiniana is a fungal pathogen that infects wheat, barley, and other crops, causing spot blotch disease. The disease is most common in humid, warm, wheat-growing regions, with South Asia’s Eastern Gangetic Plains serving as a hotspot. There is very little information known about its genetic variability, demography, and divergence period. The current work is the first to study the phylogeographic patterns of B. sorokiniana isolates obtained from various wheat and barley-growing regions throughout the world, with the goal of elucidating the demographic history and estimating divergence times. In this study, 162 ITS sequences, 18 GAPDH sequences, and 74 TEF-1αsequences from B. sorokiniana obtained from the GenBank, including 21 ITS sequences produced in this study, were used to analyse the phylogeographic pattern of distribution and evolution of B. sorokiniana infecting wheat and barley. The degrees of differentiation among B. sorokiniana sequences from eighteen countries imply the presence of a broad and geographically undifferentiated global population. The study provided forty haplotypes. The H_1 haplotype was identified to be the ancestral haplotype, followed by H_29 and H_27, with H_1 occupying a central position in the median-joining network and being shared by several populations from different continents. The phylogeographic patterns of species based on multi-gene analysis, as well as the predominance of a single haplotype, suggested that human-mediated dispersal may have played a significant role in shaping this pathogen’s population. According to divergence time analysis, haplogroups began at the Plio/Pleistocene boundary.
... Nhận diện trình tự gen 16S rRNA thông qua dữ liệu NCBI Blast và Eztaxon Server. Quan hệ di truyền của chủng phân lập được thiết lập bằng phần mềm MEGA 7 sử dụng phương pháp Neighbor-Joining[20].Most commercial antibiotics are derived fromStreptomyces. The search for new potential microbial sources (non-Streptomyces) for antibacterial activity is proposed to prevent drug resistance by current pathogenic microbes. ...
Most commercial antibiotics are derived from Streptomyces. The search for new potential microbial sources (non-Streptomyces) for antibacterial activity is proposed to prevent drug resistance by current pathogenic microbes. Strain C21 was isolated from soil, with small colonies (0.8-1.2 mm in size) on an intensive soil extract medium (ISEM). Sequence analysis of the 16S rRNA gene revealed strain C21 belongs to the group of bacteria that are difficult to cultivate, and is considered a candidate for novel species of genus Microbacterium. Except for R2A medium, strain C21 was only able to grow in nutrient-poor media such as NB/3, TSB/10, and R4/10. N-acetylglucosamine, maltose, D-glucose, L-proline, L-rhamnose, inositol, sodium acetate and 3-hydroxybutyric acid were suitable carbon sources for the growth of strain C21. Crude extract from fermentation liquid of strain C21 can inhibit Enterococcus faecalis CCARM 5168 at a concentration of 16 μg/ml, 8 μg/ml for E. faecalis CCARM 5171, 32 μg/ml for E. faecalis CCARM 5024, 64 μg/ml for E. faeciumCCARM 5025, 32 μg/ml for Streptococcus agalactiaeCCARM 4504, and 8 μg/ml for S. pyogenes CCARM 4520. Results of this study established a basis for future studies on finding potential antibacterial compounds from difficult-to-culture bacteria.
... 7.2.6 for windows and submitted to the Basic Local Alignment Search Tool (BLAST) (https:// to determine species that had the greatest homology or similarity molecularly. The phylogeny tree was designed using the Mega 7 for Windows program (Kumar et al. 2016), using the UPGMA (jukes and cantor model) method. The ITS region sequences for several strains used as a reference were obtained from NCBI (https://www. ...
Full-text available
Fungi from South Sumatra (Indonesia) were identified morphologically and molecularly, and their pathogenicity to egg, larvae, and adult Aedes aegypti was evaluated. The fungal isolates used for bioassay were 11 isolates from this study and 4 isolates from the laboratory collection. Fifteen isolates of five fungal species (Metarhizium anisopliae, Penicillium citrinum, Talaromyces diversus, Beauveria bassiana, and Purpureocillium lilacinum) from South Sumatra, Indonesia, were pathogenic to the egg, larvae, and adult of Ae. aegypti. Egg mortality caused by M. anisopliae isolate MSwTp3 was the highest (38.31%). A novel finding of this study was that the eggs exposed to the fungus not only killed the eggs but could continue to kill the emerging larvae, pupae, and adults. The five fungal species induced larval mortality between 52.22−94.44% and adult mortality between 50.00−92.22%. Fungal strains belonging to M. anisopliae, P. citrinum, T. diversus, and B. bassiana from South Sumatra seem to possess remarkable ovicidal, larvicidal and adulticidal activity against Ae. aegypti. M. anisopliae, P. citrinum, T. diversus, and B. bassiana had the potential as entomopathogens to be developed into ovicides, larvicides, and adulticides for controlling Ae. aegypti.
Controlling bacterial biofilms is a major target for industrial processes and thus is a tedious task under study in many research proceedings. The algal associated bacterial isolates from Unai Mata hot water spring, Gujarat, India, were screened for production of potent amylases, proteases and biosurfactants. The partially purified biomolecules were checked for effects of substrate concentration, cations and stability at extreme physical conditions like temperature (50 ℃) and acidic pH (5 and 6). Metal ions namely—Cu and Zn for amylase, while Zn and Na for protease, were found to be yielding least and highest activities, respectively. Best three isolates were selected for potential biomolecule production and sequenced for 16 s rDNA gene. The three isolates were found to be Stenotrophomonas sp. strain T1UM1, Pantoea sp. strain T1UM4 and Bacillus sp. strain T1UM8 with GenBank accession number MH764436, MH764437 and MH764438, respectively. The partially purified biomolecules cocktail showed effective antibiofilm activities at concentration of 1, 2, 5 and 10 U/ml or mg/ml against various bacteria tested at 1, 2 and 5 h treatments. The effects were different on the type of bacterium containing a combination of specific biomolecules. Thus, biomolecule cocktail reported here from hot water spring isolates has antibiofilm application potential in industries.
The root-knot nematodes, Meloidogyne spp., are endo-parasitic nematodes that can cause significant damage to numerous economically important crops. In a survey of plant pathogens, we have recorded for the first time the infestation of the destructive pest, Meloidogyne arenaria, on black pepper (Piper nigrum) and coffee (Coffea canephora) in Vietnam, the two most important cash crops in the country. Our study characterises four M. arenaria populations associated with black pepper and coffee using morphology, morphometrics, and molecular data of five gene regions, including ITS, D2-D3 of 28S rRNA, COI, COII-16S, and Nad5 mtDNA regions. The detailed morphological data of these nematode populations are useful for the diagnosis of M. arenaria in general. Additionally, molecular barcodes that are unequivocally linked with morphological and morphometric data can facilitate the molecular identification of this pest. Since M. arenaria is known as one of the most destructive plant pathogens, its presence in black pepper and coffee in Vietnam needs to be monitored carefully.
Full-text available
Background Foraminispora rugosa is a species reported from Brazil, Venezuela, French Guiana, Costa Rica and Cuba. It is a basidiomycete in the Ganodermataceae family. In this study, both chemical composition and cytotoxicity of the ethanolic extract of F. rugosa were investigated for the first time. Results Phylogenetic analysis confirmed the identification of the specimens, and the results of cytotoxicity assays showed that at concentrations of 7.8–500.0 µg/mL the ethanolic extract displayed weak cytotoxicity against the tested cell lines. Five oxylipins were identified by ultra high performance liquid chromatography coupled with quadrupole time-of-flight and mass spectrometry (UHPLC-QTOF–MS). Conclusions This study provides new insights into the current knowledge of bioactive compounds produced by macrofungi, and provides data for future biological assays with relative selectivity and safety.
Full-text available
The proteins with DNA-binding preference to the consensus DNA sequence (A/T) GATA (A/G) belong to a GATA transcription factor family, with a wide array of biological processes in plants. Cassava (Manihot esculenta) is an important food crop with high production of starch in storage roots. Little was however known about cassava GATA genes MeGATAs. Thirty-six MeGATAs, MeGATA1 to MeGATA36, were found in this study. Some MeGATAs showed a collinear relationship with orthologous genes of Arabidopsis, poplar, and potato, rice, maize, and sorghum. Eight MeGATA-encoded MeGATAs analyzed were all localized in the nucleus. Some MeGATAs had potentials of binding ligands and/or enzyme activity. One pair of tandem-duplicated MeGATA17-MeGATA18 and thirty pairs of whole genome-duplicated MeGATAs were found. Fourteen MeGATAs showed low or no expression in the tissues. Nine analyzed MeGATAs showed expression responses to abiotic stresses and exogenous phytohormones. Three groups of MeGATA protein interactions were found. Fifty-three miRNAs which can target 18 MeGATAs were identified. Eight MeGATAs were found to target other 292 cassava genes, which were directed to radial pattern formation and phyllome development by GO enrichment, and autophagy by KEGG enrichment. These data suggest that MeGATAs are functional generalists in interactions between cassava growth and development, abiotic stresses, and starch metabolism.
Clipping of the histone H3 N-terminal tail has been implicated in multiple fundamental biological processes for a growing list of eukaryotes. H3 clipping, serving as an irreversible process to permanently remove some post-translational modifications (PTMs), may lead to noticeable changes in chromatin dynamics or gene expression. The eukaryotic model organism Tetrahymena thermophila is among the first few eukaryotes that exhibits H3 clipping activity, wherein the first six amino acids of H3 are cleaved off during vegetative growth. Clipping only occurs in the transcriptionally silent micronucleus of the binucleated T. thermophila, thus offering a unique opportunity to reveal the role of H3 clipping in epigenetic regulation. However, the physiological functions of the truncated H3 and its protease(s) for clipping remain elusive. Here, we review the major findings of H3 clipping in T. thermophila and highlight its association with histone modifications and cell cycle regulation. We also summarize the functions and mechanisms of H3 clipping in other eukaryotes, focusing on the high diversity in terms of protease families and cleavage sites. Finally, we predict several protease candidates in T. thermophila and provide insights for future studies.
Full-text available
Genomic data are rapidly resolving the tree of living species calibrated to time, the timetree of life, which will provide a framework for research in diverse fields of science. Previous analyses of taxonomically restricted timetrees have found a decline in the rate of diversification in many groups of organisms, often attributed to ecological interactions among species. Here we have synthesized a global timetree of life from 2,274 studies representing 50,632 species and examined the pattern and rate of diversification as well as the timing of speciation. We found that species diversity has been mostly expanding overall and in many smaller groups of species, and that the rate of diversification in eukaryotes has been mostly constant. We also identified, and avoided, potential biases that may have influenced previous analyses of diversification including low levels of taxon sampling, small clade size, and the inclusion of stem branches in clade analyses. We found consistency in time-to-speciation among plants and animals-approximately two million years-as measured by intervals of crown and stem species times. Together, this clock-like change at different levels suggests that speciation and diversification are processes dominated by random events and that adaptive change is largely a separate process. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Full-text available
Scientists are assembling sequence data sets from increasing numbers of species and genes to build comprehensive timetrees. However, data are often unavailable for some species and gene combinations, and the proportion of missing data is often large for data sets containing many genes and species. Surprisingly, there has not been a systematic analysis of the effect of the degree of sparseness of the species-gene matrix on the accuracy of divergence time estimates. Here, we present results from computer simulations and empirical data analyses to quantify the impact of missing gene data on divergence time estimation in large phylogenies. We found that estimates of divergence times were robust even when sequences from a majority of genes for most of the species were absent. From the analysis of such extremely sparse data sets, we found that the most egregious errors occurred for nodes in the tree that had no common genes for any pair of species in the immediate descendant clades of the node in question. These problematic nodes can be easily detected prior to computational analyses based only on the input sequence alignment and the tree topology. We conclude that it is best to use larger alignments, since adding both genes and species to the alignment augments the number of genes available for estimating divergence events deep in the tree, and improves their time estimates.
Full-text available
SILVA (from Latin silva, forest, is a comprehensive resource for up-to-date quality-controlled databases of aligned ribosomal RNA (rRNA) gene sequences from the Bacteria, Archaea and Eukaryota domains and supplementary online services. SILVA provides a manually curated taxonomy for all three domains of life, based on representative phylogenetic trees for the small- and large-subunit rRNA genes. This article describes the improvements the SILVA taxonomy has undergone in the last 3 years. Specifically we are focusing on the curation process, the various resources used for curation and the comparison of the SILVA taxonomy with Greengenes and RDP-II taxonomies. Our comparisons not only revealed a reasonable overlap between the taxa names, but also points to significant differences in both names and numbers of taxa between the three resources.
Full-text available
We announce the release of an advanced version of the Molecular Evolutionary Genetics Analysis (MEGA) software, which currently contains facilities for building sequence alignments, inferring phylogenetic histories, and conducting molecular evolutionary analysis. In version 6.0, MEGA now enables the inference of timetrees, as it implements our RelTime method for estimating divergence times for all branching points in a phylogeny. A new Timetree Wizard in MEGA6 facilitates this timetree inference by providing a graphical user interface (GUI) to specify the phylogeny and calibration constraints step-by-step. This version also contains enhanced algorithms to search for the optimal trees under evolutionary criteria and implements a more advanced memory management that can double the size of sequence data sets to which MEGA can be applied. Both GUI and command-line versions of MEGA6 can be downloaded from free of charge.
Full-text available
SILVA (from Latin silva, forest, is a comprehensive web resource for up to date, quality-controlled databases of aligned ribosomal RNA (rRNA) gene sequences from the Bacteria, Archaea and Eukaryota domains and supplementary online services. The referred database release 111 (July 2012) contains 3 194 778 small subunit and 288 717 large subunit rRNA gene sequences. Since the initial description of the project, substantial new features have been introduced, including advanced quality control procedures, an improved rRNA gene aligner, online tools for probe and primer evaluation and optimized browsing, searching and downloading on the website. Furthermore, the extensively curated SILVA taxonomy and the new non-redundant SILVA datasets provide an ideal reference for high-throughput classification of data from next-generation sequencing approaches.
Full-text available
Molecular dating of species divergences has become an important means to add a temporal dimension to the Tree of Life. Increasingly larger datasets encompassing greater taxonomic diversity are becoming available to generate molecular timetrees by using sophisticated methods that model rate variation among lineages. However, the practical application of these methods is challenging because of the exorbitant calculation times required by current methods for contemporary data sizes, the difficulty in correctly modeling the rate heterogeneity in highly diverse taxonomic groups, and the lack of reliable clock calibrations and their uncertainty distributions for most groups of species. Here, we present a method that estimates relative times of divergences for all branching points (nodes) in very large phylogenetic trees without assuming a specific model for lineage rate variation or specifying any clock calibrations. The method (RelTime) performed better than existing methods when applied to very large computer simulated datasets where evolutionary rates were varied extensively among lineages by following autocorrelated and uncorrelated models. On average, RelTime completed calculations 1,000 times faster than the fastest Bayesian method, with even greater speed difference for larger number of sequences. This speed and accuracy will enable molecular dating analysis of very large datasets. Relative time estimates will be useful for determining the relative ordering and spacing of speciation events, identifying lineages with significantly slower or faster evolutionary rates, diagnosing the effect of selected calibrations on absolute divergence times, and estimating absolute times of divergence when highly reliable calibration points are available.
Full-text available
There is a growing need in the research community to apply the molecular evolutionary genetics analysis (MEGA) software tool for batch processing a large number of datasets and to integrate it into analysis workflows. Therefore, we now make available the computing core of the MEGA software as a stand-alone executable (MEGA-CC), along with an analysis prototyper (MEGA-Proto). MEGA-CC provides users with access to all the computational analyses available through MEGA's graphical user interface version. This includes methods for multiple sequence alignment, substitution model selection, evolutionary distance estimation, phylogeny inference, substitution rate and pattern estimation, tests of natural selection and ancestral sequence inference. Additionally, we have upgraded the source code for phylogenetic analysis using the maximum likelihood methods for parallel execution on multiple processors and cores. Here, we describe MEGA-CC and outline the steps for using MEGA-CC in tandem with MEGA-Proto for iterative and automated data analysis.
Full-text available
A computer program package called MEGA has been developed for estimating evolutionary distances, reconstructing phylogenetic trees and computing basic statistical quantities from molecular data. It is written in C++ and is intended to be used on IBM and IBM-compatible personal computers. In this program, various methods for estimating evolutionary distances from nucleotide and amino acid sequence data, three different methods of phylogenetic inference (UPGMA, neighbor-joining and maximum parsimony) and two statistical tests of topological differences are included. For the maximum parsimony method, new algorithms of branch-and-bound and heuristic searches are implemented. In addition, MEGA computes statistical quantities such as nucleotide and amino acid frequencies, transition/transversion biases, codon frequencies (codon usage tables), and the number of variable sites in specified segments in nucleotide and amino acid sequences. Advanced on-screen sequence data and phylogenetic-tree editors facilitate publication-quality outputs with a wide range of printers. Integrated and interactive designs, on-line context-sensitive helps, and a text-file editor make MEGA easy to use.
A new method called the neighbor-joining method is proposed for reconstructing phylogenetic trees from evolutionary distance data. The principle of this method is to find pairs of operational taxonomic units (OTUs [= neighbors]) that minimize the total branch length at each stage of clustering of OTUs starting with a starlike tree. The branch lengths as well as the topology of a parsimonious tree can quickly be obtained by using this method. Using computer simulation, we studied the efficiency of this method in obtaining the correct unrooted tree in comparison with that of five other tree-making methods: the unweighted pair group method of analysis, Farris's method, Sattath and Tversky's method, Li's method, and Tateno et al.'s modified Farris method. The new, neighbor-joining method and Sattath and Tversky's method are shown to be generally better than the other methods.