ArticlePDF Available

Abstract and Figures

The Galápagos iguanas are among the oldest vertebrate lineages on the Galápagos archipelago, and the evolutionary history of this clade is of great interest to biologists. We describe here the complete mitochondrial genomes of the marine iguana, Amblyrhynchus cristatus (Genbank accession number: KT277937) and the land iguana Conolophus subcristatus (Genbank accession number: KT277936). The genomes contain 13 protein-coding genes, 22 transfer RNAs, and two ribosomal RNAs genes, as well as a control region (CR). Both species have an identical gene order, which matches that of Iguana iguana. The CR of both Galápagos iguanas features similar tandem repeats units, which are absent in I. iguana. We present a phylogeny of the Iguanidae based on complete mitochondrial genomes, which confirms the sister-group relationship of Galápagos iguanas. These new mitochondrial genomes constitute an important data source for future exploration of the phylogenetic relationships and evolutionary history of the Galápagos iguanas.
Content may be subject to copyright.
ISSN: 1940-1736 (print), 1940-1744 (electronic)
Mitochondrial DNA, Early Online: 1–2
!2015 Taylor & Francis. DOI: 10.3109/19401736.2015.1079863
The complete mitochondrial genomes of the Gala
´pagos iguanas,
Amblyrhynchus cristatus and Conolophus subcristatus
Amy MacLeod
, Iker Irisarri
, Miguel Vences
, and Sebastian Steinfartz
Department of Evolutionary Biology, Zoological Institute, Technische Universita
¨t Braunschweig, Braunschweig, Germany and
Laboratory for
Zoology and Evolutionary Biology, Department of Biology, University of Konstanz, Konstanz, Germany
The Gala
´pagos iguanas are among the oldest vertebrate lineages on the Gala
archipelago, and the evolutionary history of this clade is of great interest to biologists. We
describe here the complete mitochondrial genomes of the marine iguana, Amblyrhynchus
cristatus (Genbank accession number: KT277937) and the land iguana Conolophus subcristatus
(Genbank accession number: KT277936). The genomes contain 13 protein-coding genes, 22
transfer RNAs, and two ribosomal RNAs genes, as well as a control region (CR). Both species
have an identical gene order, which matches that of Iguana iguana. The CR of both Gala
iguanas features similar tandem repeats units, which are absent in I. iguana. We present a
phylogeny of the Iguanidae based on complete mitochondrial genomes, which confirms the
sister-group relationship of Gala
´pagos iguanas. These new mitochondrial genomes constitute
an important data source for future exploration of the phylogenetic relationships and
evolutionary history of the Gala
´pagos iguanas.
Evolutionary relationships, Gala
archipelago, Iguanidae, Islands,
Received 15 July 2015
Revised 20 July 2015
Accepted 29 July 2015
Published online 3 September 2015
The Gala
´pagos iguanas are sister taxa that diverged from a
common ancestor on the Gala
´pagos archipelago some 4.5 million
years ago (MacLeod et al., 2015). This clade, including one
species of marine iguana (genus Amblyrynchus) and three species
of land iguana (genus Conolophus), represents one of the most
ancient endemic clades of the Gala
´pagos, and molecular studies
of them have produced important insights into evolutionary
processes (e.g. MacLeod et al., 2015; Rassmann et al., 1997;
Steinfartz et al., 2009). We describe here the first complete
mitochondrial genome sequences of Amblyrhynchus cristatus and
Conolophus subcristatus. In addition, we present a mitogenomic
phylogeny of the Iguanidae, which illustrates the position of the
´pagos iguanas in relation to other species for which complete
mitochondrial genomes are available.
Genomic DNA was isolated from blood samples collected with
permission of the Gala
´pagos National Park service. A. cristatus
was sampled on San Cristo
´bal Island, and C. subcristatus on
Fernandina Island. Mitochondrial genomes were amplified and
sequenced using a primer-walking approach. PCR involved
primers available from the literature, as well as new primers
developed for the study (details available upon request).
Sequences were assembled in CodonCode (V4.2.5; CodonCode
Corporation, Dedham, MA) and full length mitogenomes
were annotated with the MITOS pipeline (Bernt et al., 2013)
and manually checked by comparison to Iguana iguana
(Genbank accession number: NC002793). Nucleotide gene
alignments were performed with TranslatorX (Abascal et al.,
2010) and MAFFT (Katoh & Standley, 2013) for protein-coding
and ribosomal RNA genes respectively, and concatenated into a
single matrix for phylogenetic analysis. Maximum likelihood
phylogenetic reconstruction was performed with RAxML v.8.1.16
(Stamatakis, 2014) using independent gene partitions under
GTR + G and 100 independent maximum likelihood searches.
Branch support was assessed with 1000 replicates of non-
parametric bootstrapping. A Bayesian analysis was performed
with PhyloBayes MPI v.1.5 (Lartillot et al., 2013) under the CAT-
GTR model and by running two MCMC chains until convergence.
The mitogenomes of A. cristatus and C. subcristatus are,
respectively, 16 897 bp and 16 892 bp long. Gene order was
identical in both species and matches exactly with that of I.
iguana, corresponding to the ancestral vertebrate mitochondrial
gene order. Our findings confirm the repetitive elements found in
the control region in previous studies (Hanley & Caccone, 2005),
with both Gala
´pagos iguanas displaying four repetitive regions of
73 and 74 bp for Conolophus and Amblyrynchus, respectively.
These repetitive sequences have a high sequence identity (uncor-
rected p-distance: 0.35), implying that they have probably evolved
from a common ancestor. However, such repetitive elements are
absent in the CR of both I. iguana (Janke et al., 2001) and Cyclura
pinguis (Genbank accession number: NC027089), and may prove
interesting for future phylogenetic applications. Our phylogenetic
reconstruction confirms the sister-group relationship between the
´pagos iguanas with strong support, and indicates a relatively
close relationship with Iguana and Cyclura (Figure 1), confirming
earlier work (Pyron et al., 2013). The mitogenomes presented here
will enhance further study into the evolutionary relationships of
the Gala
´pagos iguanas as mitogenomes of additional species and
lineages of these lizards become available.
Correspondence: Amy MacLeod, Department of Evolutionary Biology,
Zoological Institute, Technische Universita
¨t Braunschweig,
Mendelssohnstr. 4, 38106 Braunschweig, Germany. E-mail:
Downloaded by [] at 08:12 11 September 2015
The authors thank K. Rassmann and F. Trillmich for sample collection,
M. Kondermann, J. Juras and L. M. Ko
¨pping for assistance with the
laboratory work, and A. Meyer for logistic assistance. This publication is
contribution number 2119 of the Charles Darwin Foundation for the
Galapagos Islands.
Declaration of interest
This study was supported by grants from the Swiss Friends of the
´pagos Islands. I. Irisarri was supported by postdoc fellowships of the
Alexander von Humboldt Foundation and EMBO. The authors declare
that they have no conflicting interests.
Abascal F, Zardoya R, Telford MJ. (2010). TranslatorX: Multiple
alignment of nucleotide sequences guided by amino acid translations.
Nucleic Acids Res 38:W7–13.
Bernt M, Donath A, Ju
¨hling F, Externbrink F, Florentz C, Fritzsch G, Pu
J, et al. (2013). MITOS: Improved de novo metazoan mitochondrial
genome annotation. Mol Phylogenet Evol 69:313–19.
Hanley T, Caccone A. (2005). Development of primers to characterize the
mitochondrial control region of Gala
´pagos land and marine iguanas
(Conolophus and Amblyrhynchus). Mol Ecol Notes 5:599–601.
Janke A, Erpenbeck D, Nilsson M, Arnason U. (2001). The mitochondrial
genomes of the iguana (Iguana iguana) and the caiman (Caiman
crocodylus): Implications for amniote phylogeny. Proc R Soc B 268:
Katoh K, Standley DM. (2013). MAFFT multiple sequence alignment
software version 7: Improvements in performance and usability. Mol
Biol Evol 30:772–80.
Lartillot N, Rodrigue N, Stubbs D, Richer J. (2013). PhyloBayes MPI:
Phylogenetic reconstruction with infinite mixtures of profiles in a
parallel environment. Syst Biol 62:611–15.
MacLeod A, Rodrı
´guez A, Vences M, Orozco-terWengel P, Garcı
Trillmich F, Gentile G, et al. (2015). Hybridization masks speciation in
the evolutionary history of Gala
´pagos marine iguana. Proc R Soc B
Pyron RA, Burbrink FT, Wiens JJ. (2013). A phylogeny and revised
classification of Squamata, including 4161 species of lizards and
snakes. BMC Evol Biol 13:93.
Rassmann K, Tautz D, Trillmich F, Gliddon C. (1997). The microevo-
lution of the Galapagos marine iguana Amblyrhynchus cristatus
assessed by nuclear and mitochondrial genetic analyses. Mol Ecol 6:
Stamatakis A. (2014). RAxML version 8: A tool for phylogenetic analysis
and post-analysis of large phylogenies. Bioinformatics 30:1312–13.
Steinfartz S, Glaberman S, Lanterbecq D, Russello MA, Rosa S, Hanley
TC, Marquez C, et al. (2009). Progressive colonization and restricted
gene flow shape island-dependent population structure in Galapagos
marine iguanas (Amblyrhynchus cristatus). BMC Evol Biol 9:297.
Figure 1. Maximum likelihood (RAxML) tree
from full mitogenome data including a
selection of iguanid species and two out-
groups (Chamaeleo and Brookesia), which
are not shown in the figure for the sake of
clarity. Numbers at nodes are support values
from non-parametric bootstrap, and posterior
probabilities from the Bayesian analysis
(PhyloBayes), both transformed into percent.
Both trees agree on the close sister-relation-
ship of the Galapagos iguanas,
Amblyrhynchus cristatus and Conolophus
subcristatus. Genbank accession numbers:
Cyclura pinguis NC027089, Urosaurus
nigricaudus NC026308, Uta stansburiana
NC027261, Anolis carolinensis NC010972,
Polychrus marmoratus NC012839,
Chalarodon madagascariensis NC012836,
Leiocephalus personatus NC012834,
Gambelia wislizenii NC012831, Basiliscus
vittatus NC012829, Oplurus grandidieri
NC012827, Sceloporus occidentalis
NC005960, Iguana iguana NC002793, and
Anolis carolinensis EU747728.2.
2A. MacLeod et al. Mitochondrial DNA, Early Online: 1–2
Downloaded by [] at 08:12 11 September 2015
... The area under the curve (AUC), representing the performance of PREQUAL as classifier, was 0.965 and 0.992 for frameshifts and misannotations, respectively. The default threshold derived from simulated data was further evaluated on various real phylogenomic datasets [3,16,17]. A careful inspection of the results suggested that the overwhelming majority of errors identifiable by eye were removed using the default PP threshold, showing its appropriateness for a wide range of datasets characterized by very different levels of sequence divergence, from ca. 20 to 2000 million years ago. ...
Full-text available
Large-scale multigene datasets used in phylogenomics and comparative genomics often contain sequence errors inherited from source genomes and transcriptomes. These errors typically manifest as stretches of non-homologous characters and derive from sequencing, assembly, and/or annotation errors. The lack of automatic tools to detect and remove sequence errors leads to the propagation of these errors in large-scale datasets. PREQUAL is a command line tool that identifies and masks regions with non-homologous adjacent characters in sets of unaligned homologous sequences. PREQUAL uses a full probabilistic approach based on pair hidden Markov models. On the front end, PREQUAL is user-friendly and simple to use while also allowing full customization to adjust filtering sensitivity. It is primarily aimed at amino acid sequences but can handle protein-coding nucleotide sequences. PREQUAL is computationally efficient and shows high sensitivity and accuracy. In this chapter, we briefly introduce the motivation for PREQUAL and its underlying methodology, followed by a description of basic and advanced usage, and conclude with some notes and recommendations. PREQUAL fills an important gap in the current bioinformatics tool kit for phylogenomics, contributing toward increased accuracy and reproducibility in future studies.
... The ROC curves were also used to derive the default PP threshold, chosen so that 95% of correct amino acids were retained while removing >90% of frameshift and misannotation errors. This threshold was further validated by using PREQUAL on empirical datasets characterized by very different levels of sequence divergences (ranging from 20 MYA to 1 BYA; Burki et al., 2016;Irisarri and Meyer, 2016;MacLeod et al., 2016). Supplementary Figure S2 shows an example of MSA from unfiltered and filtered sequences. ...
Full-text available
Phylogenomic datasets invariably contain undetected stretches of non-homologous characters due to poor-quality sequences or erroneous gene models. The large-scale multigene nature of these datasets renders impractical or impossible detailed manual curation of sequences, but few tools exist that can automate this task. To address this issue, we developed a new method that takes as input a set of unaligned homologous sequences and uses an explicit probabilistic approach to identify and mask regions with non-homologous adjacent characters. These regions are defined as sharing no statistical support for homology with any other sequence in the set, which can result from e.g. sequencing errors or gene prediction errors creating frameshifts. Our methodology is implemented in the program PREQUAL, which is a fast and accurate tool for high-throughput filtering of sequences. The program is primarily aimed at amino acid sequences, although it can handle protein coding DNA sequences as well. It is fully customizable to allow fine-tuning of the filtering sensitivity. Availability and implementation: The program PREQUAL is written in C/C ++ and available through a GNU GPL v3.0 at Contact:, Supplementary information: Supplementary data are available at Bioinformatics online.
Full-text available
The populations of native iguanas in the Caribbean Lesser Antilles are threatened by the wide occurrence and spread of non-native iguanas. Until recently, competitive hybridization was not believed to threaten the Saba Green Iguana, a subpopulation of Iguana iguana (Linnaeus, 1758) from the island of Saba. However, the arrival of non-native iguanas has put the native population at risk, leading to a change in the conservation status of the Saba Green Iguana to Critically Endangered, according to guidelines from the International Union for the Conservation of Nature. Here, we generated the complete mitogenome of the Saba Green Iguana using Oxford Nanopore long-read technology. The mitogenome is 16,626 bp long and has 13 protein-coding genes, 22 tRNA genes, 2 rRNA genes, and a control region (1194 bp). Noteworthy, this is only the second published mitogenome for the Iguana iguana species complex, despite the known high intraspecific genetic variation.
The genus Cyclura includes nine extant species and six subspecies of West Indian Rock Iguanas and is one of the most imperiled genera of squamate reptiles globally. An understanding of species diversity, evolutionary relationships, diversification, and historical biogeography in this group is crucial for implementing sound long-term conservation strategies. We collected DNA samples from 1-10 individuals per taxon from all Cyclura taxa (n = 70 ingroup individuals), focusing where possible on incorporating individuals from different populations of each species. We also collected 1-2 individuals from each of seven outgroup species of iguanas (Iguana delicatissima; five Ctenosaura species) and Anolis sagrei (n = 12 outgroup individuals). We used targeted genomic sequence capture to isolate and to sequence 1,872 loci comprising of 687,308 base pairs (bp) from each of the 82 individuals from across the nuclear genome. We extracted mitochondrial reads and assembled and annotated mitogenomes for all Cyclura taxa plus outgroup species. We present well-supported phylogenomic gene tree/species tree analyses for all extant species of Cyclura using ASTRAL-III, SVDQuartets, and starBEAST methods, and discuss the taxonomic, biogeographic, and conservation implications of these data. We find a most recent common ancestor of the genus 9.91 million years ago. The earliest divergence within Cyclura separates C. pinguis from a clade comprising all other Cyclura. Within the latter group, a clade comprising C. carinata from the southern Lucayan Islands and C. ricordii from Hispaniola is the sister taxon to a clade comprising the other Cyclura. Among the other Cyclura, the species C. cornuta and C. stejnegeri (from Hispaniola and Isla Mona) form the sister taxon to a clade of species from Jamaica (C. collei), Cuba and Cayman Islands (C. nubila and C. lewisi), and the eastern (C. rileyi) and western (C. cychlura) Lucayan Islands. Cyclura cychlura and C. rileyi form a clade whose sister taxa are C. nubila and C. lewisi. Cyclura collei is the sister taxon to these three species combined.
Full-text available
Communication between individuals via molecules, termed chemosignaling, is widespread among animal and plant species. However, we lack knowledge on the specific functions of the substances involved for most systems. The femoral gland is an organ that secretes a waxy substance involved in chemical communication in lizards. While the lipids and volatile substances secreted by the femoral glands have been investigated in several biochemical studies, the protein composition and functions of secretions remain completely unknown. Applying a proteomic approach, we provide the first attempt to comprehensively characterize the protein composition of femoral gland secretions from the Galápagos marine iguana. Using samples from several organs, the marine iguana proteome was assembled by next-generation sequencing and mass spectrometry, resulting in 7,513 proteins. Of these, 4,305 proteins were present in the femoral gland, including keratins, small serum proteins, and fatty acid-binding proteins. Surprisingly, no proteins with discernible roles in partner recognition or inter-species communication could be identified. However, we did find several proteins with direct associations to the innate immune system, including lysozyme C, antileukoproteinase (ALP), pulmonary surfactant protein (SFTPD), and galectin (LGALS1) suggesting that the femoral glands function as an important barrier to infection. Furthermore, we report several novel anti-microbial peptides from the femoral glands that show similar action against Escherichia coli and Bacillus subtilis such as oncocin, a peptide known for its effectiveness against Gram-negative pathogens. This proteomics dataset is a valuable resource for future functional protein analysis and demonstrates that femoral gland secretions also perform functions of the innate immune system.
Full-text available
Vertebrates with terrestrial or freshwater ancestors colonized the sea from the Early Triassic onward and became competitively dominant members of many marine ecosystems throughout the Mesozoic and Cenozoic eras. The circumstances that led to initial marine colonization have, however, received little attention. One hypothesis is that mass extinction associated with ecosystem collapse provided opportunities for clades of amphibians, reptiles, birds, and mammals to enter marine environments. Another is that competitive pressures in donor ecosystems on land and in freshwater, coupled with abundant food in nearshore marine habitats, favored marine colonization. Here we test these hypotheses by compiling all known secondarily marine amniote clades and their times of colonization. Marine amniotes are defined as animals whose diet consists primarily of marine organisms and whose locomotion includes swimming, diving, or wading in salt water. We compared the number of clades entering during recovery phases from mass extinctions with the rate of entry of clades during nonrecovery intervals of the Mesozoic and Cenozoic. We conservatively identify 69 marine colonizations by amniotes. The only recovery interval for which prior mass extinction could have been a trigger for marine entry is the Early Triassic, when four clades colonized the sea over 7 Myr, significantly above the rates at which clades entered during other intervals. High nearshore productivity was a greater enticement to colonization than was a low diversity of potential marine competitors or predators in nearshore environments of a highly competitive terrestrial or freshwater donor biota. Rates of marine entry increased during the Cenozoic, in part because of rising productivity and in part thanks to the participation of warm-blooded birds and mammals, which broadened the range of thermal environments in which initial colonization of the sea became possible.
Marine iguanas are among the most highly emblematic taxa of the Galápagos archipelago but have paradoxically received little attention from taxonomists. Amblyrhynchus is currently considered as a monotypic genus with a total of seven subspecies, A. cristatus cristatus, A. c. nanus, A. c. venustissimus, A. c. albemarlensis, A. c. hassi, A. c. mertensi and A. c. sielmanni. Although consensually followed for more than half a century, this classification does not properly reflect the main natural subdivisions inferred by more recent molecular evolutionary studies. We integrate population genetics, phylogenomics and comparative morphology to propose an updated taxonomy reflecting the evolutionary history of this group. We recognize a single species with 11 divergent population clusters at the level of subspecies: A. c. albermarlensis is recognized as a junior synonym of A. c. cristatus, and five new subspecies are described: A. c. godzilla subsp. nov. (San Cristóbal-Punta Pitt), A. c. jeffreysi subsp. nov. (Wolf and Darwin), A. c. hayampi subsp. nov. (Marchena), A. c. trillmichi subsp. nov. (Santa Fé) and A. c. wikelskii subsp. nov. (Santiago). Recognizing the genetically divergent population clusters as subspecies also highlights several of them as management units in need of conservation efforts, such as the two subspecies endemic to San Cristóbal.
Full-text available
The effects of the direct interaction between hybridization and speciation-two major contrasting evolutionary processes-are poorly understood. We present here the evolutionary history of the Galápagos marine iguana (Amblyrhynchus cristatus) and reveal a case of incipient within-island speciation, which is paralleled by between-island hybridization. In-depth genome-wide analyses suggest that Amblyrhynchus diverged from its sister group, the Galápagos land iguanas, around 4.5 million years ago (Ma), but divergence among extant populations is exceedingly young (less than 50 000 years). Despite Amblyrhynchus appearing as a single long-branch species phylogenetically, we find strong population structure between islands, and one case of incipient speciation of sister lineages within the same island-ostensibly initiated by volcanic events. Hybridization between both lineages is exceedingly rare, yet frequent hybridization with migrants from nearby islands is evident. The contemporary snapshot provided by highly variable markers indicates that speciation events may have occurred throughout the evolutionary history of marine iguanas, though these events are not visible in the deeper phylogenetic trees. We hypothesize that the observed interplay of speciation and hybridization might be a mechanism by which local adaptations, generated by incipient speciation, can be absorbed into a common gene pool, thereby enhancing the evolutionary potential of the species as a whole. © 2015 The Author(s) Published by the Royal Society. All rights reserved.
Full-text available
Phylogenies are increasingly used in all fields of medical and biological research. Moreover, because of the next generation sequencing revolution, datasets used for conducting phylogenetic analyses grow at an unprecedented pace. RAxML (Randomized Axelerated Maximum Likelihood) is a popular program for phylogenetic analyses of large datasets under maximum likelihood. Since the last RAxML paper in 2006, it has been continuously maintained and extended to accommodate the increasingly growing input datasets and to serve the needs of the user community. I present some of the most notable new features and extensions of RAxML, such as, a substantial extension of substitution models and supported data types, the introduction of SSE3, AVX, and AVX2 vector intrinsics, techniques for reducing the memory requirements of the code and a plethora of operations for conducting post-analyses on sets of trees. In addition, an up-to-date, 50 page user manual covering all new RAxML options is available. The code is available under GNU GPL at
Full-text available
Background The extant squamates (>9400 known species of lizards and snakes) are one of the most diverse and conspicuous radiations of terrestrial vertebrates, but no studies have attempted to reconstruct a phylogeny for the group with large-scale taxon sampling. Such an estimate is invaluable for comparative evolutionary studies, and to address their classification. Here, we present the first large-scale phylogenetic estimate for Squamata. Results The estimated phylogeny contains 4161 species, representing all currently recognized families and subfamilies. The analysis is based on up to 12896 base pairs of sequence data per species (average = 2497 bp) from 12 genes, including seven nuclear loci (BDNF, c-mos, NT3, PDC, R35, RAG-1, and RAG-2), and five mitochondrial genes (12S, 16S, cytochrome b, ND2, and ND4). The tree provides important confirmation for recent estimates of higher-level squamate phylogeny based on molecular data (but with more limited taxon sampling), estimates that are very different from previous morphology-based hypotheses. The tree also includes many relationships that differ from previous molecular estimates and many that differ from traditional taxonomy. Conclusions We present a new large-scale phylogeny of squamate reptiles that should be a valuable resource for future comparative studies. We also present a revised classification of squamates at the family and subfamily level to bring the taxonomy more in line with the new phylogenetic hypothesis. This classification includes new, resurrected, and modified subfamilies within gymnophthalmid and scincid lizards, and boid, colubrid, and lamprophiid snakes.
Full-text available
Modeling across site variation of the substitution process is increasingly recognized as important for obtaining more accurate phylogenetic reconstructions. Both finite and infinite mixture models have been proposed, and have been shown to significantly improve on classical single-matrix models. Compared to their finite counterparts, infinite mixtures have a greater expressivity. However, they are computationally more challenging. This has resulted in practical compromises in the design of infinite mixture models. In particular, a fast but simplified version of a Dirichlet process model over equilibirum frequency profiles implemented in PhyloBayes (Lartillot et al, 2007) has often been used in recent phylogenomics studies, while more refined model structures, more realistic and empirically more fit, have been practically out of reach.We introduce an Message Passing Interface (MPI) version of PhyloBayes, implementing the Dirichlet process mixture models as well as more classical empirical matrices and finite mixtures. The parallelization is made efficient thanks to the combination of two algorithmic strategies: a partial Gibbs sampling update of the tree topology, and the use of a truncated stick-breaking representation for the Dirichlet process prior. The implementation shows close to linear gains in computational speed for up to 64 cores, thus allowing faster phylogenetic reconstruction under complex mixture models.PhyloBayes MPI is freely available from our website
Full-text available
We report a major update of the MAFFT multiple sequence alignment program. This version has several new features, including options for adding unaligned sequences into an existing alignment, adjustment of direction in nucleotide alignment, constrained alignment and parallel processing, which were implemented after the previous major update. This report shows actual examples to explain how these features work, alone and in combination. Some examples incorrectly aligned by MAFFT are also shown to clarify its limitations. We discuss how to avoid misalignments, and our ongoing efforts to overcome such limitations.
Full-text available
Background Every year the human population encounters epidemic outbreaks of influenza, and history reveals recurring pandemics that have had devastating consequences. The current work focuses on the development of a robust algorithm for detecting influenza strains that have a composite genomic architecture. These influenza subtypes can be generated through a reassortment process, whereby a virus can inherit gene segments from two different types of influenza particles during replication. Reassortant strains are often not immediately recognised by the adaptive immune system of the hosts and hence may be the source of pandemic outbreaks. Owing to their importance in public health and their infectious ability, it is essential to identify reassortant influenza strains in order to understand the evolution of this virus and describe reassortment pathways that may be biased towards particular viral segments. Phylogenetic methods have been used traditionally to identify reassortant viruses. In many studies up to now, the assumption has been that if two phylogenetic trees differ, it is because reassortment has caused them to be different. While phylogenetic incongruence may be caused by real differences in evolutionary history, it can also be the result of phylogenetic error. Therefore, we wish to develop a method for distinguishing between topological inconsistency that is due to confounding effects and topological inconsistency that is due to reassortment. Results The current work describes the implementation of two approaches for robustly identifying reassortment events. The algorithms rest on the idea of significance of difference between phylogenetic trees or phylogenetic tree sets, and subtree pruning and regrafting operations, which mimic the effect of reassortment on tree topologies. The first method is based on a maximum likelihood (ML) framework (MLreassort) and the second implements a Bayesian approach (Breassort) for reassortment detection. We focus on reassortment events that are found by both methods. We test both methods on a simulated dataset and on a small collection of real viral data isolated in Hong Kong in 1999. Conclusions The nature of segmented viral genomes present many challenges with respect to disease. The algorithms developed here can effectively identify reassortment events in small viral datasets and can be applied not only to influenza but also to other segmented viruses. Owing to computational demands of comparing tree topologies, further development in this area is necessary to allow their application to larger datasets.
Full-text available
We present TranslatorX, a web server designed to align protein-coding nucleotide sequences based on their corresponding amino acid translations. Many comparisons between biological sequences (nucleic acids and proteins) involve the construction of multiple alignments. Alignments represent a statement regarding the homology between individual nucleotides or amino acids within homologous genes. As protein-coding DNA sequences evolve as triplets of nucleotides (codons) and it is known that sequence similarity degrades more rapidly at the DNA than at the amino acid level, alignments are generally more accurate when based on amino acids than on their corresponding nucleotides. TranslatorX novelties include: (i) use of all documented genetic codes and the possibility of assigning different genetic codes for each sequence; (ii) a battery of different multiple alignment programs; (iii) translation of ambiguous codons when possible; (iv) an innovative criterion to clean nucleotide alignments with GBlocks based on protein information; and (v) a rich output, including Jalview-powered graphical visualization of the alignments, codon-based alignments coloured according to the corresponding amino acids, measures of compositional bias and first, second and third codon position specific alignments. The TranslatorX server is freely available at
Primers were developed for the amplification and sequencing of the mitochondrial control region of Galápagos land (Conolophus) and marine (Amblyrhynchus) iguanas. Sequences were obtained for four land iguana samples from two islands and for 28 marine iguana samples from three islands. A series of 70–80 bp tandem repeats adjacent to the control region are described and preliminary quantification of intra- and interspecific sequence divergence is included.
Marine iguanas may have inhabited the Galápagos archipelago and its former, now sunken islands for more than 10 million years (Myr). It is therefore surprising that morphological and immunological data indicate little evolutionary divergence within the genus. We utilized mitochondrial DNA (mtDNA) sequence analyses and nuclear DNA fingerprinting to re-evaluate the level and pattern of genetic differentiation among 22 marine iguana populations from throughout the archipelago. Both genetic marker systems detect a low level of within-genus divergence, but they show contrasting levels of geographical subdivision among the populations. The mitochondrial gene pools of populations from different regions of the archipelago are isolated, and the mtDNA pattern appears to follow the sequence in which the islands were colonized by marine iguanas. Conversely, the nuclear DNA study indicates substantial interpopulational gene exchange, and the geographical distribution of the nuclear markers seems to be determined by isolation by distance among the populations. The natural history of marine iguanas suggests that the contrasting nuclear and mitochondrial DNA patterns result from an asymmetric migration behaviour of the two sexes, with higher (active and passive) interisland dispersal for males than females. Separate genetic analyses for the sexes appear to support this hypophesis. Based on these findings, a scenario is proposed that explains the marine iguanas' low genetic divergence, notwithstanding their long evolutionary history in the Galápagos archipelago.