ArticlePDF Available

The enigmatic Betadevario ramachandrani (Teleostei: Cyprinidae: Danioninae): phylogenetic position resolved by mitogenome analysis, with remarks on the prevalence of chimeric mitogenomes in GenBank



We present the complete mitochondrial genome and a phylogenetic analysis of the danionine cyprinid Betadevario ramachandrani, endemic to the Western Ghats in India. Bayesian phylogenetic analysis of all available mitochondrial genomes of Danionina show that B. ramachandrani is the most basal member of a clade also containing Devario, Microdevario and Microrasbora, and this clade is the sister group of Danio. Seven of 20 mitochondrial genomes downloaded from GenBank for phylogenetic analysis were found to be chimeric, including five curated reference genomes, and this did affect our phylogenetic analysis. At least three of these erroneous sequences have been used in other studies. There is reason to suspect that there are numerous chimeric mitogenomes in GenBank.
The enigmatic Betadevario ramachandrani
(Teleostei: Cyprinidae: Danioninae): phylogenetic
position resolved by mitogenome analysis, with
remarks on the prevalence of chimeric
mitogenomes in GenBank
Michael Norén
*and Sven Kullander
Abstract: We present the complete mitochondrial genome and a phylogenetic
analysis of the danionine cyprinid Betadevario ramachandrani, endemic to the
Western Ghats in India. Bayesian phylogenetic analysis of all available mitochon-
drial genomes of Danionina show that B. ramachandrani is the most basal member
of a clade also containing Devario, Microdevario and Microrasbora, and this clade is
the sister group of Danio. Seven of 20 mitochondrial genomes downloaded from
GenBank for phylogenetic analysis were found to be chimeric, including five curated
reference genomes, and this did affect our phylogenetic analysis. At least three of
these erroneous sequences have been used in other studies. There is reason to
suspect that there are numerous chimeric mitogenomes in GenBank.
Subjects: Ichthyology; Phylogenetic analysis; Conservation Biology
Keywords: DNA barcoding; next generation sequencing; phylogeny
1. Introduction
Betadevario ramachandrani Pramod, Fang, Rema Devi, Liao, Indra, Jameela Beevi & Kullander,
2010 is a small (up to 61 mm SL) danionine cyprinid fish which combines morphological characters
Dr. Michael Norén and Professor Sven Kullander
are members of the ichthyology team at the
Swedish Museum of Natural History (NRM) in
Stockholm, Sweden. Professor Kullander (ORCID:
0000-0001-6075-0266) is the Scientific Curator
for the ichthyological and herpetological collec-
tions of the NRM, team leader for FishBase
Sweden, and an expert on fish taxonomy, spe-
cializing in Cichlidae and Cyprinidae. Dr. Norén
(ORCID: 0000-0003-2561-6760) is Curator of
FishBase Sweden, and specializes in molecular
This study reports on the first sequencing of the
whole mitochondrial genome of the rare carpfish
Betadevario ramachandrani, and performs an
analysis which confirms that it is a primitive rela-
tive of the genus Devario. During analysis, it was
found that 7 out of 20 mitochondrial genomes
downloaded from GenBank for inclusion in this
study were amalgams (chimeras) from several
different species; in one case, the genome is
comprised of sequences from three different
species from three different subfamilies of
Cyprinidae. The erroneous genomes affected the
outcome of the analysis, at least three of the
erroneous genomes have been used in other
studies, and further analysis suggests that the
problem of chimeric genomes in GenBank may be
Norén & Kullander, Cogent Biology (2018), 4: 1525857
© 2018 The Author(s). This open access article is distributed under a Creative Commons
Attribution (CC-BY) 4.0 license.
Received: 11 June 2018
Accepted: 14 September 2018
First Published: 20 September 2018
*Corresponding author: Michael
Norén, Swedish Museum of Natural
History, P.O. Box 50007, SE-10405,
Stockholm, Sweden
Reviewing editor:
Jason Abernathy, USDA-ARS
Southeast Area, USA
Additional information is available at
the end of the article
Page 1 of 8
typical of the genus Danio, such as two pairs of long barbels, with characters typical of the genus
Devario, such as a prominent dark spot immediately posterior to the gill opening, and is the only
species in the genus Betadevario. It is restricted to a single high-altitude stream, 2.27 m wide and
at most 0.3 m deep, in the upper Sita River drainage in India (Pramod et al., 2010). Its limited
known distribution suggests that it is vulnerable to human activities such as fishing, pollution,
logging, or damming, and there are no records of B. ramachandrani after its description. Pramod
et al. (2010) analysed the morphology of B. ramachandrani and mitochondrial CYTB and nuclear
RHO data from DNA samples stored at the Swedish Museum of Natural History and concluded that
B. ramachandrani is a basal species in the clade containing Devario in both morphological and
molecular analyses. In 2017, it was found that despite being stored in alcohol at 80°C, the DNA
of the samples was degraded, with no remaining fragments longer than 450 base pairs (bp), and a
decision was made to use high-throughput sequencing technology to sequence the whole mito-
chondrial genome.
2. Materials and methods
2.1. DNA extraction and sequencing
DNA was extracted from material stored at the Swedish Museum of Natural History (voucher ID: NRM
57,780, capture locality: 13°29ʹ22.8N75°03ʹ53.5E, close to Barkana falls, Karnataka, India) using the
GeneMole automated DNA extraction system (Mole Genetics, Lysaker, Norway) with recommended
Twenty microlitres of DNA extract (sample concentration 14 ng DNA per µL) was sent to
Macrogen Inc. (Seoul, Republic of Korea) for shotgun sequencing (HiSeq X sequencer with TruSeq
DNA nano kit (Illumina, San Diego, USA)), producing 523 million paired reads, 145 bp, with 281 bp
insert. The reads have been deposited at the NCBI Sequence Read Archive (SRA), accession number
SRP157898. Assembly of the mitochondrial genome was performed using the computer software
Geneious v.10 (Biomatters, Auckland, New Zealand) (Kearse et al., 2012). The paired reads were
merged, and unmerged reads and merged reads shorter than 50 bp were deleted, leaving a total
of 25 million merged reads. The reads were quality-trimmed to remove positions with >2%
probability of error, and mapped to a published reference mitochondrial genome (Danio dangila,
Genbank accession number NC_015525). A total of 35,430 reads mapped to the reference
sequence, producing a minimum coverage of 131. The 85% majority rule consensus was extracted,
and annotated by transferring annotations from published reference sequences (Danio dangila
NC_015525, Rasbora daniconius NC_015527, Microdevario nanus NC_015546), with manual
2.2. Phylogenetic analysis
There is at present no consensus on the taxonomy of Cypriniformes, with several novel and partly
conflicting classifications having been proposed in the last few years (Nelson, Grande, & Wilson,
2016; Stout, Tan, Lemmon, Lemmon, & Armbruster, 2016; Van Der Laan, Eschmeyer, & Fricke,
2014). We use Nelson et al. (2016) as main taxonomic reference, but our concept of subtribe
Danionina is from Liao, Kullander, and Fang (2011). Up to two published complete or nearly
complete mitochondrial genomes per species of the subtribe Danionina were downloaded from
GenBank. Seven of the 20 downloaded genomes (GenBank accession numbers AB937094,
KP407138, NC_015528, NC_026122, NC_027688, NC_028526, NC_029771) were chimeric and
were removed from analysis. A total of 13 mitogenomes, representing 12 species and all genera
of Danionina except Chela and Laubuka, were aligned using the MAFFT (Katoh, Misawa, Kuma, &
Miyata, 2002) plug-in for Geneious. All included species share the same gene order. The control
region could not be confidently aligned and was deleted. One unique insertion in the Microdevario
and two in the Danionella genomes were deleted. A distantly related potential danionine,
Sundadanio rubellus (in GenBank under the trade name Sundadanio axelrodi RED, with accession
number AP011401), served as outgroup. The final alignment was 15,872 bp. For the phylogenetic
analysis, the data were partitioned by protein coding or not protein coding, and coding data further
Norén & Kullander, Cogent Biology (2018), 4: 1525857
Page 2 of 8
partitioned based on codon position, for a total of four partitions. Phylogenetic analysis was
performed using the parallel-computing version of the computer software MrBayes v3.2 (http:// (Ronquist et al., 2012), with General Time Reversible model, assuming a
Γdistribution of rates, and estimating the proportion of invariant sites (GTR + Γ+ I) for all partitions,
as recommended by the computer software PartitionFinder2 (
tionfinder) (Lanfear, Frandsen, Wright, Senfeld, & Calcott, 2017) with Akaike information criterion
and restricting to models supported by MrBayes. The analysis was run for 5 million generations,
sampled every 1,000 generations, the first 25% of samples discarded as burn-in, and convergence
was checked with Tracer v1.4 ( (Rambaut & Drummond,
2007). The alignment, with MrBayes data block with data partition scheme and model, is available
in Nexus format from the authors.
Two alternative data partitioning schemes were tried: (1) no partitioning, and (2) partitioning the
dataset by gene, with protein coding genes further partitioned by codon position, resulting in a total of
37 initial partitions, then analysing with the data partitioning (22 partitions; PartitionFinder2 merges
partitions it determines to have similar characteristics) and model for each partition recommended by
PartitionFinder2. All analyses produced trees with differing branch lengths but identical topology and
posterior probability (Bayesian posterior probability = 1 for all nodes in all trees).
3. Results
3.1. The mitochondrial genome of Betadevario ramachandrani
The genome sequence of B. ramachandrani is 16,932 bp and comprises 13 protein coding genes, 22
tRNA genes, 2 rRNA genes and D-loop region (control region). The control region is 1,322 bp, from
position 15,611 to 16,932, and contains 2 repetitious regions. The first, from position 15,633 to
16,034, consists of one 69 bp tandem repeat and one 21 bp tandem repeat; the number of
repetitions is uncertain, as it varied between 3 and 4 for the 69 bp repeat, and 4 and 6 for the
21 bp repeat, depending on assembly parameters. The second repetitious region, from position
16,799 to 16,828, is a dinucleotide microsatellite AT repeat. All protein coding genes start with ATG
(Met), and end with TAA as stop codon, except ND2, ATP8, ND3, ND4 and ND6 which end with TAG.
The 12S (small) ribosomal RNA gene is 956 bp, and the 16S (large) ribosomal RNA gene is 1668 bp.
The mitogenome has a base composition of A (32.5%), C (25.3%), G (15.5%) and T (26.7%).
The complete annotated mitochondrial genome, with GenBank accession number MH817023, is
illustrated in Figure 1.
3.2. Phylogenetic analysis
The result of the phylogenetic analysis is summarized in Figure 2.Esomus metallicus is the most
basal species of Danionina. Danionella is monophyletic, and the sister group of all remaining
Danionina. Danio is monophyletic, and the sister group of a clade comprising Microdevario nanus,
Betadevario ramachandrani, Microrasbora rubescens and Devario devario. Betadevario ramachan-
drani is the sister taxon of M. rubescens +D. devario.
3.3. Chimeric mitogenomes
To test for chimeric sequences, individual genes from the mitochondrial genomes downloaded from
GenBank were BLAST-searched against the nr database of GenBank (17 May 2018). To minimize the
risk of false positives, only genes longer than 600 bp (12S, 16S, ND1, ND2, COX1, COX2, ATP6, COX3,
ND4, ND5, CYTB and D-loop region) were BLAST-searched. Seven genomes were found to be chimeric:
NC_026122, NC_028526, NC_029771, KP407138, AB937094, NC_027688 and NC_015528. Below is a
list of the sequences and genes which appear to be at least partly chimeric, with GenBank accession
number of the top BLAST hit. Genes which appear to be from the target organism are not listed. BLAST
search hits from other suspected chimeric sequences are not reported.
Norén & Kullander, Cogent Biology (2018), 4: 1525857
Page 3 of 8
NC_026122 Devario laoensis:COX1, COX2, ND2, ND4 and ATP6 genes are from a barbine cyprinid of
the genus Schizothorax. COX1 = 100% similar to HQ235962 S. malacanthus; COX2 =98%similarto
KT833107 S. oconnori; ND2 = 91% similar to KC51374 S. waltoni; ND4 = 89% similar to HQ235822
S. malacanthus (note: none of the top 100 search hits were danionines); ATP6 =95%similarto
KT833107 S. oconnori.TheCYTB and 16S rRNA genes are from two other species of Devario
(16S = 99% similar to GQ406286 D. apogon; CYTB = 99% similar to EU241433 D. chrysotaeniatus.Note,
however, that D. laoensis does not occur in China (Fang, 2000), and the source specimen for NC_026122
is likely a misidentified D. chrysotaeniatus.
NC_028526 Danio margaritatus:ND1, ND2, COX1, COX2, ATP6, ND4 and ND6 are from the
labeonine cyprinid Discogobio tetrabarbatus (BLAST similarity to KJ669372 91%, 95%, 90%, 95%,
94%, 90% and 99%, respectively), whereas ND5 and probably also the control region is from the
squaliobarbine cyprinid Ctenopharyngodon idella (ND5 = 95% similar to KM401549 C. idella x
Elopichthys bambusa; control region = 82% similar to KT894100 C. idella; no danionines among
the 100 most similar sequences).
NC_029771 Danio albolineatus:COX1, COX2, ND1, ND2, ND4, ND5 and control region are from
a squaliobarbine cyprinid, probably C. idella (respectively, 91%, 91%, 84%, 88%, 88%, 91% and
82% similar to KM401549 C. idella xE. bambusa or KT894100 C. idella).
KP407138 Devario chrysotaeniatus:ND1, ND2, COX1, ATP6, COX3, ND4, ND5 and CYTB are
99100% similar to MG570437 C. idella. COX2 and ATP6 are 95% similar to MG570437 C. idella.
Aligning MG570437 to KP407138 reveals that the two sequences are nearly identical, differing
mainly in one 1998 bp region from the 3ʹhalf of 12S to near the 3ʹend of 16S, and one 2061 bp
Figure 1. Annotated schematic
representation of the mito-
chondrial genome of
Betadevario ramachandrani.
GenBank accession number
Norén & Kullander, Cogent Biology (2018), 4: 1525857
Page 4 of 8
region from the 3ʹend of COXI to the 3ʹend of ATP6. Submitting these two regions to a BLAST
search reveals that first is 97% similar to Devario laoensis (KP115291), whereas the second is 99%
similar to the cyprinine cyprinid Procypris mera (KM461699).
AB937094 Microdevario kubotai:CYTB, control region, and 12S are 94%, 99% and 94% similar
to AB924546 from the rasborine cyprinid Rasbora borapetensis, respectively.
NC_027688 Danio nigrofasciatus: ND4, ND5, ND6 and CYTB are 100% similar to KT624623
Danio rerio.
NC_015528 Leptobarbus hoevenii: leptobarbine cyprinid downloaded for use as outgroup.
12S = 94% similar to KJ679504 from the xenocypridine cyprinid Hypophthalmichthys nobilis;
16S = 94% similar to KT894100 from squaliobarbine cyprinid C. idella.
4. Discussion
4.1. Phylogenetic position of B. ramachandrani
The result of the phylogenetic analysis is summarized in Figure 2. Our result is, despite different taxon
sampling and an order of magnitudemore molecular data, compatible with the conclusions of Pramod
et al. (2010), which were based on mitochrondrial cytochrome band nuclear rhodopsin data, and
morphology. Both studies support the view that B. ramachandrani is the most basal member of a clade
containing Devario +Microrasbora,andthat(Microdevario +Betadevario +Devario +Microrasbora)are
the sister group of Danio.
Figure 2. Phylogram of all spe-
cies of subtribe Danionina
represented on GenBank on the
17 May 2018, based on
Bayesian analysis of nearly
complete mitochondial genome
sequences, with Sundadanio
rubellus as outgroup. All nodes
have Bayesian posterior prob-
ability = 1. Terminal labels end
with GenBank accession num-
ber. Branch lengths are propor-
tional to number of expected
substitutions per site.
Norén & Kullander, Cogent Biology (2018), 4: 1525857
Page 5 of 8
Figure 3. Two example chimeric genomes aligned to a genome of the main contaminant (Ctenopharyngodon idella). The numbers at top indicate position in the aligned
genomes. The top bar indicates identity: if both sequences are identical, the bar is tall and bright green. As the aligned sequences come from distantly related taxa, they
should not have significant identical regions, so bright green effectively indicates contamination. The grey bars are the aligned sequences, with dissimilar bases highlighted
in black. The bottom bar indicates gene location, with arrows pointing in read direction. (A): KP407138 (Devario chrysotaeniatus) is nearly identical to C. idella, with the
exception of one region spanning parts of the COI,COII and ATP6 genes, and one region spanning parts of the 12S and 16S rRNA genes; the first region is 99% similar to
Procypris mera, a contamination, and the second is 97% similar to Devario laoensis, and possibly correct. (B): Contamination from C. idella in NC_028526 (Danio
margaritatus) is evident in the ND5 gene, and in short regions flanking protein coding genes.
Norén & Kullander, Cogent Biology (2018), 4: 1525857
Page 6 of 8
4.2. Chimeric mitochondrial genomes
Our initial phylogenetic analysis produced surprising results, with three danionine sequences
grouping distantly from all other Danioninae: NC_029771 Danio albolineatusand KP407138
Devario chrysotaeniatuswere recovered as members of Xenocypridinae, while NC_028526
Danio margaritatuswas recovered as a member of Labeoninae. These species are morphologi-
cally highly distinct, and misidentification seemed unlikely. A BLAST search of the individual genes
of the 20 mitogenomes downloaded from GenBank revealed that 7 mitogenomes are partly
chimeric. Six of the chimeric genomes were produced in China, by four different institutions, and
one in Japan, so it is not an issue restricted to one institution. At least three of the chimeric
mitogenomes have been used by other studies, and five are curated GenBank reference genomes
(RefSeq sequences). We have notified GenBank of our findings. There is reason to think that the
problem is not restricted to the subfamily Danioninae, as the contaminant sequences frequently
are from species of different subfamilies of the Cyprinidae.
Analysis suggests the contamination may be of two different types (Figure 3). Some, often
spanning several 1,000 bp, appear to be a result of the person assembling the genome failing
to notice that some reads are from a non-target organism. Other consist of short regions
flanking protein coding genes; we speculate that these regions correspond to primer bind
positions with low or zero read coverage, and that the reference genome used to assemble
the reads was not removed before calculating the consensus sequence.
Performing a BLAST search of GenBanksnr database (11 September 2018) with one of
these short flanking regions (a 211 bp fragment corresponding to the tRNA-Ile, tRNA-Gln and
tRNA-Met genes, from position 3,843 to 4,053 in the NC_029771 genome) found genomes
from seven different subfamilies of Cyprinidae with corresponding regions which were 98100
percent similar. Randomly selecting and investigating one of the suspected chimeric gen-
omes, KJ801524, ostensibly from Chanodichthys erythropterus (in GenBank as Culter erythrop-
terus), revealed that its 12S and 16S rRNA genes are 98 % similar to the corresponding genes
of KJ756343 Hypophthalmichthys nobilis, and almost certainly chimeric.
That a third of the genomes downloaded for this study were chimeric suggests that the
problem may be widespread, and sequence identity an issue for any study which uses data
from downloaded whole mitochondrial genomes. We urge researchers to test downloaded
mitogenomes by doing a GenBank BLAST search for each gene prior to use, to report erroneous
sequences they detect to GenBank, and to check their own sequences before submitting new
genomes to GenBank.
Te Yu Liao (National Sun Yat-sen University, Taiwan) made
the extraction used in this study, and Mazen Sarhan
(Macrogen Europe, Netherlands) provided help with high-
throughput sequencing.
The authors received no direct funding for this research.
Competing Interest
The authors report no conflicts of interest.
Author details
Michael Norén
Sven Kullander
Department of Zoology, Swedish Museum of Natural
History, Stockholm, Sweden.
Citation information
Cite this article as: The enigmatic Betadevario ramachan-
drani (Teleostei: Cyprinidae: Danioninae): phylogenetic
position resolved by mitogenome analysis, with remarks
on the prevalence of chimeric mitogenomes in GenBank,
Michael Norén & Sven Kullander, Cogent Biology (2018), 4:
Fang, F. (2000). A review of Chinese Danio species. Acta
Zootaxonomica Sinica,25, 214227.
Katoh, K., Misawa, K., Kuma, K. I., & Miyata, T. (2002).
MAFFT: A novel method for rapid multiple sequence
alignment based on fast Fourier transform. Nucleic
Acids Research,30(14), 30593066.
Kearse, M., Moir, R., Wilson, A., Stones-Havas, S., Cheung,
M., Sturrock, S., . . . Drummond, A. (2012). Geneious
basic: An integrated and extendable desktop soft-
ware platform for the organization and analysis of
sequence data. Bioinformatics,28(12), 16471649.
Calcott, B. (2017). PartitionFinder 2: New methods
Norén & Kullander, Cogent Biology (2018), 4: 1525857
Page 7 of 8
for selecting partitioned models of evolution for
molecular and morphological phylogenetic ana-
lyses. Molecular Biology and Evolution,34(3), 772
Liao, T. Y., Kullander, S. O., & Fang, F. (2011). Phylogenetic
position of rasborin cyprinids and monophyly of
major lineages among the Danioninae, based on
morphological characters (Cypriniformes:
Cyprinidae). Journal of Zoological Systematics and
Evolutionary Research,49(3), 224232.
Nelson, J. S., Grande, T., & Wilson, M. V. H. (2016). Fishes of
the world (5
ed). Hoboken: John Wiley & Sons.
Pramod, P. K., Fang, F., Devi, K. R., Liao, T. Y., Indra, T. J.,
Beevi, K. S., & Kullander, S. O. (2010). Betadevario
ramachandrani, a new danionine genus and species
from the Western Ghats of India (Teleostei:
Cyprinidae: Danioninae). Zootaxa,2519(1), 3147.
Rambaut, A., & Drummond, A. J. (2007). Tracer version
1.4. Computer software and documentation distrib-
uted by the author. Retrieved from
Ronquist, F., Teslenko, M., Van Der Mark, P., Ayres, D. L.,
Darling, A., Höhna, S., . . . Huelsenbeck, J. P. (2012).
MrBayes 3.2: Efficient Bayesian phylogenetic infer-
ence and model choice across a large model space.
Systematic Biology,61(3), 539542.
Stout, C. C., Tan, M., Lemmon, A. R., Lemmon, E. M., &
Armbruster, J. W. (2016). Resolving Cypriniformes
relationships using an anchored enrichment approach.
BMC Evolutionary Biology,16(1), 244.
Van Der Laan, R., Eschmeyer, W. N., & Fricke, R. (2014).
Family-group names of recent fishes. Zootaxa,3882
(1), 1230.
© 2018 The Author(s). This open accessarticle is distributed undera Creative Commons Attribution (CC-BY) 4.0license.
You are free to:
Share copy and redistribute the material in any medium or format.
Adapt remix, transform, and build upon the material for any purpose, even commercially.
The licensor cannot revoke these freedoms as long as you follow the license terms.
Under the following terms:
Attribution You must give appropriate credit, provide a link to the license, and indicate if changes were made.
You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
No additional restrictions
You may not apply legal terms or technological measures that legally restrict others from doing anything thelicense permits.
Cogent Biology (ISSN: 2331-2025) is published by Cogent OA, part of Taylor & Francis Group.
Publishing with Cogent OA ensures:
Immediate, universal access to your article on publication
High visibility and discoverability via the Cogent OA website as well as Taylor & Francis Online
Download and citation statistics for your article
Rapid online publication
Input from, and dialog with, expert editors and editorial boards
Retention of full copyright of your article
Guaranteed legacy preservation of your article
Discounts and waivers for authors in developing regions
Submit your manuscript to a Cogent OA journal at
Norén & Kullander, Cogent Biology (2018), 4: 1525857
Page 8 of 8
... The quality check process was conducted on FastQC 0.11.8 (Andrews 2010), the main references are as follows conditions: (1) Sequences containing more than 3 N bases were eliminated; (2) High-quality bases (Phred score !20) accounting for less than 60% of sequences were removed; (3) Excluding 3 0 end low-quality bases; (4) Sequences less than 60 bp in length were discarded. The sequences were assembled into contigs using metaSPAdes 3.13 (Nurk et al. 2017) with default parameters, and Betadevario ramachandrani MH817023 was take as reference (Nor en & Kullander 2018), and then the resulting D. interruptus draft mitogenome assembly was further analyzed by comparison to the mitogenome of B. ramachandrani (Nor en & Kullander 2018) both to confirm correct direction assembly of the contigs and identity the starting base position of the D. interruptus mitogenome. ...
... To clarify the phylogenetic position of D. interruptus in subfamily Danioninae, the molecular phylogenetic tree was conducted by the Maximum-likelihood (ML) method based on 13 PCGs of D. interruptus and other 18 published Danionidae species from 8 genera (Devario, Microrasbora, Betadevario, Microdevario, Danio, Barilius, Opsarius, Danionella) (Broughton et al. 2001;Tang et al. 2010;Lavou e et al. 2012;Huang et al. 2016;Hirt et al. 2017;Nor en & Kullander 2018;Kundu et al. 2019;Chen et al. 2022;Song et al. 2022), and mtREV24 þ G þ F (the lowest Bayesian information standard score) was selected as the optimal evolutionary model. The result of phylogenetic analysis was summarized in Figure 3. Six species of the genus Danio were monophyletic, and a sister group to a clade that included D. interruptus, D. kakhienensis, D. devario, M. rubescens, B. ramachandrani, D. laoensis, M. kubotai, and M. nana, and the above 15 species formed a sister group with the genus Danionella. ...
Full-text available
The complete mitochondrial DNA genome of Devario interruptus was sequenced on the Illumina HiSeq platform and found to be 16,735 bp and included 37 genes encoding 13 proteins, 22 tRNAs, two rRNAs, and two non-coding regions. The proportion of nucleotides in mitochondrial genome was T (27.9%), C (23.7%), A (33%), G (15.4%), and the deviation of AT was 60.9%. A Maximum-Likelihood phylogenetic tree was reconstructed using the concatenated mitochondrial protein-coding genes of D. interruptus and other 18 species of fishes. Phylogenetic analysis results supported that D. interruptus was closely related to Devario shanensis. Fundamental genetic data of D. interruptus will be essential for further genetic studies.
... Much more significant characters were mentioned to diagnose Oliotius, for example the very large scales (hence, very low counts), the rows of papillae on the head, and the unique colour pattern. This raises questions about the identification of the material whose sequences were fished on Genbank (Yang et al., 2010;Ren et al., 2020), possibly without due attention to voucher-identification or provenance, a recurrent issue and cause of errors (Norén & Kullander, 2018). ...
Full-text available
Barbodes sellifer, new species, is described from Singapore, the southern Malay Peninsula and Riau (Sumatra). It is distinguished by having, among others, a large triangular to rectangular blotch between the dorsal fin and the midlateral row of scales (+1). Barbodes zakariaismaili, new species, is described from the Jelai watershed of the Pahang drainage. It is distinguished, among others, by having an elongated blotch on the anterior third of scale rows 0 and +1, and a narrow, faint bar between dorsal-fin origin and scale row +1. The existence of the supposed B. binotatus cryptic species is discussed; it does not satisfy any of the criteria under different concepts and this terminology should not be used. Among others, it is made of diagnosable units, and the morphological disparity among the supposed 'cryptic' taxa is not substantially lower than among non-'cryptic' relatives. It is simply a taxonomically difficult group.
Full-text available
Complete mitochondrial genome and phylogenetic analysis of Devario kakhienensis, endemic minnow from southwest China, for the first time was presented. It was determined to be a 16,777 bp long circular molecule and the genome organization was consistent with that of Danioninae species published previously. Based on PCGs, the maximum likelihood phylogenetic analysis supported the close genetic relationship between D. kakhienensis and Devario interruptus. These data would contribute toward the genetic resource enrichment, and provide a valuable framework for future research in completely resolving phylogenetic relationships with the family Danionidae.
Full-text available
Authentic DNA sequences are crucial for reliable evolutionary inference. Concerns about the identification of DNA sequences have been voiced several times in the past but few quantitative studies exist. Mitogenomes play important roles in phylogenetics, phylogeography, population genetics and DNA identification. However, the large number of mitogenomes being published routinely, often in brief data papers, has raised questions about their authenticity. In this study, we quantify problematic mitogenomes of birds and their re-usage in other papers. Of 1876 complete or partial mitogenomes of birds published until 1 January 2020, the authenticity of 1559 could be assessed with sequences of conspecifics. Of these, 78 (5.0%) were found to be problematic, including 45 curated reference sequences. Problems were due to misidentification (33), chimeras of two or three species (23), sequencing errors/numts (18), incorrect sequence assembly (1), mislabelling at GenBank but not in the final paper (2), or vice versa (1). The number of problematic mitogenomes has increased sharply since 2012. Worryingly, these problematic sequences have been re-used 436 times in other papers, including 385 times in phylogenies. No less than 53% of all mitogenomic phylogenies/networks published until 1 January 2020 included at least one problematic mitogenome. Problematic mitogenomes have resulted in incorrect phylogenetic hypotheses and proposals for unwarranted taxonomic revision, and may have compromised comparative analyses and measurements of divergence times. Our results indicate that a major upgrade of quality control measures is warranted. We propose a comprehensive set of measures that may serve as a new standard for publishing mitogenome sequences.
Full-text available
PartitionFinder 2 is a program for automatically selecting best-fit partitioning schemes and models of evolution for phylogenetic analyses. PartitionFinder 2 is substantially faster and more efficient than version 1, and incorporates many new methods and features. These include the ability to analyze morphological datasets, new methods to analyze genome-scale datasets, new output formats to facilitate interoperability with downstream software, and many new models of molecular evolution. PartitionFinder 2 is freely available under an open source license and works on Windows, OSX, and Linux operating systems. It can be downloaded from The source code is available at
Full-text available
Background Cypriniformes (minnows, carps, loaches, and suckers) is the largest group of freshwater fishes in the world (~4300 described species). Despite much attention, previous attempts to elucidate relationships using molecular and morphological characters have been incongruent. In this study we present the first phylogenomic analysis using anchored hybrid enrichment for 172 taxa to represent the order (plus three out-group taxa), which is the largest dataset for the order to date (219 loci, 315,288 bp, average locus length of 1011 bp). Results Concatenation analysis establishes a robust tree with 97 % of nodes at 100 % bootstrap support. Species tree analysis was highly congruent with the concatenation analysis with only two major differences: monophyly of Cobitoidei and placement of Danionidae. Conclusions Most major clades obtained in prior molecular studies were validated as monophyletic, and we provide robust resolution for the relationships among these clades for the first time. These relationships can be used as a framework for addressing a variety of evolutionary questions (e.g. phylogeography, polyploidization, diversification, trait evolution, comparative genomics) for which Cypriniformes is ideally suited. Electronic supplementary material The online version of this article (doi:10.1186/s12862-016-0819-5) contains supplementary material, which is available to authorized users.
Full-text available
Please visit the authors' website for this book: Fishes of the World, Fifth Edition is the only modern, phylogenetically based classification of the world’s fishes. The updated text offers new phylogenetic diagrams that clarify the relationships among fish groups, as well as cutting-edge global knowledge that brings this classic reference up to date. With this resource, you can classify orders, families, and genera of fishes, understand the connections among fish groups, organize fishes in their evolutionary context, and imagine new areas of research. To further assist your work, this text provides representative drawings, many of them new, for most families of fishes, allowing you to make visual connections to the information as you read. It also contains many references to the classical as well as the most up-to-date literature on fish relationships, based on both morphology and molecular biology. The study of fishes is one that certainly requires dedication—and access to reliable, accurate information. With more than 30,000 known species of sharks, rays, and bony fishes, both lobe-finned and ray-finned, you will need to master your area of study with the assistance of the best reference materials available. This text will help you bring your knowledge of fishes to the next level. - Explore the anatomical characteristics, distribution, common and scientific names, and phylogenetic relationships of fishes - Access biological and anatomical information on more than 515 families of living fishes - Better appreciate the complexities and controversies behind the modern view of fish relationships - Refer to an extensive bibliography, which points you in the direction of additional, valuable, and up-to-date information, much of it published within the last few years. 711 pages, Index, Bibliography
Full-text available
The family-group names of animals (superfamily, family, subfamily, supertribe, tribe and subtribe) are regulated by the International Code of Zoological Nomenclature. Family names are particularly important because they are among the most widely used of all technical animal names. Apart from using the correct family-group name according to the Code, it is also important to use one unique universal name (with a fixed spelling) to avoid confusion. We have compiled a list of family-group names for Recent fishes, applied the rules of the Code and, if possible, tried to conserve the names in prevailing recent practice. We list all of the family-group names found to date for Recent fishes (N=2625), together with their author(s) and year of publication. This list can be used in assigning the correct family-group name to a genus or a group of genera. With this publication we contribute to the usage of correct, universal family-group names in the classification of, and for communication about, Recent fishes.
Full-text available
Summary: The two main functions of bioinformatics are the organization and analysis of biological data using computational resources. Geneious Basic has been designed to be an easy-to-use and flexible desktop software application framework for the organization and analysis of biological data, with a focus on molecular sequences and related data types. It integrates numerous industry-standard discovery analysis tools, with interactive visualizations to generate publication-ready images. One key contribution to researchers in the life sciences is the Geneious public application programming interface (API) that affords the ability to leverage the existing framework of the Geneious Basic software platform for virtually unlimited extension and customization. The result is an increase in the speed and quality of development of computation tools for the life sciences, due to the functionality and graphical user interface available to the developer through the public API. Geneious Basic represents an ideal platform for the bioinformatics community to leverage existing components and to integrate their own specific requirements for the discovery, analysis and visualization of biological data.Availability and implementation: Binaries and public API freely available for download at, implemented in Java and supported on Linux, Apple OSX and MS Windows. The software is also available from the Bio-Linux package repository at
Full-text available
Since its introduction in 2001, MrBayes has grown in popularity as a software package for Bayesian phylogenetic inference using Markov chain Monte Carlo (MCMC) methods. With this note, we announce the release of version 3.2, a major upgrade to the latest official release presented in 2003. The new version provides convergence diagnostics and allows multiple analyses to be run in parallel with convergence progress monitored on the fly. The introduction of new proposals and automatic optimization of tuning parameters has improved convergence for many problems. The new version also sports significantly faster likelihood calculations through streaming single-instruction-multiple-data extensions (SSE) and support of the BEAGLE library, allowing likelihood calculations to be delegated to graphics processing units (GPUs) on compatible hardware. Speedup factors range from around 2 with SSE code to more than 50 with BEAGLE for codon problems. Checkpointing across all models allows long runs to be completed even when an analysis is prematurely terminated. New models include relaxed clocks, dating, model averaging across time-reversible substitution models, and support for hard, negative, and partial (backbone) tree constraints. Inference of species trees from gene trees is supported by full incorporation of the Bayesian estimation of species trees (BEST) algorithms. Marginal model likelihoods for Bayes factor tests can be estimated accurately across the entire model space using the stepping stone method. The new version provides more output options than previously, including samples of ancestral states, site rates, site d(N)/d(S) rations, branch rates, and node dates. A wide range of statistics on tree parameters can also be output for visualization in FigTree and compatible software.
Betadevario, new genus, with the single species B. ramachandrani, new species, from Karnataka, southwestern India, is closely related to Devario but differs from it in having two pairs of long barbels (vs. two pairs of short or rudimentary barbels, or barbels absent), wider cleithral spot which extends to cover three scales horizontally (vs. covering only one scale in width), long and low laminar preorbital process (vs. absent or a slender pointed spine-like process) along the anterior margin of the orbit, a unique flank colour pattern with a wide dark band along the lower side, bordered dorsally by a wide light stripe (vs. vertical bars, or stripes narrow and usually in greater number).
A multiple sequence alignment program, MAFFT, has been developed. The CPU time is drastically reduced as compared with existing methods. MAFFT includes two novel techniques. (i) Homo logous regions are rapidly identified by the fast Fourier transform (FFT), in which an amino acid sequence is converted to a sequence composed of volume and polarity values of each amino acid residue. (ii) We propose a simplified scoring system that performs well for reducing CPU time and increasing the accuracy of alignments even for sequences having large insertions or extensions as well as distantly related sequences of similar length. Two different heuristics, the progressive method (FFT‐NS‐2) and the iterative refinement method (FFT‐NS‐i), are implemented in MAFFT. The performances of FFT‐NS‐2 and FFT‐NS‐i were compared with other methods by computer simulations and benchmark tests; the CPU time of FFT‐NS‐2 is drastically reduced as compared with CLUSTALW with comparable accuracy. FFT‐NS‐i is over 100 times faster than T‐COFFEE, when the number of input sequences exceeds 60, without sacrificing the accuracy.
A multiple sequence alignment program, MAFFT, has been developed. The CPU time is drastically reduced as compared with existing methods. MAFFT includes two novel techniques. (i) Homo logous regions are rapidly identified by the fast Fourier transform (FFT), in which an amino acid sequence is converted to a sequence composed of volume and polarity values of each amino acid residue. (ii) We propose a simplified scoring system that performs well for reducing CPU time and increasing the accuracy of alignments even for sequences having large insertions or extensions as well as distantly related sequences of similar length. Two different heuristics, the progressive method (FFT-NS-2) and the iterative refinement method (FFT-NS-i), are implemented in MAFFT. The performances of FFT-NS-2 and FFT-NS-i were compared with other methods by computer simulations and benchmark tests; the CPU time of FFT-NS-2 is drastically reduced as compared with CLUSTALW with comparable accuracy. FFT-NS-i is over 100 times faster than T-COFFEE, when the number of input sequences exceeds 60, without sacrificing the accuracy.
The cyprinid subfamily Danioninae is one of the most important fish groups due to its inclusion of the model fish, Danio rerio. Molecular investigations have shown that species traditionally placed in the Danioninae are non-monophyletic, divided into two groups corresponding to the Danioninae and Opsariichthyinae. The Danioninae are further divided into three lineages, i.e. chedrins, danionins and rasborins. However, morphological characters determining the foregoing groups are unknown. To investigate the interrelationships among major lineages within the Danioninae, a phylogenetic analysis based on 43 morphological characters from 34 taxa was conducted. Parsimony analysis recovers the Danioninae and Opsariichthyinae to be distinguished by the Y-shaped ligament, absent in the Danioninae while present in the Opsariichthyinae. The Danioninae are divided into two tribes, Danionini and Rasborini. The Rasborini, including Boraras, Brevibora, Horadandia, Kottelatia, Rasbora, Rasboroides, Rasbosoma, Trigonopoma and Trigonostigma, are diagnosed by presence of dark supra-anal pigment and subpeduncular streak as well as presence of the rasborin process on epibranchial 4. The Danionini are composed of two subtribes, Danionina and Chedrina, the Danionina including Chela, Danio, Devario, Microdevario and Microrasbora, and the Chedrina comprising Chelaethiops, Esomus, Luciosoma, Megarasbora, Mesobola, Nematabramis, Opsarius, Raiamas and Salmophasia. The Danionina are diagnosed by the unossified interhyal and presence of the danionin foramen in the horizontal limb of the cleithrum while the Chedrina are characterized by the postcleithrum absent or greatly reduced and approximately normal to abdominal ribs when present.