A preview of this full-text is provided by The Royal Society.
Content available from Philosophical Transactions B
This content is subject to copyright.
The unholy trinity: taxonomy, species delimitation
and DNA barcoding
Rob DeSalle*, Mary G. Egan and Mark Siddall
Division of Invertebrate Zoology, American Museum of Natural History, 79th Street at Central Park West,
New York, NY 10024, USA
Recent excitement over the development of an initiative to generate DNA sequences for all named
species on the planet has in our opinion generated two major areas of contention as to how this ‘DNA
barcoding’ initiative should proceed. It is critical that these two issues are clarified and resolved,
before the use of DNA as a tool for taxonomy and species delimitation can be universalized. The first
issue concerns how DNA data are to be used in the context of this initiative; this is the DNA barcode
reader problem (or barcoder problem). Currently, many of the published studies under this initiative
have used tree building methods and more precisely distance approaches to the construction of the
trees that are used to place certain DNA sequences into a taxonomic context. The second problem
involves the reaction of the taxonomic community to the directives of the ‘DNA barcoding’ initiative.
This issue is extremely important in that the classical taxonomic approach and the DNA approach
will need to be reconciled in order for the ‘DNA barcoding’ initiative to proceed with any kind of
community acceptance. In fact, we feel that DNA barcoding is a misnomer. Our preference is for the
title of the London meetings—Barcoding Life. In this paper we discuss these two concerns generated
around the DNA barcoding initiative and attempt to present a phylogenetic systematic framework for
an improved barcoder as well as a taxonomic framework for interweaving classical taxonomy with the
goals of ‘DNA barcoding’.
Keywords: DNA barcoding; taxonomy; species delimitation; muntjac; leeches; sturgeon
1. INTRODUCTION: BUILDING A BETTER DNA
BARCODER
One of the major issues concerning the inclusion of
molecular information into taxonomic aspects of
biology that has yet to be discussed in detail in the
commentaries on this subject is concerning the best
way to read the barcodes. There are two separate tasks
to which DNA barcodes are currently being applied.
The first is the use of DNA data to distinguish between
species (equivalent to species identification or species
diagnosis) and the second is the use of DNA data to
discover new species (equivalent to species delimita-
tion, species description). These two activities differ in
the types and amount of data required. Below we
highlight some of the issues that may limit the utility of
current DNA barcoding endeavours (especially those
used for species discovery) and suggest a framework for
the development of a barcoder that addresses these
issues.
(a)The barcoder engine: distances or characters?
A major issue that needs to be resolved is how to read
the organismal barcode once it is generated. Most
recently published approaches to DNA barcoding have
utilized distance measures to make the inference as to
species designation (Hebert et al. 2003a,b,2004a,b).
Distances are used in two major approaches; the first is
a simple BLAST (Altschul et al. 1990) approach where
a raw similarity score is used to determine the nearest
neighbour to the query sequence. The second
approach utilizes distances in tree building (Hebert
et al. 2003a,b). We point out the following short-
comings with these approaches and further suggest that
character based approaches are more appropriate for
DNA barcoding both for theoretical and for practical
reasons.
A major shortcoming of using distances in DNA
barcoding is that all classical studies and taxonomic
schemes that accomplish the same thing that barcodes
are meant to accomplish are character based, making
the union of classical and DNA barcoding a difficult
process if the use of distances is continued in barcoding
studies (see below). This shortcoming also is related to
the need for diagnostic characters that classical studies
use to validate the existence of a species. A second
shortcoming is that similarity scores often do not give
the nearest neighbour as the closest relative (Koski &
Golding 2001). Nevertheless, similarity scores will
always give a nearest neighbour. Character based
methods have the logical advantage that when diag-
nostic character data are lacking, they will fail to
diagnose, allowing for a degree of hypothesis testing not
available when using distances. A third shortcoming
involves the lack of an objective set of criteria to
delineate taxa when using distances. For example, a
universal similarity cut-off to determine species status
will simply not exist, because of the broad overlap of
inter- and intra-specific distances (Goldstein et al.
2000). Researchers will have to constantly revise their
similarity cut-offs from group to group. We suspect that
Phil. Trans. R. Soc. B (2005) 360, 1905–1916
doi:10.1098/rstb.2005.1722
Published online 14 September 2005
One contribution of 18 to a Theme Issue ‘DNA barcoding of life’.
*Author for correspondence (desalle@amnh.org).
1905 q2005 The Royal Society