Identifying Insects with Incomplete DNA Barcode Libraries, African Fruit Flies (Diptera: Tephritidae) as a Test Case

Royal Museum for Central Africa, Tervuren, Belgium.
PLoS ONE (Impact Factor: 3.53). 02/2012; 7(2):e31581. DOI: 10.1371/journal.pone.0031581
Source: PubMed

ABSTRACT We propose a general working strategy to deal with incomplete reference libraries in the DNA barcoding identification of species. Considering that (1) queries with a large genetic distance with their best DNA barcode match are more likely to be misidentified and (2) imposing a distance threshold profitably reduces identification errors, we modelled relationships between identification performances and distance thresholds in four DNA barcode libraries of Diptera (n = 4270), Lepidoptera (n = 7577), Hymenoptera (n = 2067) and Tephritidae (n = 602 DNA barcodes). In all cases, more restrictive distance thresholds produced a gradual increase in the proportion of true negatives, a gradual decrease of false positives and more abrupt variations in the proportions of true positives and false negatives. More restrictive distance thresholds improved precision, yet negatively affected accuracy due to the higher proportions of queries discarded (viz. having a distance query-best match above the threshold). Using a simple linear regression we calculated an ad hoc distance threshold for the tephritid library producing an estimated relative identification error <0.05. According to the expectations, when we used this threshold for the identification of 188 independently collected tephritids, less than 5% of queries with a distance query-best match below the threshold were misidentified. Ad hoc thresholds can be calculated for each particular reference library of DNA barcodes and should be used as cut-off mark defining whether we can proceed identifying the query with a known estimated error probability (e.g. 5%) or whether we should discard the query and consider alternative/complementary identification methods.


Available from: Massimiliano Virgilio, May 28, 2015
  • Source
    ZooKeys 07/2014; DOI:10.3897/zookeys.428.7366 · 0.92 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: DNA barcode reference libraries linked to voucher specimens create new opportunities for high-throughput identification and taxonomic re-evaluations. This study provides a DNA barcode library for about 45% of the recognized species of Canadian Hemiptera, and the publically available R workflow used for its generation. The current library is based on the analysis of 20,851 specimens including 1849 species belonging to 628 genera and 64 families. These individuals were assigned to 1867 Barcode Index Numbers (BINs), sequence clusters that often coincide with species recognized through prior taxonomy. Museum collections were a key source for identified specimens, but we also employed high-throughput collection methods that generated large numbers of unidentified specimens. Many of these specimens represented novel BINs that were subsequently identified by taxonomists, adding barcode coverage for additional species. Our analyses based on both approaches includes 94 species not listed in the most recent Canadian checklist, representing a potential 3% increase in the fauna. We discuss the development of our workflow in the context of prior DNA barcode library construction projects, emphasizing the importance of delineating a set of reference specimens to aid investigations in cases of nomenclatural and DNA barcode discordance. The identification for each specimen in the reference set can be annotated on the Barcode of Life Data System (BOLD), allowing experts to highlight questionable identifications; annotations can be added by any registered user of BOLD, and instructions for this are provided.
    PLoS ONE 04/2015; 10(4):e0125635. DOI:10.1371/journal.pone.0125635 · 3.53 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Feather mites (Astigmata: Analgoidea, Pterolichoidea) are among the most abundantand commonly occurring bird ectosymbionts. Basic questions on the ecology and evolution of feather mites remain unanswered because feather mite species identification is often only possible for adult males and it is laborious even for specialised taxonomists, thus precluding large-scale identifications. Here, we tested DNA barcoding as a useful molecular tool to identify feather mites from passerine birds. 361 specimens of 72 species of feather mites from 68 species of European passerine birds from Russia and Spain were barcoded. The accuracy of barcoding and mini-barcoding was tested. Moreover, threshold choice (a controversial issue in barcoding studies) was also explored in a new way, by calculating through simulations the effect of sampling effort (in species number and species composition) on threshold calculations. We found one 200 bp mini-barcode region that showed the same accuracy as the full-length barcode (602 bp) and was surrounded by conserved regions potentially useful for group-specific degenerate primers. Species identification accuracy was perfect (100%) but decreased when singletons or species of the Proctophyllodes pinnatus group were included. In fact, barcoding confirmed previous taxonomic issues within the Proctophyllodes pinnatus group. Following an integrative taxonomy approach, we compared our barcode study with previous taxonomic knowledge on feather mites, discovering three new putative cryptic species and validating three previous morphologically different (but still undescribed) new species.
    Molecular Ecology Resources 02/2015; DOI:10.1111/1755-0998.12384 · 5.63 Impact Factor