Content uploaded by Jorge Doña
Author content
All content in this area was uploaded by Jorge Doña on Sep 15, 2017
Content may be subject to copyright.
NOTE
PCR cycles above routine numbers do not compromise
high-throughput DNA barcoding results
J. Vierna, J. Doña, A. Vizcaíno, D. Serrano, and R. Jovani
Abstract: High-throughput DNA barcoding has become essential in ecology and evolution, but some technical questions still
remain. Increasing the number of PCR cycles above the routine 20–30 cycles is a common practice when working with old-type
specimens, which provide little amounts of DNA, or when facing annealing issues with the primers. However, increasing the
number of cycles can raise the number of artificial mutations due to polymerase errors. In this work, we sequenced 20 COI
libraries in the Illumina MiSeq platform. Libraries were prepared with 40, 45, 50, 55, and 60 PCR cycles from four individuals
belonging to four species of four genera of cephalopods. We found no relationship between the number of PCR cycles and the number
of mutations despite using a nonproofreading polymerase. Moreover, even when using a high number of PCR cycles, the resulting
number of mutations was low enough not to be an issue in the context of high-throughput DNA barcoding (but may still remain
an issue in DNA metabarcoding due to chimera formation). We conclude that the common practice of increasing the number of
PCR cycles should not negatively impact the outcome of a high-throughput DNA barcoding study in terms of the occurrence of
point mutations.
Key words: COI, DNA barcoding, Illumina, library, mutations, non-proofreading polymerase.
Résumé : Le codage a
`barres a
`haut débit de l’ADN est devenu essentiel en écologie et en évolution, mais certaines questions
techniques demeurent. Un accroissement du nombre de cycles de PCR au-dela
`des 20–30 cycles usuels est pratique commune
lorsqu’on travaille avec des spécimens anciens, lesquels fournissent peu d’ADN, ou lorsque des problèmes d’appariement sont
rencontrés avec les amorces. Cependant, l’accroissement du nombre de cycles peut augmenter le nombre de mutations artifi-
cielles dues aux erreurs de la polymérase. Dans ce travail, les auteurs ont séquencé 20 librairies COI sur un appareil MiSeq
d’Illumina. Les librairies ont été préparées a
`partir de quatre individus appartenant a
`quatre espèces au sein de quatre genres de
céphalopodes en complétant 40, 45, 50, 55 ou 60 cycles de PCR. Les auteurs n’ont observé aucune relation entre le nombre de
cycles de PCR et le nombre de mutations, en dépit de l’utilisation d’une enzyme sans activité exonucléase 3=¡5=(« proofreading »).
De plus, même au terme d’un grand nombre de cycles, le nombre de mutations était suffisamment faible pour ne pas constituer
un problème dans le contexte du codage a
`barres a
`haut débit (bien qu’il puisse en constituer un dans le cas du métacodage a
`
barres de l’ADN en raison de la formation de chimères). Les auteurs concluent que la pratique courante d’augmenter le nombre
de cycles de PCR ne devrait pas avoir d’impact négatif sur les résultats d’études faisant appel au codage a
`barres a
`haut débit en
matière d’occurrence de mutations ponctuelles. [Traduit par la Rédaction]
Mots-clés : COI, codage a
`barres de l’ADN, Illumina, librairie, mutations, polymérase sans activité exonucléase 3=¡5=.
Introduction
High-throughput DNA barcoding (for single specimens; Shokralla
et al. 2014,2015;Toju 2015), as well as similar methods such as DNA
metabarcoding (for mixed species samples; Taberlet et al. 2012)or
amplicon metagenomics, combine DNA-based species identifica-
tion using standardised markers (DNA barcoding, Hebert et al.
2003) with the power of high-throughput sequencing (HTS). These
methods are powerful tools in life sciences research (Taberlet
et al. 2012;Kress et al. 2015;Toju 2015), from studying century-old
type specimens (Prosser et al. 2016), to assessing species composition
of gut microbiota (Abdelrhman et al. 2016) from mixed samples.
Here, we focus on high-throughput DNA barcoding. This meth-
odology overcomes some of the problems that currently limit
DNA barcoding, such as the high DNA template concentration
required for Sanger sequencing and the co-amplification of other
DNA templates due to intrasample contamination, Wolbachia in-
fection, gut contents, heteroplasmy, and pseudogenes. Moreover,
high-throughput DNA barcoding reduces both per specimen costs
and labour time by nearly 80%, thus allowing to be scaled up to
deal with large-scale biodiversity monitoring projects (Shokralla
et al. 2015;Cruaud et al. 2017).
However, even though high-throughput DNA barcoding is a
promising method, some technical issues require further study.
For example, some authors have explored the impact of the se-
quencing platform (Smith and Peay 2014), the polymerase used
(Oliver et al. 2015;Brandariz-Fontes et al. 2015), the DNA barcode
length (Hajibabaei et al. 2006;Doña et al. 2015), the library prep-
aration method (Schirmer et al. 2015), the primers (Schirmer et al.
2015), the annealing temperature (Schmidt et al. 2013), or the
Received 5 April 2017. Accepted 21 June 2017.
Corresponding Editor: F. Chain.
J. Vierna* and A. Vizcaíno. AllGenetics & Biology SL. Edificio CICA, Campus de Elviña s/n. E-15008 A Coruña, Spain.
J. Doña* and R. Jovani. Department of Evolutionary Ecology, Estación Biológica de Doñana (CSIC), Avenida Américo Vespucio s/n. E-41092 Sevilla, Spain.
D. Serrano. Department of Conservation Biology, Estación Biológica de Doñana (CSIC), Avenida Américo Vespucio s/n. E-41092 Sevilla, Spain.
Corresponding author: R. Jovani (email: jovani@ebd.csic.es).
*These authors contributed equally to this work.
Copyright remains with the author(s) or their institution(s). Permission for reuse (free in most cases) can be obtained from RightsLink.
Pagination not final (cite DOI) / Pagination provisoire (citer le DOI)
1
Genome 00: 1–6 (0000) dx.doi.org/10.1139/gen-2017-0081 Published at www.nrcresearchpress.com/gen on 28 July 2017.
Genome Downloaded from www.nrcresearchpress.com by 212.230.235.80 on 09/15/17
For personal use only.
phenomenon known as mistagging (Schnell et al. 2015;Esling
et al. 2015) in DNA metabarcoding or amplicon sequencing. Re-
cently, Geisen et al. (2015) and Díaz-Real et al. (2015) studied to
what extent DNA metabarcoding produced quantitative (and not
only qualitative) and reliable results in two groups of symbionts.
Finally, several other papers have dealt with some of these issues
through bioinformatic analysis of the HTS reads (Caporaso et al.
2010;Coissac et al. 2012;Edgar 2013;Bokulich et al. 2013;Boyer
et al. 2016).
Here, we focused on the number of PCR cycles used for library
preparation. This is a technical issue that can potentially impact
the biological conclusions of high-throughput DNA barcoding
projects, but that has not yet been studied in detail. Increasing
the number of PCR cycles above the normal 20–35 cycles (e.g.,
Shokralla et al. 2014,2015;Carew et al. 2017) is a common practice:
for example, when working with old-type specimens (Prosser et al.
2016), which provide small amounts of input DNA, or when the
PCR is inefficient (e.g., Blaalid et al. 2013;Ellis et al. 2013;Carew
et al. 2017). However, a large number of PCR cycles may entail the
risk of increasing the number of artificial mutations on the out-
put sequencing reads because of DNA polymerase errors and the
amplification of these errors in subsequent PCR cycles (Cha and
Thilly 1993;Hengen 1995;Casbon et al. 2011;Brandariz-Fontes
et al. 2015). This is a potential major problem for high-throughput
DNA barcoding because it can eventually distort, among others,
genetic threshold-based species delimitation. Yet, to our knowl-
edge, how these extra cycles affect DNA barcoding results has
never been investigated.
To explore the consequences of the number of PCR cycles upon
the number of artificial mutations, we extracted DNA from four
different individuals belonging to four cephalopod species. From
each of the four DNA samples, we prepared five high-throughput
DNA barcoding libraries with different number of PCR cycles:
from 40, i.e., roughly 20 cycles higher than regular numbers, to
60, as done commonly when dealing with problematic samples.
After sequencing the 20 libraries using the Illumina MiSeq plat-
form, we studied the relationship between the number of PCR
cycles and the number of mutations present in the MiSeq reads.
Our results show that, for a number of cycles between 40 and 60,
there is no relationship between the number of PCR cycles and the
number of mutations, with the number of reads with mutations
being very low. Therefore, we conclude that a number of PCR
cycles as high as 60 does not compromise the success of a high-
throughput DNA barcoding project in terms of the occurrence of
point mutations.
Materials and methods
Four ethanol-preserved tissues obtained from different cephalopod
species belonging to the orders Octopoda, Oegopsida, and Sepiida
were analysed (see sample IDs and cephalopod species in Table 1).
Species were identified according to morphology and DNA bar-
coding (Fernando Fernández-Álvarez, personal communication).
The genetic p-distances between the selected individuals were
between 80.1 and 85.7 for the cytochrome coxidase subunit I gene
(COI) used in this study.
Total DNA was extracted from each individual using the NZY-
Tissue gDNA Isolation Kit (NZYTech). DNAs were quantified with
the Qubit dsDNA HS Assay Kit (ThermoFisher Scientific) and used
as input for the preparation of the libraries.
We followed a standard Illumina library preparation protocol.
In brief, we amplified the COI region (i.e., the standard animal
barcode, Hebert et al. 2003) and included the Illumina specific
adapters and indices by following a two-step PCR approach, slightly
modified from Lange et al. (2014). For the sake of clarity, we refer to
these PCRs as PCR1 and PCR2.
PCR1 primers were LCO1490 and HCO2198 (Folmer et al. 1994),
which proved successful in a previous study in which the same
specimens were DNA barcoded (Fernando Fernández-Álvarez et al.,
personal communication). Oligonucleotide tails bearing the Illu-
mina sequencing primers were attached to the 5=ends of primers
LCO1490 and HCO2198. PCR2 was carried out with tailed primers
that bear the indices and adapters and anneal to the Illumina
sequencing primers (see Fig. 1 for a schematic representation of
the binding process).
PCR1 was carried out using 25 ng of total DNA in a final volume
of 25 L containing 6.50 L of Supreme NZYTaq Green PCR Master
Mix (NZYTech) (nonproofreading polymerase; error rate of1×10
−5
according to the manufacturer), 0.5 M of each primer, and PCR-
grade water up to 25 L. The thermal cycling conditions were as
follows: an initial denaturation step at 95 °C for 5 min, followed by
35, 40, 45, 50, or 55 cycles (see Fig. 2) of denaturation at 95 °C for
30 s; annealing at 53 °C for 30 s; extension at 72 °C for 45 s; and a
final extension step at 72 °C for 10 min. The products of PCR1 were
purified using the SPRI method (DeAngelis et al. 1995), with Mag-
Bind RXNPure Plus magnetic beads (Omega Biotek). The purified
products were loaded in a 1% agarose gel stained with GreenSafe
(NZYTech) and visualised under UV light.
PCR2 was carried out using 2.5 L of the purified PCR1 products,
and the same conditions as for PCR1 except for the number of
cycles, which was set to five (Fig. 2) and the annealing temperature
(60 °C). The products obtained were purified following the SPRI
method as indicated above. Then, the purified products were
loaded in a 1% agarose gel stained with GreenSafe (NZYTech) and
visualised under UV light. All samples yielded libraries of the
expected size.
Libraries were quantified using the Qubit dsDNA HS Assay Kit
(ThermoFisher Scientific) and pooled in equimolar amounts. The
pool was sequenced in a fraction of a 600-cycle run (MiSeq Reagent
Kit v3; PE300) of an Illumina MiSeq sequencer along with a PhiX
library used to increase sequence diversity of the overall library,
in Macrogen (Seoul, Korea).
Fig. 1. Schematic representation of the primers used for PCR1 and PCR2 (see main text). The positions of the Illumina adapters, indices, and
sequencing primers are also shown. Note that primers are not drawn to scale.
Pagination not final (cite DOI) / Pagination provisoire (citer le DOI)
2 Genome Vol. 00, 0000
Published by NRC Research Press
Genome Downloaded from www.nrcresearchpress.com by 212.230.235.80 on 09/15/17
For personal use only.
FASTQ files were demultiplexed using RTA 1.18.54 (Illumina) and
checked with FastQC 0.11.3 (http://www.bioinformatics.babraham.
ac.uk/projects/fastqc/). Then, they were quality-trimmed using
very conservative parameters in Trimmomatic 0.36 (Bolger et al.
2014) with the option SLIDINGWINDOW:1:30. SLIDINGWINDOW
starts scanning at the 5=end and clips the read once the average
quality within the window falls below a threshold (Trimmomatic
Manual 0.32). We set the size of the window to 1 and the quality
threshold to 30 (Phred Quality Score). Therefore, when the quality
of a single nucleotide fell below a Phred Quality Score of 30, the
read was clipped from this position to the 3=end. We used these
very conservative parameters to make sure that the mutations
observed in the sequencing results were due to PCR errors and
not to sequencing errors. The quality of the resulting files was
checked again with FastQC.
Quality-trimmed FASTQ files were imported into Geneious 8.1.6
(http://www.geneious.com,Kearse et al. 2012). Each pair of R1 and
R2 files were set as paired reads to improve the mapping. A map-
to-reference analysis was carried out with the Geneious mapper
using relaxed parameters (maximum number of mismatches per
read, 25%; minimum overlap identity, 80%) to allow potentially
mutated reads to map. The DNA barcode sequences from the
four cephalopod specimens were set as references (DDBJ/EMBL/
GenBank accession numbers KX078469–KX078472). The results of
the map-to-reference analysis were inspected manually to verify
that the reads of each library mapped to the correct reference
sequence. We obtained 20 assembly files corresponding to the
four species by the five PCR treatments.
Regions including the first 50 nucleotides of the mapped R1 and
R2 reads (starting immediately after the primer annealing region)
were aligned in each assembly with Muscle (Edgar 2004) as imple-
mented in Geneious 8.1.6. We selected these two 50-nucleotide
regions because such read length accumulated the maximum
number of reads after passing the quality threshold (see above);
using larger regions would have reduced the sample size and,
therefore, the statistical power of the analysis. Reads were trimmed
to the same length to simplify later bioinformatic analyses.
For each alignment file, we calculated the number of mutations
per read by comparing every read against the consensus sequence.
The consensus sequences obtained from the FASTA files of the
same species were identical between them (regardless of the num-
ber of PCR cycles) and they were also identical to the correspond-
ing COI sequences available in DDBJ/EMBL/GenBank. For this, we
used a custom developed R function (R Core Team 2016) to calcu-
late the number of mutations by multiplying the pairwise genetic
p-distance by the total length of our reads. The function treated
insertions and deletions (indels) as single mutational steps and
the genetic p-distance was calculated with the dist.dna function
(raw model) from the ape 3.4 R package (Paradis et al. 2004). Then,
we ran a Poisson generalised linear mixed model (GLMM) on the
entire resulting data set (glmer function from package lme4 1.1-12;
Bates et al. 2015). We considered the number of mutations as the
response variable, the number of cycles as the predictor variable,
and the species as a random factor. We confirmed assumptions
underlying GLMMs by exploring regression residuals for normal-
ity against a Q-Q plot.
Fig. 2. From each cephalopod sample, five different high-throughput DNA barcoding libraries were constructed and sequenced in the
Illumina MiSeq platform. In each of these five libraries, the number of PCR cycles during PCR1 was different (35, 40, 45, 50, and 55 cycles).
Pagination not final (cite DOI) / Pagination provisoire (citer le DOI)
Vierna et al. 3
Published by NRC Research Press
Genome Downloaded from www.nrcresearchpress.com by 212.230.235.80 on 09/15/17
For personal use only.
Finally, to make sure that the PCR1 reaction was still function-
ing after 55 cycles (i.e., that the emergence of new artificial muta-
tions was still possible), qPCRs were performed in all four samples
with the same parameters as in PCR1, but with 60 cycles to cover
the whole range of our experiment. The resulting fluorescence
versus number of cycles plots were visually analysed, confirming
that the reaction was still taking place after 55 cycles.
Results
Due to the stringent quality-filtering, only 2.26% of the raw
reads were used for the statistical analyses (see supplementary
material, Table S1). The average quality of both the raw and
quality-trimmed reads, as measured with FastQC, is available in
the supplementary material, Fig. S1.
We detected mutations in 4176 out of the 69 792 reads analysed
(i.e., 5.98%), which passed the quality-filtering step, mapped to
the correct reference sequence, and were located within the
50-nucleotide stretches after the primer annealing regions.
The number of mutations was consistent across species and the
maximum number of mutations per read was three along differ-
ent treatments (Fig. 3;Table 1). Accordingly, we found no effect of
the number of cycles on the number of mutations (Fig. 3; slope ±
SE = 0.0002 ± 0.0024, Z= 0.096, P= 0.923).
Discussion
In this work, we investigated whether increasing the number of
PCR cycles during library preparation produces a higher number
of mutations that could eventually impact the outcome of a high-
throughput DNA barcoding study. We demonstrated that even for
a high number of cycles (60, i.e., up to 55 cycles for PCR1 and five
additional cycles for PCR2) the number of reads with mutations
remained very low despite using a non-proofreading enzyme and
despite the potential occurrence of heteroplasmy (which would
increase the number of mutated positions when compared to the
reference sequence). However, we only analysed two regions of
50 nucleotides each from the COI animal DNA barcode, whereas
Table 1. Percentage of reads with 0, 1, 2, or 3 mutations relative to the reference
sequence.
Library ID 0 1 2 3
No. of
reads
CEP007 (Bathypolypus sponsalis) 94.055 5.772 0.167 0.004 22 105
CEP016 (Ancistroteuthis lichtensteini) 94.169 5.607 0.222 0 14 408
CEP023 (Todaropsis eblanae) 93.409 6.37 0.198 0.022 22 637
SEP006 (Sepietta oweniana) 95.019 4.839 0.14 0 10 642
Note: No sequence with four or more mutations was found.
Fig. 3. Number of mutations relative to the reference sequence observed in each PCR treatment. (a)Bathypolypus sponsalis.(b)Ancistroteuthis
lichtensteini.(c)Todaropsis eblanae.(d)Sepietta oweniana.
Pagination not final (cite DOI) / Pagination provisoire (citer le DOI)
4 Genome Vol. 00, 0000
Published by NRC Research Press
Genome Downloaded from www.nrcresearchpress.com by 212.230.235.80 on 09/15/17
For personal use only.
different genomic regions may impose different error rates to
DNA polymerase (e.g., Arezi et al. 2003). Nevertheless, the lack of
effect we found in these regions with high sequence quality by
experimentally increasing the number of PCR cycles indicates
that PCR cycles might have negligible impacts on point mutations
and subsequent taxonomic assignment.
Some DNA metabarcoding-specific technical issues can arise by
an increase in the number of PCR cycles, and thus require further
study. For instance, chimeras are hybrid amplicons that can be
formed during a PCR when an aborted extension product from an
earlier cycle functions as a primer in a subsequent PCR cycle (Haas
et al. 2011). Chimeras inflate diversity in an artificial manner and
should be carefully taken into account. In this work, chimeras
were not an issue because we prepared our libraries using DNA
from individual specimens (i.e., high-throughput DNA barcoding
libraries). However, the formation of chimeras has been found to
be correlated with the number of PCR cycles and to the con-
sumption of the primers (Wang and Wang 1996;Qiu et al. 2001;
Thompson et al. 2002). Fortunately, several bioinformatic tools
have been developed to deal with chimeras and thus their impact
can be greatly reduced (Edgar et al. 2011,Haas et al. 2011,Coissac
et al. 2012,Boyer et al. 2016). Thus, even though our results hold
for DNA metabarcoding studies in terms of point mutations, the
formation of chimeras at high PCR cycles is a separated problem
that should be considered in DNA metabarcoding studies.
Overall, our results show that increasing the number of PCR
cycles above routine levels during library preparation is not risky
for high-throughput DNA barcoding studies, in terms of the
amount of point mutations produced by polymerase errors even
when a non-proofreading enzyme is used. Therefore, this strategy
can be safely followed with little amounts of input DNA or when
there are mismatches in the primer annealing regions that make
the PCRs inefficient.
Data accessibility
The MiSeq raw data, the sequences files, and the supplementary
material have been deposited in Figshare (https://doi.org/10.6084/
m9.figshare.3860958). The R code used for the analyses is available
on the GitHub repository (https://github.com/Jorge-Dona/Barcoding-
tools).
Acknowledgements
We thank Fernando Fernández-Álvarez for letting us analyse
the cephalopod samples, which belong to the research project
CALOCEAN-2 (AGL2012-39077), funded by the Ministerio de Economía y
Competitividad (Spain). This work was supported by the Ministe-
rio de Economía y Competitividad (Spain) with a Ramón y Cajal
research contract RYC-2009-03967 to R.J., and two research proj-
ects (CGL2011-24466, CGL2015-69650-P) to D.S. and R.J. J.D. was also
supported by the Ministerio de Economía y Competitividad (Spain)
(SVP-2013-067939).
References
Abdelrhman, K.F.A., Bacci, G., Mancusi, C., Mengoni, A., Serena, F., and
Ugolini, A. 2016. A first insight into the gut microbiota of the sea turtle Caretta
caretta. Front. Microbiol. 7: 1060. PMID:27458451.
Arezi, B., Xing, W., Sorge, J.A., and Hogrefe, H.H. 2003. Amplification efficiency
of thermostable DNA polymerases. Anal. Biochem. 321: 226–235. doi:10.1016/
S0003-2697(03)00465-2. PMID:14511688.
Bates, D., Mächler, M., Bolker, B., and Walker, S. 2015. Fitting linear mixed-
effects models using lme4. Journal of Statistical Software, 67(1): 1–48. doi:10.
18637/jss.v067.i01.
Blaalid, R., Kumar, S., Nilsson, R.H., Abarenkov, K., Kirk, P.M., and Kauserud, H.
2013. ITS1 versus ITS2 as DNA metabarcodes for fungi. Mol. Ecol. Resour. 13:
218–224. doi:10.1111/1755-0998.12065. PMID:23350562.
Bokulich, N.A., Subramanian, S., Faith, J.J., Gevers, D., Gordon, J.I., Knight, R.,
et al. 2013. Quality-filtering vastly improves diversity estimates from Illu-
mina amplicon sequencing. Nat. Methods, 10: 57–59. PMID:23202435.
Bolger, A.M., Lohse, M., and Usadel, B. 2014. Trimmomatic: a flexible trimmer
for Illumina Sequence data. Bioinformatics, 30(15): 2114–2120. doi:10.1093/
bioinformatics/btu170. PMID:24695404.
Boyer, F., Mercier, C., Bonin, A., Le Bras, Y., Taberlet, P., and Coissac, E. 2016.
obitools: a unix-inspired software package for DNA metabarcoding. Mol.
Ecol. Resour. 16: 176–182. doi:10.1111/1755-0998.12428. PMID:25959493.
Brandariz-Fontes, C., Camacho-Sánchez, M., Vila
`, C., Vega-Pla, J.L., Rico, C., and
Leonard, J.A. 2015. Effect of the enzyme and PCR conditions on the quality of
high-throughput DNA sequencing results. Sci. Rep. 5: 8056. doi:10.1038/
srep08056. PMID:25623996.
Caporaso, J.G., Kuczynski, J., Stombaugh, J., Bittinger, K., Bushman, F.D.,
Costello, E.K., et al. 2010. QIIME allows analysis of high-throughput commu-
nity sequencing data. Nat. Methods, 7: 335–336. doi:10.1038/nmeth.f.303.
PMID:20383131.
Carew, M.E., Metzeling, L., StClair, R., and Hoffmann, A.A. 2017. Detecting inver-
tebrate species in archived collections using next generation sequencing.
Mol. Ecol. Resour. [Online ahead of print.] doi:10.1111/1755-0998.12644.
Casbon, J.A., Osborne, R.J., Brenner, S., and Lichtenstein, C.P. 2011. A method for
counting PCR template molecules with application to next-generation se-
quencing. Nucleic Acids Res. 39: e81. doi:10.1093/nar/gkr217. PMID:21490082.
Cha, R.S., and Thilly, W.G. 1993. Specificity, efficiency, and fidelity of PCR. Ge-
nome Res. 3: S18–S29. doi:10.1101/gr.3.3.S18. PMID:8118393.
Coissac, E., Riaz, T., and Puillandre, N. 2012. Bioinformatic challenges for DNA
metabarcoding of plants and animals. Mol. Ecol. 21: 1834–1847. doi:10.1111/j.
1365-294X.2012.05550.x. PMID:22486822.
Cruaud, P., Rasplus, J.Y., Rodriguez, L.J., and Cruaud, A. 2017. High-throughput
sequencing of multiple amplicons for barcoding and integrative taxonomy.
Sci. Rep. 7: 41948. doi:10.1038/srep41948. PMID:28165046.
DeAngelis, M.M., Wang, D.G., and Hawkins, T.L. 1995. Solid-phase reversible
immobilization for the isolation of PCR products. Nucleic Acids Res. 23:
4742–4743. doi:10.1093/nar/23.22.4742. PMID:8524672.
Diaz-Real, J., Serrano, D., Píriz, A., and Jovani, R. 2015. NGS metabarcoding
proves successful for quantitative assessment of symbiont abundance: the
case of feather mites on birds. Exp. Appl. Acarol. 67: 209–218. doi:10.1007/
s10493-015-9944-x. PMID:26139533.
Doña, J., Diaz-Real, J., Mironov, S., Bazaga, P., Serrano, D., and Jovani, R. 2015.
DNA barcoding and minibarcoding as a powerful tool for feather mite
studies. Mol. Ecol. Resour. 15: 1216–1225. doi:10.1111/1755-0998.12384. PMID:
25655349.
Edgar, R.C. 2004. MUSCLE: multiple sequence alignment with high accuracy and
high throughput. Nucleic Acids Res. 32: 1792–1797. doi:10.1093/nar/gkh340.
PMID:15034147.
Edgar, R.C. 2013. UPARSE: highly accurate OTU sequences from microbial
amplicon reads. Nat. Methods, 10: 996–998. doi:10.1038/nmeth.2604. PMID:
23955772.
Edgar, R.C., Haas, B.J., Clemente, J.C., Quince, C., and Knight, R. 2011. UCHIME
improves sensitivity and speed of chimera detection. Bioinformatics, 27:
2194–2200. doi:10.1093/bioinformatics/btr381.
Ellis, R.J., Bruce, K.D., Jenkins, C., Stothard, J.R., Ajarova, L., Mugisha, L., and
Viney, M.E. 2013. Comparison of the distal gut microbiota from people and
animals in Africa. PLoS ONE, 8: e54783. doi:10.1371/journal.pone.0054783.
PMID:23355898.
Esling, P., Lejzerowicz, F., and Pawlowski, J. 2015. Accurate multiplexing and
filtering for high-throughput amplicon-sequencing. Nucleic Acids Res. 43:
2513–2524. doi:10.1093/nar/gkv107. PMID:25690897.
Folmer, O., Black, M., Hoeh, W., Lutz, R., and Vrijenhoek, R. 1994. DNA primers
for amplification of mitochondrial cytochrome coxidase subunit I from di-
verse metazoan invertebrates. Mol. Mar. Biol. Biotechnol. 3: 294–299. PMID:
7881515.
Geisen, S., Laros, I., Vizcaíno, A., Bonkowski, M., and de Groot, G.A. 2015. Not all
are free-living: high-throughput DNA metabarcoding reveals a diverse com-
munity of protists parasitizing soil metazoa. Mol. Ecol. 24: 4556–4569. doi:
10.1111/mec.13238. PMID:25966360.
Haas, B.J., Gevers, D., Earl, A.M., Feldgarden, M., Ward, D.V., Giannoukos, G.,
et al. 2011. Chimeric 16S rRNA sequence formation and detection in Sanger
and 454-pyrosequenced PCR amplicons. Genome Res. 21: 494–504. doi:10.
1101/gr.112730.110. PMID:21212162.
Hajibabaei, M., Smith, M., Janzen, D.H., Rodriguez, J.J., Whitfield, J.B., and
Hebert, P.D. 2006. A minimalist barcode can identify a specimen whose DNA
is degraded. Mol. Ecol. Notes, 6: 959–964. doi:10.1111/j.1471-8286.2006.01470.x.
Hebert, P.D.N., Cywinska, A., Ball, S.L., and deWaard, J.R. 2003. Biological iden-
tifications through DNA barcodes. Proc. R. Soc. B Biol. Sci. 270: 313–321.
doi:10.1098/rspb.2002.2218.
Hengen, P.N. 1995. Methods and reagents: fidelity of DNA polymerases for PCR.
Trends Biochem. Sci. 20: 324–325. doi:10.1016/S0968-0004(00)89060-X. PMID:
7667892.
Kearse, M., Moir, R., Wilson, A., Stones-Havas, S., Cheung, M., Sturrock, S., et al.
2012. Geneious Basic: an integrated and extendable desktop software plat-
form for the organization and analysis of sequence data. Bioinformatics, 28:
1647–1649. doi:10.1093/bioinformatics/bts199. PMID:22543367.
Kress, W.J., García-Robledo, C., Uriarte, M., and Erickson, D.L. 2015. DNA bar-
codes for ecology, evolution, and conservation. Trends Ecol. Evol. 30, 25–35.
doi:10.1016/j.tree.2014.10.008.
Lange, V., Böhme, I., Hofmann, J., Lang, K., Sauter, J., Schöne, B., et al. 2014.
Cost-efficient high-throughput HLA typing by MiSeq amplicon sequencing.
BMC Genomics, 15: 63. doi:10.1186/1471-2164-15-63. PMID:24460756.
Pagination not final (cite DOI) / Pagination provisoire (citer le DOI)
Vierna et al. 5
Published by NRC Research Press
Genome Downloaded from www.nrcresearchpress.com by 212.230.235.80 on 09/15/17
For personal use only.
Oliver, A.K., Brown, S.P., Callaham, M.A., Jr., and Jumpponen, A. 2015. Polymerase
matters: non-proofreading enzymes inflate fungal community richness esti-
mates by up to 15%. Fungal Ecol. 15: 86–89. doi:10.1016/j.funeco.2015.03.003.
Paradis, E., Claude, J., and Strimmer, K. 2004. APE: analyses of phylogenetics
and evolution in R language. Bioinformatics, 20: 289–290. doi:10.1093/
bioinformatics/btg412. PMID:14734327.
Prosser, S.W., deWaard, J.R., Miller, S.E., and Hebert, P.D. 2016. DNA barcodes
from century-old type specimens using next-generation sequencing. Mol.
Ecol. Resour. 16: 487–497. doi:10.1111/1755-0998.12474. PMID:26426290.
Qiu, X., Wu, L., Huang, H., McDonel, P.E., Palumbo, A.V., Tiedje, J.M., and Zhou, J.
2001. Evaluation of PCR-generated chimeras, mutations, and heteroduplexes
with 16S rRNA gene-based cloning. Appl. Environ. Microbiol. 67: 880–887.
doi:10.1128/AEM.67.2.880-887.2001. PMID:11157258.
R Core Team. 2016. R: a language and environment for statistical computing.
R Foundation for Statistical Computing, Vienna, Austria. Available from
https://www.R-project.org/.
Schirmer, M., Ijaz, U.Z., D’Amore, R., Hall, N., Sloan, W.T., and Quince, C. 2015.
Insight into biases and sequencing errors for amplicon sequencing with
the Illumina MiSeq platform. Nucleic Acids. Res. 43: e37. doi:10.1093/nar/
gku1341. PMID:25586220.
Schmidt, P.-A., Bálint, M., Greshake, B., Bandow, C., Römbke, J., and Schmitt, I.
2013. Illumina metabarcoding of a soil fungal community. Soil Biol. Biochem.
65: 128–132. doi:10.1016/j.soilbio.2013.05.014.
Schnell, I.B., Bohmann, K., and Gilbert, M.T. 2015. Tag jumps illuminated —
reducing sequence-to-sample misidentifications in metabarcoding studies.
Mol. Ecol. Resour. 15: 1289–1303. doi:10.1111/1755-0998.12402. PMID:25740652.
Shokralla, S., Gibson, J.F., Nikbakht, H., Janzen, D.H., Hallwachs, W., and
Hajibabaei, M. 2014. Next-generation DNA barcoding: using next-generation
sequencing to enhance and accelerate DNA barcode capture from single
specimens. Mol. Ecol. Resour. 14: 892–901. PMID:24641208.
Shokralla, S., Porter, T.M., Gibson, J.F., Dobosz, R., Janzen, D.H., Hallwachs, W.,
et al. 2015. Massively parallel multiplex DNA sequencing for specimen iden-
tification using an Illumina MiSeq platform. Sci. Rep. 5: 9687. doi:10.1038/
srep09687. PMID:25884109.
Smith, D.P., and Peay, K.G. 2014. Sequence depth, not PCR replication, improves
ecological inference from next generation DNA sequencing. PLoS ONE, 9:
e90234. doi:10.1371/journal.pone.0090234. PMID:24587293.
Taberlet, P., Coissac, E., Pompanon, F., Brochmann, C., and Willerslev, A. 2012.
Towards next-generation biodiversity assessment using DNA metabarcoding.
Mol. Ecol. 21: 2045–2050. doi:10.1111/j.1365-294X.2012.05470.x. PMID:22486824.
Thompson, J.R., Marcelino, L.A., and Polz, M.F. 2002. Heteroduplexes in mixed-
template amplifications: formation, consequence and elimination by ‘recon-
ditioning PCR’. Nucleic Acids Res. 30: 2083–2088. doi:10.1093/nar/30.9.2083.
PMID:11972349.
Toju, H. 2015. High-throughput DNA barcoding for ecological network studies.
Popul. Ecol. 57: 37–51. doi:10.1007/s10144-014-0472-z.
Wang, G.C., and Wang, Y. 1996. The frequency of chimeric molecules as a con-
sequence of PCR co-amplification of 16S rRNA genes from different bacterial
species. Microbiology, 142: 1107–1114. doi:10.1099/13500872-142-5-1107. PMID:
8704952.
Pagination not final (cite DOI) / Pagination provisoire (citer le DOI)
6 Genome Vol. 00, 0000
Published by NRC Research Press
Genome Downloaded from www.nrcresearchpress.com by 212.230.235.80 on 09/15/17
For personal use only.