PreprintPDF Available

Abstract

Whole genome sequencing is instrumental for the study of genome variation in natural populations, delivering important knowledge on genomic modifications and potential targets of natural selection at the population level. Large dormant eggbanks of aquatic invertebrates such as the keystone herbivore Daphnia, a microcrustacean widespread in freshwater ecosystems, provide detailed sedimentary archives to study genomic processes over centuries. To overcome the problem of limited DNA amounts in single Daphnia dormant eggs, we developed an optimised workflow for whole genome amplification (WGA), yielding sufficient amounts of DNA for downstream whole genome sequencing of individual historical eggs, including polyploid lineages. We compare two WGA kits, applied to recently produced Daphnia magna dormant eggs from laboratory cultures, and to historical dormant eggs of Daphnia pulicaria collected from Arctic lake sediment between 10y and 300y old. Resulting genome coverage breadth in most samples was ~70%, including those from >100y old isolates. Sequence read distribution was highly correlated among samples amplified with the same kit, but less correlated between kits. Despite this, a high percentage of genomic positions with SNPs in one or more samples (maximum of 74% between kits, and 97% within kits) were recovered at a depth required for genotyping. As a by-product of sequencing we obtained 100% coverage of the mitochondrial genomes even from the oldest isolates (~300y). The mtDNA provides an additional source for evolutionary studies of these populations. We provide an optimised workflow for WGA followed by whole genome sequencing including steps to minimise exogenous DNA.
1
Refining the evolutionary time machine: an assessment of whole genome amplification 1
using single historical Daphnia eggs 2
3
Christopher James O’Grady 1,3, Vignesh Dhandapani 2, John K. Colbourne 2, Dagmar Frisch2. 4
5
1 University of Warwick, School of Life Sciences, Coventry, UK 6
2 University of Birmingham, School of Life Sciences, Birmingham, UK 7
3 Cell and Gene Therapy Catapult, London, UK 8
9
10
11
Running title: Whole genome amplification from single historical Daphnia eggs 12
13
Keywords: Whole genome sequencing, SNP analysis, Daphnia, ancient DNA, population 14
ecology 15
16
Subject category: 17
The authors declare no conflict of interest 18
19
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted April 19, 2021. ; https://doi.org/10.1101/2021.04.19.440325doi: bioRxiv preprint
2
Abstract 20
Whole genome sequencing is instrumental for the study of genome variation in natural 21
populations, delivering important knowledge on genomic modifications and potential targets 22
of natural selection at the population level. Large dormant eggbanks of aquatic invertebrates 23
such as the keystone herbivore Daphnia, a microcrustacean widespread in freshwater 24
ecosystems, provide detailed sedimentary archives to study genomic processes over centuries. 25
To overcome the problem of limited DNA amounts in single Daphnia dormant eggs, we 26
developed an optimised workflow for whole genome amplification (WGA), yielding 27
sufficient amounts of DNA for downstream whole genome sequencing of individual historical 28
eggs, including polyploid lineages. We compare two WGA kits, applied to recently produced 29
Daphnia magna dormant eggs from laboratory cultures, and to historical dormant eggs of 30
Daphnia pulicaria collected from Arctic lake sediment between 10y and 300y old. Resulting 31
genome coverage breadth in most samples was ~70%, including those from >100y old 32
isolates. Sequence read distribution was highly correlated among samples amplified with the 33
same kit, but less correlated between kits. Despite this, a high percentage of genomic 34
positions with SNPs in one or more samples (maximum of 74% between kits, and 97% within 35
kits) were recovered at a depth required for genotyping. As a by-product of sequencing we 36
obtained 100% coverage of the mitochondrial genomes even from the oldest isolates (~300y). 37
The mtDNA provides an additional source for evolutionary studies of these populations. We 38
provide an optimised workflow for WGA followed by whole genome sequencing including 39
steps to minimise exogenous DNA. 40
41
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted April 19, 2021. ; https://doi.org/10.1101/2021.04.19.440325doi: bioRxiv preprint
3
Introduction 42
With recent declines in sequencing costs due to development of high-throughput 43
technologies, whole genome sequencing (WGS) has emerged as an important molecular tool 44
in evolutionary biology, and has been applied to a plethora of different biological systems 45
(Dettman et al., 2012; Ellegren, 2014; Hohenlohe, Hand, Andrews, & Luikart, 2018; Stiller & 46
Zhang, 2019). WGS allows the analysis of genetic variation at thousands of genomic loci to 47
test relationships between phenotypic and genotypic adaptations in genome-wide association 48
studies (GWAS) (De La Torre, Wilhite, & Neale, 2019; Rajpurohit et al., 2018; Sella & 49
Barton, 2019). At the population level, and in particular if long-term time series data 50
including from ancient DNA are available, WGS can provide invaluable genomic detail, 51
shedding light on evolutionary patterns and processes (Leonardi et al., 2017; Parks et al., 52
2015). 53
Understanding how individuals and populations adapt to their environment is one of 54
the most compelling and challenging tasks in evolutionary ecology, especially regarding the 55
current unprecedented environmental change. A unique approach gaining momentum is the 56
study of propagules of various plant or animal taxa preserved in layered aquatic sediments 57
(Ellegaard et al., 2020; Orsini et al., 2013). These archives, consisting of dormant stages (e.g. 58
eggs, seeds, cysts) with DNA degraded to varying degrees, as well as of hatchable propagules 59
with intact DNA allow the direct observation of evolutionary change across centuries or even 60
millennia (Brede et al., 2009; Cordellier, Wojewodzic, Wessels, Kuster, & von Elert, 2021; 61
Frisch et al., 2014; Härnström, Ellegaard, Andersen, & Godhe, 2011; Mergeay, Verschuren, 62
& De Meester, 2006; Pollard, Colbourne, & Keller, 2003; Weider, Lampert, Wessels, 63
Colbourne, & Limburg, 1997), potentially at genomic resolution of individual isolates. The 64
exploitation of such resources together with modern molecular tools is instrumental in the 65
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted April 19, 2021. ; https://doi.org/10.1101/2021.04.19.440325doi: bioRxiv preprint
4
study of evolutionary processes over thousands of generations in relation to environmental 66
change. 67
As one of the notable examples, the population genetics of the ecological and genomic 68
model Daphnia (Crustacea, Cladocera) has been studied over historical time frames and 69
associated with changes in the lake environment, genotyping either individual eggs (Limburg 70
& Weider, 2002; Brede et al., 2009; Orsini, Spanier, & De Meester, 2012; Frisch et al., 2014, 71
2016) or by whole genome sequencing of pooled egg DNA (Cordellier et al., 2021). Whole 72
genome sequencing (WGS) of individual eggs would allow population genomic studies at 73
high-resolution, however, success of WGS of individual dormant eggs so far has been limited 74
(Lack, Weider, & Jeyasingh, 2018). 75
A major obstacle for WGS of individual dormant Daphnia eggs is their minute 76
amount of DNA. These eggs contain an embryo in the late blastula stage (Chen et al, 2018; 77
von Baldass, 1941) given an estimated haploid genome size of ~200 Mb (Colbourne et al., 78
2011) and around 2000 cells in a dormant embryo of D. pulex (von Baldass, 1941), the DNA 79
content can be estimated at ~800 pg in a diploid embryo. The situation is exacerbated for 80
historical dormant Daphnia eggs or those of other taxa, due to DNA degradation, posing 81
additional problems for DNA sequencing (Rizzi, Lari, Gigli, De Bellis, & Caramelli, 2012). 82
Although it is possible to combine eggs from individual sediment strata for a pooled 83
sequencing approach, such a strategy leads to information loss on individual genotypes and 84
accuracy of population genomic parameters such as FST estimates (Dorant et al., 2019). An 85
alternative approach to gain sufficient amounts of genetic starting material is by performing 86
whole genome amplification (WGA). This method uses cell material without prior DNA 87
extraction, thus minimising potential loss of DNA during the extraction process and amplifies 88
genomic DNA from extremely low starting concentrations in the picogram range. However, a 89
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted April 19, 2021. ; https://doi.org/10.1101/2021.04.19.440325doi: bioRxiv preprint
5
previous study that applied WGA to individual, dormant Daphnia eggs had limited success 90
with only one of three eggs producing amplified Daphnia DNA (Lack et al., 2018). 91
Multiple displacement amplification (MDA), a widely used PCR-free WGA method, 92
utilizes a high fidelity φ29 DNA polymerase which extends from hexamer primers that 93
randomly bind to targets across the genomic template (Dean et al., 2002; Dean, Nelson, 94
Giesler, & Lasken, 2001). This results in the generation of large DNA products, with an 95
average length of ~10kb (capable of reaching over 100 kb), that can have a strong coverage of 96
the target genome (Blanco et al., 1989; Handyside et al., 2004; Lasken & Egholm, 2003; 97
Paez, 2004). MDA is often favoured over PCR-WGA techniques, for example degenerate 98
oligonucleotide-primed PCR, as PCR-based methods can result in the production of small 99
DNA fragments (>1 kb) (Telenius et al., 1992; Wells, Sherlock, Handyside, & Delhanty, 100
1999; L. Zhang et al., 1992) that contain a number of non-specific amplification artifacts 101
(Cheung & Nelson, 1996). Additionally, PCR-WGA methods can show a significant 102
amplification bias towards specific loci, and consequently products may not give a complete 103
coverage of loci (Dean et al., 2002). MDA is highly sensitive and is particularly vulnerable to 104
DNA contamination, which can compete or co-amplify with the desired DNA template during 105
WGA and cause issues during downstream analyses (Blainey & Quake, 2011; Woyke et al., 106
2011). Great care must therefore be taken to eliminate sources of contamination during the 107
amplification step. 108
Extending a study for whole genome amplification from individual Daphnia (Lack et 109
al., 2018), our goal was to develop an optimized WGA-WGS workflow for historical dormant 110
egg isolates including improved decontamination steps. To achieve this, we use recently 111
produced Daphnia magna dormant eggs from laboratory cultures (days old) and Daphnia 112
pulicaria dormant eggs isolated from lake sediment (between 10-300 years old), and 113
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted April 19, 2021. ; https://doi.org/10.1101/2021.04.19.440325doi: bioRxiv preprint
6
compared two commercially available single cell WGA kits based on MDA technology. We 114
test the success of reducing exogenous DNA through the application of different 115
concentrations and durations of bleach, or several washes with PBS to samples prior to WGA. 116
In a sequencing experiment, we analyse mapping efficiency and genome-wide read 117
distribution and compare these between species and eggs of various age for a total number of 118
16 dormant eggs aged up to 300 years old. We compare read distribution patterns, coverage 119
breadth and uniformity, and identify contaminants. Finally, we test the utility of these kits for 120
detecting genomic variants in both nuclear and mitochondrial genomes. 121
122
Materials and Methods 123
Egg collection. All eggs used for whole genome amplification were isolated from ephippia of 124
two species: Daphnia magna Straus, 1820, and Daphnia pulicaria Forbes, 1893. For D. 125
magna, we used eggs from recently produced ephippia that are routinely removed from 126
laboratory cultures maintained in the Daphnia facility of the University of Birmingham, UK 127
(DM1-DM7, unknown origin). Ephippia of Arctic, triploid populations of Daphnia pulicaria 128
were collected in 2015 from sediment of two lakes in West Greenland (Kangerlussuaq area). 129
Details on the lakes and sediment dating can be found in Dane, Anderson, Osburn, Colbourne, 130
& Frisch( 2020). Briefly, we sampled ephippia from sediment corresponding to several 131
historical time periods in two lakes: Lake SS4 (Braya Sø): ca. 2010, ca. 1880, ca. 1720, and 132
Lake SS381: ca. 2010, ca. 1840 (Table 1). 133
Pre-WGA preparation and cleaning of Daphnia eggs. Eggs were removed from ephippia 134
(decapsulated) and transferred to sterile 1X PBS shortly before use. Decapsulated eggs were 135
inspected under a stereomicroscope to ensure that eggs were in good condition (judged by 136
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted April 19, 2021. ; https://doi.org/10.1101/2021.04.19.440325doi: bioRxiv preprint
7
colour and appearance). Visually undamaged eggs were washed in a 5% or 10% wash 137
solution made from industrial strength (12%) bleach. Exposure to the wash solution was 138
either instantly (<2 seconds) or for 20 seconds (Table 1), followed by five separate rinses in a 139
sterile 1X PBS buffer to remove any remaining bleach. Alternatively, eggs were washed by 140
five to eight rinses in 1X PBS, by placing a row of PBS droplets on a glass slide, and washing 141
each egg individually by carefully and repeatedly drawing them up with a pipette (sterile tip) 142
in each of the droplets. Rinse controls contained the PBS solution used for the last rinse, 143
while negative controls contained sterile PBS. Bleaching (including the final PBS rinse) was 144
performed in a SCANLAF Mars Safety Class 2 laminar flowhood in sterile conditions. The 145
alternative procedure of PBS rinsing was performed in a clean, dedicated room with thorough 146
bleaching of all surfaces prior to processing eggs. Bleached and rinsed eggs were kept on ice 147
for brief periods in sterile 1X PBS until further processing. 148
Whole Genome Amplification. Whole Genome Amplification (WGA) was performed using 149
two PCR-free kits: Expedeon TruePrime™ Single Cell WGA kit (from hereon: TruePrime), 150
and Qiagen REPLI-g Single Cell Kit (from hereon: REPLI-g). Positive controls (extracted 151
Daphnia DNA) were included in WGA. Rinse controls and negative controls were included to 152
monitor possible amplification of contaminating DNA. Prior to WGA, egg membranes were 153
pierced with a sterile 10 µl pipette tip to allow exposure of embryonic cells, and kept in the 154
respective amount of 1X PBS buffer required for the first step of the reaction in each test kit. 155
DNA concentration in WGA products were quantified with a microplate Reader (Tecan 156
infinite F200 pro), or a Qubit® 2.0 Fluorometer (Invitrogen) and the Qubit® dsDNA HS Assay 157
kit (Invitrogen). The size distribution of WGA products was determined by agarose gel 158
electrophoresis to analyse the impact of pre-treatment steps (bleaching or washing in PBS) on 159
the DNA fragments produced by WGA. 160
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted April 19, 2021. ; https://doi.org/10.1101/2021.04.19.440325doi: bioRxiv preprint
8
Whole genome library preparation and sequencing. WGA samples DM1 - DM7 (Daphnia 161
magna eggs) and DP1 - DP5 (Daphnia pulicaria eggs) were used to prepare single-end (SE) 162
libraries with an insert size of 300 bp using a PCR-free workflow with the KAPA HyperPrep 163
Kit (Roche) following manufacturer's instructions. Sequencing of 100 bp SE libraries was 164
performed on the Illumina HiSeq2500 platform at the Environmental Omics sequencing 165
facility, University of Birmingham, UK. WGA samples DP6DP9 (Daphnia pulicaria eggs) 166
were used to prepare paired-end (PE) libraries with an insert size of 350 bp with the TruSeq 167
DNA PCR-free gel-free library preparation kit (Illumina) according to the manufacturer's 168
instructions. Paired-end library preparation and sequencing (150 bp PE libraries) was 169
performed at Edinburgh Genomics, The University of Edinburgh, UK. 170
171
Bioinformatic and statistical analysis. All analyses involving R packages were completed 172
with R version 3.6.2 (R CoreTeam, 2019). 173
Quality control and mapping. We used FastQC (Andrews, 2015) to check read 174
quality, followed by adapter trimming and removal of leading and trailing low quality bases 175
with Trimmomatic (Bolger, Lohse, & Usadel, 2014). Following quality control, reads were 176
mapped using BWA-mem with default settings (Li & Durbin, 2009) to the respective 177
reference genome assembly (Daphnia magna genome assembly daphmag2.4, GenBank 178
accession GCA_001632505.1; Daphnia pulex genome assembly 179
(http://genome.jgi.doe.gov/Dappu1/Dappu1.download.html) (Colbourne et al., 2011), 180
Daphnia pulex mitochondrial genome, Genbank Accession NC_000844 (Crease, 1999). 181
Mapping statistics were computed with Qualimap (García-Alcalde et al., 2012) prior to 182
variant calling. Duplicate reads were removed from mapped reads using MarkDuplicates from 183
the Picard Toolkit (Broad Institute, 2019). 184
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted April 19, 2021. ; https://doi.org/10.1101/2021.04.19.440325doi: bioRxiv preprint
9
Nuclear DNA variant calling and analysis. Nuclear Single Nucleotide Polymorphisms 185
(SNPs) were called in Daphnia magna using the available SE libraries. SNPs in the triploid 186
Arctic D. pulicaria eggs were called only from PE samples because sequencing depth of SE 187
samples was insufficient for calling variants in a triploid organism (Maruki & Lynch, 2017). 188
Nuclear and mitochondrial variants were called with freebayes v. 1.3.2 (Garrison & Marth, 189
2012), excluding reads with a mapping quality < 40, base quality < 24 and a minimum 190
alternate allele fraction of 0.01, ploidy = 2 (D. magna ncDNA, D. pulicaria mtDNA) and 191
ploidy = 3 in D. pulicaria ncDNA. After variant calling, nuclear SNPs were hard-filtered with 192
vcffilter (Garrison, 2016) applying all of the following settings: "QUAL > 1" to ensure the 193
exclusion of variants with very low quality, "QUAL / AO > 10" to include only variants 194
where each observation contributes at least 10 log units (~Q10 per read) , "SAF > 0 & SAR > 195
0" to avoid strand bias, and "RPR > 1 & RPL > 1" to require at least two reads on each side of 196
the variant. 197
Genomic positions with high-confidence SNPs present in one or more samples were 198
compared between selected samples to assess the percentage of loci that could be called in all 199
selected samples (i.e. that were amplified and sequenced at the depth required for 200
genotyping). This comparison was used primarily to estimate the repeatability of WGA and 201
subsequent WGS, and thus for the resulting capacity to call variants at multisample level. For 202
D. magna, we compared the four samples with the highest number of SNPs (two REPLI-g 203
amplified samples: DM2, DM3, two TruePrime amplified samples: DM6, DM7). For D. 204
pulicaria, we compared all four PE samples (only TruePrime amplified). Results were 205
visualised with the R packages eulerr v. 6.1.0 (Larsson, 2020) and ggVennDiagram v. 0.3 206
(Gao & Yi, 2019). 207
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted April 19, 2021. ; https://doi.org/10.1101/2021.04.19.440325doi: bioRxiv preprint
10
Transition-to-transversion ratios for SNPs (Ti:Tv) were calculated after applying a 208
minor allele frequency (MAF) threshold of 0.05, and not allowing missing data (R packages 209
SeqArray 1.26.2 (Zheng et al., 2017) and SeqVarTools 1.24.1 (Gogarten et al. 2020)). 210
mtDNA variant calling and analysis. To analyse mtDNA, we used all available 211
Daphnia pulicaria samples (DP1-DF10). For mitochondrial SNPs, the same filters as for 212
nuclear SNPs were applied except "QUAL / AO > 10" to avoid filtering calls of the alternate 213
allele from samples with paired-end sequencing due to their consistently higher depth 214
compared to the single-end samples. Identity-by-State (IBS) was calculated by applying a 215
MAF threshold of 0.05, not allowing missing data (R package SeqArray 1.26.2 (Zheng et al., 216
2017)). SNPs in Daphnia pulicaria mtDNA were visualised with the packages Circlize 217
v.0.4.11 (Gu et al. 2014), and SNPRelate (Zheng et al., 2012). 218
Read distribution. This analysis focused on reads mapped to the N50 scaffolds of the 219
D. magna and D. pulex genomes. Coverage depth was normalised between samples of binned 220
reads (bin size 10 kb or 100 kb, normalised reads = number of reads per bin / average number 221
of reads across bins. Normalised read coverage was visualised with the packages Circlize 222
v.0.4.11 (Gu, 2014), and ggplot2 v.3.3.2 (Wickham, 2016). Read distribution was compared 223
within species between samples by correlation analysis (Pearson's correlation coefficient) of 224
normalised binned reads. For this purpose, we removed a single outlier present in all D. 225
pulicaria samples (position 420,001-430,000, scaffold 38). We tested uniformity of read 226
distribution according to the standard model for random sequencing by fitting the distribution 227
of normalised read coverage to a Poisson distribution (Lander & Waterman, 1988). 228
Outlier identification and exogenous DNA in DF7 and DF8. Outlier regions were 229
identified by a mapping rate 10 times higher than the mean normalized count. Sequences 230
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted April 19, 2021. ; https://doi.org/10.1101/2021.04.19.440325doi: bioRxiv preprint
11
belonging to these outlier regions were extracted and searched against the nucleotide database 231
with BLASTN. Exogenous DNA was identified in two samples with low mapping efficiency 232
(DP4, DP5). For this, unmapped reads were called from bam files with samtools v1.4 (Li & 233
Durbin, 2009) and converted to paired-end fastq files using bedtools v.2 (Quinlan & Hall, 234
2010). Following this step, SOAPdenovo-127mer v.2 (Luo et al., 2012) was used to denovo-235
assemble the unmapped reads with kmer size of 23 bp and default parameters. The resulting 236
contigs were searched against the NCBI nucleotide database with BLASTN (using the 237
command line "blastn -task megablast -db NCBI_nt_db -query infile -evalue 1e-100 -out 238
outfile -max_target_seqs 1 -num_threads 10 -outfmt "6 qseqid sseqid sciname qlen slen qstart 239
qend sstart send length evalue pident nident mismatch gaps"). For graphical representation we 240
used Krona (Ondov, Bergman, & Phillippy, 2011). 241
242
Results 243
Whole genome amplification 244
WGA products were separated by agarose gel electrophoresis to determine the impact 245
of pre-treatment steps (bleaching or washing in PBS) on WGA product fragment sizes. 246
Regardless of the pre-treatment process, strong bands for DNA fragments > 10 kb were 247
detected in all samples, suggesting a minimal impact of pre-treatment on WGA product size 248
(examples in Fig. S1). However, WGA products obtained from one of the rinse controls (1X 249
PBS, DNA amplified from the last PBS wash of an unbleached sample) also produced high 250
intensity bands (from 500 bp to >10 kb), which likely resulted from amplified residual DNA 251
carried over from prior washes. WGA of rinse controls obtained from the PBS after bleach-252
washing did not yield any product, suggesting that the application of bleach completely 253
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted April 19, 2021. ; https://doi.org/10.1101/2021.04.19.440325doi: bioRxiv preprint
12
removes external, exogenous DNA. No amplified DNA was present in any of the negative 254
controls (sterile PBS). 255
Mean WGA-DNA concentration was lower in the samples amplified by TruePrime 256
(7.38 µg) in comparison with REPLI-g (31.76 µg) (Table 1). These values were within the 257
ranges suggested by the respective WGA kit manufacturers (~40 µg WGA-DNA for REPLI-258
g, and 3-4 µg when starting from a single cell for TruePrime). 259
260
Read mapping to nuclear and mitochondrial reference genomes 261
Both single-end (SE) (DP1-DP5) and paired-end (PE) (DP6-DP9) sequencing was 262
performed for D. pulicaria, the latter with about 10-fold higher read numbers and related 263
higher coverage depth (Table 2). We did not find a clear pattern of mapping efficiency to the 264
nuclear genome related to the applied pre-treatment in the several-days-old dormant eggs 265
produced in cultures (DM1- DM7). However, in ephippial eggs isolated from sediment, 266
treatment with any of the tested bleach concentrations (DP1 – DP5) was associated with lower 267
mapping efficiency and/or a smaller fraction of the genome covered, suggesting that 268
bleaching might damage egg DNA. This appeared to be independent of the amplification kit, 269
for example in the TruePrime-amplified samples DP2 and DP3 (bleached, ~50% coverage 270
breadth) compared with the similarly aged DP8 and DP9 (not bleached, ~70% coverage 271
breadth, Table 2a). 272
High-throughput sequencing of libraries prepared from WGA-DNA from 13 of the 16 273
tested dormant Daphnia eggs successfully mapped with between 48% and 99% of reads 274
(mean 88%, median 92%) in both SE and PE libraries to the respective Daphnia nuclear 275
genomes, suggesting that WGA largely resulted in amplification of the target DNA (Table 276
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted April 19, 2021. ; https://doi.org/10.1101/2021.04.19.440325doi: bioRxiv preprint
13
2a). This was achieved for ephippia produced in lab cultures as well as those collected from 277
lake sediment of different age with up to ~180y old eggs. Maximum coverage breadth, i.e. the 278
fraction of the nuclear genome covered was similar in both Daphnia species (between 70 and 279
80%, Table 2a). Moderate mapping efficiency was recorded for two relatively young eggs at 280
48% (DP1, from ~10y old sediment) and 56% (DM4, from a lab culture) and resulted in 281
coverage of a lower fraction of the nuclear reference genomes (lower coverage breadth). 282
Mapping to the nuclear genome failed almost entirely in the two oldest eggs where WGA was 283
attempted (DP4 and DP5, ~300 y old, Table 2a), although the WGA-DNA yield was similar 284
to that of other eggs (Table 1), suggesting amplification of contaminant DNA. However, egg 285
age did not have a consistent effect on mapping efficiency, e.g. libraries for three ~10y old D. 286
pulicaria eggs mapped with an efficiency between 48% (DP1) and > 90% (DP6, DP7), while 287
the libraries from two ~140y old eggs were mapped with an efficiency of 84-87% (DP2, 288
DP3). 289
In contrast to the nuclear genome, we found that reads obtained from all historical 290
eggs of D. pulicaria including the oldest samples (~300y old) could be mapped to the 291
mitochondrial genomes of the target species, resulting in high average coverage depth 292
between 71X and 60,000X and a near to 100% coverage breadth of the mitochondrial genome 293
(Table 2b). 294
Read distribution 295
Patterns of genome-wide normalized read distribution differed between species, WGA 296
kits and sequencing strategy (Fig. 1). In D. magna, reads originating from the REPLI-g kit 297
(SE sequencing, Fig. 1 A-B) produced an even coverage that was repeatable between samples. 298
Read distribution of D. magna DNA obtained from this kit appeared to be more uniform, and 299
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted April 19, 2021. ; https://doi.org/10.1101/2021.04.19.440325doi: bioRxiv preprint
14
to produce fewer and less apparent outliers compared with the TruePrime kit in this species. 300
The read distribution pattern differed markedly from that observed in D. pulicaria (Fig. 1 C-301
E). Samples from this species were represented by historical, sedimentary eggs and therefore 302
the DNA may have been compromised to different degrees, providing inferior amplification 303
substrate. Read distribution obtained from sequencing SE libraries of sedimentary eggs (only 304
D. pulicaria, Fig. 1C) appeared less uniform in the REPLI-g amplified samples (DP1 – DP3), 305
which revealed several regions of the genome that were preferentially amplified. These 306
outliers included genomic regions that code e.g. for the Pokey transposon or several introns, 307
but also for segments of mitochondrial DNA encoded in the nuclear genome (Table S1). SE 308
libraries obtained from historical eggs amplified with TruePrime provided a more uniform 309
coverage with less pronounced outliers even in samples > 100y old (Fig. 1C,E). The same 310
observation was made for the PE libraries obtained from TruePrime amplified samples (Fig. 311
1D,E). The most extreme outlier identified in TruePrime-amplified DNA (a segment of 312
nuclear mitochondrial DNA on scaffold 38, Table S1) was observed in both SE and PE 313
libraries of the historical eggs. 314
To test repeatability of read distribution patterns between samples within each species, 315
we computed pairwise correlation coefficients (excluding the outlier on scaffold 38) for read 316
counts within 10 kb bins (Fig. 2, Figs. S2, S3). For both species, samples amplified by the 317
same WGA kit were strongly correlated (mean r: D. magna REPLI-g = 0.731; D. magna 318
TruePrime = 0.767; D. pulicaria REPLI-g (SE) = 0.643; D. pulicaria TruePrime (SE) = 319
0.470, D. pulicaria TruePrime (PE) = 0.938). Weak correlations were observed between 320
samples amplified using different kits (mean r: D. magna = 0.149; D. pulicaria = 0.085). 321
The read distributions of none of the amplified samples had a significant fit to a 322
Poisson distribution expected under the standard model for random sequencing (Table S2). 323
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted April 19, 2021. ; https://doi.org/10.1101/2021.04.19.440325doi: bioRxiv preprint
15
Exogenous DNA 324
In samples where WGA failed almost entirely to produce target DNA (DP4 and DP5, 325
Table 2a), but yielded a similar amount of DNA as other eggs, we used a BLAST search to 326
identify taxon origin of the unmapped reads. This DNA was identified mostly as contaminant 327
DNA from various bacterial and invertebrate taxa, as well as human DNA, but diversity and 328
percentages of the contaminant taxa differed between the two eggs (Fig. 3 and Fig. S4). 329
Variant calling 330
For D. magna we called single nucleotide variants (SNPs) in the six samples with > 6 331
million reads (DM1-DM3, DM5-DM7). For the triploid D. pulicaria eggs we could only use 332
the four PE samples for confident SNP calling because deeper sequencing with higher 333
coverage depth is needed for higher ploidy genomes. 334
The number of identified SNPs per sample in D. magna was between 99,927 and 335
430,909 in comparison with the D. magna reference genome (Table 2). In the D. magna 336
samples with the highest number of called SNPs (REPLI-g: DM2, DM3, TruePrime: DM7, 337
Table 2, Fig. 4A), a total of 442,976 unique and shared SNP loci were recorded. All three 338
samples could be genotyped at the required depth at 327,802 of these positions (74% of the 339
potential genomic positions of SNPs in these three samples). SNP loci recorded in the lower-340
depth D. magna sample DM6 were nested almost entirely in those of the higher-depth sample 341
DM7 (both amplified with TruePrime, Fig. 4B). In D. pulicaria, the total number of SNP loci 342
identified in the four PE samples in comparison to the D. pulex reference genome was 343
1,758,439. All four samples could be genotyped with the required depth at 1,712,889 (or 344
97.4%) of these genomic positions (Fig. 4C). 345
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted April 19, 2021. ; https://doi.org/10.1101/2021.04.19.440325doi: bioRxiv preprint
16
We estimated the transition:transversion ratio (ti:tv, Fig. 4D) to gage the variability of 346
this metric between the samples studied and to assess whether their values were within the 347
ranges reported for Daphnia in general. We found that the ratios differed between species but 348
that variation among samples of the same species was small (D. pulicaria: 1.30 in all four 349
samples, D. magna 1.43 – 1.47). Differences of ti:tv ratios between amplification kits (only 350
D. magna) were small (REPLI-g: 1.43 – 1.45, TruePrime: 1.46 – 1.47). 351
SNP calling in the mitochondrial genomes (Fig. 5) was performed with all available 352
D. pulicaria samples. The analysis revealed the presence of 200 biallelic SNPs that differed 353
between the two lake populations sampled. However, within-populations samples were 354
identical with the exception of DP1 (Lake SS4) and DP9 (Lake 1381) that differed from other 355
samples of their respective population by a single SNP. 356
357
Discussion 358
Whole genome amplification and subsequent whole genome sequencing are staples of 359
modern single-cell genome studies, and are widely applied to the study of human diseases 360
(Huang, Ma, Chapman, Lu, & Xie, 2015). Other promising but less common applications 361
include phylogenomics (Ahrendt et al., 2018; Zhang et al., 2019) and metagenomics of 362
microbial communities (Xu & Zhao, 2018). Suitable application for population genomics and 363
evolutionary studies using Daphnia dormant eggs has been suggested (Lack et al., 2018), but 364
to date a comprehensive study involving the comparison between multiple sedimentary eggs 365
of different historical age and species, applying different pre-treatments and amplification kits 366
is not available. Multiple displacement amplification (MDA) has superior qualities when the 367
goal is the discovery of single nucleotide variants, due to high fidelity of the φ29 polymerase 368
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted April 19, 2021. ; https://doi.org/10.1101/2021.04.19.440325doi: bioRxiv preprint
17
and associated low error rates, while PCR-based WGA such as MALBAC may perform better 369
for detecting copy number variation (Chen et al., 2014; de Bourcy et al., 2014). We therefore 370
tested two MDA-WGA kits to provide a detailed workflow (Fig. 6) for successful and 371
repeatable amplification and sequencing of DNA for variant calls from dormant eggs of 372
Daphnia. 373
374
Pre-Treatment to minimise exogenous DNA 375
Sources of DNA contamination may originate from the WGA reagents, or are 376
introduced when handling samples (Rinke et al., 2014). Contamination may also derive from 377
non-target exogenous DNA on the biological isolate (often of microbial origin), which is 378
especially common in historical samples containing highly degraded ancient DNA (Pilli et al., 379
2013). Decontamination procedures of equipment, which include the application of bleach 380
and UV light, can be performed prior to WGA to prevent the amplification of exogenous 381
DNA (Weiss et al., 2014). It is also recommended to use a thoroughly decontaminated 382
laminar flowhood during all stages of WGA, preferably situated in a dedicated clean room. 383
Overall, our data suggests that amplification of exogenous DNA can be kept to a 384
minimum when all careful steps are followed to avoid contamination. However, in historical 385
samples where DNA is already damaged, exogenous DNA may be present in higher amounts 386
than the target DNA, and thus be preferentially amplified, and even overwhelm the 387
amplification process. This was particularly obvious in the oldest samples tested here (~300y 388
old). Apart from egg age, DNA integrity may also be highly dependent on the preservation 389
conditions of the lake sediment: DNA preservation can vary strongly between different lakes 390
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted April 19, 2021. ; https://doi.org/10.1101/2021.04.19.440325doi: bioRxiv preprint
18
and may be related to a number of variables, including temperature, salt concentration and pH 391
conditions (Ellegaard et al., 2020; Giguet-Covex et al., 2019) 392
Washes with diluted bleach of D. pulicaria sedimentary eggs had detrimental impact 393
on DNA integrity of both younger and older eggs (~10 to 300y old) and therefore on the 394
quality of WGA-DNA. In contrast, washing with PBS did not appear to affect the DNA in 395
dormant eggs of D. pulicaria of different historical age (~10 and ~180y old), which produced 396
high quality WGA-DNA. It is also possible that these eggs had a generally higher quality than 397
those used for bleaching, which were sampled from sediment of a neighbouring lake. 398
Nevertheless, all sedimentary D. pulicaria eggs were older than the dormant eggs removed 399
from laboratory cultures of D. magna in our study. These were only several days old and did 400
not show any signs of DNA degradation even after bleaching with a higher bleach 401
concentration, and/or for a longer exposure time to bleach. A possible explanation is the likely 402
presence of microfissures in the egg membranes of historical, sedimentary eggs that are prone 403
to increase with dormant egg age. To test this idea, microscopic studies comparing eggs of 404
different age are needed. Based on these findings, we recommend the use of careful serial 405
PBS washes instead of diluted bleach to remove possible contaminants on sedimentary egg 406
surfaces that could overwhelm DNA amplification, particularly if egg DNA is more strongly 407
impaired with increasing age. 408
Mapping of WGA-DNA produced with either of the two kits tested here was highly 409
successful in most isolates, with a median of 92% of reads mapped to their respective 410
reference genomes, even of eggs as old as 180 years. Coverage breadth in most samples was 411
between 70-80%. These values are similar to that previously reported for a dormant Daphnia 412
egg, but below those obtained from Daphnia bulk sequencing (Lack et al., 2018), and higher 413
than of MALBAC amplified individual sperm cells of Daphnia (53%, Xu et al., 2015). 414
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted April 19, 2021. ; https://doi.org/10.1101/2021.04.19.440325doi: bioRxiv preprint
19
Indeed, incomplete genome coverage is commonly observed in WGA data in various taxa 415
(e.g. de Bourcy et al., 2014; Huang et al., 2015; Picher et al., 2016). 416
Deviation of read distributions from a Poisson distribution expected under a random 417
sequencing model (Lander & Waterman, 1988) was observed in all samples. However, this is 418
not unexpected as it has been described previously that this model is inadequate for single-cell 419
sequencing due to the possibility of locus dropout (Daley & Smith, 2014). Despite this, we 420
found the regions of the genome that are amplified to be remarkably repeatable among 421
samples of the same species, with highly significant correlation coefficients within 422
amplification kits (average r of ~0.7). The r values below average were generally associated 423
with failed target amplification or low read depth overall, specifically in the SE samples, but 424
above average in the PE samples with high read depth. However, correlations between WGA 425
kits were not strong, so for good comparability between samples, it is recommended to apply 426
only one kit. 427
The highly reproducible patterns of read distribution within kits are largely 428
responsible for the success of SNP calling across samples, demonstrating the suitability of 429
this method for population genomic applications particularly for dormant eggs and thus for its 430
utility for studying genome evolution using sediment archives. In D. magna, >70% of the 431
SNP positions could be called in all three samples with >14 million reads. Perhaps not 432
surprisingly these values were even higher in the PE samples of D. pulicaria with >80 million 433
reads, suggesting in general that sequencing strategy and depth strongly influence fidelity of 434
the variant call also when applied to WGA-DNA. 435
Transition to transversion ratios have been suggested as a quality indicator for human 436
SNP discovery (Wang, Raskin, Samuels, Shyr, & Guo, 2015). However, because these ratios 437
vary between species, e.g. averaging 1.54 in several strains of Daphnia magna (Ho et al., 438
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted April 19, 2021. ; https://doi.org/10.1101/2021.04.19.440325doi: bioRxiv preprint
20
2020), or 0.45 in C. elegans (Denver et al., 2009), comparisons should be made within 439
species. Our range between 1.43 to 1.47 for nuclear DNA of D. magna was well within the 440
ratio measured by Ho et al. (2020), and for D. pulicaria (ti:tv 1.30) was similar to that 441
reported for the closely related D. pulex (1.58, Keith et al., 2016). In this study, we can also 442
apply the ti:tv ratio as indicator for the reliability of the WGA procedure within and across 443
kits, with highly similar values for D. magna eggs, or almost identical values for D. pulicaria. 444
Other studies have identified possible limitations of the reliability of WGA, such as 445
coverage uniformity, reproducibility, and allele dropout rate (de Bourcy et al., 2014; Huang et 446
al., 2015). The substrates that these studies used were cell lineages from which individual 447
cells were subjected to scWGA and compared to bulk sequencing of a multicellular sample 448
from the same lineage. For historical isolates of dormant eggs from the sediment, such a 449
strategy cannot be applied. However, an effective substitute to test the reliability of SNPs 450
from these historical samples was the use of asexually produced dormant eggs such as those 451
of the Arctic D. pulicaria population from which our samples originated. WGA of these 452
samples allowed the genotyping of 97% of all SNP loci detected in the PE sequences. 453
Preliminary analysis of variation between these genotyped eggs showed a maximum 454
difference of 3% of the roughly 1.7 million SNPs between individuals (unpublished data). 455
An added benefit of WGA with both tested kits was the possibility of obtaining high 456
coverage of the full mitochondrial genome for both species. This is of particular interest for 457
the historical samples from which full mitochondrial genomes and high-quality SNP calls 458
could be retrieved even of the oldest samples (~300y old eggs). This opens a promising 459
avenue for gathering information of genome-wide mutation rates and spectra of the 460
mitochondrial genome stored in sedimentary archives across extended time periods and 461
thousands of generations, likely surpassing the time range tested here. 462
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted April 19, 2021. ; https://doi.org/10.1101/2021.04.19.440325doi: bioRxiv preprint
21
463
Conclusion 464
Although the method described here was tested on dormant eggs of Daphnia, it could 465
be widely applied to sedimentary dormant stages of a variety of taxa, potentially providing 466
access not only to single taxa but also to entire aquatic communities. Such taxa could include 467
key members of the plankton in the freshwater and marine to hypersaline ecosystems that 468
produce dormant stages, such as the dormant eggs of calanoid copepods, cysts of the brine 469
shrimp Artemia, algal dormant stages and plant seeds. Other methods that have successfully 470
been applied to individual small-sized planktonic crustaceans, such as low-DNA-input 471
sequencing libraries (Beninde, Möst, & Meyer, 2020) could be tested on dormant stages 472
isolated from historical sediment layers. However their protocol, which was optimised for 473
adult specimens requires a highly efficient DNA extraction step and includes up to 8 PCR 474
cycles to obtain library concentrations as high as 5 ng DNA. Since DNA concentration in 475
eggs or other propagules is generally very low, a DNA extraction step which inevitably leads 476
to a certain amount of DNA loss, could be imprudent when precious historical material is 477
involved. 478
Our data reveal that both of the tested amplification kits provided high quality DNA 479
for most Daphnia egg isolates, and that the amplified DNA could efficiently be applied to 480
WGS and subsequent genome-wide studies at the population level. However, differences 481
were observed with respect to the age of dormant eggs, where our results suggest a superior 482
performance of TruePrime (compared with REPLI-g) for application to eggs of sedimentary 483
origin and thus to prospectively degraded DNA. A possible explanation could be that the 484
primase TthPrimPol used in the TruePrime kit shows translesion activity, allowing re-485
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted April 19, 2021. ; https://doi.org/10.1101/2021.04.19.440325doi: bioRxiv preprint
22
initiation of the replication fork when encountering damaged DNA, and thus continued 486
amplification of damaged substrate (Picher & Blanco, 2014). Due to the similarity of 487
TthPrimPol to human PrimPol, another mechanism could explain our results: human PrimPol 488
can reprime DNA synthesis following a lesion, allowing the φ29 DNA polymerase to 489
continue the amplification process close to the region in which the lesion was found (Mourón 490
et al., 2013), possibly improving evenness during the process . 491
Ultimately, our data indicate that for optimal results, preliminary trials are 492
recommended using both kits tested here (and if possible others) on the dormant egg 493
population in question. 494
495
Acknowledgements 496
DF received funding from the European Union’s Horizon 2020 research and innovation 497
programme under the Marie Skłodowska-Curie grant agreement No. 658714 and NERC 498
Biomolecular Analysis Facility Pilot Project Grant NBAF998. COG received funding by the 499
Midlands Integrative Biosciences Training Partnership (MIBTP). We are grateful to Stephen 500
Kissane for preparation and sequencing of single-end libraries, and to Caroline Sewell for 501
supplying D. magna ephippia from the Daphnia facility, UoB. Paired-end library preparation 502
and sequencing were carried out by Edinburgh Genomics, the University of Edinburgh, which 503
is partly supported with core funding from NERC (UKSBS PR18037). 504
505
506
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted April 19, 2021. ; https://doi.org/10.1101/2021.04.19.440325doi: bioRxiv preprint
23
References 507
508
Ahrendt, S. R., Quandt, C. A., Ciobanu, D., Clum, A., Salamov, A., Andreopoulos, B., … 509
Grigoriev, I. V. (2018). Leveraging single-cell genomics to expand the fungal tree of life. 510
Nature Microbiology, 3(12), 1417–1428. doi: 10.1038/s41564-018-0261-0 511
Andrews, S. (2015). FASTQC A Quality Control tool for High Throughput Sequence Data. 512
Babraham Institute. Https://Www.Bioinformatics.Babraham.Ac.Uk/Projects/Fastqc/. 513
Beninde, J., Möst, M., & Meyer, A. (2020). Optimized and affordable highthroughput 514
sequencing workflow for preserved and nonpreserved small zooplankton specimens. 515
Molecular Ecology Resources, 20(6), 1632–1646. doi: 10.1111/1755-0998.13228 516
Blainey, P. C., & Quake, S. R. (2011). Digital MDA for enumeration of total nucleic acid 517
contamination. Nucleic Acids Research. doi: 10.1093/nar/gkq1074 518
Blanco, L., Bernad, A., Lázaro, J. M., Martín, G., Garmendia, C., & Salas, M. (1989). Highly 519
Efficient DNA Synthesis by the Phage ϕ 29 DNA Polymerase. Journal of Biological 520
Chemistry. doi: 10.1016/s0021-9258(18)81883-x 521
Bolger, A. M., Lohse, M., & Usadel, B. (2014). Trimmomatic: A flexible trimmer for 522
Illumina sequence data. Bioinformatics. doi: 10.1093/bioinformatics/btu170 523
Brede, N., Sandrock, C., Straile, D., Spaak, P., Jankowski, T., Streit, B., & Schwenk, K. 524
(2009). The impact of human-made ecological changes on the genetic architecture of 525
Daphnia species. Proceedings of the National Academy of Sciences of the United States 526
of America, 106(12), 4758–4763. 527
Broad Institute. (2019). Picard toolkit. Broad Institute, GitHub 528
Repository.Https://Github.Com/Broadinstitute/Picard/Releases/Tag/2.25.0. 529
Chen, L., Barnett, R. E., Horstmann, M., Bamberger, V., Heberle, L., Krebs, N., … Weiss, L. 530
C. (2018). Mitotic activity patterns and cytoskeletal changes throughout the progression 531
of diapause developmental program in Daphnia. BMC Cell Biology, 19(1), 30. doi: 532
10.1186/s12860-018-0181-0 533
Chen, M., Song, P., Zou, D., Hu, X., Zhao, S., Gao, S., & Ling, F. (2014). Comparison of 534
Multiple Displacement Amplification (MDA) and Multiple Annealing and Looping-535
Based Amplification Cycles (MALBAC) in Single-Cell Sequencing. PLoS ONE, 9(12), 536
e114520. doi: 10.1371/journal.pone.0114520 537
Cheung, V. G., & Nelson, S. F. (1996). Whole genome amplification using a degenerate 538
oligonucleotide primer allows hundreds of genotypes to be performed on less than one 539
nanogram of genomic DNA. Proceedings of the National Academy of Sciences, 93(25), 540
14676–14679. doi: 10.1073/pnas.93.25.14676 541
Colbourne, J. K., Pfrender, M. E., Gilbert, D., Thomas, W. K., Tucker, A., Oakley, T. H., … 542
Boore, J. L. (2011). The Ecoresponsive Genome of Daphnia pulex. Science, 331(6017), 543
555–561. doi: 10.1126/science.1197761 544
Cordellier, M., Wojewodzic, M. W., Wessels, M., Kuster, C., & von Elert, E. (2021). Next-545
generation sequencing of DNA from resting eggs: signatures of eutrophication in a lake’s 546
sediment. Zoology. doi: 10.1016/j.zool.2021.125895 547
Crease, T. J. (1999). The complete sequence of the mitochondrial genome of Daphnia pulex 548
(Cladocera: Crustacea). Gene. doi: 10.1016/S0378-1119(99)00151-1 549
Daley, T., & Smith, A. D. (2014). Modeling genome coverage in single-cell sequencing. 550
Bioinformatics. doi: 10.1093/bioinformatics/btu540 551
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted April 19, 2021. ; https://doi.org/10.1101/2021.04.19.440325doi: bioRxiv preprint
24
Dane, M., Anderson, N. J., Osburn, C. L., Colbourne, J. K., & Frisch, D. (2020). Centennial 552
clonal stability of asexual Daphnia in Greenland lakes despite climate variability. 553
Ecology and Evolution, 10(24), 14178–14188. doi: 10.1002/ece3.7012 554
de Bourcy, C. F. A., De Vlaminck, I., Kanbar, J. N., Wang, J., Gawad, C., & Quake, S. R. 555
(2014). A quantitative comparison of single-cell whole genome amplification methods. 556
PloS One, 9(8), e105585. doi: 10.1371/journal.pone.0105585 557
De La Torre, A. R., Wilhite, B., & Neale, D. B. (2019). Environmental Genome-Wide 558
Association Reveals Climate Adaptation Is Shaped by Subtle to Moderate Allele 559
Frequency Shifts in Loblolly Pine. Genome Biology and Evolution, 11(10), 2976–2989. 560
doi: 10.1093/gbe/evz220 561
Dean, F. B., Hosono, S., Fang, L., Wu, X., Faruqi, A. F., Bray-Ward, P., … Lasken, R. S. 562
(2002). Comprehensive human genome amplification using multiple displacement 563
amplification. Proceedings of the National Academy of Sciences of the United States of 564
America, 99(8), 5261–5266. doi: 10.1073/pnas.082089499 565
Dean, F. B., Nelson, J. R., Giesler, T. L., & Lasken, R. S. (2001). Rapid amplification of 566
plasmid and phage DNA using Phi29 DNA polymerase and multiply-primed rolling 567
circle amplification. Genome Research. doi: 10.1101/gr.180501 568
Denver, D. R., Dolan, P. C., Wilhelm, L. J., Sung, W., Lucas-Lledo, J. I., Howe, D. K., … 569
Baer, C. F. (2009). A genome-wide view of Caenorhabditis elegans base-substitution 570
mutation processes. Proceedings of the National Academy of Sciences, 106(38), 16310–571
16314. doi: 10.1073/pnas.0904895106 572
Dettman, J. R., Rodrigue, N., Melnyk, A. H., Wong, A., Bailey, S. F., & Kassen, R. (2012). 573
Evolutionary insight from whole-genome sequencing of experimentally evolved 574
microbes. Molecular Ecology. doi: 10.1111/j.1365-294X.2012.05484.x 575
Dorant, Y., Benestan, L., Rougemont, Q., Normandeau, E., Boyle, B., Rochette, R., & 576
Bernatchez, L. (2019). Comparing Poolseq, Rapture, and GBS genotyping for inferring 577
weak population structure: The American lobster (Homarus americanus) as a case study. 578
Ecology and Evolution, 9(11), 6606–6623. doi: 10.1002/ece3.5240 579
Ellegaard, M., Clokie, M. R. J., Czypionka, T., Frisch, D., Godhe, A., Kremp, A., … John 580
Anderson, N. (2020). Dead or alive: sediment DNA archives as tools for tracking aquatic 581
evolution and adaptation. Communications Biology, 3(1), 169. doi: 10.1038/s42003-020-582
0899-z 583
Ellegren, H. (2014). Genome sequencing and population genomics in non-model organisms. 584
Trends in Ecology and Evolution. doi: 10.1016/j.tree.2013.09.008 585
Frisch, D., Morton, P. K., Culver, B. W., Edlund, M. B., Jeyasingh, P. D., & Weider, L. J. 586
(2016). Paleogenetic records of Daphnia pulicaria in North American lakes reveal the 587
impact of cultural eutrophication. Global Change Biology, 23(2), 708–718. doi: doi: 588
10.1111/gcb.13445 589
Frisch, D., Morton, P. K., Roy Chowdhury, P., Culver, B. W., Colbourne, J. K., Weider, L. J., 590
& Jeyasingh, P. D. (2014). A millennial-scale chronicle of evolutionary responses to 591
cultural eutrophication in Daphnia. Ecology Letters, 17(3), 360–368. 592
Gao, C.-H., & Yi, L. (2019). ggVennDiagram: A “ggplot2” implement of Venn Diagram. 593
https://github.com/gaospecial/ggVennDiagram. 594
García-Alcalde, F., Okonechnikov, K., Carbonell, J., Cruz, L. M., Götz, S., Tarazona, S., … 595
Conesa, A. (2012). Qualimap: evaluating next-generation sequencing alignment data. 596
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted April 19, 2021. ; https://doi.org/10.1101/2021.04.19.440325doi: bioRxiv preprint
25
Bioinformatics, 28(20), 2678–2679. doi: 10.1093/bioinformatics/bts503 597
Garrison, E. (2016). Vcflib, a simple C++ library for parsing and manipulating VCF files. 598
Retrieved from https://github.com/vcflib/vcflib. 599
Garrison, E., & Marth, G. (2012). Haplotype-based variant detection from short-read 600
sequencing. ArXiv Preprint ArXiv:1207.3907. 601
Giguet-Covex, C., Ficetola, G. F., Walsh, K., Poulenard, J., Bajard, M., Fouinat, L., … 602
Arnaud, F. (2019). New insights on lake sediment DNA from the catchment: importance 603
of taphonomic and analytical issues on the record quality. Scientific Reports, 9(1), 604
14676. doi: 10.1038/s41598-019-50339-1 605
Gu, Z. (2014). circlize implements and enhances circular visualization in R. Bioinformatics, 606
30(19), 2811–2812. doi: 10.1093/bioinformatics/btu393 607
Handyside, A. H., Robinson, M. D., Simpson, R. J., Omar, M. B., Shaw, M. A., Grudzinskas, 608
J. G., & Rutherford, A. (2004). Isothermal whole genome amplification from single and 609
small numbers of cells: A new era for preimplantation genetic diagnosis of inherited 610
disease. Molecular Human Reproduction. doi: 10.1093/molehr/gah101 611
Härnström, K., Ellegaard, M., Andersen, T. J., & Godhe, A. (2011). Hundred years of genetic 612
structure in a sediment revived diatom population. Proceedings of the National Academy 613
of Sciences of the United States of America, 108(10), 4252–4257. 614
Ho, E. K. H., Macrae, F., Latta, L. C., McIlroy, P., Ebert, D., Fields, P. D., … Schaack, S. 615
(2020). High and Highly Variable Spontaneous Mutation Rates in Daphnia. Molecular 616
Biology and Evolution, 37(11), 3258–3266. doi: 10.1093/molbev/msaa142 617
Hohenlohe, P. A., Hand, B. K., Andrews, K. R., & Luikart, G. (2018). Population Genomics 618
Provides Key Insights in Ecology and Evolution. doi: 10.1007/13836_2018_20 619
Huang, L., Ma, F., Chapman, A., Lu, S., & Xie, X. S. (2015). Single-Cell Whole-Genome 620
Amplification and Sequencing: Methodology and Applications. Annual Review of 621
Genomics and Human Genetics, 16(1), 79–102. doi: 10.1146/annurev-genom-090413-622
025352 623
Keith, N., Tucker, A. E., Jackson, C. E., Sung, W., Lucas Lledó, J. I., Schrider, D. R., … 624
Lynch, M. (2016). High mutational rates of large-scale duplication and deletion in 625
Daphnia pulex. Genome Research, 26(1), 60–69. doi: 10.1101/gr.191338.115 626
Lack, J. B., Weider, L. J., & Jeyasingh, P. D. (2018). Whole genome amplification and 627
sequencing of a Daphnia resting egg. Molecular Ecology Resources. doi: 10.1111/1755-628
0998.12720 629
Lander, E. S., & Waterman, M. S. (1988). Genomic mapping by fingerprinting random 630
clones: A mathematical analysis. Genomics, 2(3), 231–239. doi: 10.1016/0888-631
7543(88)90007-9 632
Larsson, J. (2020). eulerr: Area-Proportional Euler and Venn Diagrams with Ellipses. R 633
package version 6.1.0. Retrieved from https://cran.r-project.org/package=eulerr 634
Lasken, R. S., & Egholm, M. (2003). Whole genome amplification: abundant supplies of 635
DNA from precious samples or clinical specimens. Trends in Biotechnology, 21(12), 636
531–535. doi: 10.1016/j.tibtech.2003.09.010 637
Leonardi, M., Librado, P., Der Sarkissian, C., Schubert, M., Alfarhan, A. H., Alquraishi, S. 638
A., … Orlando, L. (2017). Evolutionary Patterns and Processes: Lessons from Ancient 639
DNA. Systematic Biology, 66(1), e1–e29. doi: 10.1093/sysbio/syw059 640
Li, H., & Durbin, R. (2009). Fast and accurate short read alignment with Burrows-Wheeler 641
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted April 19, 2021. ; https://doi.org/10.1101/2021.04.19.440325doi: bioRxiv preprint
26
transform. Bioinformatics, 25(14), 1754–1760. doi: 10.1093/bioinformatics/btp324 642
Limburg, P. A., & Weider, L. J. (2002). “Ancient” DNA in the resting egg bank of a 643
microcrustacean can serve as a palaeolimnological database. Proceedings of the Royal 644
Society of London Series B-Biological Sciences, 269(1488), 281–287. 645
Luo, R., Liu, B., Xie, Y., Li, Z., Huang, W., Yuan, J., … Wang, J. (2012). SOAPdenovo2: An 646
empirically improved memory-efficient short-read de novo assembler. GigaScience. doi: 647
10.1186/2047-217X-1-18 648
Maruki, T., & Lynch, M. (2017). Genotype Calling from Population-Genomic Sequencing 649
Data. G3: Genes|Genomes|Genetics, 7(5), 1393–1404. doi: 10.1534/g3.117.039008 650
Mergeay, J., Verschuren, D., & De Meester, L. (2006). Invasion of an asexual American 651
water flea clone throughout Africa and rapid displacement of a native sibling species. 652
Proceedings of the Royal Society B-Biological Sciences, 273(1603), 2839–2844. 653
Mourón, S., Rodriguez-Acebes, S., Martínez-Jiménez, M. I., García-Gómez, S., Chocrón, S., 654
Blanco, L., & Méndez, J. (2013). Repriming of DNA synthesis at stalled replication 655
forks by human PrimPol. Nature Structural & Molecular Biology, 20(12), 1383–1389. 656
doi: 10.1038/nsmb.2719 657
Ondov, B. D., Bergman, N. H., & Phillippy, A. M. (2011). Interactive metagenomic 658
visualization in a Web browser. BMC Bioinformatics, 12(1), 385. doi: 10.1186/1471-659
2105-12-385 660
Orsini, L., Schwenk, K., De Meester, L., Colbourne, J. K., Pfrender, M. E., & Weider, L. J. 661
(2013). The evolutionary time machine: forecasting how populations can adapt to 662
changing environments using dormant propagules. Trends in Ecology & Evolution, 28, 663
274–282. 664
Orsini, L., Spanier, K. I., & De Meester, L. (2012). Genomic signature of natural and 665
anthropogenic stress in wild populations of the waterflea Daphnia magna: validation in 666
space, time and experimental evolution. Molecular Ecology, 21(9), 2160–2175. 667
Paez, J. G. (2004). Genome coverage and sequence fidelity of 29 polymerase-based multiple 668
strand displacement whole genome amplification. Nucleic Acids Research, 32(9), e71–669
e71. doi: 10.1093/nar/gnh069 670
Parks, M., Subramanian, S., Baroni, C., Salvatore, M. C., Zhang, G., Millar, C. D., & 671
Lambert, D. M. (2015). Ancient population genomics and the study of evolution. 672
Philosophical Transactions of the Royal Society B: Biological Sciences. doi: 673
10.1098/rstb.2013.0381 674
Picher, Á. J., & Blanco, L. (2014). Patent No. International Publication Number 675
WO2014140309A1. https://patents.google.com/patent/WO2014140309A1/en 676
Picher, Á. J., Budeus, B., Wafzig, O., Krüger, C., García-Gómez, S., Martínez-Jiménez, M. I., 677
… Schneider, A. (2016). TruePrime is a novel method for whole-genome amplification 678
from single cells based on TthPrimPol. Nature Communications, 7(1), 13296. doi: 679
10.1038/ncomms13296 680
Pilli, E., Modi, A., Serpico, C., Achilli, A., Lancioni, H., Lippi, B., … Caramelli, D. (2013). 681
Monitoring DNA Contamination in Handled vs. Directly Excavated Ancient Human 682
Skeletal Remains. PLoS ONE, 8(1), e52524. doi: 10.1371/journal.pone.0052524 683
Pollard, H. G., Colbourne, J. K., & Keller, W. (2003). Reconstruction of centuries-old 684
Daphnia communities in a lake recovering from acidification and metal contamination. 685
Ambio, 32(3), 214–218. doi: 10.1579/0044-7447-32.3.214 686
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted April 19, 2021. ; https://doi.org/10.1101/2021.04.19.440325doi: bioRxiv preprint
27
Quinlan, A. R., & Hall, I. M. (2010). BEDTools: a flexible suite of utilities for comparing 687
genomic features. Bioinformatics, 26(6), 841–842. doi: 10.1093/bioinformatics/btq033 688
R CoreTeam. (2019). R: A Language and Environment for Statistical Computing. R 689
Foundation for Statistical Computing. Vienna, Austria. http://www.R-project.org. 690
Rajpurohit, S., Gefen, E., Bergland, A. O., Petrov, D. A., Gibbs, A. G., & Schmidt, P. S. 691
(2018). Spatiotemporal dynamics and genome-wide association genome-wide 692
association analysis of desiccation tolerance in Drosophila melanogaster. Molecular 693
Ecology. doi: 10.1111/mec.14814 694
Rinke, C., Lee, J., Nath, N., Goudeau, D., Thompson, B., Poulton, N., … Woyke, T. (2014). 695
Obtaining genomes from uncultivated environmental microorganisms using FACS-based 696
single-cell genomics. Nature Protocols, 9(5), 1038–1048. doi: 10.1038/nprot.2014.067 697
Rizzi, E., Lari, M., Gigli, E., De Bellis, G., & Caramelli, D. (2012). Ancient DNA studies: 698
New perspectives on old samples. Genetics Selection Evolution. doi: 10.1186/1297-699
9686-44-21 700
Sella, G., & Barton, N. H. (2019). Thinking about the Evolution of Complex Traits in the Era 701
of Genome-Wide Association Studies. Annual Review of Genomics and Human 702
Genetics. doi: 10.1146/annurev-genom-083115-022316 703
Stiller, J., & Zhang, G. (2019). Comparative Phylogenomics, a Stepping Stone for Bird 704
Biodiversity Studies. Diversity. doi: 10.3390/d11070115 705
Telenius, H., Carter, N. P., Bebb, C. E., Nordenskjöld, M., Ponder, B. A. J., & Tunnacliffe, A. 706
(1992). Degenerate oligonucleotide-primed PCR: General amplification of target DNA 707
by a single degenerate primer. Genomics. doi: 10.1016/0888-7543(92)90147-K 708
von Baldass, F. (1941). Entwicklung von Daphnia pulex. Zoologische Jahrbücher. Abteilung 709
Für Anatomie Und Ontogenie Der Tiere, 67, 1–60. 710
Wang, J., Raskin, L., Samuels, D. C., Shyr, Y., & Guo, Y. (2015). Genome measures used for 711
quality control are dependent on gene function and ancestry. Bioinformatics (Oxford, 712
England), 31(3), 318–323. doi: 10.1093/bioinformatics/btu668 713
Weider, L. J., Lampert, W., Wessels, M., Colbourne, J. K., & Limburg, P. (1997). Long-term 714
genetic shifts in a microcrustacean egg bank associated with anthropogenic changes in 715
the Lake Constance ecosystem. Proceedings of the Royal Society of London Series B-716
Biological Sciences, 264(1388), 1613–1618. 717
Wells, D., Sherlock, J. K., Handyside, A. H., & Delhanty, J. D. (1999). Detailed chromosomal 718
and molecular genetic analysis of single cells by whole genome amplification and 719
comparative genomic hybridisation. Nucleic Acids Research, 27(4), 1214–1218. doi: 720
10.1093/nar/27.4.1214 721
Wickham, H. (2016). ggplot2 Elegant Graphics for Data Analysis (Use R!). In Springer. doi: 722
10.1007/978-0-387-98141-3 723
Woyke, T., Sczyrba, A., Lee, J., Rinke, C., Tighe, D., Clingenpeel, S., … Cheng, J.-F. (2011). 724
Decontamination of MDA reagents for single cell whole genome amplification. PloS 725
One, 6(10), e26161. doi: 10.1371/journal.pone.0026161 726
Xu, S., Ackerman, M. S., Long, H., Bright, L., Spitze, K., Ramsdell, J. S., … Lynch, M. 727
(2015). A male-specific genetic map of the microcrustacean Daphnia pulex based on 728
single-sperm whole-genome sequencing. Genetics, 201(1), 31–38. doi: 729
10.1534/genetics.115.179028 730
Xu, Y., & Zhao, F. (2018). Single-cell metagenomics: challenges and applications. Protein & 731
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted April 19, 2021. ; https://doi.org/10.1101/2021.04.19.440325doi: bioRxiv preprint
28
Cell, 9(5), 501–510. doi: 10.1007/s13238-018-0544-5 732
Zhang, F., Ding, Y., Zhu, C. D., Zhou, X., Orr, M. C., Scheu, S., & Luan, Y. X. (2019). 733
Phylogenomics from low-coverage whole-genome sequencing. Methods in Ecology and 734
Evolution. doi: 10.1111/2041-210X.13145 735
Zhang, L., Cui, X., Schmitt, K., Hubert, R., Navidi, W., & Arnheim, N. (1992). Whole 736
genome amplification from a single cell: Implications for genetic analysis. Proceedings 737
of the National Academy of Sciences of the United States of America. doi: 738
10.1073/pnas.89.13.5847 739
Zheng, X., Gogarten, S. M., Lawrence, M., Stilp, A., Conomos, M. P., Weir, B. S., … Levine, 740
D. (2017). SeqArray-a storage-efficient high-performance data format for WGS variant 741
calls. Bioinformatics. doi: 10.1093/bioinformatics/btx145 742
Zheng, X., Levine, D., Shen, J., Gogarten, S. M., Laurie, C., & Weir, B. S. (2012). A high-743
performance computing toolset for relatedness and principal component analysis of SNP 744
data. Bioinformatics, 28(24), 3326–3328. doi: 10.1093/bioinformatics/bts606 745
746
747
Data accessibility and Benefit-Sharing Statement 748
The sequence data supporting the findings reported here will be made available in a suitable 749
public data repository such as NCBI Genbank or Dryad upon acceptance of the manuscript. 750
751
Author contributions 752
DF conceived the idea and obtained funding for the study. DF, COG and JKC designed the 753
experiments. COG performed the labwork. DF and COG analysed the data and wrote the 754
paper with input from JKC and VD. 755
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted April 19, 2021. ; https://doi.org/10.1101/2021.04.19.440325doi: bioRxiv preprint
29
Fig 1. A-E. Visualisation of the genome-wide normalised coverage pattern resulting from 756
WGA-DNA of dormant eggs of D. magna and D. pulicaria, with comparison of two 757
commercial MDA kits. A) Seven D. magna samples (DM1 – DM7, dormant eggs from 758
cultures), amplified with REPLI-g (five outer rings, light blue) or TruePrime (two inner rings, 759
orange). Bars represent normalised coverage (SE sequencing) in 100 kb bins on the N50 760
scaffolds. B) Detail of A) for the largest D. magna scaffold (scaffold 512). Bars represent 761
normalised coverage in 10 kb bins. C) Five D. pulicaria samples (DP1 – DP5, eggs isolated 762
from sediment of different age). Amplification with REPLI-g (10 year old egg, light blue, 300 763
y old egg, blue TruePrime (two inner rings, orange). Bars represent normalised coverage (SE 764
sequencing) in 100 kb bins on the N50 scaffolds. D) Detail of C) for the largest D. pulex 765
scaffold (scaffold 1). Bars represent normalised coverage in 10 kb bins. 766
767
768
769
5
7
24
84
115
139
190
243
248
311
337
389
446
512
547
568
626
642
687
725
781
868
872
915
930
944
966
996
1005
1036
1253
1274
1348
1361
1409
1574
1579
1581
1663
1764
1867
1937
2066
2076
2101
2121
2190
2227
2244
2372
2385
2452
2486
2569
2581
2850
2861
2865
2994
3025
3124
3258
3276
3326
3376
D. magna
normalised
coverage
(SE)
Repli-G, fresh resting egg, culture
TrueP rime , fres h res ting e gg, cu ltur e
AB
CDE
0
0
0
0
0
0
0
0
0
100
200
300
400
500
100
200
300
400
500
100
200
300
400
500
100
200
300
400
500
100
200
300
400
500
100
200
300
400
500
100
200
300
400
500
100
200
300
400
500
100
200
300
400
500
DP6 DP1DP2DP4 DP3DP5DP7DP8DP9
0100 200 300 400
Scaffold 1
DM6
0
100
200
300
400
500
DM1
0
100
200
300
400
500
DM2
0
100
200
300
400
500
0
100
200
300
400
500
DM4 DM3
0
100
200
300
400
500
DM5
0
100
200
300
400
500
0
100
200
300
400
500
0100 200 300
Scaffold 512
DM7
1
2
3
4
5
6
7
8
9
0
1
1
1
2
1
3
1
4
1
5
1
6
1
7
1
8
1
9
1
0
2
1
2
2
2
3
2
4
2
5
2
6
2
7
2
8
2
9
2
0
3
1
3
2
3
3
3
3
4
3
5
3
6
3
7
3
8
3
9
4
0
4
1
4
2
4
3
4
4
4
5
4
6
4
7
4
8
4
9
5
0
TrueP rime , ~ 10 y old e gg
TrueP rime , ~ 180 y ol d egg
D. pulicaria
normalised
coverage
(PE)
1
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted April 19, 2021. ; https://doi.org/10.1101/2021.04.19.440325doi: bioRxiv preprint
30
Fig. 2. Correlation matrix (Pearson's r) of pairwise comparisons between normalised read 770
counts of samples in 10 kb bins. A) Pairwise comparisons of D. magna samples. B) Pairwise 771
comparisons of D. pulicaria samples. All pairwise comparisons had an associated p-value < 772
0.01.
773
774
775
DP7
DP9
DP8
DP6
DP3
DP2
DP5
DP4
DP1
PE
PESE
SE
DP4 DP5 DP2 DP3 DP6 DP7 DP8
SE
SE
repli-G
TruePrime
DM7
DM6
DM5
DM4
DM3
DM2
DM6
DM5
DM4DM3DM2DM1
A
B
Pearson'sr
0.25
0.50
0.75
0.90
Pearson'sr
0.25
0.50
0.75
0.90
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted April 19, 2021. ; https://doi.org/10.1101/2021.04.19.440325doi: bioRxiv preprint
31
Fig. 3. Taxon identity of non-target DNA amplified from a D. pulicaria dormant egg (DP5)776
synthetic construct 1%
Root
Eukaryota
Opisthokonta
Metazoa
Bilateria
Catarrhini
Homininae
Homo sapiens
81%
Protostomia
Ecdysozoa
4% Daphnia magna
Naegleria
1% Naegleria gruberi
4more
Bacteria
Terrabacteria
group
Firmicutes
Bacilli
Strepto-
coccus
Streptococcus mitis
1% Streptococcus mitis B6
1% S. pseudopneumoniae
5more
Clostridiales
1% Dehalobacter sp. C
1% Leptospira biexa serovar
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted April 19, 2021. ; https://doi.org/10.1101/2021.04.19.440325doi: bioRxiv preprint
32
Fig. 4. Results of SNP analysis in D. magna and D. pulicaria. Venn diagrams (A-C) represent 777
the number of genomic positions where SNPs were located in one or more samples and that 778
were sequenced in all or a subset of samples (shared and unique genomic positions). This 779
comparison was used to estimate the repeatability of whole genome amplification and thus for 780
the resulting capacity to call variants at multi-sample level. A) Comparison between three D. 781
magna samples with the highest number of SNPs identified, amplified by REPLI-g (DM2, 782
DM3) and TruePrime (DM7). B) Comparison between the two D. magna samples amplified 783
by TruePrime (DM6, DM7). C) Comparison between four D. pulicaria samples, amplified 784
with TruePrime. D) Transition:transversion ratio analysed for SNPs of both Daphnia species 785
786
787
788
789
DM2 ∩ DM3 ∩ DM7:
268,071 DM3
repli-G TruePrime
DM7DM6
DM2
DM3:
5,188
DM2:
2,027
DM7:
3,801
DM2 ∩
DM3:
78,314
DM3 ∩ DM7:
79,336
DM3 ∩ DM7:
3,706
Daphnia magna
Daphnia pulicaria Tra nsi tio n:Tr ans v er s io n r at i o
AB
CD
DM6:
3,561
DM7:
261,481
DM6 ∩ D7:
96,366
DP6
DP7 DP8
DP9
1965
(0.11%)
1823
(0.1%)
7130
(0.41%)
1712889
(97.41%)
3536
(0.2%)
1735
(0.1%) 2814
(0.16%)
994
(0.06%)
3766
(0.21%)
3500
(0.2%)
7765
(0.44%)
2054
(0.12%)
3536
(0.2%)
2244
(0.13%)
2688
(0.15%)
0.0
0.5
1.0
1.5
TiTv
species
magna
pulicaria
DM1
DM2
DM3
DM5
DM6
DM7
DP6
DP7
DP8
DP9
Repli-G TrueP rim e
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted April 19, 2021. ; https://doi.org/10.1101/2021.04.19.440325doi: bioRxiv preprint
33
Fig. 5. SNP positions and pairwise clustering of individuals by Identity-by-state (IBS) of nine 790
D. pulicara mitochondrial genomes recovered from historical sedimentary dormant eggs from 791
two lakes in West Greenland (Lake SS1381, green label, and SS4, blue label). Samples with 792
yellow labels were amplified with TruePrime, and the remaining three with REPLI-g. 793
A) Circular plot of detected SNP loci. The four outer rings represent samples from Lake 794
SS1381, the five inner rings from SS4. SNP colours are labelled according to their state 795
(turquoise = homozygous (variant allele), grey = homozygous (reference allele) , blue = 796
heterozygous (both reference and variant allele present). B) Hierarchical clustering tree 797
resulting from the fraction of identical genomic positions (Identity-by-State) in the nine 798
mitochondrial genomes compared. 799
800
0
400
800
genomic position
(mtDNA)
1200
1600
2000
2400
2800
3200
3600
12000
12400
12800
13200
13600
14000
14400
14800
4000
4400
4800
5200
5600
6000
6400
6800
7200
7600
8000
8400
8800
9200
9600
10000
10400
10800
11200
11600
15200
Lake SS1381 Lake SS4
0.00
0.005
0.01
1.00
DP9
DP6
DP7
DP8
DP1
DP2
DP3
DP4
DP5
Ref
Alt
Ref/Alt
Daphnia pulicaria,mtDNA(SNPs)
Alleles
IBS
DP4
DP5
DP3
DP2
DP1
DP7
DP6
DP8
DP9
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted April 19, 2021. ; https://doi.org/10.1101/2021.04.19.440325doi: bioRxiv preprint
34
Fig. 6. Recommended workflow for whole genome amplification and whole genome 801
sequencing with downstream variant call and variant filtering. Recommendation for 802
sequencing depth of higher ploidy genomes see (Maruki & Lynch, 2017). For further details 803
see text. 804
805
fresh egg,
impeccable
quality
Repli-G
scWGA
TruePrime
scWGA
serial PBS
wash (5-10x)
sedimentary
egg, variable
quality
transfer egg to PBS volume required by WGA kit
individual
dormant egg
pierce egg with sterile pipette tip (e.g. 10μl tip) to
break membrane and expose contents
PCR-free paired end library preparation and HTP
sequencing to desired depth (min. 5-10x for
diploids, >20x for triploids)
10' bleach wash
nal PBS rinse(s)
Quality Control
Raw reads:
FastQC,Trimming
(Adapter, low
quality bases)
Mapping:
BWA-mem
BIOIFORMATICS
(example workow)
PRE-TREATMENT
WGA,
LIBRARIES,
WGS
Quality Control Mapping:
qualimap (mapping stats)
laminar ow hood
Variant Call: freebayes
(diploids & polypoids)
hardlter with vcflter for
diploids & polyploids
SNP analysis: various
tools for population
genomic analysis, e.g.
SeqArray for IBS(diploids
&polyploids)
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted April 19, 2021. ; https://doi.org/10.1101/2021.04.19.440325doi: bioRxiv preprint
35
Table 1. Species identity and specifics of dormant eggs used as template in whole genome 806
amplification. Additional information is given on pre-treatment (% of the bleach solution and 807
exposure time, or 1X PBS buffer, e.g. "10% <2 sec" indicates exposure for less than 2 808
seconds to a 10% bleach solution made from 12% industrial bleach), the applied WGA kit and 809
the total product of WGA-DNA obtained. SS4 and SS1831 are two lakes in West Greenland 810
near Kangerlussuaq (for details see Methods) from which sediment cores with ephippia were 811
extracted. 812
sample
species
ploidy
age
pre-treatment
scWGA kit
µg DNA
DM1
D. magna
2n
lab
10%, < 2 sec
REPLI-g
22.27
DM2
D. magna
2n
lab
10%, 20 sec
REPLI-g
28.66
DM3
D. magna
2n
lab
10%, < 2 sec
REPLI-g
37.93
DM4
D. magna
2n
lab
5%, <2 sec
REPLI-g
33.58
DM5
D. magna
2n
lab
5%, 20sec
REPLI-g
22.80
DM6
D. magna
2n
lab
10%, < 2sec
Trueprime
7.06
DM7
D. magna
2n
lab
10%, < 2sec
Trueprime
7.48
DP1
D. pulicaria
3n
~10y (SS4)
10%, < 2sec
REPLI-g
39.48
DP2
D. pulicaria
3n
~140y (SS4)
5%, < 2sec
Trueprime
9.21
DP3
D. pulicaria
3n
~140y (SS4)
10%, < 2sec
Trueprime
9.18
DP4
D. pulicaria
3n
~300y (SS4)
10%, < 2sec
REPLI-g
42.85
DP5
D. pulicaria
3n
~300y (SS4)
10%, < 2sec
REPLI-g
26.50
DP6
D. pulicaria
3n
~10y (SS1381)
1X PBS
Trueprime
8.30
DP7
D. pulicaria
3n
~10y (SS1381)
1X PBS
Trueprime
6.00
DP8
D. pulicaria
3n
~180y (SS1381)
1X PBS
Trueprime
6.64
DP9
D. pulicaria
3n
~180y (SS1381)
1X PBS
Trueprime
5.14
813
814
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted April 19, 2021. ; https://doi.org/10.1101/2021.04.19.440325doi: bioRxiv preprint
36
Table 2. Mapping details and number of variants called in a) nuclear DNA of D. magna 815
(DM), and D. pulicaria (DP), b) mitochondrial DNA of D. pulicaria. 816
age = see Table 1; kit = WGA kit (R-g = REPLI-g, TP = TruePrime); Seq = Sequencing 817
strategy (SE = single end , PE = paired end); Number of reads = total number of reads used 818
for mapping; mapped reads= fraction of reads mapped to the respective reference genome; 819
fraction covered = coverage breadth as the percentage of reference genome with at least 1x 820
coverage; average coverage depth = mean number of reads per genomic position; SNP loci = 821
number of single nucleotide variants compared to the respective reference genome. 822
a) 823
Nuclear genome D. magna and D. pulicaria
sample
age
kit
Seq
number of
reads
mapped
reads %
coverage
breadth %
coverage
depth
SNP loci
DM1
lab
R-g
SE
8,488,215
98.72
74.95
6
301,779
DM2
lab
R-g
SE
17,742,882
98.76
77.43
13
414,782
DM3
lab
R-g
SE
24,069,767
98.74
78.63
18
430,909
DM4
lab
R-g
SE
38,765
55.73
1.47
0.02
NA
DM5
lab
R-g
SE
10,167,137
98.64
76.36
7
339,023
DM6
lab
TP
SE
6,392,740
99.18
66.75
5
99,927
DM7
lab
TP
SE
14,800,247
99.04
78.05
11
357,847
DP1
~10y
R-g
SE
12,990,625
48.36
41.45
3
NA*
DP2
~140y
TP
SE
12,088,977
87.36
50.11
5
NA*
DP3
~140y
TP
SE
16,643,733
84.62
51.82
7
NA*
DP4
~300y
R-g
SE
8,579,357
0.36
0.33
0.01
NA*
DP5
~300y
R-g
SE
7,950,283
2.55
0.93
0.07
NA*
DP6
~10y
TP
PE
85,442,043
90.53
69.21
51
1,732,886
DP7
~10y
TP
PE
134,031,117
90.95
69.58
78
1,734,987
DP8
~180y
TP
PE
95,665,657
91.65
69.50
58
1,742,463
DP9
~180y
TP
PE
93,262,683
91.85
69.55
57
1,741,613
*no variants called in these sample due to insufficient sequencing depth for triploid variants 824
b) 825
Mitochondrial genome D. pulicaria
sample
age
Seq
mapped
reads %
coverage
breadth %
coverage
depth
SNP loci
DP1
~10y
SE
73.87
99.97
61,542
200
DP2
~140y
SE
0.68
99.94
498
200
DP3
~140y
SE
0.25
99.93
252
200
DP4
~300y
SE
0.13
99.93
71
200
DP5
~300y
SE
4.55
99.95
2,315
200
DP6
~10y
PE
3.32
99.97
25,740
200
DP7
~10y
PE
0.86
99.95
10,240
200
DP8
~180y
PE
0.72
99.97
6,155
200
DP9
~180y
PE
0.62
99.95
5,197
200
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission.
The copyright holder for this preprintthis version posted April 19, 2021. ; https://doi.org/10.1101/2021.04.19.440325doi: bioRxiv preprint
ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
The rate and spectrum of spontaneous mutations are critical parameters in basic and applied biology because they dictate the pace and character of genetic variation introduced into populations which is a prerequisite for evolution. We use a mutation-accumulation (MA) approach to estimate mutation parameters from whole genome sequence data from multiple genotypes from multiple populations of Daphnia magna, an ecological and evolutionary model system. We report extremely high base substitution mutation rates (µ-n,bs = 8.96 x 10-9/bp/generation [95% CI: 6.66-11.97 x 10-9/bp/generation] in the nuclear genome and µ-m,bs= 8.7x10-7/bp/generation [95% CI: 4.40-15.12 x 10-7/bp/generation] in the mtDNA), the highest of any eukaryote examined using this approach. Levels of intraspecific variation based on the range of estimates from the 9 genotypes collected from three populations (Finland, Germany, and Israel) span 1 and 3 orders of magnitude, respectively, resulting in up to a ∼300-fold difference in rates among genomic partitions within the same lineage. In contrast, mutation spectra exhibit very consistent patterns across genotypes and populations, suggesting the mechanisms underlying the mutational process may be similar, even when the rates at which they occur differ. We discuss the implications of high levels of intraspecific variation in rates, the importance of estimating gene conversion rates using an MA approach, and the interacting factors influencing the evolution of mutation parameters. Our findings deepen our knowledge about mutation and provide both challenges to and support for current theories aimed at explaining the evolution of the mutation rate, as a trait, across taxa.
Article
Full-text available
Birds are a group with immense availability of genomic resources, and hundreds of forthcoming genomes at the doorstep. We review recent developments in whole genome sequencing, phylogenomics, and comparative genomics of birds. Short read based genome assemblies are common, largely due to efforts of the Bird 10K genome project (B10K). Chromosome-level assemblies are expected to increase due to improved long-read sequencing. The available genomic data has enabled the reconstruction of the bird tree of life with increasing confidence and resolution, but challenges remain in the early splits of Neoaves due to their explosive diversification after the Cretaceous-Paleogene (K-Pg) event. Continued genomic sampling of the bird tree of life will not just better reflect their evolutionary history but also shine new light onto the organization of phylogenetic signal and conflict across the genome. The comparatively simple architecture of avian genomes makes them a powerful system to study the molecular foundation of bird specific traits. Birds are on the verge of becoming an extremely resourceful system to study biodiversity from the nucleotide up.
Article
Full-text available
With the development of high throughput sequencing and single-cell genomics technologies, many uncultured bacterial communities have been dissected by combining these two techniques. Especially, by simultaneously leveraging of single-cell genomics and metagenomics, researchers can greatly improve the efficiency and accuracy of obtaining whole genome information from complex microbial communities, which not only allow us to identify microbes but also link function to species, identify subspecies variations, study host-virus interactions and etc. Here, we review recent developments and the challenges need to be addressed in single-cell metagenomics, including potential contamination, uneven sequence coverage, sequence chimera, genome assembly and annotation. With the development of sequencing and computational methods, single-cell metagenomics will undoubtedly broaden its application in various microbiome studies.
Article
Full-text available
Over since its emergence in 1984, the field of ancient DNA has struggled to overcome the challenges related to the decay of DNA molecules in the fossil record. With the recent development of high-throughput DNA sequencing technologies and molecular techniques tailored to ultra-damaged templates, it has now come of age, merging together approaches in phylogenomics, population genomics, epigenomics and metagenomics. Leveraging on complete temporal sample series, ancient DNA provides direct access to the most important dimension in evolution - time, allowing a wealth of fundamental evolutionary processes to be addressed at unprecedented resolution. This review taps into the most recent findings in ancient DNA research to present analyses of ancient genomic and metagenomic data.
Article
Full-text available
Knowledge of the genome-wide rate and spectrum of mutations is necessary to understand the origin of disease and the genetic variation driving all evolutionary processes. Here, we provide a genome-wide analysis of the rate and spectrum of mutations obtained in two Daphnia pulex genotypes via separate mutation-accumulation (MA) experiments. Unlike most MA studies that utilize haploid, homozygous, or self-fertilizing lines, D. pulex can be propagated ameiotically while maintaining a naturally-heterozygous, diploid genome, allowing the capture of the full spectrum of genomic changes that arise in a heterozygous state. While base-substitution mutation rates are similar to those in other multicellular eukaryotes (~4 x 10-9 per site per generation), we find that the rates of large-scale (>100 kb) de novo copy-number variants (CNVs) are significantly elevated relative to those seen in previous MA studies. The heterozygosity maintained in this experiment allowed for estimates of gene-conversion processes. While most of the conversion tract lengths we report are similar to those generated by meiotic processes, we also find larger tract lengths that are indicative of mitotic processes. Comparison of MA lines to natural isolates reveals that a majority of large-scale CNVs in natural populations are removed by purifying selection. The mutations observed here share similarities with disease-causing complex, large-scale CNVs, thereby demonstrating that MA studies in D. pulex serve as a system for studying the processes leading to such alterations.
Article
Full-text available
Recently, the study of ancient DNA (aDNA) has been greatly enhanced by the development of second-generation DNA sequencing technologies and targeted enrichment strategies. These developments have allowed the recovery of several complete ancient genomes, a result that would have been considered virtually impossible only a decade ago. Prior to these developments, aDNA research was largely focused on the recovery of short DNA sequences and their use in the study of phylogenetic relationships, molecular rates, species identification and population structure. However, it is now possible to sequence a large number of modern and ancient complete genomes from a single species and thereby study the genomic patterns of evolutionary change over time. Such a study would herald the beginnings of ancient population genomics and its use in the study of evolution. Species that are amenable to such large-scale studies warrant increased research effort. We report here progress on a population genomic study of the Adélie penguin (Pygoscelis adeliae). This species is ideally suited to ancient population genomic research because both modern and ancient samples are abundant in the permafrost conditions of Antarctica. This species will enable us to directly address many of the fundamental questions in ecology and evolution. © 2014 The Author(s) Published by the Royal Society. All rights reserved.
Article
Full-text available
Motivation: The transition/transversion (Ti/Tv) ratio and heterozygous/nonreference-homozygous (het/nonref-hom) ratio have been commonly computed in genetic studies as a quality control (QC) measurement. Additionally, these two ratios are helpful in our understanding of the patterns of DNA sequence evolution. Results: To thoroughly understand these two genomic measures, we performed a study using 1000 Genomes Project (1000G) released genotype data (N=1092). An additional two datasets (N=581 and N=6) were used to validate our findings from the 1000G dataset. We compared the two ratios among continental ancestry, genome regions and gene functionality. We found that the Ti/Tv ratio can be used as a quality indicator for single nucleotide polymorphisms inferred from high-throughput sequencing data. The Ti/Tv ratio varies greatly by genome region and functionality, but not by ancestry. The het/nonref-hom ratio varies greatly by ancestry, but not by genome regions and functionality. Furthermore, extreme guanine + cytosine content (either high or low) is negatively associated with the Ti/Tv ratio magnitude. Thus, when performing QC assessment using these two measures, care must be taken to apply the correct thresholds based on ancestry and genome region. Failure to take these considerations into account at the QC stage will bias any following analysis. Contact: yan.guo@vanderbilt.edu Supplementary information: Supplementary data are available at Bioinformatics online.
Article
Full-text available
DNA replication forks that collapse during the process of genomic duplication lead to double-strand breaks and constitute a threat to genomic stability. The risk of fork collapse is higher in the presence of replication inhibitors or after UV irradiation, which introduces specific modifications in the structure of DNA. In these cases, fork progression may be facilitated by error-prone translesion synthesis (TLS) DNA polymerases. Alternatively, the replisome may skip the damaged DNA, leaving an unreplicated gap to be repaired after replication. This mechanism strictly requires a priming event downstream of the lesion. Here we show that PrimPol, a new human primase and TLS polymerase, uses its primase activity to mediate uninterrupted fork progression after UV irradiation and to reinitiate DNA synthesis after dNTP depletion. As an enzyme involved in tolerance to DNA damage, PrimPol might become a target for cancer therapy.