Ray Tobler

Ray Tobler
Australian National University | ANU · College of Asia & the Pacific

PhD

About

107
Publications
29,824
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
2,302
Citations
Additional affiliations
January 2016 - January 2019
Australian Centre for Ancient DNA
Position
  • Fellow
January 2013 - April 2013
Stanford University
Position
  • Visiting Student Researcher
April 2010 - December 2015
University of Veterinary Medicine, Vienna
Position
  • PhD Student

Publications

Publications (107)
Article
Full-text available
In‐solution hybridisation enrichment of genetic variation is a valuable methodology in human paleogenomics. It allows enrichment of endogenous DNA by targeting genetic markers that are comparable between sequencing libraries. Many studies have used the 1240k reagent—which enriches 1,237,207 genome‐wide SNPs—since 2015, though access was restricted....
Preprint
Full-text available
In-solution hybridisation enrichment of genetic variation is a valuable methodology in human paleogenomics. It allows enrichment of endogenous DNA by targeting genetic markers that are comparable between sequencing libraries. Many studies have used the 1240k reagent-which enriches 1,237,207 genome-wide SNPs-since 2015, though access was restricted....
Article
Full-text available
The evolutionarily recent dispersal of anatomically modern humans (AMH) out of Africa (OoA) and across Eurasia provides a unique opportunity to examine the impacts of genetic selection as humans adapted to multiple new environments. Analysis of ancient Eurasian genomic datasets (~1,000 to 45,000 y old) reveals signatures of strong selection, includ...
Article
Recent studies of cosmopolitan Drosophila populations have found hundreds to thousands of genetic loci with seasonally fluctuating allele frequencies, bringing temporally fluctuating selection to the forefront of the historical debate surrounding the maintenance of genetic variation in natural populations. Numerous mechanisms have been explored in...
Article
Full-text available
Genomic sequence data from worldwide human populations have provided a range of novel insights into our shared ancestry and the historical migrations that have shaped our global genetic diversity. However, a comprehensive understanding of these fundamental questions has been impeded by the lack of inclusion of many Indigenous populations in genomic...
Article
Full-text available
The role of natural selection in shaping biological diversity is an area of intense interest in modern biology. To date, studies of positive selection have primarily relied on genomic datasets from contemporary populations, which are susceptible to confounding factors associated with complex and often unknown aspects of population history. In parti...
Preprint
Full-text available
We introduce Dual Coordinate VCF (DVCF), a file format that records genomic variants against two different reference genomes simultaneously and is fully compliant with the current VCF specification. As implemented in the Genozip platform, DVCF enables bioinformatics pipelines to seamlessly operate across two coordinate systems by leveraging the sys...
Article
Full-text available
Xu et al. (2021) recently recommended a new parameterization of BWA-mem as a superior alternative to the widely-used BWA-aln algorithm to map ancient DNA sequencing data. Here, we compare the BWA-mem parameterization recommended by Xu et al. with the best-performing alignment methods determined in the recent benchmarks of Oliva and colleagues (2021...
Article
Full-text available
Our paper about the impacts of the Laschamps Geomagnetic Excursion 42,000 years ago has provoked considerable scientific and public interest, particularly in the so-called Adams Event associated with the initial transition of the magnetic poles. Although we welcome the opportunity to discuss our new ideas, Hawks’ assertions of misrepresentation are...
Article
Full-text available
Our study on the exact timing and the potential climatic, environmental, and evolutionary consequences of the Laschamps Geomagnetic Excursion has generated the hypothesis that geomagnetism represents an unrecognized driver in environmental and evolutionary change. It is important for this hypothesis to be tested with new data, and encouragingly, no...
Article
Full-text available
We are a group of archaeologists, anthropologists, curators and geneticists representing diverse global communities and 31 countries. All of us met in a virtual workshop dedicated to ethics in ancient DNA research held in November 2020. There was widespread agreement that globally applicable ethical guidelines are needed, but that recent recommenda...
Preprint
Full-text available
The evolutionarily recent dispersal of Anatomically Modern Humans (AMH) out of Africa and across Eurasia provides an opportunity to study rapid genetic adaptation to multiple new environments. Genomic analyses of modern human populations have detected limited signals of strong selection such as hard sweeps, but genetic admixture between populations...
Research
Full-text available
This PDF file includes: • Figs. S1 to S14 from the Systematic benchmark of ancient DNA read mapping paper• Supplementary Text Tables S1 and S2 are provided as separate files.
Data
Table S1 and S2 from the Systematic benchmark of ancient DNA read mapping paper
Preprint
Full-text available
Xu and colleagues (Xu et al., 2021) recently suggested a new parameterisation of BWA-mem (Li, 2013) as an alternative to the current standard BWA-aln (Li and Durbin, 2009) to process ancient DNA sequencing data. The authors tested several combinations of the -k and -r parameters to optimise BWA-mem ’s performance with degraded and contaminated anci...
Article
Principal component analysis (PCA) is a powerful tool for the analysis of population structure, a genetic property that is essential to understand the evolutionary processes driving biological diversification and (pre)historical colonizations, migrations and extinctions. In the current era of high‐throughput sequencing technologies, population stru...
Article
Full-text available
The tropical archipelago of Wallacea contains thousands of individual islands interspersed between mainland Asia and Near Oceania, and marks the location of a series of ancient oceanic voyages leading to the peopling of Sahul—i.e., the former continent that joined Australia and New Guinea at a time of lowered sea level—by 50,000 years ago. Despite...
Article
Full-text available
The current severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) pandemic has emphasized the vulnerability of human populations to novel viral pressures, despite the vast array of epidemiological and biomedical tools now available. Notably, modern human genomes contain evolutionary information tracing back tens of thousands of years, which...
Article
Full-text available
The hominin fossil record of Island Southeast Asia (ISEA) indicates that at least two endemic ‘super-archaic’ species—Homo luzonensis and H. floresiensis—were present around the time anatomically modern humans arrived in the region >50,000 years ago. Intriguingly, contemporary human populations across ISEA carry distinct genomic traces of ancient i...
Article
Full-text available
The current standard practice for assembling individual genomes involves mapping millions of short DNA sequences (also known as DNA ‘reads’) against a pre-constructed reference genome. Mapping vast amounts of short reads in a timely manner is a computationally challenging task that inevitably produces artefacts, including biases against alleles not...
Article
Full-text available
Reversing the field Do terrestrial geomagnetic field reversals have an effect on Earth's climate? Cooper et al. created a precisely dated radiocarbon record around the time of the Laschamps geomagnetic reversal about 41,000 years ago from the rings of New Zealand swamp kauri trees. This record reveals a substantial increase in the carbon-14 content...
Article
Full-text available
Supplementary Material for 'A global environmental crisis 42,000 years ago' Geological archives record multiple reversals of Earth’s magnetic poles, but the global impacts of these events, if any, remain unclear. Uncertain radiocarbon calibration has limited investigation of the potential effects of the last major magnetic inversion, known as the...
Article
Full-text available
We present Genozip, a universal and fully featured compression software for genomic data. Genozip is designed to be a general-purpose software and a development framework for genomic compression by providing five core capabilities – universality (support for all common genomic file formats), high compression ratios, speed, feature-richness, and ext...
Preprint
Full-text available
The current SARS-CoV-2 pandemic has emphasized the vulnerability of human populations to novel viral pressures, despite the vast array of epidemiological and biomedical tools now available. Notably, modern human genomes contain evolutionary information tracing back tens of thousands of years, which may help identify the viruses that have impacted o...
Preprint
Full-text available
The hominin fossil record of Island Southeast Asia (ISEA) indicates that at least two endemic super-archaic species, Homo luzonensis and H. floresiensis, were present around the time anatomically modern humans (AMH) arrived in the region >50,000 years ago. Contemporary human populations carry signals consistent with interbreeding events with Deniso...
Article
Full-text available
Motivation: genozip is a new lossless compression tool for VCF (Variant Call Format) files. By applying field-specific algorithms and fully utilizing the available computational hardware, genozip achieves the highest compression ratios amongst existing lossless compression tools known to the authors, at speeds comparable with the fastest multi-thr...
Preprint
We introduce PolyLinkR, an R package for gene set enrichment analysis that implements a novel null-model that accounts for linkage disequilibrium between genes belonging to the same gene set - a potential cause of false positives that is often not controlled for in similar tools. Our benchmarks show that PolyLinkR has improved performance compared...
Preprint
Full-text available
The role of selection in shaping genetic diversity in natural populations is an area of intense interest in modern biology, especially the characterization of adaptive loci. Within humans, the rapid increase in genomic information has produced surprisingly few well-defined adaptive loci, promoting the view that recent human adaptation involved nume...
Preprint
Full-text available
Background : Recombinase Polymerase Amplification (RPA) is a relatively new isothermal methodology for amplifying DNA. RPA is similar to traditional PCR in that it produces an amplicon that is defined by the annealing of two opposing oligonucleotide primers. However, while PCR relies on repeated heating and cooling cycles to denature and amplify DN...
Preprint
Full-text available
Background : Recombinase Polymerase Amplification (RPA) is a relatively new isothermal methodology for amplifying DNA. RPA is similar to traditional PCR in that it produces an amplicon that is defined by the annealing of two opposing oligonucleotide primers. However, while PCR relies on repeated heating and cooling cycles to denature and amplify DN...
Article
Full-text available
The genetic architecture of adaptive traits is of key importance to predict evolutionary responses. Most adaptive traits are polygenic—i.e., result from selection on a large number of genetic loci—but most molecularly characterized traits have a simple genetic basis. This discrepancy is best explained by the difficulty in detecting small allele fre...
Data
Selection coefficients (s) of selected alleles using different approaches to estimate the frequency of a given selected allele. The median frequency of each allele (the median frequency of all marker SNPs of a selected allele) was computed, and the frequency trajectory of replicates with ≥0.1 (method 1) and ≥0.2 (method 2) AFC until generation 60 w...
Data
Genomic heterogeneity of simulations based on a sweep paradigm with linkage and a constant s across replicates. RFS shows the frequency distribution of replicates in which selected alleles increase in frequency. RFS of experimental data (observed) is indicated by salmon dots. The expected distribution of RFS was obtained by computer simulations (se...
Data
Comparison of the genetic heterogeneity (A–B) and replicate similarity (C–D) of the selective sweep and QT paradigm simulations to the observed data. (A, B) The difference between RFS of empirical (observed) and the simulated (expected) data. For 1,000 iterations of each simulation, the difference between empirical and simulated RFS, Σ(obs − exp)2,...
Data
Characteristics of selected alleles. Starting frequency (top panel) and selection coefficient (bottom panel) of the selected alleles classified by the number of replicates in which a given selected allele has ≥0.1 frequency increase at generation 60 (method 1 in Materials and methods “Different approaches to determine the presence of selected allel...
Data
Fitness functions used for the simulation of QT paradigm. (A) Gaussian fitness function used in QT paradigm without linkage (D in S4 Fig) optimum phenotype = 0.6, standard deviation = 0.3, and fitness range from 0.5 to 4.5. (B) Gaussian fitness function used in QT paradigm with linkage (E in S4 Fig) optimum phenotype = −1.3, standard deviation = 1....
Data
Increased fitness and phenotypic similarity among 10 evolved replicates. (A) Evolved females are more fecund than the ancestral population (ANCOVA, Tukey’s HSD test p < 0.0001). The number of eggs laid over four days (two to five days after eclosion) were counted, (B) Females of 10 evolved replicates are equally fecund (ANCOVA, Tukey’s HSD test, p...
Data
Different simulation scenarios used to contrast selective sweep and QT paradigms. We compare different adaptive sweep and QT scenarios to the empirical data: selective sweep simulations of alleles without (panel A) and with (panel B) linkage were studied, as well as different aspects of a QT paradigm: genetic redundancy (panel C) and simulations of...
Data
Absolute frequency difference of the identified P-elements (in selected alleles) and selected alleles. Chr.: left and right arms of chromosomes are concatenated as some haplotype blocks span the centromere. No.: an arbitrary number given to the haplotype block of each chromosome (similar to the numbers in S1 Table). Delta: the absolute frequency di...
Data
Estimated Ne in evolved replicates for autosomes and the X chromosome. Ne, effective population size. (XLSX)
Data
Genomic heterogeneity of simulations based on the QT paradigm with linkage among alleles. RFS shows the frequency distribution of replicates in which selected alleles increase in frequency (threshold; A: ≥5% ASFC [method 3 in Materials and methods “Different approaches to determine the presence of selected alleles and their frequencies”]; B: ≥10% A...
Data
Characteristics of the reconstructed haplotype blocks. In cases in which the block spans both arms of the chromosome, both chromosome arms are specified. chr, chromosome; num, an arbitrary number given to the haplotype block of each chromosome; pos (bp), the genomic position of block in the chromosome; size (kb), length of the block in kb; SNP nums...
Data
Enrichment of gene functions in selected alleles. “*” indicates uncorrected for multiple testing. “$” indicates p-value after adjustment for multiple testing. (XLSX)
Data
Enrichment of KEGG pathways in selected alleles. “*” indicates uncorrected for multiple testing. “$” indicates p-value after adjustment for multiple testing. (XLSX)
Data
Summary of regression models to identify factors affecting the estimated selection coefficient (s). (A) The size of haplotype block is used as physical or genetic distance. (B) The estimated s was computed based on the frequency trajectory of a selected allele in replicates with ≥5% and ≥10% ASFC (methods 3 and 4 in Materials and methods “Different...
Data
Details of DNA extraction and library preparation for Pool-Seq (A) and haplotype samples (B). (A) For the founder population, nomenclature is as follows: species_population_selectionRegime_replicate (e.g., Dsim_Fl_Base_1), and for the evolved populations, nomenclature is as follows: species_population_selectionRegime_generation_replicate (e.g., Dsi...
Data
The coverage of SNPs for all time points and replicates. (XLSX)
Data
Size distribution of the reconstructed haplotype blocks in evolved replicates. Fifty percent of the haplotype blocks were smaller than 100 Kb, but approximately 25% were larger than 1 Mb. Data available in S1 Table. (PNG)
Data
Genomic heterogeneity in evolved replicates. The RFS shows the frequency distribution of replicates in which selected alleles increase in frequency. Different thresholds were used to identify an allele as selected in each replicate; Top panel: ≥0.1 (method 1) and ≥0.2 AFC (method 2): an allele with ≥0.1/0.2 frequency change, bottom panel: ≥5% (meth...
Data
Genomic heterogeneity of simulations based on a selective sweep paradigm with a constant s across replicates and no linkage. The RFS shows the frequency distribution of replicates in which selected alleles increase in frequency. The RFS of experimental data (observed) is indicated by salmon dots. The expected distribution of RFS was obtained by com...
Data
Genomic heterogeneity of simulations based on the redundancy paradigm. The RFS shows the frequency distribution of replicates in which selected alleles increase in frequency (threshold; A: ≥5% ASFC [method 3 in Materials and methods “Different approaches to determine the presence of selected alleles and their frequencies”]; B: ≥10% ASFC [method 4])...
Data
Genomic heterogeneity of simulations based on the QT paradigm without linkage among alleles. The RFS shows the frequency distribution of replicates in which selected alleles increase in frequency (threshold; A: ≥5% ASFC [method 3 in Materials and methods “Different approaches to determine the presence of selected alleles and their frequencies”]; B:...
Article
Full-text available
Background Population genetic theory predicts that rapid adaptation is largely driven by complex traits encoded by many loci of small effect. Because large-effect loci are quickly fixed in natural populations, they should not contribute much to rapid adaptation. Results To investigate the genetic architecture of thermal adaptation — a highly com...
Preprint
Full-text available
Principal components analysis (PCA) has been one of the most widely used exploration tools in genomic data analysis since its introduction in 1978 (Menozzi et al. 1978). PCA allows similarities between individuals to be efficiently calculated and visualized, optimally in two dimensions. While PCA is well suited to analyses concerned with autosomal...
Preprint
Full-text available
The genetic architecture of adaptive traits is of key importance to predict evolutionary responses. Most adaptive traits are polygenic - i.e. result from selection on a large number of genetic loci - but most molecularly characterized traits have a simple genetic basis. This discrepancy is best explained by the difficulty in detecting small allele...
Article
Full-text available
The first tracking of the dynamics of a natural invasion by a transposable element (TE) provides unprecedented details on the establishment of host defense mechanisms against TEs. We captured a D. simulans population at an early stage of a P-element invasion and studied the spread of the TE in replicated experimentally evolving populations kept und...
Article
Significance Using a powerful method that uses inexpensive short reads to detect Y-linked transfers, we show that gene traffic onto the Drosophila Y chromosome is 10 times more frequent than previously thought and includes the first Y-linked retrocopies discovered in these taxa. All 25 identified Y-linked gene transfers were relatively young (<1 mi...
Preprint
Full-text available
Population genetic theory predicts that rapid adaptation is largely driven by complex traits encoded by many loci of small effect. Because large effect loci are quickly fixed in natural populations, they should not contribute much to rapid adaptation. To investigate the genetic architecture of thermal adaptation - a highly complex trait - we perfor...
Article
Full-text available
The combination of experimental evolution with high-throughput sequencing of pooled individuals - i.e. Evolve and Resequence; E&R - is a powerful approach to study adaptation from standing genetic variation under controlled, replicated conditions. Nevertheless, E&R studies in Drosophila melanogaster have frequently resulted in inordinate numbers of...
Article
Full-text available
Aboriginal Australians represent one of the longest continuous cultural complexes known. Archaeological evidence indicates that Australia and New Guinea were initially settled approximately 50 thousand years ago (ka); however, little is known about the processes underlying the enormous linguistic and phenotypic diversity within Australia. Here we r...
Data
Figure S1. Quantile‐Quantile plot of simulated versus empirical allele frequency changes (AFC) for D. melanogaster. Figure S2. Distribution of simulated and empirical allele frequency changes (AFC) for D. simulans.
Article
Experimental evolution is a powerful tool to study adaptation under controlled conditions. Laboratory natural selection experiments mimic adaptation in the wild with better-adapted genotypes having more offspring. Because the selected traits are frequently not known, adaptation is typically measured as fitness increase by comparing evolved populati...
Article
Populations arrayed along broad latitudinal gradients often show patterns of clinal variation in phenotype and genotype. Such population differentiation can be generated and maintained by both historical demographic events and local adaptation. These evolutionary forces are not mutually exclusive and can in some cases produce nearly identical patte...
Data
Supplemental Table S1. Results of ANOVA (car R package, see methods) testing for all traits in both assayed generations (Cold 34 & Hot 59 or Cold 44 & Hot 75). Supplemental Table S2. Results of ANOVA (car R package, see methods) testing for all traits except fitness after adjusting all values and adding the assayed generations as an additional fix...
Article
Thermal stress is a pervasive selective agent in natural populations that impacts organismal growth, survival and reproduction. Drosophila melanogaster exhibits a variety of putatively adaptive phentotypic responses to thermal stress in natural and experimental settings; however, accompanying assessments of fitness are typically lacking. Here we qu...
Preprint
Populations arrayed along broad latitudinal gradients often show patterns of clinal variation in phenotype and genotype. Such population differentiation can be generated and maintained by historical demographic events and local adaptation. These evolutionary forces are not mutually exclusive and, moreover, can in some cases produce nearly identical...
Article
Full-text available
Whole genome re-sequencing of experimental populations evolving under a specific selection regime has become a popular approach to determine genotype-phenotype maps and understand adaptation to new environments. Despite its conceptual appeal and success in identifying some causative genes, it has become apparent that many studies suffer from an exc...
Article
Full-text available
Evolve and resequence (E&R) is a new approach to investigate the genomic responses to selection during experimental evolution. By using whole genome sequencing of pools of individuals (Pool-Seq), this method can identify selected variants in controlled and replicable experimental settings. Reviewing the current state of the field, we show that E&R...
Article
The analysis of polymorphism data is becoming increasingly important as a complementary tool to classical genetic analyses. Nevertheless, despite plunging sequencing costs, genomic sequencing of individuals at the population scale is still restricted to a few model species. Whole-genome sequencing of pools of individuals (Pool-seq) provides a cost-...
Data
Supplementary Appendix S1 Appendix S2 Appendix S3 Appendix S4 Table S1 Details about the individual flies sequenced. Table S2 Details about the populations analysed with Pool-Seq. Table S3 Influence of temperature on mtDNA and Wolbachia coverage. Fig. S1 Cumulative coverage for clade-specific SNPs. Fig. S2 Phylogenetic relationship of Wolbachi...
Article
Full-text available
The diversity and infection dynamics of the endosymbiont Wolbachia can be influenced by many factors, such as transmission rate, cytoplasmic incompatibility, environment, selection and genetic drift. The interplay of these factors in natural populations can result in heterogeneous infection patterns with substantial differences between populations...