[Show abstract][Hide abstract] ABSTRACT: Conservation genomics has become an increasingly popular term, yet it remains unclear whether the non-invasive sampling that is essential for many conservation-related studies is compatible with the minimum requirements for harnessing next-generation sequencing technologies. Here, we evaluated the feasibility of using genotyping-by-sequencing of non-invasively collected hair samples to simultaneously identify and genotype single nucleotide polymorphisms (SNPs) in a climate-sensitive mammal, the American pika (Ochotona princeps). We identified and genotyped 3,803 high-confidence SNPs across eight sites distributed along two elevational transects using starting DNA amounts as low as 1 ng. Fifty-five outlier loci were detected as candidate gene regions under divergent selection, constituting potential targets for future validation. Genome-wide estimates of gene diversity significantly and positively correlated with elevation across both transects, with all low elevation sites exhibiting significant heterozygote deficit likely due to inbreeding. More broadly, our results highlight a range of issues that must be considered when pairing genomic data collection with non-invasive sampling, particularly related to field sampling protocols for minimizing exogenous DNA, data collection strategies and quality control steps for enhancing target organism yield, and analytical approaches for maximizing cost-effectiveness and information content of recovered genomic data.
[Show abstract][Hide abstract] ABSTRACT: Restriction-site Associated DNA (RAD) markers are rapidly becoming a standard for SNP discovery and genotyping studies even in organisms without a sequenced reference genome. It is difficult, however, to identify genes nearby RAD markers of interest or move from SNPs identified by RAD to a high-throughput genotyping assay. Paired-end sequencing of RAD fragments can alleviate these problems by generating a set of paired sequences that can be locally assembled into high-quality contigs up to 1 kb in length. These contigs can then be used for SNP identification, homology searching, or high-throughput assay primer design. In this chapter, we offer suggestions on how to design a RAD paired-end (RAD-PE) sequencing project and the protocol for creating paired-end RAD libraries suitable for Illumina sequencers.
[Show abstract][Hide abstract] ABSTRACT: The advent of next-generation sequencing (NGS) has revolutionized genomic and transcriptomic approaches to biology. These new sequencing tools are also valuable for the discovery, validation and assessment of genetic markers in populations. Here we review and discuss best practices for several NGS methods for genome-wide genetic marker development and genotyping that use restriction enzyme digestion of target genomes to reduce the complexity of the target. These new methods -- which include reduced-representation sequencing using reduced-representation libraries (RRLs) or complexity reduction of polymorphic sequences (CRoPS), restriction-site-associated DNA sequencing (RAD-seq) and low coverage genotyping -- are applicable to both model organisms with high-quality reference genome sequences and, excitingly, to non-model species with no existing genomic data.
[Show abstract][Hide abstract] ABSTRACT: Despite the power of massively parallel sequencing platforms, a drawback is the short length of the sequence reads produced. We demonstrate that short reads can be locally assembled into longer contigs using paired-end sequencing of restriction-site associated DNA (RAD-PE) fragments. We use this RAD-PE contig approach to identify single nucleotide polymorphisms (SNPs) and determine haplotype structure in threespine stickleback and to sequence E. coli and stickleback genomic DNA with overlapping contigs of several hundred nucleotides. We also demonstrate that adding a circularization step allows the local assembly of contigs up to 5 kilobases (kb) in length. The ease of assembly and accuracy of the individual contigs produced from each RAD site sequence suggests RAD-PE sequencing is a useful way to convert genome-wide short reads into individually-assembled sequences hundreds or thousands of nucleotides long.
PLoS ONE 04/2011; 6(4):e18561. DOI:10.1371/journal.pone.0018561 · 3.23 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Next-generation sequencing technologies are revolutionizing the field of evolutionary biology, opening the possibility for genetic analysis at scales not previously possible. Research in population genetics, quantitative trait mapping, comparative genomics, and phylogeography that was unthinkable even a few years ago is now possible. More importantly, these next-generation sequencing studies can be performed in organisms for which few genomic resources presently exist. To speed this revolution in evolutionary genetics, we have developed Restriction site Associated DNA (RAD) genotyping, a method that uses Illumina next-generation sequencing to simultaneously discover and score tens to hundreds of thousands of single-nucleotide polymorphism (SNP) markers in hundreds of individuals for minimal investment of resources. In this chapter, we describe the core RAD-seq protocol, which can be modified to suit a diversity of evolutionary genetic questions. In addition, we discuss bioinformatic considerations that arise from unique aspects of next-generation sequencing data as compared to traditional marker-based approaches, and we outline some general analytical approaches for RAD-seq and similar data. Despite considerable progress, the development of analytical tools remains in its infancy, and further work is needed to fully quantify sampling variance and biases in these data types.
[Show abstract][Hide abstract] ABSTRACT: Next-generation sequencing technology provides novel opportunities for gathering genome-scale sequence data in natural populations, laying the empirical foundation for the evolving field of population genomics. Here we conducted a genome scan of nucleotide diversity and differentiation in natural populations of threespine stickleback (Gasterosteus aculeatus). We used Illumina-sequenced RAD tags to identify and type over 45,000 single nucleotide polymorphisms (SNPs) in each of 100 individuals from two oceanic and three freshwater populations. Overall estimates of genetic diversity and differentiation among populations confirm the biogeographic hypothesis that large panmictic oceanic populations have repeatedly given rise to phenotypically divergent freshwater populations. Genomic regions exhibiting signatures of both balancing and divergent selection were remarkably consistent across multiple, independently derived populations, indicating that replicate parallel phenotypic evolution in stickleback may be occurring through extensive, parallel genetic evolution at a genome-wide scale. Some of these genomic regions co-localize with previously identified QTL for stickleback phenotypic variation identified using laboratory mapping crosses. In addition, we have identified several novel regions showing parallel differentiation across independent populations. Annotation of these regions revealed numerous genes that are candidates for stickleback phenotypic evolution and will form the basis of future genetic analyses in this and other organisms. This study represents the first high-density SNP-based genome scan of genetic diversity and differentiation for populations of threespine stickleback in the wild. These data illustrate the complementary nature of laboratory crosses and population genomic scans by confirming the adaptive significance of previously identified genomic regions, elucidating the particular evolutionary and demographic history of such regions in natural populations, and identifying new genomic regions and candidate genes of evolutionary significance.
[Show abstract][Hide abstract] ABSTRACT: Single nucleotide polymorphism (SNP) discovery and genotyping are essential to genetic mapping. There remains a need for a simple, inexpensive platform that allows high-density SNP discovery and genotyping in large populations. Here we describe the sequencing of restriction-site associated DNA (RAD) tags, which identified more than 13,000 SNPs, and mapped three traits in two model organisms, using less than half the capacity of one Illumina sequencing run. We demonstrated that different marker densities can be attained by choice of restriction enzyme. Furthermore, we developed a barcoding system for sample multiplexing and fine mapped the genetic basis of lateral plate armor loss in threespine stickleback by identifying recombinant breakpoints in F(2) individuals. Barcoding also facilitated mapping of a second trait, a reduction of pelvic structure, by in silico re-sorting of individuals. To further demonstrate the ease of the RAD sequencing approach we identified polymorphic markers and mapped an induced mutation in Neurospora crassa. Sequencing of RAD markers is an integrated platform for SNP discovery and genotyping. This approach should be widely applicable to genetic mapping in a variety of organisms.
PLoS ONE 02/2008; 3(10):e3376. DOI:10.1371/journal.pone.0003376 · 3.23 Impact Factor