[Show abstract][Hide abstract] ABSTRACT: Summary Lifespan is a remarkably diverse trait ranging from a few days to several hundred years in nature, but the mechanisms underlying the evolution of lifespan differences remain elusive. Here we de novo assemble a reference genome for the naturally short-lived African turquoise killifish, providing a unique resource for comparative and experimental genomics. The identification of genes under positive selection in this fish reveals potential candidates to explain its compressed lifespan. Several aging genes are under positive selection in this short-lived fish and long-lived species, raising the intriguing possibility that the same gene could underlie evolution of both compressed and extended lifespans. Comparative genomics and linkage analysis identify candidate genes associated with lifespan differences between various turquoise killifish strains. Remarkably, these genes are clustered on the sex chromosome, suggesting that short lifespan might have co-evolved with sex determination. Our study provides insights into the evolutionary forces that shape lifespan in nature.
[Show abstract][Hide abstract] ABSTRACT: Conservation genomics has become an increasingly popular term, yet it remains unclear whether the non-invasive sampling that is essential for many conservation-related studies is compatible with the minimum requirements for harnessing next-generation sequencing technologies. Here, we evaluated the feasibility of using genotyping-by-sequencing of non-invasively collected hair samples to simultaneously identify and genotype single nucleotide polymorphisms (SNPs) in a climate-sensitive mammal, the American pika (Ochotona princeps). We identified and genotyped 3,803 high-confidence SNPs across eight sites distributed along two elevational transects using starting DNA amounts as low as 1 ng. Fifty-five outlier loci were detected as candidate gene regions under divergent selection, constituting potential targets for future validation. Genome-wide estimates of gene diversity significantly and positively correlated with elevation across both transects, with all low elevation sites exhibiting significant heterozygote deficit likely due to inbreeding. More broadly, our results highlight a range of issues that must be considered when pairing genomic data collection with non-invasive sampling, particularly related to field sampling protocols for minimizing exogenous DNA, data collection strategies and quality control steps for enhancing target organism yield, and analytical approaches for maximizing cost-effectiveness and information content of recovered genomic data.
[Show abstract][Hide abstract] ABSTRACT: Restriction-site Associated DNA (RAD) markers are rapidly becoming a standard for SNP discovery and genotyping studies even in organisms without a sequenced reference genome. It is difficult, however, to identify genes nearby RAD markers of interest or move from SNPs identified by RAD to a high-throughput genotyping assay. Paired-end sequencing of RAD fragments can alleviate these problems by generating a set of paired sequences that can be locally assembled into high-quality contigs up to 1 kb in length. These contigs can then be used for SNP identification, homology searching, or high-throughput assay primer design. In this chapter, we offer suggestions on how to design a RAD paired-end (RAD-PE) sequencing project and the protocol for creating paired-end RAD libraries suitable for Illumina sequencers.
No preview · Article · Jun 2012 · Methods in molecular biology (Clifton, N.J.)
[Show abstract][Hide abstract] ABSTRACT: The advent of next-generation sequencing (NGS) has revolutionized genomic and transcriptomic approaches to biology. These new sequencing tools are also valuable for the discovery, validation and assessment of genetic markers in populations. Here we review and discuss best practices for several NGS methods for genome-wide genetic marker development and genotyping that use restriction enzyme digestion of target genomes to reduce the complexity of the target. These new methods -- which include reduced-representation sequencing using reduced-representation libraries (RRLs) or complexity reduction of polymorphic sequences (CRoPS), restriction-site-associated DNA sequencing (RAD-seq) and low coverage genotyping -- are applicable to both model organisms with high-quality reference genome sequences and, excitingly, to non-model species with no existing genomic data.
No preview · Article · Jun 2011 · Nature Reviews Genetics
[Show abstract][Hide abstract] ABSTRACT: Despite the power of massively parallel sequencing platforms, a drawback is the short length of the sequence reads produced. We demonstrate that short reads can be locally assembled into longer contigs using paired-end sequencing of restriction-site associated
DNA (RAD-PE) fragments. We use this RAD-PE contig approach to identify single
nucleotide polymorphisms (SNPs) and determine haplotype structure in threespine stickleback and to sequence E. coli and stickleback genomic DNA with overlapping contigs of several hundred nucleotides. We also demonstrate that adding a circularization step allows the local assembly of contigs up to 5 kilobases (kb) in length. The ease of assembly and accuracy of the individual contigs produced from each RAD site sequence suggests RAD-PE sequencing is a useful way to convert genome-wide short reads into individually-assembled sequences hundreds or thousands of nucleotides long.
[Show abstract][Hide abstract] ABSTRACT: Next-generation sequencing technologies are revolutionizing the field of evolutionary biology, opening the possibility for genetic analysis at scales not previously possible. Research in population genetics, quantitative trait mapping, comparative genomics, and phylogeography that was unthinkable even a few years ago is now possible. More importantly, these next-generation sequencing studies can be performed in organisms for which few genomic resources presently exist. To speed this revolution in evolutionary genetics, we have developed Restriction site Associated DNA (RAD) genotyping, a method that uses Illumina next-generation sequencing to simultaneously discover and score tens to hundreds of thousands of single-nucleotide polymorphism (SNP) markers in hundreds of individuals for minimal investment of resources. In this chapter, we describe the core RAD-seq protocol, which can be modified to suit a diversity of evolutionary genetic questions. In addition, we discuss bioinformatic considerations that arise from unique aspects of next-generation sequencing data as compared to traditional marker-based approaches, and we outline some general analytical approaches for RAD-seq and similar data. Despite considerable progress, the development of analytical tools remains in its infancy, and further work is needed to fully quantify sampling variance and biases in these data types.
Full-text · Article · Jan 2011 · Methods in molecular biology (Clifton, N.J.)
[Show abstract][Hide abstract] ABSTRACT: Density of annotated and predicted genes along the stickleback genome. Count of genes in each 1-Mb window, taking each gene's position to be its lower bound as given in the Gasterosteus aculeatus genome database (Ensembl, database version 56.1j, assembly Broad S1). Vertical gray shading indicates the 21 linkage groups and unassembled scaffolds.
(0.66 MB TIF)
[Show abstract][Hide abstract] ABSTRACT: Nucleotide diversity within single and groups of populations. Nucleotide diversity (π) across the genome, with colored bars indicating significantly elevated (p≤10−5, blue) and reduced (p≤10−5, green) values. Vertical gray shading indicates boundaries of the 21 linkage groups and unassembled scaffolds, and gold shading indicates two consistent peaks of elevated nucleotide diversity. (A) RS. (B) RB. (C) OC (RS + RB). (D) BP. (E) BL. (F) ML. (G) FW (BP + BL + ML).
(2.85 MB TIF)
[Show abstract][Hide abstract] ABSTRACT: Private allele density in the overall freshwater-oceanic comparison. Each plot shows density of private alleles (ρ), with colored bars indicating regions of significantly elevated (p≤10−3, blue; p≤10−5, red) or reduced (p≤10−3) values, assessed by bootstrap resampling. Vertical gray shading indicates the 21 linkage groups and unassembled scaffolds, and gold shading indicates the nine consistent peaks of population differentiation. (A) Private allele density in FW compared to OC. (B) Private allele density in OC compared to FW.
(1.38 MB TIF)
[Show abstract][Hide abstract] ABSTRACT: A complete list of the protein coding genes that fall in genomic regions associated with differences between oceanic and freshwater populations. Gene names are listed, where available from Ensembl (release 55.1j). Where gene names were lacking, ortholog names are listed for candidate genes from Table 3. Orthology for unnamed genes was extracted from the Ensembl annotation for each gene or determined by a BLAST search of the NCBI protein database using the predicted protein/s for each gene. Broad ontology groups for candidates are denoted by red text (those listed under the heading “Morphology” in Table 3) or blue text (those listed under “Osmoregulation” in Table 3).
(0.10 MB XLS)
[Show abstract][Hide abstract] ABSTRACT: Next-generation sequencing technology provides novel opportunities for gathering genome-scale sequence data in natural populations, laying the empirical foundation for the evolving field of population genomics. Here we conducted a genome scan of nucleotide diversity and differentiation in natural populations of threespine stickleback (Gasterosteus aculeatus). We used Illumina-sequenced RAD tags to identify and type over 45,000 single nucleotide polymorphisms (SNPs) in each of 100 individuals from two oceanic and three freshwater populations. Overall estimates of genetic diversity and differentiation among populations confirm the biogeographic hypothesis that large panmictic oceanic populations have repeatedly given rise to phenotypically divergent freshwater populations. Genomic regions exhibiting signatures of both balancing and divergent selection were remarkably consistent across multiple, independently derived populations, indicating that replicate parallel phenotypic evolution in stickleback may be occurring through extensive, parallel genetic evolution at a genome-wide scale. Some of these genomic regions co-localize with previously identified QTL for stickleback phenotypic variation identified using laboratory mapping crosses. In addition, we have identified several novel regions showing parallel differentiation across independent populations. Annotation of these regions revealed numerous genes that are candidates for stickleback phenotypic evolution and will form the basis of future genetic analyses in this and other organisms. This study represents the first high-density SNP-based genome scan of genetic diversity and differentiation for populations of threespine stickleback in the wild. These data illustrate the complementary nature of laboratory crosses and population genomic scans by confirming the adaptive significance of previously identified genomic regions, elucidating the particular evolutionary and demographic history of such regions in natural populations, and identifying new genomic regions and candidate genes of evolutionary significance.
[Show abstract][Hide abstract] ABSTRACT: Single nucleotide polymorphism (SNP) discovery and genotyping are essential to genetic mapping. There remains a need for a simple, inexpensive platform that allows high-density SNP discovery and genotyping in large populations. Here we describe the sequencing of restriction-site associated DNA (RAD) tags, which identified more than 13,000 SNPs, and mapped three traits in two model organisms, using less than half the capacity of one Illumina sequencing run. We demonstrated that different marker densities can be attained by choice of restriction enzyme. Furthermore, we developed a barcoding system for sample multiplexing and fine mapped the genetic basis of lateral plate armor loss in threespine stickleback by identifying recombinant breakpoints in F(2) individuals. Barcoding also facilitated mapping of a second trait, a reduction of pelvic structure, by in silico re-sorting of individuals. To further demonstrate the ease of the RAD sequencing approach we identified polymorphic markers and mapped an induced mutation in Neurospora crassa. Sequencing of RAD markers is an integrated platform for SNP discovery and genotyping. This approach should be widely applicable to genetic mapping in a variety of organisms.