Decoding a Substantial Set of Samples in Parallel by Massive Sequencing

Science for Life Laboratory, Department of Gene Technology, School of Biotechnology, Royal Institute of Technology (KTH), Solna, Sweden.
PLoS ONE (Impact Factor: 3.23). 03/2011; 6(3):e17785. DOI: 10.1371/journal.pone.0017785
Source: PubMed


There has been a dramatic increase of throughput of sequenced bases in the last years but sequencing a multitude of samples in parallel has not yet developed equally. Here we present a novel strategy where the combination of two tags is used to link sequencing reads back to their origins from a pool of samples. By incorporating the tags in two steps sample-handling complexity is lowered by nearly 100 times compared to conventional indexing protocols. In addition, the method described here enables accurate identification and typing of thousands of samples in parallel. In this study the system was designed to test 4992 samples using only 122 tags. To prove the concept of the two-tagging method, the highly polymorphic 2(nd) exon of DLA-DRB1 in dogs and wolves was sequenced using the 454 GS FLX Titanium Chemistry. By requiring a minimum sequence depth of 20 reads per sample, 94% of the successfully amplified samples were genotyped. In addition, the method allowed digital detection of chimeric fragments. These results demonstrate that it is possible to sequence thousands of samples in parallel without complex pooling patterns or primer combinations. Furthermore, the method is highly scalable as only a limited number of additional tags leads to substantial increase of the sample size.

  • Source
    • "Our approach to library preparation synthesized several published, successful , cost-saving strategies from the literature. We used a combinatorial sample tagging protocol (Neiman et al. 2011), which permitted the same set of barcoded primer pairs to be used for each indexed library whereby different combinations of barcodes and indices formed unique tags for each sample (Fig. 1). The protocol substantially reduced the complexity of sample handling as well as overall costs by eliminating the need for a unique PCR primer pair (~$60 per pair purified by high performance liquid chromatography) or a unique Illumina adapter (~$30 per adapter) per sample. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Next-generation sequencing (NGS) technology has extraordinarily enhanced the scope of research in the life sciences, and our goal was to broaden the application of NGS to systems that were previously difficult to study. We present protocols for processing fecal and swab samples into amplicon libraries amenable to Illumina sequencing. We developed and tested a novel metagenomic DNA extraction approach using solid phase reversible immobilization (SPRI) bead technology on fecal and swab samples collected from Western Bluebirds (Sialia mexicana). We compared the performance of the SPRI-based extraction protocol with that of the Mo Bio PowerSoil Kit, the current standard for the Human Microbiome Project and the Earth Microbiome Project. The SPRI-based method produced comparable PCR amplification success from fecal extractions but significantly outperformed the PowerSoil Kit in DNA quality, quantity, and PCR success for both cloacal and oral swab samples. We furthermore modified published protocols for preparing highly multiplexed Illumina libraries with minimal sample loss and without post-adapter ligation amplification. Our library preparation protocol was successfully validated on three sets of heterogeneous amplicons (derived from SPRI, PowerSoil, and control extractions of avian fecal and swab samples) that were sequenced across three independent, 250 bp, paired-end runs on Illumina's MiSeq platform. Our comprehensive strategies focus on maximizing efficiency and minimizing costs. In addition to increasing the feasibility of using minimally invasive sampling and NGS capabilities in avian research, our methods are notably not avian-specific and thus applicable to many research programs that involve DNA extraction and amplicon sequencing.This article is protected by copyright. All rights reserved.
    Full-text · Article · Apr 2014 · Molecular Ecology Resources
  • Source
    • "Each target was amplified by a two-step PCR, and received a distinct index included in one of the inner primers (Figure 1b, Supplementary Table S3 and S4). The indices have been described previously2526, as well as the primers for TP53 exon 2 and 927 and the general primers for TP53 and Lambda20. Inner cycling, 94°C 5 min, [94°C 30s, 60°C 120 s, 72°C 3 min] × 10 cycles, 72°C 5 min, 4°C hold, outer cycling the same but 30 cycles. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Here we demonstrate the use of short-read massive sequencing systems to in effect achieve longer read lengths through hierarchical molecular tagging. We show how indexed and PCR-amplified targeted libraries are degraded, sub-sampled and arrested at timed intervals to achieve pools of differing average length, each of which is indexed with a new tag. By this process, indices of sample origin, molecular origin, and degree of degradation is incorporated in order to achieve a nested hierarchical structure, later to be utilized in the data processing to order the reads over a longer distance than the sequencing system originally allows. With this protocol we show how continuous regions beyond 3000 bp can be decoded by an Illumina sequencing system, and we illustrate the potential applications by calling variants of the lambda genome, analysing TP53 in cancer cell lines, and targeting a variable canine mitochondrial region.
    Full-text · Article · Mar 2013 · Scientific Reports
  • Source
    • "The incorporation of sample-specific barcodes, indexes or multiplex identifying sequences (MIDs) into sequencing libraries via ligation or PCR amplification is required for the identification of the original source of sequences. Double-barcoding strategies using a combination of two barcodes unique for each sample have been devised to reduce the probability of false assignment of sequencing reads to their sample origin (Galan et al., 2010; Neiman et al., 2011). For example, a double-indexing method, which incorporates barcodes into both ends of sample molecules via PCR amplification with 5 -tailed primers prior to library pooling, has been developed for improving the accuracy of multiplex sequencing on the current Illumina sequencing platform (Kircher et al., 2012). "

    Full-text · Article · Jan 2013
Show more