Decoding a Substantial Set of Samples in Parallel by Massive Sequencing

Science for Life Laboratory, Department of Gene Technology, School of Biotechnology, Royal Institute of Technology (KTH), Solna, Sweden.
PLoS ONE (Impact Factor: 3.23). 03/2011; 6(3):e17785. DOI: 10.1371/journal.pone.0017785
Source: PubMed


There has been a dramatic increase of throughput of sequenced bases in the last years but sequencing a multitude of samples in parallel has not yet developed equally. Here we present a novel strategy where the combination of two tags is used to link sequencing reads back to their origins from a pool of samples. By incorporating the tags in two steps sample-handling complexity is lowered by nearly 100 times compared to conventional indexing protocols. In addition, the method described here enables accurate identification and typing of thousands of samples in parallel. In this study the system was designed to test 4992 samples using only 122 tags. To prove the concept of the two-tagging method, the highly polymorphic 2(nd) exon of DLA-DRB1 in dogs and wolves was sequenced using the 454 GS FLX Titanium Chemistry. By requiring a minimum sequence depth of 20 reads per sample, 94% of the successfully amplified samples were genotyped. In addition, the method allowed digital detection of chimeric fragments. These results demonstrate that it is possible to sequence thousands of samples in parallel without complex pooling patterns or primer combinations. Furthermore, the method is highly scalable as only a limited number of additional tags leads to substantial increase of the sample size.

13 Reads
  • Source
    • "Our approach to library preparation synthesized several published, successful , cost-saving strategies from the literature. We used a combinatorial sample tagging protocol (Neiman et al. 2011), which permitted the same set of barcoded primer pairs to be used for each indexed library whereby different combinations of barcodes and indices formed unique tags for each sample (Fig. 1). The protocol substantially reduced the complexity of sample handling as well as overall costs by eliminating the need for a unique PCR primer pair (~$60 per pair purified by high performance liquid chromatography) or a unique Illumina adapter (~$30 per adapter) per sample. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Next-generation sequencing (NGS) technology has extraordinarily enhanced the scope of research in the life sciences, and our goal was to broaden the application of NGS to systems that were previously difficult to study. We present protocols for processing fecal and swab samples into amplicon libraries amenable to Illumina sequencing. We developed and tested a novel metagenomic DNA extraction approach using solid phase reversible immobilization (SPRI) bead technology on fecal and swab samples collected from Western Bluebirds (Sialia mexicana). We compared the performance of the SPRI-based extraction protocol with that of the Mo Bio PowerSoil Kit, the current standard for the Human Microbiome Project and the Earth Microbiome Project. The SPRI-based method produced comparable PCR amplification success from fecal extractions but significantly outperformed the PowerSoil Kit in DNA quality, quantity, and PCR success for both cloacal and oral swab samples. We furthermore modified published protocols for preparing highly multiplexed Illumina libraries with minimal sample loss and without post-adapter ligation amplification. Our library preparation protocol was successfully validated on three sets of heterogeneous amplicons (derived from SPRI, PowerSoil, and control extractions of avian fecal and swab samples) that were sequenced across three independent, 250 bp, paired-end runs on Illumina's MiSeq platform. Our comprehensive strategies focus on maximizing efficiency and minimizing costs. In addition to increasing the feasibility of using minimally invasive sampling and NGS capabilities in avian research, our methods are notably not avian-specific and thus applicable to many research programs that involve DNA extraction and amplicon sequencing.This article is protected by copyright. All rights reserved.
    Molecular Ecology Resources 04/2014; 14(6). DOI:10.1111/1755-0998.12269 · 3.71 Impact Factor
  • Source
    • "Each target was amplified by a two-step PCR, and received a distinct index included in one of the inner primers (Figure 1b, Supplementary Table S3 and S4). The indices have been described previously2526, as well as the primers for TP53 exon 2 and 927 and the general primers for TP53 and Lambda20. Inner cycling, 94°C 5 min, [94°C 30s, 60°C 120 s, 72°C 3 min] × 10 cycles, 72°C 5 min, 4°C hold, outer cycling the same but 30 cycles. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Here we demonstrate the use of short-read massive sequencing systems to in effect achieve longer read lengths through hierarchical molecular tagging. We show how indexed and PCR-amplified targeted libraries are degraded, sub-sampled and arrested at timed intervals to achieve pools of differing average length, each of which is indexed with a new tag. By this process, indices of sample origin, molecular origin, and degree of degradation is incorporated in order to achieve a nested hierarchical structure, later to be utilized in the data processing to order the reads over a longer distance than the sequencing system originally allows. With this protocol we show how continuous regions beyond 3000 bp can be decoded by an Illumina sequencing system, and we illustrate the potential applications by calling variants of the lambda genome, analysing TP53 in cancer cell lines, and targeting a variable canine mitochondrial region.
    Scientific Reports 03/2013; 3:1186. DOI:10.1038/srep01186 · 5.58 Impact Factor
  • Source
    • "However, the broad dynamic range of the 454 platform allows a reduction in input RNA requirements. Consequently, by choosing informative intermediary gene sets and barcoding transcripts from several different individuals in a combinatorial scheme1617 we envision TnT in combination with massively parallel sequencing platforms to enable a highly multiplexed analysis both with regard to transcript and sample number. "
    [Show abstract] [Hide abstract]
    ABSTRACT: In this work we present a targeted gene expression strategy employing trinucleotide threading (TnT) amplification and massive parallel sequencing. We have previously shown that TnT combined with array readout accurately monitors expression levels. However, with this detection strategy spurious products go undetected. Accordingly, we adapted the TnT protocol to massive parallel sequencing to acquire an unbiased view of the entire TnT-generated product population. In this manner we investigated the identity of undesired products, their extent at different oligonucleotide:RNA ratios and their effect on the expression levels. We demonstrate that TnT gene expression profiling with massive sequencing readout renders reliable expression data from as low as 3.5 ng of total RNA. Moreover, using 350 ng of total RNA results in only 0.7% to 1.1% undesired products. When lowering the amount of input material, the undesired product fraction increases but this does not influence the expression profiles.
    Scientific Reports 11/2012; 2:821. DOI:10.1038/srep00821 · 5.58 Impact Factor
Show more

Preview (3 Sources)

13 Reads
Available from