A “Double Adaptor” Method for Improved Shotgun Library Construction

Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, Texas, 77030, USA.
Analytical Biochemistry (Impact Factor: 2.31). 04/1996; 236(1):107-13. DOI: 10.1006/abio.1996.0138
Source: PubMed

ABSTRACT The efficiency of shotgun DNA sequencing depends to a great extent on the quality of the random-subclone libraries used. We here describe a novel "double adaptor" strategy for efficient construction of high-quality shotgun libraries. In this method, randomly sheared and end-repaired fragments are ligated to oligonucleotide adaptors creating 12-base overhangs. Nonphosphorylated oligonucleotides are used, which prevents formation of adaptor dimers and ensures efficient ligation of insert to adaptor. The vector is prepared from a modified M13 vector, by KpnI/PstI digestion followed by ligation to oligonucleotides with ends complementary to the overhangs created in the digest. These adaptors create 5'-overhangs complementary to those on the inserts. Following annealing of insert to vector, the DNA is directly used for transformation without a ligation step. This protocol is robust and shows three- to fivefold higher yield of clones compared to previous protocols. No chimeric clones can be detected and the background of clones without an insert is <1%. The procedure is rapid and shows potential for automation.

  • Source
    • "In order to evaluate our approach on real data, we used as a test set reads from the Drosophila pseudoobscura genome sequencing project (Richards et al., 2005). We chose these particular data because the sequencing adapters used in the project are known (Andersson et al., 1996). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Sequences produced by automated Sanger sequencing machines frequently contain fragments of the cloning vector on their ends. Software tools currently available for identifying and removing the vector sequence require knowledge of the vector sequence, specific splice sites and any adapter sequences used in the experiment-information often omitted from public databases. Furthermore, the clipping coordinates themselves are missing or incorrectly reported. As an example, within the approximately 1.24 billion shotgun sequences deposited in the NCBI Trace Archive, as many as approximately 735 million (approximately 60%) lack vector clipping information. Correct clipping information is essential to scientists attempting to validate, improve and even finish the increasingly large number of genomes released at a 'draft' quality level. We present here Figaro, a novel software tool for identifying and removing the vector from raw sequence data without prior knowledge of the vector sequence. The vector sequence is automatically inferred by analyzing the frequency of occurrence of short oligo-nucleotides using Poisson statistics. We show that Figaro achieves 99.98% sensitivity when tested on approximately 1.5 million shotgun reads from Drosophila pseudoobscura. We further explore the impact of accurate vector trimming on the quality of whole-genome assemblies by re-assembling two bacterial genomes from shotgun sequences deposited in the Trace Archive. Designed as a module in large computational pipelines, Figaro is fast, lightweight and flexible. Figaro is released under an open-source license through the AMOS package (
    Bioinformatics 03/2008; 24(4):462-7. DOI:10.1093/bioinformatics/btm632 · 4.62 Impact Factor
  • Source
    • "purified bacterial cells (Charles and Ishikawa 1999) or gelpurified from a chromosomal fragment resolved through Pulsed Field Gel Electrophoresis (PFGE) (Wernegreen et al. 2002). Short (1.5–2.5 kb) insert libraries were generated from hydrosheared DNA using a double adaptor kit (SeqWright Inc.) (Andersson et al. 1996). Plasmid clones were purified and bidirectionally sequenced using BigDye v3.0 chemistry on either an ABI3700 or an ABI3730xl (Applied Biosystems). "
    [Show abstract] [Hide abstract]
    ABSTRACT: The distinct lifestyle of obligately intracellular bacteria can alter fundamental forces that drive and constrain genome change. In this study, sequencing the 792-kb genome of Blochmannia pennsylvanicus, an obligate endosymbiont of Camponotus pennsylvanicus, enabled us to trace evolutionary changes that occurred in the context of a bacterial-ant association. Comparison to the genome of Blochmannia floridanus reveals differential loss of genes involved in cofactor biosynthesis, the composition and structure of the cell wall and membrane, gene regulation, and DNA replication. However, the two Blochmannia species show complete conservation in the order and strand orientation of shared genes. This finding of extreme stasis in genome architecture, also reported previously for the aphid endosymbiont Buchnera, suggests that genome stability characterizes long-term bacterial mutualists of insects and constrains their evolutionary potential. Genome-wide analyses of protein divergences reveal 10- to 50-fold faster amino acid substitution rates in Blochmannia compared to related bacteria. Despite these varying features of genome evolution, a striking correlation in the relative divergences of proteins indicates parallel functional constraints on gene functions across ecologically distinct bacterial groups. Furthermore, the increased rates of amino acid substitution and gene loss in Blochmannia have occurred in a lineage-specific fashion, which may reflect life history differences of their ant hosts.
    Genome Research 09/2005; 15(8):1023-33. DOI:10.1101/gr.3771305 · 13.85 Impact Factor
  • Source
    • "The M13 shotgun library from concatenated and sheared DNA fragments was constructed using the double adaptor method as described previously (Andersson et al. 1996). "
    [Show abstract] [Hide abstract]
    ABSTRACT: A total of 100 kb of DNA derived from 69 individual human brain cDNA clones of 0.7-2.0 kb were sequenced by concatenated cDNA sequencing (CCS), whereby multiple individual DNA fragments are sequenced simultaneously in a single shotgun library. The method yielded accurate sequences and a similar efficiency compared with other shotgun libraries constructed from single DNA fragments (> 20 kb). Computer analyses were carried out on 65 cDNA clone sequences and their corresponding end sequences to examine both nucleic acid and amino acid sequence similarities in the databases. Thirty-seven clones revealed no DNA database matches, 12 clones generated exact matches (> or = 98% identity), and 16 clones generated nonexact matches (57%-97% identity) to either known human or other species genes. Of those 28 matched clones, 8 had corresponding end sequences that failed to identify similarities. In a protein similarity search, 27 clone sequences displayed significant matches, whereas only 20 of the end sequences had matches to known protein sequences. Our data indicate that full-length cDNA insert sequences provide significantly more nucleic acid and protein sequence similarity matches than expressed sequence tags (ESTs) for database searching.
    Genome Research 04/1997; 7(4):353-8. · 13.85 Impact Factor