A “Double Adaptor” Method for Improved Shotgun Library Construction
The efficiency of shotgun DNA sequencing depends to a great extent on the quality of the random-subclone libraries used. We here describe a novel "double adaptor" strategy for efficient construction of high-quality shotgun libraries. In this method, randomly sheared and end-repaired fragments are ligated to oligonucleotide adaptors creating 12-base overhangs. Nonphosphorylated oligonucleotides are used, which prevents formation of adaptor dimers and ensures efficient ligation of insert to adaptor. The vector is prepared from a modified M13 vector, by KpnI/PstI digestion followed by ligation to oligonucleotides with ends complementary to the overhangs created in the digest. These adaptors create 5'-overhangs complementary to those on the inserts. Following annealing of insert to vector, the DNA is directly used for transformation without a ligation step. This protocol is robust and shows three- to fivefold higher yield of clones compared to previous protocols. No chimeric clones can be detected and the background of clones without an insert is <1%. The procedure is rapid and shows potential for automation.
[Show abstract] [Hide abstract] ABSTRACT: The yaws treponemes, Treponema pallidum ssp. pertenue (TPE) strains, are closely related to syphilis causing strains of Treponema pallidum ssp. pallidum (TPA). Both yaws and syphilis are distinguished on the basis of epidemiological characteristics, clinical symptoms, and several genetic signatures of the corresponding causative agents. To precisely define genetic differences between TPA and TPE, high-quality whole genome sequences of three TPE strains (Samoa D, CDC-2, Gauthier) were determined using next-generation sequencing techniques. TPE genome sequences were compared to four genomes of TPA strains (Nichols, DAL-1, SS14, Chicago). The genome structure was identical in all three TPE strains with similar length ranging between 1,139,330 bp and 1,139,744 bp. No major genome rearrangements were found when compared to the four TPA genomes. The whole genome nucleotide divergence (d(A)) between TPA and TPE subspecies was 4.7 and 4.8 times higher than the observed nucleotide diversity (π) among TPA and TPE strains, respectively, corresponding to 99.8% identity between TPA and TPE genomes. A set of 97 (9.9%) TPE genes encoded proteins containing two or more amino acid replacements or other major sequence changes. The TPE divergent genes were mostly from the group encoding potential virulence factors and genes encoding proteins with unknown function. Hypothetical genes, with genetic differences, consistently found between TPE and TPA strains are candidates for syphilitic treponemes virulence factors. Seventeen TPE genes were predicted under positive selection, and eleven of them coded either for predicted exported proteins or membrane proteins suggesting their possible association with the cell surface. Sequence changes between TPE and TPA strains and changes specific to individual strains represent suitable targets for subspecies- and strain-specific molecular diagnostics.0Comments 30Citations
- "These loci included rDNA loci and several TPI regions (names of genes are shown in parentheses) including TPI11 (tprC), TPI12 (tprD), TPI25A (tprE), TPI25B (tprF, tprG), TPI32B (arp), TPI34 (TP0470), TPI48 (tprI, tprJ), TPI67B (tprK), and TPI78 (tprL) . Small insert libraries from XL-PCR products were prepared and sequenced as described previously [29,30]. Alternatively, XL PCR products were DDT sequenced using the primer walking method. "
[Show abstract] [Hide abstract] ABSTRACT: The genus Bartonella comprises facultative intracellular bacteria adapted to mammals, including previously recognized and emerging human pathogens. We report the 2,341,328 bp genome sequence of Bartonella grahamii, one of the most prevalent Bartonella species in wild rodents. Comparative genomics revealed that rodent-associated Bartonella species have higher copy numbers of genes for putative host-adaptability factors than the related human-specific pathogens. Many of these gene clusters are located in a highly dynamic region of 461 kb. Using hybridization to a microarray designed for the B. grahamii genome, we observed a massive, putatively phage-derived run-off replication of this region. We also identified a novel gene transfer agent, which packages the bacterial genome, with an over-representation of the amplified DNA, in 14 kb pieces. This is the first observation associating the products of run-off replication with a gene transfer agent. Because of the high concentration of gene clusters for host-adaptation proteins in the amplified region, and since the genes encoding the gene transfer agent and the phage origin are well conserved in Bartonella, we hypothesize that these systems are driven by selection. We propose that the coupling of run-off replication with gene transfer agents promotes diversification and rapid spread of host-adaptability factors, facilitating host shifts in Bartonella.0Comments 45Citations
- "Genomic DNA was randomly sheared by nebulization and 1–3 kb sized fragments were recovered. The extracted fragments were cloned into a modified M13 vector using the 'double adaptor' method . After 7–8 hours propagation in E. coli, ssM13 DNA was prepared for direct sequencing using Multi Screen MABCN1250 filter plates from Millipore combined with sodium-perchlorate lysis. "
[Show abstract] [Hide abstract] ABSTRACT: Sequences produced by automated Sanger sequencing machines frequently contain fragments of the cloning vector on their ends. Software tools currently available for identifying and removing the vector sequence require knowledge of the vector sequence, specific splice sites and any adapter sequences used in the experiment-information often omitted from public databases. Furthermore, the clipping coordinates themselves are missing or incorrectly reported. As an example, within the approximately 1.24 billion shotgun sequences deposited in the NCBI Trace Archive, as many as approximately 735 million (approximately 60%) lack vector clipping information. Correct clipping information is essential to scientists attempting to validate, improve and even finish the increasingly large number of genomes released at a 'draft' quality level. We present here Figaro, a novel software tool for identifying and removing the vector from raw sequence data without prior knowledge of the vector sequence. The vector sequence is automatically inferred by analyzing the frequency of occurrence of short oligo-nucleotides using Poisson statistics. We show that Figaro achieves 99.98% sensitivity when tested on approximately 1.5 million shotgun reads from Drosophila pseudoobscura. We further explore the impact of accurate vector trimming on the quality of whole-genome assemblies by re-assembling two bacterial genomes from shotgun sequences deposited in the Trace Archive. Designed as a module in large computational pipelines, Figaro is fast, lightweight and flexible. Figaro is released under an open-source license through the AMOS package (http://amos.sourceforge.net/Figaro).0Comments 23Citations
- "In order to evaluate our approach on real data, we used as a test set reads from the Drosophila pseudoobscura genome sequencing project (Richards et al., 2005). We chose these particular data because the sequencing adapters used in the project are known (Andersson et al., 1996). "