The efficiency of shotgun DNA sequencing depends to a great extent on the quality of the random-subclone libraries used. We here describe a novel "double adaptor" strategy for efficient construction of high-quality shotgun libraries. In this method, randomly sheared and end-repaired fragments are ligated to oligonucleotide adaptors creating 12-base overhangs. Nonphosphorylated oligonucleotides are used, which prevents formation of adaptor dimers and ensures efficient ligation of insert to adaptor. The vector is prepared from a modified M13 vector, by KpnI/PstI digestion followed by ligation to oligonucleotides with ends complementary to the overhangs created in the digest. These adaptors create 5'-overhangs complementary to those on the inserts. Following annealing of insert to vector, the DNA is directly used for transformation without a ligation step. This protocol is robust and shows three- to fivefold higher yield of clones compared to previous protocols. No chimeric clones can be detected and the background of clones without an insert is <1%. The procedure is rapid and shows potential for automation.
"In order to evaluate our approach on real data, we used as a test set reads from the Drosophila pseudoobscura genome sequencing project (Richards et al., 2005). We chose these particular data because the sequencing adapters used in the project are known (Andersson et al., 1996). "
[Show abstract][Hide abstract] ABSTRACT: Sequences produced by automated Sanger sequencing machines frequently contain fragments of the cloning vector on their ends. Software tools currently available for identifying and removing the vector sequence require knowledge of the vector sequence, specific splice sites and any adapter sequences used in the experiment-information often omitted from public databases. Furthermore, the clipping coordinates themselves are missing or incorrectly reported. As an example, within the approximately 1.24 billion shotgun sequences deposited in the NCBI Trace Archive, as many as approximately 735 million (approximately 60%) lack vector clipping information. Correct clipping information is essential to scientists attempting to validate, improve and even finish the increasingly large number of genomes released at a 'draft' quality level.
We present here Figaro, a novel software tool for identifying and removing the vector from raw sequence data without prior knowledge of the vector sequence. The vector sequence is automatically inferred by analyzing the frequency of occurrence of short oligo-nucleotides using Poisson statistics. We show that Figaro achieves 99.98% sensitivity when tested on approximately 1.5 million shotgun reads from Drosophila pseudoobscura. We further explore the impact of accurate vector trimming on the quality of whole-genome assemblies by re-assembling two bacterial genomes from shotgun sequences deposited in the Trace Archive. Designed as a module in large computational pipelines, Figaro is fast, lightweight and flexible.
Figaro is released under an open-source license through the AMOS package (http://amos.sourceforge.net/Figaro).
"Genomic DNA was purified from CsCl gradients  and DNA sequencing was performed by Sanger dideoxy whole genome shotgun (WGS) and by the 454 Life Sciences pyrosequencing strategies . Genomic and plasmid DNA from USA300-HOU-MR was sheared to a size of 2 kb by nebulization, and cloned into a derivative of pUC18 . The clones were used for WGS DNA sequencing to 8× coverage by using dye terminator chemistry, data were collected on ABI 3730 sequencers, and reads were assembled using the ATLAS assembler . "
[Show abstract][Hide abstract] ABSTRACT: Community acquired (CA) methicillin-resistant Staphylococcus aureus (MRSA) increasingly causes disease worldwide. USA300 has emerged as the predominant clone causing superficial and invasive infections in children and adults in the USA. Epidemiological studies suggest that USA300 is more virulent than other CA-MRSA. The genetic determinants that render virulence and dominance to USA300 remain unclear.
We sequenced the genomes of two pediatric USA300 isolates: one CA-MRSA and one CA-methicillin susceptible (MSSA), isolated at Texas Children's Hospital in Houston. DNA sequencing was performed by Sanger dideoxy whole genome shotgun (WGS) and 454 Life Sciences pyrosequencing strategies. The sequence of the USA300 MRSA strain was rigorously annotated. In USA300-MRSA 2658 chromosomal open reading frames were predicted and 3.1 and 27 kilobase (kb) plasmids were identified. USA300-MSSA contained a 20 kb plasmid with some homology to the 27 kb plasmid found in USA300-MRSA. Two regions found in US300-MRSA were absent in USA300-MSSA. One of these carried the arginine deiminase operon that appears to have been acquired from S. epidermidis. The USA300 sequence was aligned with other sequenced S. aureus genomes and regions unique to USA300 MRSA were identified.
USA300-MRSA is highly similar to other MRSA strains based on whole genome alignments and gene content, indicating that the differences in pathogenesis are due to subtle changes rather than to large-scale acquisition of virulence factor genes. The USA300 Houston isolate differs from another sequenced USA300 strain isolate, derived from a patient in San Francisco, in plasmid content and a number of sequence polymorphisms. Such differences will provide new insights into the evolution of pathogens.
"purified bacterial cells (Charles and Ishikawa 1999) or gelpurified from a chromosomal fragment resolved through Pulsed Field Gel Electrophoresis (PFGE) (Wernegreen et al. 2002). Short (1.5–2.5 kb) insert libraries were generated from hydrosheared DNA using a double adaptor kit (SeqWright Inc.) (Andersson et al. 1996). Plasmid clones were purified and bidirectionally sequenced using BigDye v3.0 chemistry on either an ABI3700 or an ABI3730xl (Applied Biosystems). "
[Show abstract][Hide abstract] ABSTRACT: The distinct lifestyle of obligately intracellular bacteria can alter fundamental forces that drive and constrain genome change. In this study, sequencing the 792-kb genome of Blochmannia pennsylvanicus, an obligate endosymbiont of Camponotus pennsylvanicus, enabled us to trace evolutionary changes that occurred in the context of a bacterial-ant association. Comparison to the genome of Blochmannia floridanus reveals differential loss of genes involved in cofactor biosynthesis, the composition and structure of the cell wall and membrane, gene regulation, and DNA replication. However, the two Blochmannia species show complete conservation in the order and strand orientation of shared genes. This finding of extreme stasis in genome architecture, also reported previously for the aphid endosymbiont Buchnera, suggests that genome stability characterizes long-term bacterial mutualists of insects and constrains their evolutionary potential. Genome-wide analyses of protein divergences reveal 10- to 50-fold faster amino acid substitution rates in Blochmannia compared to related bacteria. Despite these varying features of genome evolution, a striking correlation in the relative divergences of proteins indicates parallel functional constraints on gene functions across ecologically distinct bacterial groups. Furthermore, the increased rates of amino acid substitution and gene loss in Blochmannia have occurred in a lineage-specific fashion, which may reflect life history differences of their ant hosts.
Genome Research 09/2005; 15(8):1023-33. DOI:10.1101/gr.3771305 · 14.63 Impact Factor
Data provided are for informational purposes only. Although carefully collected, accuracy cannot be guaranteed. The impact factor represents a rough estimation of the journal's impact factor and does not reflect the actual current impact factor. Publisher conditions are provided by RoMEO. Differing provisions from the publisher's actual policy or licence agreement may be applicable.