ALLPATHS 2: small genomes assembled accurately and with high continuity from short paired reads.

Broad Institute of MIT and Harvard, Charles Street, Cambridge, MA 02141, USA.
Genome biology (Impact Factor: 10.47). 10/2009; 10(10):R103. DOI: 10.1186/gb-2009-10-10-r103
Source: PubMed

ABSTRACT We demonstrate that genome sequences approaching finished quality can be generated from short paired reads. Using 36 base (fragment) and 26 base (jumping) reads from five microbial genomes of varied GC composition and sizes up to 40 Mb, ALLPATHS2 generated assemblies with long, accurate contigs and scaffolds. Velvet and EULER-SR were less accurate. For example, for Escherichia coli, the fraction of 10-kb stretches that were perfect was 99.8% (ALLPATHS2), 68.7% (Velvet), and 42.1% (EULER-SR).

Download full-text


Available from: Swati Ranade, Jul 04, 2015
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The recent breakthroughs in next-generation sequencing technologies, such as those of Roche 454, Illumina/Solexa, and ABI SOLID, have dramatically reduced the cost of producing short reads of the genome of new species. The huge volume of reads, along with short read length, high coverage, and sequencing errors, poses a great challenge to de novo genome assembly. However, the paired-end information provides a new solution to these problems. In this paper, we review and compare some current assembly tools, including Newbler, CAP3, Velvet, SOAPdenovo, AllPaths, Abyss, IDBA, PE-Assembly, and Telescoper. In general, we compare the seed extension and graph-based methods that use the overlap/lapout/consensus approach and the de Bruijn graph approach for assembly. At the end of the paper, we summarize these methods and discuss the future directions of genome assembly.
    Tsinghua Science & Technology 10/2013; 18(5):500-514. DOI:10.1109/TST.2013.6616523
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Reduced costs and increased speed and accuracy of sequencing can bring the genome-based evaluation of individual disease risk to the bedside. While past efforts have identified a number of actionable mutations, the bulk of genetic risk remains hidden in sequence data. The biggest challenge facing genomic medicine today is the development of new techniques to predict the specifics of a given human phenome (set of all expressed phenotypes) encoded by each individual variome (full set of genome variants) in the context of the given environment. Numerous tools exist for the computational identification of the functional effects of a single variant. However, the pipelines taking advantage of full genomic, exomic, trascriptomic (and other) sequences have only recently become a reality. This review looks at the building of methodologies for predicting "variome"-defined disease risk. It also discusses some of the challenges for incorporating such a pipeline into everyday medical practice.
    Journal of Molecular Biology 08/2013; 431(21). DOI:10.1016/j.jmb.2013.07.038 · 3.96 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The kinetoplastids are an important group of protozoa from the Excavata, and cause numerous diseases with wide environmental, economic and ecological impact. Trypanosoma brucei, the causative agent of human African trypanosomiasis, expresses a dense variant surface glycoprotein (VSG) coat, facilitating immune evasion via rapid switching and antigenic variation. Coupled to VSG switching is efficient clathrin-mediated endocytosis (CME), which removes anti-VSG antibody from the parasite surface. While the precise molecular basis for an extreme CME flux is unknown, genes encoding the AP2 complex, central to CME in most organisms, are absent from T. brucei, suggesting mechanistic divergence in trypanosome CME. Here we identify the AP complex gene cohorts of all available kinetoplastid genomes and a new Trypanosoma grayi genome. We find multiple secondary losses of AP complexes, but that loss of AP2 is restricted to T. brucei and closest relatives. Further, loss of AP2 correlates precisely with the presence of VSG genes, supporting a model whereby these two adaptations may function synergistically in immune evasion.
    Molecular Phylogenetics and Evolution 01/2013; DOI:10.1016/j.ympev.2013.01.002 · 4.02 Impact Factor