SmashCell: A software framework for the analysis of single-cell amplified genome sequences

Department of Microbiology and Immunology, Stanford University School of Medicine, Stanford, CA 94305, USA.
Bioinformatics (Impact Factor: 4.98). 10/2010; 26(23):2979-80. DOI: 10.1093/bioinformatics/btq564
Source: PubMed


Recent advances in single-cell manipulation technology, whole genome amplification and high-throughput sequencing have now made it possible to sequence the genome of an individual cell. The bioinformatic analysis of these genomes, however, is far more complicated than the analysis of those generated using traditional, culture-based methods. In order to simplify this analysis, we have developed SmashCell (Simple Metagenomics Analysis SHell-for sequences from single Cells). It is designed to automate the main steps in microbial genome analysis-assembly, gene prediction, functional annotation-in a way that allows parameter and algorithm exploration at each step in the process. It also manages the data created by these analyses and provides visualization methods for rapid analysis of the results.
The SmashCell source code and a comprehensive manual are available at
Supplementary data are available at Bioinformatics online.

Download full-text


Available from: David A Relman
  • Source
    • "Nevertheless, contrary to this notion, single-cell data sets bring about unique properties that require specific attention and there is an increasing interest in developing analysis methods for singlecell bioinformatics (Ning et al., 2014; Roach et al., 2009). Some of the peculiar features specific to single-cell analysis that warrant specific bioinformatics approaches are: low volume, nonlinear amplification issues (Wu et al., 2014); unconventional use of spikeins for normalization due to expression bias (Katayama et al., 2013); contamination from neighbouring cells (Harrington et al., 2010); the need to account for subtle changes that are more likely to be seen in spatial/temporal separation of single-cells which are inherently related by potentially having originated from the same progenitor cell (Buettner and Theis, 2012); models to account for missing data, which is more likely to be seen in single-cell experiments due to insufficient starting material (Buettner et al., 2014); and structure identification in low dimensional data (Feigelman et al., 2014). The last of these features is particularly interesting as it presents a data analysis challenge that is in between the very low dimensional space of the past (e.g., data sets with a handful of gene measurements) and modern-day, high-throughput data sets (e.g., a typical transcriptomic study with tens of thousands of gene measurements). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Individual cells within the same population show various degrees of heterogeneity, which may be better handled with single-cell analysis to address biological and clinical questions. Single-cell analysis is especially important in developmental biology as subtle spatial and temporal differences in cells have significant associations with cell fate decisions during differentiation and with the description of a particular state of a cell exhibiting an aberrant phenotype. Biotechnological advances, especially in the area of microfluidics, have led to a robust, massively parallel and multi-dimensional capturing, sorting, and lysis of single-cells and amplification of related macromolecules, which have enabled the use of imaging and omics techniques on single-cells. There have been improvements in computational single-cell image analysis in developmental biology regarding feature extraction, segmentation, image enhancement, and machine learning, handling limitations of optical resolution to gain new perspectives from the raw microscopy images. Omics approaches, such as transcriptomics, genomics, and epigenomics, targeting gene and small RNA expression, single nucleotide and structural variations, and methylation and histone modifications, rely heavily on high-throughput sequencing technologies. Although there are well-established bioinformatics methods for analysis of sequence data, there are limited bioinformatics approaches which address experimental design, sample size considerations, amplification bias, normalization, differential expression, coverage, clustering, and classification issues, specifically applied at the single-cell level. In this review, we summarize biological and technological advancements, discuss challenges faced in the aforementioned data acquisition and analysis issues, and present future prospects for application of single-cell analyses to developmental biology.
    Preview · Article · Sep 2015 · Molecular Human Reproduction
  • Source
    • "Over the last few years, several software packages for single-cell assembly have been released that address the problem of highly variable coverage rate in MDAderived data. SmashCell (Simple Metagenomics Analysis SHell-for sequences from single Cells) is a software framework that combines assembly, gene prediction, and annotation of single-cell data (Harrington et al., 2010). Assemblers that followed were IDBA-UD (Peng et al., 2012) and Velvet-sc . "
    [Show abstract] [Hide abstract]
    ABSTRACT: Single-cell genomics has advanced the field of microbiology from the analysis of microbial metagenomes where information is "drowning in a sea of sequences," to recognizing each microbial cell as a separate and unique entity. Single-cell genomics employs Phi29 polymerase-mediated whole-genome amplification to yield microgram-range genomic DNA from single microbial cells. This method has now been applied to a handful of symbiotic systems, including bacterial symbionts of marine sponges, insects (grasshoppers, termites), and vertebrates (mouse, human). In each case, novel insights were obtained into the functional genomic repertoire of the bacterial partner, which, in turn, led to an improved understanding of the corresponding host. Single-cell genomics is particularly valuable when dealing with uncultivated microorganisms, as is still the case for many bacterial symbionts. In this review, we explore the power of single-cell genomics for symbiosis research and highlight recent insights into the symbiotic systems that were obtained by this approach.
    Full-text · Article · Aug 2012 · Biological Bulletin
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Studying complex biological systems such as a developing embryo, a tumor, or a microbial ecosystem often involves understanding the behavior and heterogeneity of the individual cells that constitute the system and their interactions. In this review, we discuss a variety of approaches to single-cell genomic analysis.
    Preview · Article · Nov 2010 · Annual Review of Genetics
Show more