Are you Matthew Cohoon?

Claim your profile

Publications (5)35.13 Total impact

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The nonessential regions in bacterial chromosomes are ill-defined due to incomplete functional information. Here, we establish a comprehensive repertoire of the genome regions that are dispensable for growth of Bacillus subtilis in a variety of media conditions. In complex medium, we attempted deletion of 157 individual regions ranging in size from 2 to 159 kb. A total of 146 deletions were successful in complex medium, whereas the remaining regions were subdivided to identify new essential genes (4) and coessential gene sets (7). Overall, our repertoire covers ∼76% of the genome. We screened for viability of mutant strains in rich defined medium and glucose minimal media. Experimental observations were compared with predictions by the iBsu1103 model, revealing discrepancies that led to numerous model changes, including the large-scale application of model reconciliation techniques. We ultimately produced the iBsu1103V2 model and generated predictions of metabolites that could restore the growth of unviable strains. These predictions were experimentally tested and demonstrated to be correct for 27 strains, validating the refinements made to the model. The iBsu1103V2 model has improved considerably at predicting loss of viability, and many insights gained from the model revisions have been integrated into the Model SEED to improve reconstruction of other microbial models.
    Nucleic Acids Research 10/2012; · 8.28 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Background Bacillus subtilis is an organism of interest because of its extensive industrial applications, its similarity to pathogenic organisms, and its role as the model organism for Gram-positive, sporulating bacteria. In this work, we introduce a new genome-scale metabolic model of B. subtilis 168 called iBsu1103. This new model is based on the annotated B. subtilis 168 genome generated by the SEED, one of the most up-to-date and accurate annotations of B. subtilis 168 available. Results The iBsu1103 model includes 1,437 reactions associated with 1,103 genes, making it the most complete model of B. subtilis available. The model also includes Gibbs free energy change (Δ r G'°) values for 1,403 (97%) of the model reactions estimated by using the group contribution method. These data were used with an improved reaction reversibility prediction method to identify 653 (45%) irreversible reactions in the model. The model was validated against an experimental dataset consisting of 1,500 distinct conditions and was optimized by using an improved model optimization method to increase model accuracy from 89.7% to 93.1%. Conclusions Basing the iBsu1103 model on the annotations generated by the SEED significantly improved the model completeness and accuracy compared with the most recent previously published model. The enhanced accuracy of the iBsu1103 model also demonstrates the efficacy of the improved reaction directionality prediction method in accurately identifying irreversible reactions in the B. subtilis metabolism. The proposed improved model optimization methodology was also demonstrated to be effective in minimally adjusting model content to improve model accuracy.
    Genome biology 01/2009; 10(6):1-15. · 10.30 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The National Microbial Pathogen Data Resource (NMPDR) (http://www.nmpdr.org) is a National Institute of Allergy and Infections Disease (NIAID)-funded Bioinformatics Resource Center that supports research in selected Category B pathogens. NMPDR contains the complete genomes of approximately 50 strains of pathogenic bacteria that are the focus of our curators, as well as >400 other genomes that provide a broad context for comparative analysis across the three phylogenetic Domains. NMPDR integrates complete, public genomes with expertly curated biological subsystems to provide the most consistent genome annotations. Subsystems are sets of functional roles related by a biologically meaningful organizing principle, which are built over large collections of genomes; they provide researchers with consistent functional assignments in a biologically structured context. Investigators can browse subsystems and reactions to develop accurate reconstructions of the metabolic networks of any sequenced organism. NMPDR provides a comprehensive bioinformatics platform, with tools and viewers for genome analysis. Results of precomputed gene clustering analyses can be retrieved in tabular or graphic format with one-click tools. NMPDR tools include Signature Genes, which finds the set of genes in common or that differentiates two groups of organisms. Essentiality data collated from genome-wide studies have been curated. Drug target identification and high-throughput, in silico, compound screening are in development.
    Nucleic Acids Research 02/2007; 35(Database issue):D347-53. · 8.28 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The release of the 1000th complete microbial genome will occur in the next two to three years. In anticipation of this milestone, the Fellowship for Interpretation of Genomes (FIG) launched the Project to Annotate 1000 Genomes. The project is built around the principle that the key to improved accuracy in high-throughput annotation technology is to have experts annotate single subsystems over the complete collection of genomes, rather than having an annotation expert attempt to annotate all of the genes in a single genome. Using the subsystems approach, all of the genes implementing the subsystem are analyzed by an expert in that subsystem. An annotation environment was created where populated subsystems are curated and projected to new genomes. A portable notion of a populated subsystem was defined, and tools developed for exchanging and curating these objects. Tools were also developed to resolve conflicts between populated subsystems. The SEED is the first annotation environment that supports this model of annotation. Here, we describe the subsystem approach, and offer the first release of our growing library of populated subsystems. The initial release of data includes 180 177 distinct proteins with 2133 distinct functional roles. This data comes from 173 subsystems and 383 different organisms.
    Nucleic Acids Research 02/2005; 33(17):5691-702. · 8.28 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: The release of the 1000th complete microbial genome will occur in the next two to three years. In anticipation of this milestone, the Fellowship for Interpretation of Genomes (FIG) launched the Project to Annotate 1000 Genomes. The project is built around the principle that the key to improved accuracy in high-throughput annotation technology is to have experts annotate single subsystems over the complete collection of genomes, rather than having an annotation expert attempt to annotate all of the genes in a single genome. Using the subsystems approach, all of the genes implementing the subsystem are analyzed by an expert in that subsystem. An annotation environment was created where populated subsystems are curated and projected to new genomes. A portable notion of a populated subsystem was defined, and tools developed for exchanging and curating these objects. Tools were also developed to resolve conflicts between populated subsystems. The SEED is the first annotation environment that supports this model of annotation. Here, we describe the subsystem approach, and offer the first release of our growing library of populated subsystems. The initial release of data includes 180 177 distinct proteins with 2133 distinct functional roles. This data comes from 173 subsystems and 383 different organisms.
    Nucleic Acids Research, 33(17), pp. 5691-5702.