John P Hamilton

Michigan State University, East Lansing, MI, USA

Are you John P Hamilton?

Claim your profile

Publications (15)66.87 Total impact

  • Article: Advances in plant genome sequencing.
    John P Hamilton, C Robin Buell
    [show abstract] [hide abstract]
    ABSTRACT: The study of plant biology in the 21st century is, and will continue to be, vastly different from that in the 20th century. One driver for this has been the use of genomics methods to reveal the genetic blueprints for not one but dozens of plant species, as well as resolving genome differences in thousands of individuals at the population level. Genomics technology has advanced substantially since publication of the first plant genome sequence, that of Arabidopsis thaliana, in 2000. Plant genomics researchers have readily embraced new algorithms, technologies and approaches to generate genome, transcriptome and epigenome datasets for model and crop species that have permitted deep inferences into plant biology. Challenges in sequencing any genome include ploidy, heterozygosity and paralogy, all which are amplified in plant genomes compared to animal genomes due to the large genome sizes, high repetitive sequence content, and rampant whole- or segmental genome duplication. The ability to generate de novo transcriptome assemblies provides an alternative approach to bypass these complex genomes and access the gene space of these recalcitrant species. The field of genomics is driven by technological improvements in sequencing platforms; however, software and algorithm development has lagged behind reductions in sequencing costs, improved throughput, and quality improvements. It is anticipated that sequencing platforms will continue to improve the length and quality of output, and that the complementary algorithms and bioinformatic software needed to handle large, repetitive genomes will improve. The future is bright for an exponential improvement in our understanding of plant biology.
    The Plant Journal 04/2012; 70(1):177-90. · 6.16 Impact Factor
  • Source
    Article: mRNA-Seq analysis of the Pseudoperonospora cubensis transcriptome during cucumber (Cucumis sativus L.) infection.
    [show abstract] [hide abstract]
    ABSTRACT: Pseudoperonospora cubensis, an oomycete, is the causal agent of cucurbit downy mildew, and is responsible for significant losses on cucurbit crops worldwide. While other oomycete plant pathogens have been extensively studied at the molecular level, Ps. cubensis and the molecular basis of its interaction with cucurbit hosts has not been well examined. Here, we present the first large-scale global gene expression analysis of Ps. cubensis infection of a susceptible Cucumis sativus cultivar, 'Vlaspik', and identification of genes with putative roles in infection, growth, and pathogenicity. Using high throughput whole transcriptome sequencing, we captured differential expression of 2383 Ps. cubensis genes in sporangia and at 1, 2, 3, 4, 6, and 8 days post-inoculation (dpi). Additionally, comparison of Ps. cubensis expression profiles with expression profiles from an infection time course of the oomycete pathogen Phytophthora infestans on Solanum tuberosum revealed similarities in expression patterns of 1,576-6,806 orthologous genes suggesting a substantial degree of overlap in molecular events in virulence between the biotrophic Ps. cubensis and the hemi-biotrophic P. infestans. Co-expression analyses identified distinct modules of Ps. cubensis genes that were representative of early, intermediate, and late infection stages. Collectively, these expression data have advanced our understanding of key molecular and genetic events in the virulence of Ps. cubensis and thus, provides a foundation for identifying mechanism(s) by which to engineer or effect resistance in the host.
    PLoS ONE 01/2012; 7(4):e35796. · 4.09 Impact Factor
  • Source
    Article: Expression profiling of Cucumis sativus in response to infection by Pseudoperonospora cubensis.
    [show abstract] [hide abstract]
    ABSTRACT: The oomycete pathogen, Pseudoperonospora cubensis, is the causal agent of downy mildew on cucurbits, and at present, no effective resistance to this pathogen is available in cultivated cucumber (Cucumis sativus). To better understand the host response to a virulent pathogen, we performed expression profiling throughout a time course of a compatible interaction using whole transcriptome sequencing. As described herein, we were able to detect the expression of 15,286 cucumber genes, of which 14,476 were expressed throughout the infection process from 1 day post-inoculation (dpi) to 8 dpi. A large number of genes, 1,612 to 3,286, were differentially expressed in pair-wise comparisons between time points. We observed the rapid induction of key defense related genes, including catalases, chitinases, lipoxygenases, peroxidases, and protease inhibitors within 1 dpi, suggesting detection of the pathogen by the host. Co-expression network analyses revealed transcriptional networks with distinct patterns of expression including down-regulation at 2 dpi of known defense response genes suggesting coordinated suppression of host responses by the pathogen. Comparative analyses of cucumber gene expression patterns with that of orthologous Arabidopsis thaliana genes following challenge with Hyaloperonospora arabidopsidis revealed correlated expression patterns of single copy orthologs suggesting that these two dicot hosts have similar transcriptional responses to related pathogens. In total, the work described herein presents an in-depth analysis of the interplay between host susceptibility and pathogen virulence in an agriculturally important pathosystem.
    PLoS ONE 01/2012; 7(4):e34954. · 4.09 Impact Factor
  • Article: Development of transcriptomic resources for interrogating the biosynthesis of monoterpene indole alkaloids in medicinal plant species.
    [show abstract] [hide abstract]
    ABSTRACT: The natural diversity of plant metabolism has long been a source for human medicines. One group of plant-derived compounds, the monoterpene indole alkaloids (MIAs), includes well-documented therapeutic agents used in the treatment of cancer (vinblastine, vincristine, camptothecin), hypertension (reserpine, ajmalicine), malaria (quinine), and as analgesics (7-hydroxymitragynine). Our understanding of the biochemical pathways that synthesize these commercially relevant compounds is incomplete due in part to a lack of molecular, genetic, and genomic resources for the identification of the genes involved in these specialized metabolic pathways. To address these limitations, we generated large-scale transcriptome sequence and expression profiles for three species of Asterids that produce medicinally important MIAs: Camptotheca acuminata, Catharanthus roseus, and Rauvolfia serpentina. Using next generation sequencing technology, we sampled the transcriptomes of these species across a diverse set of developmental tissues, and in the case of C. roseus, in cultured cells and roots following elicitor treatment. Through an iterative assembly process, we generated robust transcriptome assemblies for all three species with a substantial number of the assembled transcripts being full or near-full length. The majority of transcripts had a related sequence in either UniRef100, the Arabidopsis thaliana predicted proteome, or the Pfam protein domain database; however, we also identified transcripts that lacked similarity with entries in either database and thereby lack a known function. Representation of known genes within the MIA biosynthetic pathway was robust. As a diverse set of tissues and treatments were surveyed, expression abundances of transcripts in the three species could be estimated to reveal transcripts associated with development and response to elicitor treatment. Together, these transcriptomes and expression abundance matrices provide a rich resource for understanding plant specialized metabolism, and promotes realization of innovative production systems for plant-derived pharmaceuticals.
    PLoS ONE 01/2012; 7(12):e52506. · 4.09 Impact Factor
  • Source
    Article: Alternative splicing of a multi-drug transporter from Pseudoperonospora cubensis generates an RXLR effector protein that elicits a rapid cell death.
    [show abstract] [hide abstract]
    ABSTRACT: Pseudoperonospora cubensis, an obligate oomycete pathogen, is the causal agent of cucurbit downy mildew, a foliar disease of global economic importance. Similar to other oomycete plant pathogens, Ps. cubensis has a suite of RXLR and RXLR-like effector proteins, which likely function as virulence or avirulence determinants during the course of host infection. Using in silico analyses, we identified 271 candidate effector proteins within the Ps. cubensis genome with variable RXLR motifs. In extending this analysis, we present the functional characterization of one Ps. cubensis effector protein, RXLR protein 1 (PscRXLR1), and its closest Phytophthora infestans ortholog, PITG_17484, a member of the Drug/Metabolite Transporter (DMT) superfamily. To assess if such effector-non-effector pairs are common among oomycete plant pathogens, we examined the relationship(s) among putative ortholog pairs in Ps. cubensis and P. infestans. Of 271 predicted Ps. cubensis effector proteins, only 109 (41%) had a putative ortholog in P. infestans and evolutionary rate analysis of these orthologs shows that they are evolving significantly faster than most other genes. We found that PscRXLR1 was up-regulated during the early stages of infection of plants, and, moreover, that heterologous expression of PscRXLR1 in Nicotiana benthamiana elicits a rapid necrosis. More interestingly, we also demonstrate that PscRXLR1 arises as a product of alternative splicing, making this the first example of an alternative splicing event in plant pathogenic oomycetes transforming a non-effector gene to a functional effector protein. Taken together, these data suggest a role for PscRXLR1 in pathogenicity, and, in total, our data provide a basis for comparative analysis of candidate effector proteins and their non-effector orthologs as a means of understanding function and evolutionary history of pathogen effectors.
    PLoS ONE 01/2012; 7(4):e34701. · 4.09 Impact Factor
  • Source
    Article: Integration of two diploid potato linkage maps with the potato genome sequence.
    [show abstract] [hide abstract]
    ABSTRACT: To facilitate genome-guided breeding in potato, we developed an 8303 Single Nucleotide Polymorphism (SNP) marker array using potato genome and transcriptome resources. To validate the Infinium 8303 Potato Array, we developed linkage maps from two diploid populations (DRH and D84) and compared these maps with the assembled potato genome sequence. Both populations used the doubled monoploid reference genotype DM1-3 516 R44 as the female parent but had different heterozygous diploid male parents (RH89-039-16 and 84SD22). Over 4,400 markers were mapped (1,960 in DRH and 2,454 in D84, 787 in common) resulting in map sizes of 965 (DRH) and 792 (D84) cM, covering 87% (DRH) and 88% (D84) of genome sequence length. Of the mapped markers, 33.5% were in candidate genes selected for the array, 4.5% were markers from existing genetic maps, and 61% were selected based on distribution across the genome. Markers with distorted segregation ratios occurred in blocks in both linkage maps, accounting for 4% (DRH) and 9% (D84) of mapped markers. Markers with distorted segregation ratios were unique to each population with blocks on chromosomes 9 and 12 in DRH and 3, 4, 6 and 8 in D84. Chromosome assignment of markers based on linkage mapping differed from sequence alignment with the Potato Genome Sequencing Consortium (PGSC) pseudomolecules for 1% of the mapped markers with some disconcordant markers attributable to paralogs. In total, 126 (DRH) and 226 (D84) mapped markers were not anchored to the pseudomolecules and provide new scaffold anchoring data to improve the potato genome assembly. The high degree of concordance between the linkage maps and the pseudomolecules demonstrates both the quality of the potato genome sequence and the functionality of the Infinium 8303 Potato Array. The broad genome coverage of the Infinium 8303 Potato Array compared to other marker sets will enable numerous downstream applications.
    PLoS ONE 01/2012; 7(4):e36347. · 4.09 Impact Factor
  • Source
    Article: Single nucleotide polymorphism discovery in elite North American potato germplasm.
    [show abstract] [hide abstract]
    ABSTRACT: Current breeding approaches in potato rely almost entirely on phenotypic evaluations; molecular markers, with the exception of a few linked to disease resistance traits, are not widely used. Large-scale sequence datasets generated primarily through Sanger Expressed Sequence Tag projects are available from a limited number of potato cultivars and access to next generation sequencing technologies permits rapid generation of sequence data for additional cultivars. When coupled with the advent of high throughput genotyping methods, an opportunity now exists for potato breeders to incorporate considerably more genotypic data into their decision-making. To identify a large number of Single Nucleotide Polymorphisms (SNPs) in elite potato germplasm, we sequenced normalized cDNA prepared from three commercial potato cultivars: 'Atlantic', 'Premier Russet' and 'Snowden'. For each cultivar, we generated 2 Gb of sequence which was assembled into a representative transcriptome of ~28-29 Mb for each cultivar. Using the Maq SNP filter that filters read depth, density, and quality, 575,340 SNPs were identified within these three cultivars. In parallel, 2,358 SNPs were identified within existing Sanger sequences for three additional cultivars, 'Bintje', 'Kennebec', and 'Shepody'. Using a stringent set of filters in conjunction with the potato reference genome, we identified 69,011 high confidence SNPs from these six cultivars for use in genotyping with the Infinium platform. Ninety-six of these SNPs were used with a BeadXpress assay to assess allelic diversity in a germplasm panel of 248 lines; 82 of the SNPs proved sufficiently informative for subsequent analyses. Within diverse North American germplasm, the chip processing market class was most distinct, clearly separated from all other market classes. The round white and russet market classes both include fresh market and processing cultivars. Nevertheless, the russet and round white market classes are more distant from each other than processing are from fresh market types within these two groups. The genotype data generated in this study, albeit limited in number, has revealed distinct relationships among the market classes of potato. The SNPs identified in this study will enable high-throughput genotyping of germplasm and populations, which in turn will enable more efficient marker-assisted breeding efforts in potato.
    BMC Genomics 06/2011; 12:302. · 4.07 Impact Factor
  • Article: A stereoselective hydroxylation step of alkaloid biosynthesis by a unique cytochrome P450 in Catharanthus roseus.
    [show abstract] [hide abstract]
    ABSTRACT: Plant cytochrome P450s are involved in the production of over a hundred thousand metabolites such as alkaloids, terpenoids, and phenylpropanoids. Although cytochrome P450 genes constitute one of the largest superfamilies in plants, many of the catalytic functions of the enzymes they encode remain unknown. Here, we report the identification and functional characterization of a cytochrome P450 gene in a new subfamily of CYP71, CYP71BJ1, involved in alkaloid biosynthesis. Co-expression analysis of putative cytochrome P450 genes in the Catharanthus roseus transcriptome identified candidate genes with expression profiles similar to known terpene indole alkaloid biosynthetic genes. Screening of these candidate genes by functional expression in Saccharomyces cerevisiae yielded a unique P450-dependent enzyme that stereoselectively hydroxylates the alkaloids tabersonine and lochnericine at the 19-position of the aspidosperma-type alkaloid scaffold. Tabersonine, which can be converted to either vindoline or 19-O-acetylhörhammericine, represents a branch point in alkaloid biosynthesis. The discovery of CYP71BJ1, which forms part of the pathway leading to 19-O-acetylhörhammericine, will help illuminate how this branch point is controlled in C. roseus.
    Journal of Biological Chemistry 03/2011; 286(19):16751-7. · 4.77 Impact Factor
  • Source
    Article: The Comprehensive Phytopathogen Genomics Resource: a web-based resource for data-mining plant pathogen genomes.
    [show abstract] [hide abstract]
    ABSTRACT: The Comprehensive Phytopathogen Genomics Resource (CPGR) provides a web-based portal for plant pathologists and diagnosticians to view the genome and trancriptome sequence status of 806 bacterial, fungal, oomycete, nematode, viral and viroid plant pathogens. Tools are available to search and analyze annotated genome sequences of 74 bacterial, fungal and oomycete pathogens. Oomycete and fungal genomes are obtained directly from GenBank, whereas bacterial genome sequences are downloaded from the A Systematic Annotation Package (ASAP) database that provides curation of genomes using comparative approaches. Curated lists of bacterial genes relevant to pathogenicity and avirulence are also provided. The Plant Pathogen Transcript Assemblies Database provides annotated assemblies of the transcribed regions of 82 eukaryotic genomes from publicly available single pass Expressed Sequence Tags. Data-mining tools are provided along with tools to create candidate diagnostic markers, an emerging use for genomic sequence data in plant pathology. The Plant Pathogen Ribosomal DNA (rDNA) database is a resource for pathogens that lack genome or transcriptome data sets and contains 131 755 rDNA sequences from GenBank for 17 613 species identified as plant pathogens and related genera. Database URL: http://cpgr.plantbiology.msu.edu.
    Database The Journal of Biological Databases and Curation 01/2011; 2011:bar053. · 2.07 Impact Factor
  • Source
    Article: Genome sequence of the necrotrophic plant pathogen Pythium ultimum reveals original pathogenicity mechanisms and effector repertoire.
    [show abstract] [hide abstract]
    ABSTRACT: Pythium ultimum is a ubiquitous oomycete plant pathogen responsible for a variety of diseases on a broad range of crop and ornamental species. The P. ultimum genome (42.8 Mb) encodes 15,290 genes and has extensive sequence similarity and synteny with related Phytophthora species, including the potato blight pathogen Phytophthora infestans. Whole transcriptome sequencing revealed expression of 86% of genes, with detectable differential expression of suites of genes under abiotic stress and in the presence of a host. The predicted proteome includes a large repertoire of proteins involved in plant pathogen interactions, although, surprisingly, the P. ultimum genome does not encode any classical RXLR effectors and relatively few Crinkler genes in comparison to related phytopathogenic oomycetes. A lower number of enzymes involved in carbohydrate metabolism were present compared to Phytophthora species, with the notable absence of cutinases, suggesting a significant difference in virulence mechanisms between P. ultimum and more host-specific oomycete species. Although we observed a high degree of orthology with Phytophthora genomes, there were novel features of the P. ultimum proteome, including an expansion of genes involved in proteolysis and genes unique to Pythium. We identified a small gene family of cadherins, proteins involved in cell adhesion, the first report of these in a genome outside the metazoans. Access to the P. ultimum genome has revealed not only core pathogenic mechanisms within the oomycetes but also lineage-specific genes associated with the alternative virulence and lifestyles found within the pythiaceous lineages compared to the Peronosporaceae.
    Genome biology 01/2010; 11(7):R73. · 6.63 Impact Factor
  • Article: Identification and characterization of lineage-specific genes within the Poaceae.
    [show abstract] [hide abstract]
    ABSTRACT: Using the rice (Oryza sativa) sp. japonica genome annotation, along with genomic sequence and clustered transcript assemblies from 184 species in the plant kingdom, we have identified a set of 861 rice genes that are evolutionarily conserved among six diverse species within the Poaceae yet lack significant sequence similarity with plant species outside the Poaceae. This set of evolutionarily conserved and lineage-specific rice genes is termed conserved Poaceae-specific genes (CPSGs) to reflect the presence of significant sequence similarity across three separate Poaceae subfamilies. The vast majority of rice CPSGs (86.6%) encode proteins with no putative function or functionally characterized protein domain. For the remaining CPSGs, 8.8% encode an F-box domain-containing protein and 4.5% encode a protein with a putative function. On average, the CPSGs have fewer exons, shorter total gene length, and elevated GC content when compared with genes annotated as either transposable elements (TEs) or those genes having significant sequence similarity in a species outside the Poaceae. Multiple sequence alignments of the CPSGs with sequences from other Poaceae species show conservation across a putative domain, a novel domain, or the entire coding length of the protein. At the genome level, syntenic alignments between sorghum (Sorghum bicolor) and 103 of the 861 rice CPSGs (12.0%) could be made, demonstrating an additional level of conservation for this set of genes within the Poaceae. The extensive sequence similarity in evolutionarily distinct species within the Poaceae family and an additional screen for TE-related structural characteristics and sequence discounts these CPSGs as being misannotated TEs. Collectively, these data confirm that we have identified a specific set of genes that are highly conserved within, as well as specific to, the Poaceae.
    Plant physiology 01/2008; 145(4):1311-22. · 6.53 Impact Factor
  • Source
    Article: EuCAP, a Eukaryotic Community Annotation Package, and its application to the rice genome.
    [show abstract] [hide abstract]
    ABSTRACT: Despite the improvements of tools for automated annotation of genome sequences, manual curation at the structural and functional level can provide an increased level of refinement to genome annotation. The Institute for Genomic Research Rice Genome Annotation (hereafter named the Osa1 Genome Annotation) is the product of an automated pipeline and, for this reason, will benefit from the input of biologists with expertise in rice and/or particular gene families. Leveraging knowledge from a dispersed community of scientists is a demonstrated way of improving a genome annotation. This requires tools that facilitate 1) the submission of gene annotation to an annotation project, 2) the review of the submitted models by project annotators, and 3) the incorporation of the submitted models in the ongoing annotation effort. We have developed the Eukaryotic Community Annotation Package (EuCAP), an annotation tool, and have applied it to the rice genome. The primary level of curation by community annotators (CA) has been the annotation of gene families. Annotation can be submitted by email or through the EuCAP Web Tool. The CA models are aligned to the rice pseudomolecules and the coordinates of these alignments, along with functional annotation, are stored in the MySQL EuCAP Gene Model database. Web pages displaying the alignments of the CA models to the Osa1 Genome models are automatically generated from the EuCAP Gene Model database. The alignments are reviewed by the project annotators (PAs) in the context of experimental evidence. Upon approval by the PAs, the CA models, along with the corresponding functional annotations, are integrated into the Osa1 Genome Annotation. The CA annotations, grouped by family, are displayed on the Community Annotation pages of the project website http://rice.tigr.org, as well as in the Community Annotation track of the Genome Browser. We have applied EuCAP to rice. As of July 2007, the structural and/or functional annotation of 1,094 genes representing 57 families have been deposited and integrated into the current gene set. All of the EuCAP components are open-source, thereby allowing the implementation of EuCAP for the annotation of other genomes. EuCAP is available at http://sourceforge.net/projects/eucap/.
    BMC Genomics 02/2007; 8:388. · 4.07 Impact Factor
  • Source
    Article: The TIGR Plant Transcript Assemblies database.
    [show abstract] [hide abstract]
    ABSTRACT: The TIGR Plant Transcript Assemblies (TA) database (http://plantta.tigr.org) uses expressed sequences collected from the NCBI GenBank Nucleotide database for the construction of transcript assemblies. The sequences collected include expressed sequence tags (ESTs) and full-length and partial cDNAs, but exclude computationally predicted gene sequences. The TA database includes all plant species for which more than 1000 EST or cDNA sequences are publicly available. The EST and cDNA sequences are first clustered based on an all-versus-all pairwise sequence comparison, followed by the generation of consensus sequences (TAs) from individual clusters. The clustering and assembly procedures use the TGICL tool, Megablast and the CAP3 assembler. The UniProt Reference Clusters (UniRef100) protein database is used as the reference database for the functional annotation of the assemblies. The transcription orientation of each TA is determined based on the orientation of the alignment with the best protein hit. The TA sequences and annotation are available via web interfaces and FTP downloads. Assemblies can be retrieved by a text-based keyword search or a sequence-based BLAST search. The current version of the TA database is Release 2 (July 17, 2006) and includes a total of 215 plant species.
    Nucleic Acids Research 02/2007; 35(Database issue):D846-51. · 8.03 Impact Factor
  • Source
    Article: Comprehensive analysis of alternative splicing in rice and comparative analyses with Arabidopsis.
    [show abstract] [hide abstract]
    ABSTRACT: Recently, genomic sequencing efforts were finished for Oryza sativa (cultivated rice) and Arabidopsis thaliana (Arabidopsis). Additionally, these two plant species have extensive cDNA and expressed sequence tag (EST) libraries. We employed the Program to Assemble Spliced Alignments (PASA) to identify and analyze alternatively spliced isoforms in both species. A comprehensive analysis of alternative splicing was performed in rice that started with >1.1 million publicly available spliced ESTs and over 30,000 full length cDNAs in conjunction with the newly enhanced PASA software. A parallel analysis was performed with Arabidopsis to compare and ascertain potential differences between monocots and dicots. Alternative splicing is a widespread phenomenon (observed in greater than 30% of the loci with transcript support) and we have described nine alternative splicing variations. While alternative splicing has the potential to create many RNA isoforms from a single locus, the majority of loci generate only two or three isoforms and transcript support indicates that these isoforms are generally not rare events. For the alternate donor (AD) and acceptor (AA) classes, the distance between the splice sites for the majority of events was found to be less than 50 basepairs (bp). In both species, the most frequent distance between AA is 3 bp, consistent with reports in mammalian systems. Conversely, the most frequent distance between AD is 4 bp in both plant species, as previously observed in mouse. Most alternative splicing variations are localized to the protein coding sequence and are predicted to significantly alter the coding sequence. Alternative splicing is widespread in both rice and Arabidopsis and these species share many common features. Interestingly, alternative splicing may play a role beyond creating novel combinations of transcripts that expand the proteome. Many isoforms will presumably have negative consequences for protein structure and function, suggesting that their biological role involves post-transcriptional regulation of gene expression.
    BMC Genomics 01/2006; 7:327. · 4.07 Impact Factor
  • Article: Genome sequence of the necrotrophic plant pathogen Pythium ultimum reveals original pathogenicity mechanisms and effector repertoire.
    [show abstract] [hide abstract]
    ABSTRACT: Pythium ultimum is a ubiquitous oomycete plant pathogen responsible for a variety of diseases on a broad range of crop and ornamental species.
    Genome biology. 11(7):R73.