[Show abstract][Hide abstract] ABSTRACT: The detection and analysis of genetic variation plays an important role in plant breeding and this role is increasing with the continued development of genome sequencing technologies. Molecular genetic markers are important tools to characterize genetic variation and assist with genomic breeding. Processing and storing the growing abundance of molecular marker data being produced requires the development of specific bioinformatics tools and advanced databases. Molecular marker databases range from species specific through to organism wide and often host a variety of additional related genetic, genomic, or phenotypic information. In this chapter, we will present some of the features of plant molecular genetic marker databases, highlight the various types of marker resources, and predict the potential future direction of crop marker databases.
[Show abstract][Hide abstract] ABSTRACT: Single-nucleotide polymorphisms (SNPs)are molecular markers based on nucleotide variation and can be used for genotyping assays across populations and to track genomic inheritance. SNPs offer a comprehensive genotyping alternative to whole-genome sequencing for both agricultural and research purposes including molecular breeding and diagnostics, genome evolution and genetic diversity analyses, genetic mapping, and trait association studies. Here genomic SNPs were discovered between four cultivars of the important amphidiploid oilseed species Brassica napus and used to develop a B. napus Infinium™ array containing 5,306 SNPs randomly dispersed across the genome. Assay success was high, with >94 % of these producing a reproducible, polymorphic genotype in the 1,070 samples screened. Although the assay was designed to B. napus, successful SNP amplification was achieved in the B. napus progenitor species, Brassica rapa and Brassica oleracea, and to a lesser extent in the related species Brassica nigra. Phylogenetic analysis was consistent with the expected relationships between B. napus individuals. This study presents an efficient custom SNP assay development pipeline in the complex polyploid Brassica genome and demonstrates the utility of the array for high-throughput genotyping in a number of related Brassica species. It also demonstrates the utility of this assay in genotyping resistance genes on chromosome A7, which segregate amongst the 1,070 samples.
[Show abstract][Hide abstract] ABSTRACT: Despite being a major international crop, our understanding of the wheat genome is relatively poor due to its large size and complexity. To gain a greater understanding of wheat genome diversity, we have identified single nucleotide polymorphisms between 16 Australian bread wheat varieties. Whole-genome shotgun Illumina paired read sequence data were mapped to the draft assemblies of chromosomes 7A, 7B and 7D to identify more than 4 million intervarietal SNPs. SNP density varied between the three genomes, with much greater density observed on the A and B genomes than the D genome. This variation may be a result of substantial gene flow from the tetraploid Triticum turgidum, which possesses A and B genomes, during early co-cultivation of tetraploid and hexaploid wheat. In addition, we examined SNP density variation along the chromosome syntenic builds and identified genes in low-density regions which may have been selected during domestication and breeding. This study highlights the impact of evolution and breeding on the bread wheat genome and provides a substantial resource for trait association and crop improvement. All SNP data are publically available on a generic genome browser GBrowse at www.wheatgenome.info.
[Show abstract][Hide abstract] ABSTRACT: The genome sequence of an organism provides the basis for gene discovery, the analysis of genetic variation and the association of genomic variation with heritable traits. Genome sequence variation can vary from single nucleotide polymorphisms, insertions/deletions to presence/absence of large regions or rearrangements. Second generation sequencing technologies and applied bioinformatics tools can provide an unprecedented insight into genome structure and variation, with applications for understanding the evolution of Brassica species and advancing crop breeding strategies. Advances in data production and bioinformatics capability now make the resequencing of complex polyploid genomes routine. This provides the opportunity to expand genomics from gene and molecular genetic marker discovery to developing a broader understanding of the role of adaptation and selection in diversity. This, in turn, enables comparative genomic approaches to truly comprehend the effect of diversity on genome structure and how this impacts on the form and function of organisms, their growth and development, and response to environment, pests and diseases. The sequencing and re-sequencing of different Brassica varieties has given researchers an unprecedented opportunity to identify genome wide variation. Bioinformatics tools have been produced and applied to interrogate and annotate this abundant data, and genome wide variation has been integrated with genetic maps and phenotypic information. The Brassica genomes, when combined with genome diversity information provide an insight into the evolution of these important crop plants and their wild relatives. Together this information can be used to advance breeding of improved varieties with enhanced agronomic traits.
[Show abstract][Hide abstract] ABSTRACT: Despite the international significance of wheat, its large and complex genome hinders genome sequencing efforts. To assess the impact of selection on this genome, we have assembled genomic regions representing genes for chromosomes 7A, 7B and 7D. We demonstrate that the dispersion of wheat to new environments has shaped the modern wheat genome. Most genes are conserved between the three homoeologous chromosomes. We found differential gene loss that supports current theories on the evolution of wheat, with greater loss observed in the A and B genomes compared with the D. Analysis of intervarietal polymorphisms identified fewer polymorphisms in the D genome, supporting the hypothesis of early gene flow between the tetraploid and hexaploid. The enrichment for genes on the D genome that confer environmental adaptation may be associated with dispersion following wheat domestication. Our results demonstrate the value of applying next-generation sequencing technologies to assemble gene-rich regions of complex genomes and investigate polyploid genome evolution. We anticipate the genome-wide application of this reduced-complexity syntenic assembly approach will accelerate crop improvement efforts not only in wheat, but also in other polyploid crops of significance.
[Show abstract][Hide abstract] ABSTRACT: Single nucleotide polymorphisms (SNPs) are becoming the dominant form of molecular marker for genetic and genomic analysis. The advances in second generation DNA sequencing provide opportunities to identify very large numbers of SNPs in a range of species. However, SNP identification remains a challenge for large and polyploid genomes due to their size and complexity. We have developed a pipeline for the robust identification of SNPs in large and complex genomes using Illumina second generation DNA sequence data and demonstrated this by the discovery of SNPs in the hexaploid wheat genome. We have developed a SNP discovery pipeline called SGSautoSNP (Second-Generation Sequencing AutoSNP) and applied this to discover more than 800,000 SNPs between four hexaploid wheat cultivars across chromosomes 7A, 7B and 7D. All SNPs are presented for download and viewing within a public GBrowse database. Validation suggests an accuracy of greater than 93% of SNPs represent polymorphisms between wheat cultivars and hence are valuable for detailed diversity analysis, marker assisted selection and genotyping by sequencing. The pipeline produces output in GFF3, VCF, Flapjack or Illumina Infinium design format for further genotyping diverse populations. As well as providing an unprecedented resource for wheat diversity analysis, the method establishes a foundation for high resolution SNP discovery in other large and complex genomes.
[Show abstract][Hide abstract] ABSTRACT: All lateral organ development in plants, such as nodulation in legumes, requires the temporal and spatial regulation of genes and gene networks. A total mRNA profiling approach using RNA-seq to target the specific soybean (Glycine max) root tissues responding to compatible rhizobia [i.e. the Zone Of Nodulation (ZON)] revealed a large number of novel, often transient, mRNA changes occurring during the early stages of nodulation. Focusing on the ZON enabled us to discard the majority of root tissues and their developmentally diverse gene transcripts, thereby highlighting the lowly and transiently expressed nodulation-specific genes. It also enabled us to concentrate on a precise moment in early nodule development at each sampling time. We focused on discovering genes regulated specifically by the Bradyrhizobium-produced Nod factor signal, by inoculating roots with either a competent wild-type or incompetent mutant (nodC(-) ) strain of Bradyrhizobium japonicum. Collectively, 2915 genes were identified as being differentially expressed, including many known soybean nodulation genes. A number of unknown nodulation gene candidates and soybean orthologues of nodulation genes previously reported in other legume species were also identified. The differential expression of several candidates was confirmed and further characterized via inoculation time-course studies and qRT-PCR. The expression of many genes, including an endo-1,4-β-glucanase, a cytochrome P450 and a TIR-LRR-NBS receptor kinase, was transient, peaking quickly during the initiation of nodule ontogeny. Additional genes were found to be down-regulated. Significantly, a set of differentially regulated genes acting in the gibberellic acid (GA) biosynthesis pathway was discovered, suggesting a novel role of GAs in nodulation.
[Show abstract][Hide abstract] ABSTRACT: Single nucleotide polymorphisms (SNPs) are the most abundant type of molecular genetic marker and can be used for producing high-resolution genetic maps, marker-trait association studies and marker-assisted breeding. Large polyploid genomes such as wheat present a challenge for SNP discovery because of the potential presence of multiple homoeologs for each gene. AutoSNPdb has been successfully applied to identify SNPs from Sanger sequence data for several species, including barley, rice and Brassica, but the volume of data required to accurately call SNPs in the complex genome of wheat has prevented its application to this important crop. DNA sequencing technology has been revolutionized by the introduction of next-generation sequencing, and it is now possible to generate several million sequence reads in a timely and cost-effective manner. We have produced wheat transcriptome sequence data using 454 sequencing technology and applied this for SNP discovery using a modified autoSNPdb method, which integrates SNP and gene annotation information with a graphical viewer. A total of 4,694,141 sequence reads from three bread wheat varieties were assembled to identify a total of 38 928 candidate SNPs. Each SNP is within an assembly complete with annotation, enabling the selection of polymorphism within genes of interest.
[Show abstract][Hide abstract] ABSTRACT: The large and complex genome of wheat makes genetic and genomic analysis in this important species both expensive and resource intensive. The application of next-generation sequencing technologies is particularly resource intensive, with at least 17 Gbp of sequence data required to obtain minimal (1×) coverage of the genome. A similar volume of data would represent almost 40× coverage of the rice genome. Progress can be made through the establishment of consortia to produce shared genomic resources. Australian wheat genome researchers, working with Bioplatforms Australia, have collaborated in a national initiative to establish a genetic diversity dataset representing Australian wheat germplasm based on whole genome next-generation sequencing data. Here, we describe the establishment and validation of this resource which can provide a model for broader international initiatives for the analysis of large and complex genomes.
[Show abstract][Hide abstract] ABSTRACT: Genomics is playing an increasing role in plant breeding and this is accelerating with the rapid advances in genome technology. Translating the vast abundance of data being produced by genome technologies requires the development of custom bioinformatics tools and advanced databases. These range from large generic databases which hold specific data types for a broad range of species, to carefully integrated and curated databases which act as a resource for the improvement of specific crops. In this review, we outline some of the features of plant genome databases, identify specific resources for the improvement of individual crops and comment on the potential future direction of crop genome databases.
[Show abstract][Hide abstract] ABSTRACT: Complex Triticeae genomes pose a challenge to genome sequencing efforts due to their size and repetitive nature. Genome sequencing can reveal details of conservation and rearrangements between related genomes. We have applied Illumina second generation sequencing technology to sequence and assemble the low copy and unique regions of Triticum aestivum chromosome arm 7BS, followed by the construction of a syntenic build based on gene order in Brachypodium. We have delimited the position of a previously reported translocation between 7BS and 4AL with a resolution of one or a few genes and report approximately 13% genes from 7BS having been translocated to 4AL. An additional 13 genes are found on 7BS which appear to have originated from 4AL. The gene content of the 7DS and 7BS syntenic builds indicate a total of ~77,000 genes in wheat. Within wheat syntenic regions, 7BS and 7DS share 740 genes and a common gene conservation rate of ~39% of the genes from the corresponding regions in Brachypodium, as well as a common rate of colinearity with Brachypodium of ~60%. Comparison of wheat homoeologues revealed ~84% of genes previously identified in 7DS have a homoeologue on 7BS or 4AL. The conservation rates we have identified among wheat homoeologues and with Brachypodium provide a benchmark of homoeologous gene conservation levels for future comparative genomic analysis. The syntenic build of 7BS is publicly available at http://www.wheatgenome.info.
[Show abstract][Hide abstract] ABSTRACT: • Bread wheat (Triticum aestivum; Poaceae) is a crop plant of great importance. It provides nearly 20% of the world's daily food supply measured by calorie intake, similar to that provided by rice. The yield of wheat has doubled over the last 40 years due to a combination of advanced agronomic practice and improved germplasm through selective breeding. More recently, yield growth has been less dramatic, and a significant improvement in wheat production will be required if demand from the growing human population is to be met. • Next-generation sequencing (NGS) technologies are revolutionizing biology and can be applied to address critical issues in plant biology. Technologies can produce draft sequences of genomes with a significant reduction to the cost and timeframe of traditional technologies. In addition, NGS technologies can be used to assess gene structure and expression, and importantly, to identify heritable genome variation underlying important agronomic traits. • This review provides an overview of the wheat genome and NGS technologies, details some of the problems in applying NGS technology to wheat, and describes how NGS technologies are starting to impact wheat crop improvement.
American Journal of Botany 02/2012; 99(2):365-71. DOI:10.3732/ajb.1100309 · 2.60 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: The genome of bread wheat (Triticum aestivum) is predicted to be greater than 16 Gbp in size and consist predominantly of repetitive elements, making the sequencing and assembly of this genome a major challenge. We have reduced genome sequence complexity by isolating chromosome arm 7DS and applied second-generation technology and appropriate algorithmic analysis to sequence and assemble low copy and genic regions of this chromosome arm. The assembly represents approximately 40% of the chromosome arm and all known 7DS genes. Comparison of the 7DS assembly with the sequenced genomes of rice (Oryza sativa) and Brachypodium distachyon identified large regions of conservation. The syntenic relationship between wheat, B. distachyon and O. sativa, along with available genetic mapping data, has been used to produce an annotated draft 7DS syntenic build, which is publicly available at http://www.wheatgenome.info. Our results suggest that the sequencing of isolated chromosome arms can provide valuable information of the gene content of wheat and is a step towards whole-genome sequencing and variation discovery in this important crop.