[Show abstract][Hide abstract] ABSTRACT: Despite the international significance of wheat, its large and complex genome hinders genome sequencing efforts. To assess the impact of selection on this genome, we have assembled genomic regions representing genes for chromosomes 7A, 7B and 7D. We demonstrate that the dispersion of wheat to new environments has shaped the modern wheat genome. Most genes are conserved between the three homoeologous chromosomes. We found differential gene loss that supports current theories on the evolution of wheat, with greater loss observed in the A and B genomes compared with the D. Analysis of intervarietal polymorphisms identified fewer polymorphisms in the D genome, supporting the hypothesis of early gene flow between the tetraploid and hexaploid. The enrichment for genes on the D genome that confer environmental adaptation may be associated with dispersion following wheat domestication. Our results demonstrate the value of applying next-generation sequencing technologies to assemble gene-rich regions of complex genomes and investigate polyploid genome evolution. We anticipate the genome-wide application of this reduced-complexity syntenic assembly approach will accelerate crop improvement efforts not only in wheat, but also in other polyploid crops of significance.
[Show abstract][Hide abstract] ABSTRACT: All lateral organ development in plants, such as nodulation in legumes, requires the temporal and spatial regulation of genes and gene networks. A total mRNA profiling approach using RNA-seq to target the specific soybean (Glycine max) root tissues responding to compatible rhizobia [i.e. the Zone Of Nodulation (ZON)] revealed a large number of novel, often transient, mRNA changes occurring during the early stages of nodulation. Focusing on the ZON enabled us to discard the majority of root tissues and their developmentally diverse gene transcripts, thereby highlighting the lowly and transiently expressed nodulation-specific genes. It also enabled us to concentrate on a precise moment in early nodule development at each sampling time. We focused on discovering genes regulated specifically by the Bradyrhizobium-produced Nod factor signal, by inoculating roots with either a competent wild-type or incompetent mutant (nodC(-) ) strain of Bradyrhizobium japonicum. Collectively, 2915 genes were identified as being differentially expressed, including many known soybean nodulation genes. A number of unknown nodulation gene candidates and soybean orthologues of nodulation genes previously reported in other legume species were also identified. The differential expression of several candidates was confirmed and further characterized via inoculation time-course studies and qRT-PCR. The expression of many genes, including an endo-1,4-β-glucanase, a cytochrome P450 and a TIR-LRR-NBS receptor kinase, was transient, peaking quickly during the initiation of nodule ontogeny. Additional genes were found to be down-regulated. Significantly, a set of differentially regulated genes acting in the gibberellic acid (GA) biosynthesis pathway was discovered, suggesting a novel role of GAs in nodulation.
[Show abstract][Hide abstract] ABSTRACT: Single nucleotide polymorphisms (SNPs) are the most abundant type of molecular genetic marker and can be used for producing high-resolution genetic maps, marker-trait association studies and marker-assisted breeding. Large polyploid genomes such as wheat present a challenge for SNP discovery because of the potential presence of multiple homoeologs for each gene. AutoSNPdb has been successfully applied to identify SNPs from Sanger sequence data for several species, including barley, rice and Brassica, but the volume of data required to accurately call SNPs in the complex genome of wheat has prevented its application to this important crop. DNA sequencing technology has been revolutionized by the introduction of next-generation sequencing, and it is now possible to generate several million sequence reads in a timely and cost-effective manner. We have produced wheat transcriptome sequence data using 454 sequencing technology and applied this for SNP discovery using a modified autoSNPdb method, which integrates SNP and gene annotation information with a graphical viewer. A total of 4,694,141 sequence reads from three bread wheat varieties were assembled to identify a total of 38 928 candidate SNPs. Each SNP is within an assembly complete with annotation, enabling the selection of polymorphism within genes of interest.
[Show abstract][Hide abstract] ABSTRACT: The large and complex genome of wheat makes genetic and genomic analysis in this important species both expensive and resource intensive. The application of next-generation sequencing technologies is particularly resource intensive, with at least 17 Gbp of sequence data required to obtain minimal (1×) coverage of the genome. A similar volume of data would represent almost 40× coverage of the rice genome. Progress can be made through the establishment of consortia to produce shared genomic resources. Australian wheat genome researchers, working with Bioplatforms Australia, have collaborated in a national initiative to establish a genetic diversity dataset representing Australian wheat germplasm based on whole genome next-generation sequencing data. Here, we describe the establishment and validation of this resource which can provide a model for broader international initiatives for the analysis of large and complex genomes.
[Show abstract][Hide abstract] ABSTRACT: Genomics is playing an increasing role in plant breeding and this is accelerating with the rapid advances in genome technology. Translating the vast abundance of data being produced by genome technologies requires the development of custom bioinformatics tools and advanced databases. These range from large generic databases which hold specific data types for a broad range of species, to carefully integrated and curated databases which act as a resource for the improvement of specific crops. In this review, we outline some of the features of plant genome databases, identify specific resources for the improvement of individual crops and comment on the potential future direction of crop genome databases.
[Show abstract][Hide abstract] ABSTRACT: Complex Triticeae genomes pose a challenge to genome sequencing efforts due to their size and repetitive nature. Genome sequencing can reveal details of conservation and rearrangements between related genomes. We have applied Illumina second generation sequencing technology to sequence and assemble the low copy and unique regions of Triticum aestivum chromosome arm 7BS, followed by the construction of a syntenic build based on gene order in Brachypodium. We have delimited the position of a previously reported translocation between 7BS and 4AL with a resolution of one or a few genes and report approximately 13% genes from 7BS having been translocated to 4AL. An additional 13 genes are found on 7BS which appear to have originated from 4AL. The gene content of the 7DS and 7BS syntenic builds indicate a total of ~77,000 genes in wheat. Within wheat syntenic regions, 7BS and 7DS share 740 genes and a common gene conservation rate of ~39% of the genes from the corresponding regions in Brachypodium, as well as a common rate of colinearity with Brachypodium of ~60%. Comparison of wheat homoeologues revealed ~84% of genes previously identified in 7DS have a homoeologue on 7BS or 4AL. The conservation rates we have identified among wheat homoeologues and with Brachypodium provide a benchmark of homoeologous gene conservation levels for future comparative genomic analysis. The syntenic build of 7BS is publicly available at http://www.wheatgenome.info.
Theoretical and Applied Genetics 02/2012; 124(3):423-32. · 3.66 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: • Bread wheat (Triticum aestivum; Poaceae) is a crop plant of great importance. It provides nearly 20% of the world's daily food supply measured by calorie intake, similar to that provided by rice. The yield of wheat has doubled over the last 40 years due to a combination of advanced agronomic practice and improved germplasm through selective breeding. More recently, yield growth has been less dramatic, and a significant improvement in wheat production will be required if demand from the growing human population is to be met. • Next-generation sequencing (NGS) technologies are revolutionizing biology and can be applied to address critical issues in plant biology. Technologies can produce draft sequences of genomes with a significant reduction to the cost and timeframe of traditional technologies. In addition, NGS technologies can be used to assess gene structure and expression, and importantly, to identify heritable genome variation underlying important agronomic traits. • This review provides an overview of the wheat genome and NGS technologies, details some of the problems in applying NGS technology to wheat, and describes how NGS technologies are starting to impact wheat crop improvement.
American Journal of Botany 02/2012; 99(2):365-71. · 2.59 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Single nucleotide polymorphisms (SNPs) are becoming the dominant form of molecular marker for genetic and genomic analysis. The advances in second generation DNA sequencing provide opportunities to identify very large numbers of SNPs in a range of species. However, SNP identification remains a challenge for large and polyploid genomes due to their size and complexity. We have developed a pipeline for the robust identification of SNPs in large and complex genomes using Illumina second generation DNA sequence data and demonstrated this by the discovery of SNPs in the hexaploid wheat genome. We have developed a SNP discovery pipeline called SGSautoSNP (Second-Generation Sequencing AutoSNP) and applied this to discover more than 800,000 SNPs between four hexaploid wheat cultivars across chromosomes 7A, 7B and 7D. All SNPs are presented for download and viewing within a public GBrowse database. Validation suggests an accuracy of greater than 93% of SNPs represent polymorphisms between wheat cultivars and hence are valuable for detailed diversity analysis, marker assisted selection and genotyping by sequencing. The pipeline produces output in GFF3, VCF, Flapjack or Illumina Infinium design format for further genotyping diverse populations. As well as providing an unprecedented resource for wheat diversity analysis, the method establishes a foundation for high resolution SNP discovery in other large and complex genomes.
[Show abstract][Hide abstract] ABSTRACT: The genome of bread wheat (Triticum aestivum) is predicted to be greater than 16 Gbp in size and consist predominantly of repetitive elements, making the sequencing and assembly of this genome a major challenge. We have reduced genome sequence complexity by isolating chromosome arm 7DS and applied second-generation technology and appropriate algorithmic analysis to sequence and assemble low copy and genic regions of this chromosome arm. The assembly represents approximately 40% of the chromosome arm and all known 7DS genes. Comparison of the 7DS assembly with the sequenced genomes of rice (Oryza sativa) and Brachypodium distachyon identified large regions of conservation. The syntenic relationship between wheat, B. distachyon and O. sativa, along with available genetic mapping data, has been used to produce an annotated draft 7DS syntenic build, which is publicly available at http://www.wheatgenome.info. Our results suggest that the sequencing of isolated chromosome arms can provide valuable information of the gene content of wheat and is a step towards whole-genome sequencing and variation discovery in this important crop.