[Show abstract][Hide abstract] ABSTRACT: Upland cotton (Gossypium hirsutum L.) has a narrow germplasm base, which constrains marker
development and hampers intraspecific breeding. A pressing need exists for high-throughput
single nucleotide polymorphism (SNP) markers that can be readily applied to germplasm in
breeding and breeding-related research programs. Despite progress made in developing new
sequencing technologies during the past decade, the cost of sequencing remains substantial when
one is dealing with numerous samples and large genomes. Several strategies have been proposed
to lower the cost of sequencing for multiple genotypes of large-genome species like cotton, such
as transcriptome sequencing and reduced representation DNA sequencing. This paper reports the
development of a transcriptome assembly of the inbred line Texas Maker-1 (TM-1), a genetic
standard for cotton, its usefulness as a reference for RNAseq-based SNP identification, and the
availability of transcriptome sequences of four other cotton cultivars. An assembly of TM-1 was
made using Roche/454 transcriptome reads combined with an assembly of all available public
EST sequences of TM-1. The TM-1 assembly consists of 72,450 contigs with a total of 70M bp.
Functional predictions of the transcripts were estimated by alignment to selected protein
databases. Transcriptome sequences of the five lines, including TM-1, were obtained using an
Illumina Genome Analyzer-II, and the short reads were mapped to the TM-1 assembly to
discover SNPs among the five lines. We identified >14,000 unfiltered allelic SNPs, of which
~3,700 SNP were retained for assay development after applying several rigorous filters. This
paper reports availability of the reference transcriptome assembly and shows its utility in
developing intraspecific SNP markers in upland cotton.
The Plant Genome 02/2015; 8(1). DOI:10.3835/plantgenome2014.10.0068 · 3.88 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Background
Cotton (Gossypium spp.) is the largest producer of natural fibers for textile and is an important crop worldwide. Crop production is comprised primarily of G. hirsutum L., an allotetraploid. However, elite cultivars express very small amounts of variation due to the species monophyletic origin, domestication and further bottlenecks due to selection. Conversely, wild cotton species harbor extensive genetic diversity of prospective utility to improve many beneficial agronomic traits, fiber characteristics, and resistance to disease and drought. Introgression of traits from wild species can provide a natural way to incorporate advantageous traits through breeding to generate higher-producing cotton cultivars and more sustainable production systems. Interspecific introgression efforts by conventional methods are very time-consuming and costly, but can be expedited using marker-assisted selection.
Using transcriptome sequencing we have developed the first gene-associated single nucleotide polymorphism (SNP) markers for wild cotton species G. tomentosum, G. mustelinum, G. armourianum and G. longicalyx. Markers were also developed for a secondary cultivated species G. barbadense cv. 3–79. A total of 62,832 non-redundant SNP markers were developed from the five wild species which can be utilized for interspecific germplasm introgression into cultivated G. hirsutum and are directly associated with genes. Over 500 of the G. barbadense markers have been validated by whole-genome radiation hybrid mapping. Overall 1,060 SNPs from the five different species have been screened and shown to produce acceptable genotyping assays.
This large set of 62,832 SNPs relative to cultivated G. hirsutum will allow for the first high-density mapping of genes from five wild species that affect traits of interest, including beneficial agronomic and fiber characteristics. Upon mapping, the markers can be utilized for marker-assisted introgression of new germplasm into cultivated cotton and in subsequent breeding of agronomically adapted types, including cultivar development.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2164-15-945) contains supplementary material, which is available to authorized users.
[Show abstract][Hide abstract] ABSTRACT: Phosphorus, in its orthophosphate form (Pi), is one of the most limiting macronutrients in soils for plant growth and development. However, the whole genome molecular mechanisms contributing to plant acclimation to Pi deficiency remain largely unknown. White lupin (Lupinus albus L.) has evolved unique adaptations for growth in Pi deficient soils including the development of cluster roots to increase root surface area. In this study, we utilized RNA-Seq technology to assess global gene expression in white lupin cluster roots, normal roots, and leaves in response to Pi supply. We de novo assembled 277,224,180 Illumina reads from 12 cDNA libraries to build the first white lupin gene index (LAGI 1.0). This index contains 125,821 unique sequences with an average length of 1,155 bp. Of these sequences 50,734 were transcriptionally active (RPKM ≥ 3) representing approximately 7.8% of the Lupinus albus genome, using the predicted genome size of Lupinus angustifolius as a reference. We identified a total of 2,128 sequences differentially expressed in response to Pi deficiency with a ≥ 2-fold change and a p-value ≤ 0.05. Twelve sequences were consistently differentially expressed due to Pi deficiency stress in three species, making them ideal candidates to monitor the Pi status of plants. Additionally, classic physiological experiments were coupled with RNA-Seq data to examine the role of cytokinin and gibberellic acid in Pi deficiency-induced cluster root development. This global gene expression analysis provides new insights into the biochemical and molecular mechanisms involved in the acclimation to Pi deficiency.
[Show abstract][Hide abstract] ABSTRACT: Alfalfa, [Medicago sativa (L.) sativa], a widely-grown perennial forage has potential for development as a cellulosic ethanol feedstock. However, the genomics of alfalfa, a non-model species, is still in its infancy. The recent advent of RNA-Seq, a massively parallel sequencing method for transcriptome analysis, provides an opportunity to expand the identification of alfalfa genes and polymorphisms, and conduct in-depth transcript profiling.
Cell walls in stems of alfalfa genotype 708 have higher cellulose and lower lignin concentrations compared to cell walls in stems of genotype 773. Using the Illumina GA-II platform, a total of 198,861,304 expression sequence tags (ESTs, 76 bp in length) were generated from cDNA libraries derived from elongating stem (ES) and post-elongation stem (PES) internodes of 708 and 773. In addition, 341,984 ESTs were generated from ES and PES internodes of genotype 773 using the GS FLX Titanium platform. The first alfalfa (Medicago sativa) gene index (MSGI 1.0) was assembled using the Sanger ESTs available from GenBank, the GS FLX Titanium EST sequences, and the de novo assembled Illumina sequences. MSGI 1.0 contains 124,025 unique sequences including 22,729 tentative consensus sequences (TCs), 22,315 singletons and 78,981 pseudo-singletons. We identified a total of 1,294 simple sequence repeats (SSR) among the sequences in MSGI 1.0. In addition, a total of 10,826 single nucleotide polymorphisms (SNPs) were predicted between the two genotypes. Out of 55 SNPs randomly selected for experimental validation, 47 (85%) were polymorphic between the two genotypes. We also identified numerous allelic variations within each genotype. Digital gene expression analysis identified numerous candidate genes that may play a role in stem development as well as candidate genes that may contribute to the differences in cell wall composition in stems of the two genotypes.
Our results demonstrate that RNA-Seq can be successfully used for gene identification, polymorphism detection and transcript profiling in alfalfa, a non-model, allogamous, autotetraploid species. The alfalfa gene index assembled in this study, and the SNPs, SSRs and candidate genes identified can be used to improve alfalfa as a forage crop and cellulosic feedstock.
[Show abstract][Hide abstract] ABSTRACT: *MicroRNAs (miRNAs) play a pivotal role in post-transcriptional regulation of gene expression in plants. Information on miRNAs in legumes is as yet scarce. This work investigates miRNAs in an agronomically important legume, common bean (Phaseolus vulgaris). *A hybridization approach employing miRNA macroarrays - printed with oligonucleotides complementary to 68 known miRNAs - was used to detect miRNAs in the leaves, roots and nodules of control and nutrient-stressed (phosphorus, nitrogen, or iron deficiency; acidic pH; and manganese toxicity) common bean plants. *Thirty-three miRNAs were expressed in control plants and another five were only expressed under stress conditions. The miRNA expression ratios (stress:control) were evaluated using principal component and hierarchical cluster analyses. A group of miRNAs responded to nearly all stresses in the three organs analyzed. Other miRNAs showed organ-specific responses. Most of the nodule-responsive miRNAs showed up-regulation. miRNA blot expression analysis confirmed the macroarray results. Novel miRNA target genes were proposed for common bean and the expression of selected targets was evaluated by quantitative reverse transcriptase-polymerase chain reaction. *In addition to the detection of previously reported stress-responsive miRNAs, we discovered novel common bean stress-responsive miRNAs, for manganese toxicity. Our data provide a foundation for evaluating the individual roles of miRNAs in common bean.
New Phytologist 08/2010; 187(3):805-18. DOI:10.1111/j.1469-8137.2010.03320.x · 6.55 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Common bean (Phaseolus vulgaris L.) and soybean (Glycine max) both belong to the Phaseoleae tribe and share significant coding sequence homology. This suggests that the GeneChip(R) Soybean Genome Array (soybean GeneChip) may be used for gene expression studies using common bean.
To evaluate the utility of the soybean GeneChip for transcript profiling of common bean, we hybridized cRNAs purified from nodule, leaf, and root of common bean and soybean in triplicate to the soybean GeneChip. Initial data analysis showed a decreased sensitivity and accuracy of measuring differential gene expression in common bean cross-species hybridization (CSH) GeneChip data compared to that of soybean. We employed a method that masked putative probes targeting inter-species variable (ISV) regions between common bean and soybean. A masking signal intensity threshold was selected that optimized both sensitivity and accuracy of measuring differential gene expression. After masking for ISV regions, the number of differentially-expressed genes identified in common bean was increased by 2.8-fold reflecting increased sensitivity. Quantitative RT-PCR (qRT-PCR) analysis of 20 randomly selected genes and purine-ureide pathway genes demonstrated an increased accuracy of measuring differential gene expression after masking for ISV regions. We also evaluated masked probe frequency per probe set to gain insight into the sequence divergence pattern between common bean and soybean. The sequence divergence pattern analysis suggested that the genes for basic cellular functions and metabolism were highly conserved between soybean and common bean. Additionally, our results show that some classes of genes, particularly those associated with environmental adaptation, are highly divergent.
The soybean GeneChip is a suitable cross-species platform for transcript profiling in common bean when used in combination with the masking protocol described. In addition to transcript profiling, CSH of the GeneChip in combination with masking probes in the ISV regions can be used for comparative ecological and/or evolutionary genomics studies.
[Show abstract][Hide abstract] ABSTRACT: The GeneChip(R) Medicago Genome Array, developed for Medicago truncatula, is a suitable platform for transcript profiling in tetraploid alfalfa [Medicago sativa (L.) subsp. sativa]. However, previous research involving cross-species hybridization (CSH) has shown that sequence variation between two species can bias transcript profiling by decreasing sensitivity (number of expressed genes detected) and the accuracy of measuring fold-differences in gene expression.
Transcript profiling using the Medicago GeneChip(R) was conducted with elongating stem (ES) and post-elongation stem (PES) internodes from alfalfa genotypes 252 and 1283 that differ in stem cell wall concentrations of cellulose and lignin. A protocol was developed that masked probes targeting inter-species variable (ISV) regions of alfalfa transcripts. A probe signal intensity threshold was selected that optimized both sensitivity and accuracy. After masking for both ISV regions and previously identified single-feature polymorphisms (SFPs), the number of differentially expressed genes between the two genotypes in both ES and PES internodes was approximately 2-fold greater than the number detected prior to masking. Regulatory genes, including transcription factor and receptor kinase genes that may play a role in development of secondary xylem, were significantly over-represented among genes up-regulated in 252 PES internodes compared to 1283 PES internodes. Several cell wall-related genes were also up-regulated in genotype 252 PES internodes. Real-time quantitative RT-PCR of differentially expressed regulatory and cell wall-related genes demonstrated increased sensitivity and accuracy after masking for both ISV regions and SFPs. Over 1,000 genes that were differentially expressed in ES and PES internodes of genotypes 252 and 1283 were mapped onto putative orthologous loci on M. truncatula chromosomes. Clustering simulation analysis of the differentially expressed genes suggested co-expression of some neighbouring genes on Medicago chromosomes.
The problems associated with transcript profiling in alfalfa stems using the Medicago GeneChip as a CSH platform were mitigated by masking probes targeting ISV regions and SFPs. Using this masking protocol resulted in the identification of numerous candidate genes that may contribute to differences in cell wall concentration and composition of stems of two alfalfa genotypes.
[Show abstract][Hide abstract] ABSTRACT: Advances in alfalfa [Medicago sativa (L.) subsp. sativa] breeding, molecular genetics, and genomics have been slow because this crop is an allogamous autotetraploid (2n = 4x = 32) with complex polysomic inheritance and few genomic resources. Increasing cellulose and decreasing lignin in alfalfa stem cell walls would improve this crop as a cellulosic ethanol feedstock. We conducted genome-wide analysis of single-feature polymorphisms (SFPs) of two alfalfa genotypes (252, 1283) that differ in stem cell wall lignin and cellulose concentrations. SFP analysis was conducted using the Medicago GeneChip (Affymetrix, Santa Clara, CA) as a cross-species platform. Analysis of GeneChip expression data files of alfalfa stem internodes of genotypes 252 and 1283 at two growth stages (elongating, post-elongation) revealed 10,890 SFPs in 8230 probe sets. Validation analysis by polymerase chain reaction (PCR)-sequencing of a random sample of SFPs indicated a 17% false discovery rate. Functional classification and over-representation analysis showed that genes involved in photosynthesis, stress response and cell wall biosynthesis were highly enriched among SFP-harboring genes. The Medicago GeneChip is a suitable cross-species platform for detecting SFPs in tetraploid alfalfa.
The Plant Genome 11/2009; 2(3). DOI:10.3835/plantgenome2009.03.0014 · 3.88 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Single-feature polymorphism (SFP) discovery is a rapid and cost-effective approach to identify DNA polymorphisms. However, high false positive rates and/or low sensitivity are prevalent in previously described SFP detection methods. This work presents a new computing method for SFP discovery.
The probe affinity differences and affinity shape powers formed by the neighboring probes in each probe set were computed into SFP weight scores. This method was validated by known sequence information and was comprehensively compared with previously-reported methods using the same datasets. A web application using this algorithm has been implemented for SFP detection. Using this method, we identified 364 SFPs in a barley near-isogenic line pair carrying either the wild type or the mutant uniculm2 (cul2) allele. Most of the SFP polymorphisms were identified on chromosome 6H in the vicinity of the Cul2 locus.
This SFP discovery method exhibits better performance in specificity and sensitivity over previously-reported methods. It can be used for other organisms for which GeneChip technology is available. The web-based tool will facilitate SFP discovery. The 364 SFPs discovered in a barley near-isogenic line pair provide a set of genetic markers for fine mapping and future map-based cloning of the Cul2 locus.
[Show abstract][Hide abstract] ABSTRACT: We have initiated a genome-wide transcript profiling study using the model legume Medicago truncatula to identify putative genes related to cell wall biosynthesis and regulatory function in legumes. We used the GeneChip® Medicago Genome Array to compare transcript abundance in elongating versus postelongation stem internode segments of two M. truncatula accessions and two Medicago sativa (alfalfa) clones with contrasting stem cell wall concentration and composition. Hundreds of differentially expressed probe sets between elongating and postelongation stem segments showed similar patterns of gene expression in the model legume and cultivated alfalfa. Differentially expressed genes included genes with putative functions associated with primary and secondary cell wall biosynthesis and growth. Mining of public microarray data for coexpressed genes with two marker genes for secondary cell wall synthesis identified additional candidate secondary cell wall-related genes. Coexpressed genes included protein kinases, transcription factors, and unclassified groups that were not previously reported with secondary cell wall-associated genes. M. truncatula has been recognized as an excellent model plant for legume genomics. The stem tissue transcriptome analysis, described here, indicates that M. truncatula has utility as a model plant for cell wall genomics in legumes in general and shows excellent potential for translating gene discoveries to its close relative, cultivated alfalfa, in particular. The natural variation for stem cell wall traits in Medicago may offer a new tool to study an expanded repertoire of valuable agronomic traits in related species, including woody dicots in the eurosid I clade.
BioEnergy Research 06/2009; 2(1-2). DOI:10.1007/s12155-009-9034-1 · 3.40 Impact Factor
[Show abstract][Hide abstract] ABSTRACT: Gene expression during the early stages of fiber cell development and in allopolyploid crops is poorly understood. Here we report computational and expression analyses of 32 789 high-quality ESTs derived from Gossypium hirsutum L. Texas Marker-1 (TM-1) immature ovules (GH_TMO). The ESTs were assembled into 8540 unique sequences including 4036 tentative consensus sequences (TCs) and 4504 singletons, representing approximately 15% of the unique sequences in the cotton EST collection. Compared with approximately 178 000 existing ESTs derived from elongating fibers and non-fiber tissues, GH_TMO ESTs showed a significant increase in the percentage of genes encoding putative transcription factors such as MYB and WRKY and genes encoding predicted proteins involved in auxin, brassinosteroid (BR), gibberellic acid (GA), abscisic acid (ABA) and ethylene signaling pathways. Cotton homologs related to MIXTA, MYB5, GL2 and eight genes in the auxin, BR, GA and ethylene pathways were induced during fiber cell initiation but repressed in the naked seed mutant (N1N1) that is impaired in fiber formation. The data agree with the known roles of MYB and WRKY transcription factors in Arabidopsis leaf trichome development and the well-documented phytohormonal effects on fiber cell development in immature cotton ovules cultured in vitro. Moreover, the phytohormonal pathway-related genes were induced prior to the activation of MYB-like genes, suggesting an important role of phytohormones in cell fate determination. Significantly, AA sub-genome ESTs of all functional classifications including cell-cycle control and transcription factor activity were selectively enriched in G. hirsutum L., an allotetraploid derived from polyploidization between AA and DD genome species, a result consistent with the production of long lint fibers in AA genome species. These results suggest general roles for genome-specific, phytohormonal and transcriptional gene regulation during the early stages of fiber cell development in cotton allopolyploids.
The Plant Journal 10/2006; 47(5):761-75. DOI:10.1111/j.1365-313X.2006.02829.x · 6.82 Impact Factor