Transcriptome analysis of the roots at early and late seedling stages using Illumina paired-end sequencing and development of EST-SSR markers in radish
Shandong Key Laboratory of Greenhouse Vegetable Biology, Shandong Branch of National Vegetable Improvement Center, Institute of Vegetables, Shandong Academy of Agricultural Sciences, Jinan 250100, China. Plant Cell Reports
(Impact Factor: 3.07).
04/2012; 31(8):1437-47. DOI: 10.1007/s00299-012-1259-3
The tuberous root of radish is an important vegetable, but insufficient transcriptomic and genomic data are currently available to understand the molecular mechanisms underlying tuberous root formation and development. High-throughput transcriptomic sequencing is essential to generate a large transcript sequence data set for gene discovery and molecular marker development. In this study, a total of 107.3 million clean reads were generated using Illumina paired-end sequencing technology. De novo assembly generated 61,554 unigenes with an average length of 820 bp. Based on a sequence similarity search with known proteins or nucleotides, 85.51 % (52,634), 90.18 % (55,507) and 54 % (33,242) consensus sequences showed homology with sequences in the Nr, Nt and Swiss-Prot databases, respectively. Of these annotated unigenes, 21,109 and 17,343 unigenes were assigned to gene ontology categories and clusters of orthologous groups, respectively. A total of 27,809 unigenes were assigned to 123 pathways in the Kyoto Encyclopedia of Genes and Genomes database. Analysis of transcript differences between libraries from the early and late seedling developmental stages demonstrated that starch and sucrose metabolism and phenylpropanoid biosynthesis may be the dominant metabolic events during tuberous root formation and plant hormones probably play critical roles in regulation of this developmental process. In total, 14,641 potential EST-SSRs were identified among the unigenes, and 12,733 primer pairs for 2,511 SSR were obtained. Summarily, this study gave us a clue to understand the radish tuberous root formation and development, and also provided us with a valuable sequence resource for novel gene discovery and marker-assisted selective breeding in radish. KEY MESSAGE: De novo assembled and characterized the radish tuberous root transcriptome; explored the mechanism of radish tuberous root formation; development of EST-SSR markers in radish.
Available from: Loren A Honaas
- "RNA-Seq has been leveraged with de novo transcriptome assembly to learn about plant innovations including parasitism4567and C4 photosynthesis, plant processes including fruit ripening, drought response, and flavonoid biosynthesis, chemical defenses, and the evolution of sex chromosomes. The recent boom of RNA-Seq studies involving de novo assembly has motivated innovations in assemblers developed specifically for RNA-Seq data (Velvet[14,15], Oases[16,17](includes Velvet), SOAPdenovo1920212223242526272829, SOAPdenovo- Trans, CLC, ABySS, Trinity[5,13,333435363738). Comparison of de novo transcriptome assembler performance is hindered by lack of widely used standard quality metricsor rigorous evaluation of a comprehensive selection of assemblers with a transcriptome from a high quality reference genome. "
[Show abstract] [Hide abstract]
ABSTRACT: Whereas de novo assemblies of RNA-Seq data are being published for a growing number of species across the tree of life, there are currently no broadly accepted methods for evaluating such assemblies. Here we present a detailed comparison of 99 transcriptome assemblies, generated with 6 de novo assemblers including CLC, Trinity, SOAP, Oases, ABySS and NextGENe. Controlled analyses of de novo assemblies for Arabidopsis thaliana and Oryza sativa transcriptomes provide new insights into the strengths and limitations of transcriptome assembly strategies. We find that the leading assemblers generate reassuringly accurate assemblies for the majority of transcripts. At the same time, we find a propensity for assemblers to fail to fully assemble highly expressed genes. Surprisingly, the instance of true chimeric assemblies is very low for all assemblers. Normalized libraries are reduced in highly abundant transcripts, but they also lack 1000s of low abundance transcripts. We conclude that the quality of de novo transcriptome assemblies is best assessed through consideration of a combination of metrics: 1) proportion of reads mapping to an assembly 2) recovery of conserved, widely expressed genes, 3) N50 length statistics, and 4) the total number of unigenes. We provide benchmark Illumina transcriptome data and introduce SCERNA, a broadly applicable modular protocol for de novo assembly improvement. Finally, our de novo assembly of the Arabidopsis leaf transcriptome revealed ~20 putative Arabidopsis genes lacking in the current annotation.
- "Fortunately , with the advent of next-generation sequencing (NGS) technology in recent years, genomic information in previously uncharacterized systems can be generated (Grabherr et al. 2011). Currently, high-throughput RNA sequencing (RNA-Seq) has emerged as a powerful and cost-efficient tool for transcriptome analysis, and it has also been used for transcript profiling in various nonmodel plant species, including potato (Solanum tuberosum L.) (Zhang et al. 2014), pennycress (Thlaspi arvense) (Dorn et al. 2013), sweet potato (Ipomoea batatas) (Firon et al. 2013) and radish (Raphanus sativus L.) (Wang et al. 2012a). "
[Show abstract] [Hide abstract]
ABSTRACT: Ipomoea nil is widely used as an ornamental plant due to its abundance of flower color, but the limited transcriptome and genomic data hinder research on it. Using illumina platform, transcriptome profiling of I. nil was performed through high-throughput sequencing, which was proven to be a rapid and cost-effective means to characterize gene content. Our goal is to use the resulting information to facilitate the relevant research on flowering and flower color formation in I. nil. In total, 268 million unique illumina RNA-Seq reads were produced and used in the transcriptome assembly. These reads were assembled into 220,117 contigs, of which 137,307 contigs were annotated using the GO and KEGG database. Based on the result of functional annotations, a total of 89,781 contigs were assigned 455,335 GO term annotations. Meanwhile, 17,418 contigs were identified with pathway annotation and they were functionally assigned to 144 KEGG pathways. Our transcriptome revealed at least 55 contigs as probably flowering-related genes in I. nil, and we also identified 25 contigs that encode key enzymes in the phenylpropanoid biosynthesis pathway. Based on the analysis relating to gene expression profiles, in the phenylpropanoid biosynthesis pathway of I. nil, the repression of lignin biosynthesis might lead to the redirection of the metabolic flux into anthocyanin biosynthesis. This may be the most likely reason that I. nil has high anthocyanins content, especially in its flowers. Additionally, 15,537 simple sequence repeats (SSRs) were detected using the MISA software, and these SSRs will undoubtedly benefit future breeding work. Moreover, the information uncovered in this study will also serve as a valuable resource for understanding the flowering and flower color formation mechanisms in I. nil.
- "After removing the reads with adaptors, reads with unknown nucleotides larger than 5% and low quality reads, 66,110,340 clean PE reads consisting of 5,949,930,600 nucleotides (nt) were obtained with an average GC content of 47.34% (Table 1). The output was similar to a previous study on radish transcriptome from two root cDNA libraries, which generated a total of 53.6 million and 53.7 million clean reads, respectively . All high-quality clean reads were assembled into 150,455 contigs with an average length of 299 bp, and the length distribution of the assembled contigs was as shown in Additional file 1A. "
[Show abstract] [Hide abstract]
ABSTRACT: Radish (Raphanus sativus L.), is an important root vegetable crop worldwide. Glucosinolates in the fleshy taproot significantly affect the flavor and nutritional quality of radish. However, little is known about the molecular mechanisms underlying glucosinolate metabolism in radish taproots. The limited availability of radish genomic information has greatly hindered functional genomic analysis and molecular breeding in radish.
In this study, a high-throughput, large-scale RNA sequencing technology was employed to characterize the de novo transcriptome of radish roots at different stages of development. Approximately 66.11 million paired-end reads representing 73,084 unigenes with a N50 length of 1,095 bp, and a total length of 55.73 Mb were obtained. Comparison with the publicly available protein database indicates that a total of 67,305 (about 92.09% of the assembled unigenes) unigenes exhibit similarity (e --value <= 1.0e-5) to known proteins. The functional annotation and classification including Gene Ontology (GO), Clusters of Orthologous Group (COG) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis revealed that the main activated genes in radish taproots are predominately involved in basic physiological and metabolic processes, biosynthesis of secondary metabolite pathways, signal transduction mechanisms and other cellular components and molecular function related terms. The majority of the genes encoding enzymes involved in glucosinolate (GS) metabolism and regulation pathways were identified in the unigene dataset by targeted searches of their annotations. A number of candidate radish genes in the glucosinolate metabolism related pathways were also discovered, from which, eight genes were validated by T-A cloning and sequencing while four were validated by quantitative RT-PCR expression profiling.
The ensuing transcriptome dataset provides a comprehensive sequence resource for molecular genetics research in radish. It will serve as an important public information platform to further understanding of the molecular mechanisms involved in biosynthesis and metabolism of the related nutritional and flavor components during taproot formation in radish.
Data provided are for informational purposes only. Although carefully collected, accuracy cannot be guaranteed. The impact factor represents a rough estimation of the journal's impact factor and does not reflect the actual current impact factor. Publisher conditions are provided by RoMEO. Differing provisions from the publisher's actual policy or licence agreement may be applicable.