The First Insight into the Tissue Specific Taxus Transcriptome via Illumina Second Generation Sequencing

Biotechnology Institute, Dalian Jiaotong University, Dalian, China.
PLoS ONE (Impact Factor: 3.53). 06/2011; 6(6):e21220. DOI: 10.1371/journal.pone.0021220
Source: PubMed

ABSTRACT Illumina second generation sequencing is now an efficient route for generating enormous sequence collections that represent expressed genes and quantitate expression level. Taxus is a world-wide endangered gymnosperm genus and forms an important anti-cancer medicinal resource, but the large and complex genomes of Taxus have hindered the development of genomic resources. The research of its tissue-specific transcriptome is absent. There is also no study concerning the association between the plant transcriptome and metabolome with respect to the plant tissue type.
We performed the de novo assembly of Taxus mairei transcriptome using Illumina paired-end sequencing technology. In a single run, we produced 13,737,528 sequencing reads corresponding to 2.03 Gb total nucleotides. These reads were assembled into 36,493 unique sequences. Based on similarity search with known proteins, 23,515 Unigenes were identified to have the Blast hit with a cut-off E-value above 10⁻⁵. Furthermore, we investigated the transcriptome difference of three Taxus tissues using a tag-based digital gene expression system. We obtained a sequencing depth of over 3.15 million tags per sample and identified a large number of genes associated with tissue specific functions and taxane biosynthetic pathway. The expression of the taxane biosynthetic genes is significantly higher in the root than in the leaf and the stem, while high activity of taxane-producing pathway in the root was also revealed via metabolomic analyses. Moreover, many antisense transcripts and novel transcripts were found; clusters with similar differential expression patterns, enriched GO terms and enriched metabolic pathways with regard to the differentially expressed genes were revealed for the first time.
Our data provides the most comprehensive sequence resource available for Taxus study and will help define mechanisms of tissue specific functions and secondary metabolism in non-model plant organisms.

Download full-text


Available from: Guangbo Ge, Jun 27, 2015
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Transcriptome sequencing is a powerful tool for the assessment of gene expression and the identification and characterization of molecular markers in non-model organisms. Rhodiola algida L. (Crassulaceae), endemic to the Qinghai-Tibetan Plateau, has long been used in traditional Chinese medicine to prevent altitude sickness and eliminate fatigue. Illumina-based high-throughput transcriptome sequencing of aboveground and underground tissues of R. algida respectively yielded 5.40 million and 5.18 million clean reads. A total of 82,664 unigenes averaging 577bp in length were generated from the aboveground clean reads, with 86,237 unigenes of 502-bp mean length obtained from the underground tissues. Of 55,028 unigenes compared with sequences in UniProt databases, 20,413 unigenes had significant similarities with existing sequences in NR, NT, Swiss-Prot, GO, KEGG, and COG databases. Single nucleotide polymorphism (SNP) analysis identified 237,294 SNPs from 154,636 contigs of aboveground tissues and 197,540 SNPs from 144,963 underground-derived contigs. The information uncovered in this study should serve as a valuable resource for the characterization of important traits related to secondary metabolite formation and for the identification of associated molecular mechanisms.
    Gene 10/2014; 553(2):90-97. DOI:10.1016/j.gene.2014.09.063 · 2.08 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Transcriptomic sequence resources represent invaluable assets for research, in particular for non-model species without a sequenced genome. To date, the Next Generation Sequencing technologies 454/Roche and Illumina have been used to generate transcriptome sequence databases by mRNA-Seq for more than fifty different plant species. While some of the databases were successfully used for downstream applications, such as proteomics, the assembly parameters indicate that the assemblies do not yet accurately reflect the actual plant transcriptomes. Two different assembly strategies have been used, overlap consensus based assemblers for long reads and Eulerian path/de Bruijn graph assembler for short reads. In this review, we discuss the challenges and solutions to the transcriptome assembly problem. A list of quality control parameters and the necessary scripts to produce them are provided.
    Frontiers in Plant Science 09/2012; 3:220. DOI:10.3389/fpls.2012.00220 · 3.64 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The ripe fruit of Momordica cochinchinensis Spreng, known as gac, is featured by very high carotenoid content. Although this plant might be a good resource for carotenoid metabolic engineering, so far, the genes involved in the carotenoid metabolic pathways in gac were unidentified due to lack of genomic information in the public database. In order to expedite the process of gene discovery, we have undertaken Illumina deep sequencing of mRNA prepared from aril of gac fruit. From 51,446,670 high-quality reads, we obtained 81,404 assembled unigenes with average length of 388 base pairs. At the protein level, gac aril transcripts showed about 81.5% similarity with cucumber proteomes. In addition 17,104 unigenes have been assigned to specific metabolic pathways in Kyoto Encyclopedia of Genes and Genomes, and all of known enzymes involved in terpenoid backbones biosynthetic and carotenoid biosynthetic pathways were also identified in our library. To analyze the relationship between putative carotenoid biosynthesis genes and alteration of carotenoid content during fruit ripening, digital gene expression analysis was performed on three different ripening stages of aril. This study has revealed putative phytoene synthase, 15-cis-phytone desaturase, zeta-carotene desaturase, carotenoid isomerase and lycopene epsilon cyclase might be key factors for controlling carotenoid contents during aril ripening. Taken together, this study has also made availability of a large gene database. This unique information for gac gene discovery would be helpful to facilitate functional studies for improving carotenoid quantities.
    Plant Molecular Biology 05/2012; 79(4-5):413-27. DOI:10.1007/s11103-012-9919-9 · 4.07 Impact Factor