Cheung F, Haas B, Goldberg S, May G, Xiao Y, Town C. Sequencing Medicago truncatula expressed sequenced tags using 454 life sciences technology. BMC Genomics 7: 272

The Institute for Genomic Research, 9712 Medical Center Drive, Rockville, MD 20850, USA.
BMC Genomics (Impact Factor: 3.99). 02/2006; 7(1):272. DOI: 10.1186/1471-2164-7-272
Source: PubMed


In this study, we addressed whether a single 454 Life Science GS20 sequencing run provides new gene discovery from a normalized cDNA library, and whether the short reads produced via this technology are of value in gene structure annotation.
A single 454 GS20 sequencing run on adapter-ligated cDNA, from a normalized cDNA library, generated 292,465 reads that were reduced to 252,384 reads with an average read length of 92 nucleotides after cleaning. After clustering and assembly, a total of 184,599 unique sequences were generated containing over 400 SSRs. The 454 sequences generated hits to more genes than a comparable amount of sequence from MtGI. Although short, the 454 reads are of sufficient length to map to a unique genome location as effectively as longer ESTs produced by conventional sequencing. Functional interpretation of the sequences was carried out by Gene Ontology assignments from matches to Arabidopsis and was shown to cover a broad range of GO categories. 53,796 assemblies and singletons (29%) had no match in the existing MtGI. Within the previously unobserved Medicago transcripts, thousands had matches in a comprehensive protein database and one or more of the TIGR Plant Gene Indices. Approximately 20% of these novel sequences could be found in the Medicago genome sequence. A total of 70,026 reads generated by the 454 technology were mapped to 785 Medicago finished BACs using PASA and over 1,000 gene models required modification. In parallel to 454 sequencing, 4,445 5'-prime reads were generated by conventional sequencing using the same library and from the assembled sequences it was shown to contain about 52% full length cDNAs encoding proteins from 50 to over 500 amino acids in length.
Due to the large number of reads afforded by the 454 DNA sequencing technology, it is effective in revealing the expression of transcripts from a broad range of GO categories and contains many rare transcripts in normalized cDNA libraries, although only a limited portion of their sequence is uncovered. As with longer ESTs, 454 reads can be mapped uniquely onto genomic sequence to provide support for, and modifications of, gene predictions.

Download full-text


Available from: Chris Town, Feb 23, 2015
  • Source
    • "These technologies have made it affordable to sequence the whole genomes of nematodes, which are almost less than 1/15 of the size of the 3.2 Gb human genome (Dillman et al., 2012). Besides genome sequencing, these next-gen technology like Roche/454 technology are also being used to study the transcriptomes of different organisms (Cheung et al., 2006; Weber et al., 2007) and plant parasitic nematodes (Haegeman et al., 2011; Nicol et al., 2012). Although each individual cell of any organism contains its complete set of genes in distinctive chromosomes, the transcriptional activity of each gene is a quite dynamic and multifactorial process. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Plant parasitic nematodes are obligate parasites causing serious reduction in crop yields. Several economically important species parasitize various plant species, but the root knot and cyst nematodes belonging to the Heteroderidae family are especially dangerous. Plant parasitic nematodes result in crop losses of over $150 billion worldwide. This review gives an account of morpho-physiological and molecular events during parasitism of root-knot and cyst nematodes. It describes the transcriptomes and parasitomes of various nematodes indicating that the effector proteins are crucial for the compatible plant nematode interactions. Various sequencing techniques used in plant-nematode genomics and transcriptomics are discussed. Moreover, the dynamics of host transcriptomes in response to infection with different nematode species have been reported. The host transcriptomes have unrevealed many candidate genes, which are involved in both compatible and incompatible plant nematode interactions. The strategy of manipulation of expression of the genes induced and suppressed by the nematodes in the feeding sites has also been suggested for enhancement of resistance against nematodes. This review will provide the researchers with the information regarding transcriptional changes in the nematodes as well as host plants, which would be important for the induction of resistance against nematodes in different crop plants. © 2015 Friends Science Publishers
    International Journal of Agriculture and Biology 03/2015; 17. DOI:10.17957/IJAB/15.0037 · 0.90 Impact Factor
  • Source
    • "Thus, it is necessary to profile soil microbial communities for evaluating the effects and feedback responses of climate or land-use changes. The rapid development of a suite of high-throughput, sequencing-or microarray-based metagenomics tools has enabled accurate measurements of microbial community structures (Call et al. 2003; Curtis and Sloan 2005; Cheung et al. 2006; Claesson et al. 2010). Among these, GeoChip excels in that it efficiently targets a wide range of gene markers involved in carbon, nitrogen, sulfur and phosphorus cycling, antibiotics resistance, metal resistance, and organic pollutant degradation (He et al. 2007; Lu et al. 2012). "
    [Show abstract] [Hide abstract]
    ABSTRACT: The grassland and shrubland are two major landscapes of the Tibetan alpine meadow, a region very sensitive to the impact of global warming and anthropogenic perturbation. Herein, we report a study showing that a majority of differences in soil microbial community functional structures, measured by a functional gene array named GeoChip 4.0, in two adjacent shrubland and grassland areas, were explainable by environmental properties, suggesting that the harsh environments in the alpine grassland rendered niche adaptation important. Furthermore, genes involved in labile carbon degradation were more abundant in the shrubland than those of the grassland but genes involved in recalcitrant carbon degradation were less abundant, which was conducive to long-term carbon storage and sequestration in the shrubland despite low soil organic carbon content. In addition, genes of anerobic nitrogen cycling processes such as denitrification and dissimilatory nitrogen reduction were more abundant, shifting soil nitrogen cycling toward ammonium biosynthesis and consequently leading to higher soil ammonium contents. We also noted higher abundances of stress genes responsive to nitrogen limitation and oxygen limitation, which might be attributed to low total nitrogen and higher water contents in the shrubland. Together, these results provide mechanistic knowledge about microbial linkages to soil carbon and nitrogen storage and potential consequences of vegetation shifts in the Tibetan alpine meadow.
    MicrobiologyOpen 10/2014; 3(5). DOI:10.1002/mbo3.190 · 2.21 Impact Factor
  • Source
    • "Especially for the 454 pyrosequencing, of which the sequencing reads now approaching the length of traditional Sanger sequences, is ideal for the transcriptome sequencing for species that lacks a sequenced genome [11]. Besides, the latest versions of Newbler assembler from 454 can effectively assembly 454 RNA-Seq reads into putative transcripts, which can be better used for the subsequent gene discovery [12], [13], microarrays design [14] and high throughput SSRs or SNPs identification [15]–[17]. The SSRs and SNPs are often utilized as gene-based genetic markers and widely used for the generation of linkage maps and identification of quantitative trait loci (QTLs). "
    [Show abstract] [Hide abstract]
    ABSTRACT: Background Rapidly driven by the need for developing sustainable sources of nutritionally important fatty acids and the rising concerns about environmental impacts after using fossil oil, oil-plants have received increasing awareness nowadays. As an important oil-rich plant in China, Camellia oleifera has played a vital role in providing nutritional applications, biofuel productions and chemical feedstocks. However, the lack of C. oleifera genome sequences and little genetic information have largely hampered the urgent needs for efficient utilization of the abundant germplasms towards modern breeding efforts of this woody oil-plant. Results Here, using the 454 GS-FLX sequencing platform, we generated approximately 600,000 RNA-Seq reads from four tissues of C. oleifera. These reads were trimmed and assembled into 104,842 non-redundant putative transcripts with a total length of ∼38.9 Mb, representing more than 218-fold of all the C. oleifera sequences currently deposited in the GenBank (as of March 2014). Based on the BLAST similarity searches, nearly 42.6% transcripts could be annotated with known genes, conserved domains, or Gene Ontology (GO) terms. Comparisons with the cultivated tea tree, C. sinensis, identified 3,022 pairs of orthologs, of which 211 exhibited the evidence under positive selection. Pathway analysis detected the majority of genes potentially related to lipid metabolism. Evolutionary analysis of omega-6 fatty acid desaturase (FAD2) genes among 20 oil-plants unexpectedly suggests that a parallel evolution may occur between C. oleifera and Olea oleifera. Additionally, more than 2,300 simple sequence repeats (SSRs) and 20,200 single-nucleotide polymorphisms (SNPs) were detected in the C. oleifera transcriptome. Conclusions The generated transcriptome represents a considerable increase in the number of sequences deposited in the public databases, providing an unprecedented opportunity to discover all related-genes associated with lipid metabolic pathway in C. oleifera. It will greatly enhance the generation of new varieties of C. oleifera with increased yields and high quality.
    PLoS ONE 08/2014; 9(8):e104150. DOI:10.1371/journal.pone.0104150 · 3.23 Impact Factor
Show more