Article
ParPEST: a pipeline for EST data analysis based on parallel computing
BMC Bioinformatics
01/2005;
Source: DOAJ
-
Citations (0)
- Cited In (11)
-
Article: CleanEST: a database of cleansed EST libraries.
[show abstract] [hide abstract]
ABSTRACT: The EST division of GenBank, dbEST, is widely used in many applications such as gene discovery and verification of exon-intron structure. However, the use of EST sequences in the dbEST libraries is often hampered by inconsistent terminology used to describe the library sources and by the presence of contaminated sequences. Here, we describe CleanEST, a novel database server that classified dbEST libraries and removes contaminants. We classified all dbEST libraries according to species and sequencing center. In addition, we further classified human EST libraries by anatomical and pathological systems according to eVOC ontologies. For each dbEST library, we provide two different cleansed sequences: 'pre-cleansed' and 'user-cleansed'. To generate pre-cleansed sequences, we cleansed sequences in dbEST by alignment of EST sequences against well-known contamination sources: UniVec, Escherichia coli, mitochondria and chloroplast (for plant). To provide user-cleansed sequences, we built an automatic user-cleansing pipeline, in which sequences of a user-selected library are cleansed on-the-fly according to user-selected options. The server is available at http://cleanest.kobic.re.kr/ and the database is updated monthly.Nucleic Acids Research 11/2008; 37(Database issue):D686-9. · 8.03 Impact Factor -
Article: SolEST database: a "one-stop shop" approach to the study of Solanaceae transcriptomes.
[show abstract] [hide abstract]
ABSTRACT: Since no genome sequences of solanaceous plants have yet been completed, expressed sequence tag (EST) collections represent a reliable tool for broad sampling of Solanaceae transcriptomes, an attractive route for understanding Solanaceae genome functionality and a powerful reference for the structural annotation of emerging Solanaceae genome sequences. We describe the SolEST database http://biosrv.cab.unina.it/solestdb which integrates different EST datasets from both cultivated and wild Solanaceae species and from two species of the genus Coffea. Background as well as processed data contained in the database, extensively linked to external related resources, represent an invaluable source of information for these plant families. Two novel features differentiate SolEST from other resources: i) the option of accessing and then visualizing Solanaceae EST/TC alignments along the emerging tomato and potato genome sequences; ii) the opportunity to compare different Solanaceae assemblies generated by diverse research groups in the attempt to address a common complaint in the SOL community. Different databases have been established worldwide for collecting Solanaceae ESTs and are related in concept, content and utility to the one presented herein. However, the SolEST database has several distinguishing features that make it appealing for the research community and facilitates a "one-stop shop" for the study of Solanaceae transcriptomes.BMC Plant Biology 11/2009; 9:142. · 3.45 Impact Factor -
Article: Comparative 454 pyrosequencing of transcripts from two olive genotypes during fruit development.
[show abstract] [hide abstract]
ABSTRACT: Despite its primary economic importance, genomic information on olive tree is still lacking. 454 pyrosequencing was used to enrich the very few sequence data currently available for the Olea europaea species and to identify genes involved in expression of fruit quality traits. Fruits of Coratina, a widely cultivated variety characterized by a very high phenolic content, and Tendellone, an oleuropein-lacking natural variant, were used as starting material for monitoring the transcriptome. Four different cDNA libraries were sequenced, respectively at the beginning and at the end of drupe development. A total of 261,485 reads were obtained, for an output of about 58 Mb. Raw sequence data were processed using a four step pipeline procedure and data were stored in a relational database with a web interface. Massively parallel sequencing of different fruit cDNA collections has provided large scale information about the structure and putative function of gene transcripts accumulated during fruit development. Comparative transcript profiling allowed the identification of differentially expressed genes with potential relevance in regulating the fruit metabolism and phenolic content during ripening.BMC Genomics 09/2009; 10:399. · 4.07 Impact Factor
Data provided are for informational purposes only. Although carefully collected, accuracy cannot be guaranteed.
The impact factor represents a rough estimation of the journal's impact factor and does not reflect the actual
current impact factor.
Publisher conditions are provided by RoMEO. Differing provisions from the publisher's actual policy or licence
agreement may be applicable.
Keywords
affordable costs
collected information
complete analysis
data information content
efficient bioinformatic approaches
error-prone DNA sequences
EST analysis
EST data
execution time
functional genomic studies
hardware components
information content
large datasets
ParPEST
preliminary functional annotation
relational database
reliable analysis
reliable information
specific steps
suitable data warehouse