Construction and characterization of a rock-cluster-based EST analysis pipeline.
ABSTRACT Open access to vast amount of expression sequence tags (ESTs) data in the public databases has provided a powerful platform for gene identification, gene expression studies and comparative/functional genomic studies. To facilitate management of large-scale EST data, high performance cluster and analysis softwares, especially parallel softwares, are fundamentally essential. We reported herein a convenient approach to construct a high performance computating (HPC) cluster based on popular Rocks and a perl-scripted analysis pipeline for EST pre-processing, clustering, assembling and annotation and any other desired analysis modules through parallel computing. We tested the system using different datasets on increasing nodes. Our present results showed that the cluster and pipeline accelerate the EST analysis without artificial interference.
- [Show abstract] [Hide abstract]
ABSTRACT: Bax inhibitor-1 (BI1) family contains six known genes in human. Some members of BI1 family have been proved to play important roles in cell death. Here we discuss the similarities and differences among the members of BI1 family by comparative genomics and proteomics, and report here the discovery of a novel member, tmbim1b, in Bos taurus. BI1 family is evolutionarily conserved as integral membrane proteins containing multiple membrane-spanning segments and predominantly localized to intracellular membranes, similar to Bcl-2 family proteins. They share multiple motifs and transcriptional factors within the promoter and the coding regions. They may represent regulators of cell death pathways, which are concluded from structure conservation of BI1 family.Computational Biology and Chemistry 07/2008; 32(3):159-62. DOI:10.1016/j.compbiolchem.2008.01.002 · 1.60 Impact Factor
- Computational Biology and Chemistry 12/2008; 32(6):469. DOI:10.1016/j.compbiolchem.2008.07.028 · 1.60 Impact Factor
- [Show abstract] [Hide abstract]
ABSTRACT: EST sequencing projects are increasing in scale and scope as the genome sequencing technologies migrate from core sequencing centers to individual research laboratories. Effectively, generating EST data is no longer a bottleneck for investigators. However, processing large amounts of EST data remains a non-trivial challenge for many. Web-based EST analysis tools are proving to be the most convenient option for biologists when performing their analysis, so these tools must continuously improve on their utility to keep in step with the growing needs of research communities. We have developed a web-based EST analysis pipeline called ESTPiper, which streamlines typical large-scale EST analysis components. The intuitive web interface guides users through each step of base calling, data cleaning, assembly, genome alignment, annotation, analysis of gene ontology (GO), and microarray oligonucleotide probe design. Each step is modularized. Therefore, a user can execute them separately or together in batch mode. In addition, the user has control over the parameters used by the underlying programs. Extensive documentation of ESTPiper's functionality is embedded throughout the web site to facilitate understanding of the required input and interpretation of the computational results. The user can also download intermediate results and port files to separate programs for further analysis. In addition, our server provides a time-stamped description of the run history for reproducibility. The pipeline can also be installed locally, allowing researchers to modify ESTPiper to suit their own needs. ESTPiper streamlines the typical process of EST analysis. The pipeline was initially designed in part to support the Daphnia pulex cDNA sequencing project. A web server hosting ESTPiper is provided at http://estpiper.cgb.indiana.edu/ to now support projects of all size. The software is also freely available from the authors for local installations.BMC Genomics 05/2009; 10:174. DOI:10.1186/1471-2164-10-174 · 4.04 Impact Factor