Construction and characterization of a rock-cluster-based EST analysis pipeline

ArticleinComputational Biology and Chemistry 30(1):81-6 · March 2006with3 Reads
Impact Factor: 1.12 · DOI: 10.1016/j.compbiolchem.2005.10.003 · Source: PubMed


    Open access to vast amount of expression sequence tags (ESTs) data in the public databases has provided a powerful platform for gene identification, gene expression studies and comparative/functional genomic studies. To facilitate management of large-scale EST data, high performance cluster and analysis softwares, especially parallel softwares, are fundamentally essential. We reported herein a convenient approach to construct a high performance computating (HPC) cluster based on popular Rocks and a perl-scripted analysis pipeline for EST pre-processing, clustering, assembling and annotation and any other desired analysis modules through parallel computing. We tested the system using different datasets on increasing nodes. Our present results showed that the cluster and pipeline accelerate the EST analysis without artificial interference.