Identification of condition-specific regulatory modules through multi-level motif and mRNA expression analysis.
ABSTRACT Many computational methods for identification of transcription regulatory modules often result in many false positives in practice due to noise sources of binding information and gene expression profiling data. In this paper, we propose a multi-level strategy for condition-specific gene regulatory module identification by integrating motif binding information and gene expression data through support vector regression and significant analysis. We have demonstrated the feasibility of the proposed method on a yeast cell cycle data set. The study on a breast cancer microarray data set shows that it can successfully identify the significant and reliable regulatory modules associated with breast cancer.
- SourceAvailable from: Jianhua Ruan[Show abstract] [Hide abstract]
ABSTRACT: MOTIVATION: The transcriptional regulation of a gene depends on the binding of cis-regulatory elements on its promoter to some transcription factors and the expression levels of the transcription factors. Most existing approaches to studying transcriptional regulation model these dependencies separately, i.e. either from promoters to gene expression or from the expression levels of transcription factors to the expression levels of genes. Little effort has been devoted to a single model for integrating both dependencies. RESULTS: We propose a novel method to model gene expression using both promoter sequences and the expression levels of putative regulators. The proposed method, called bi-dimensional regression tree (BDTree), extends a multivariate regression tree approach by applying it simultaneously to both genes and conditions of an expression matrix. The method produces hypotheses about the condition-specific binding motifs and regulators for each gene. As a side-product, the method also partitions the expression matrix into small submatrices in a way similar to bi-clustering. We propose and compare several splitting functions for building the tree. When applied to two microarray datasets of the yeast Saccharomyces cerevisiae, BDTree successfully identifies most motifs and regulators that are known to regulate the biological processes underlying the datasets. Comparing with an existing algorithm, BDTree provides a higher prediction accuracy in cross-validations.Bioinformatics 03/2006; 22(3):332-40. DOI:10.1093/bioinformatics/bti792 · 4.98 Impact Factor
Article: The UCSC genome browser database[Show abstract] [Hide abstract]
ABSTRACT: The University of California Santa Cruz (UCSC) Genome Browser Database is an up to date source for genome sequence data integrated with a large collection of related annotations. The database is optimized to support fast interactive performance with the web-based UCSC Genome Browser, a tool built on top of the database for rapid visualization and querying of the data at many levels. The annotations for a given genome are displayed in the browser as a series of tracks aligned with the genomic sequence. Sequence data and annotations may also be viewed in a text-based tabular format or downloaded as tab-delimited flat files. The Genome Browser Database, browsing tools and downloadable data files can all be found on the UCSC Genome Bioinformatics website (http://genome.ucsc.edu), which also contains links to documentation and related technical information.Nucleic Acids Research 02/2003; 31(1):51-4. DOI:10.1093/nar/gkg129 · 9.11 Impact Factor
- [Show abstract] [Hide abstract]
ABSTRACT: Not Available Bibtex entry for this abstract Preferred format for this abstract (see Preferences) Find Similar Abstracts: Use: Authors Title Return: Query Results Return items starting with number Query Form Database: Astronomy Physics arXiv e-printsPLoS Computational Biology 01/2007; 2(12):e174. DOI:10.1371/journal.pcbi.0020174 · 4.62 Impact Factor