Harnessing naturally randomized transcription to infer regulatory relationships among genes

Department of Biostatistics, University of Washington, 1705 NE Pacific St, Seattle, WA 98195, USA.
Genome biology (Impact Factor: 10.47). 02/2007; 8(10):R219. DOI: 10.1186/gb-2007-8-10-r219
Source: PubMed

ABSTRACT We develop an approach utilizing randomized genotypes to rigorously infer causal regulatory relationships among genes at the transcriptional level, based on experiments in which genotyping and expression profiling are performed. This approach can be used to build transcriptional regulatory networks and to identify putative regulators of genes. We apply the method to an experiment in yeast, in which genes known to be in the same processes and functions are recovered in the resulting transcriptional regulatory network.

Download full-text


Available from: Frank Emmert-Streib, Feb 11, 2015
  • Source
    • "Theoretical evidence in the form of " Causality Equivalence Theorem " has been proposed by Chen et al. [21] to establish causal relationship. According to the theorem, under the assumption that í µí±‹ is randomized, the following conditions are needed to establish a causal relation: C1: í µí±‹ and í µí±€ are associated, C2: í µí±‹ and í µí±Œ are associated, C3: í µí±‹ is independent of í µí±Œ | í µí±€. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Background. The genome-wide association studies (GWAS) have been successful during the last few years. A key challenge is that the interpretation of the results is not straightforward, especially for transacting SNPs. Integration of transcriptome data into GWAS may provide clues elucidating the mechanisms by which a genetic variant leads to a disease. Methods. Here, we developed a novel mediation analysis approach to identify new expression quantitative trait loci (eQTL) driving CYP2D6 activity by combining genotype, gene expression, and enzyme activity data. Results. 389,573 and 1,214,416 SNP-transcript-CYP2D6 activity trios are found strongly associated (P < 10−5, FDR = 16.6% and 11.7%) for two different genotype platforms, namely, Affymetrix and Illumina, respectively. The majority of eQTLs are trans-SNPs. A single polymorphism leads to widespread downstream changes in the expression of distant genes by affecting major regulators or transcription factors (TFs), which would be visible as an eQTL hotspot and can lead to large and consistent biological effects. Overlapped eQTL hotspots with the mediators lead to the discovery of 64 TFs. Conclusions. Our mediation analysis is a powerful approach in identifying the trans-QTL-phenotype associations. It improves our understanding of the functional genetic variations for the liver metabolism mechanisms.
    10/2013; 2013:493019. DOI:10.1155/2013/493019
  • Source
    • "Regulators with an asterisk (*) were found by Zhu et al. (2008). Regulators marked with a plus (þ) were found in Chen et al. (2007) study and unlabeled regulators are novel predictions. In parentheses after the name of the regulator is the number of targets that we found. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Inference of biological networks from high-throughput data is a central problem in bioinformatics. Particularly powerful for network reconstruction is data collected by recent studies that contain both genetic variation information and gene expression profiles from genetically distinct strains of an organism. Various statistical approaches have been applied to these data to tease out the underlying biological networks that govern how individual genetic variation mediates gene expression and how genes regulate and interact with each other. Extracting meaningful causal relationships from these networks remains a challenging but important problem. In this article, we use causal inference techniques to infer the presence or absence of causal relationships between yeast gene expressions in the framework of graphical causal models. We evaluate our method using a well studied dataset consisting of both genetic variations and gene expressions collected over randomly segregated yeast strains. Our predictions of causal regulators, genes that control the expression of a large number of target genes, are consistent with previously known experimental evidence. In addition, our method can detect the absence of causal relationships and can distinguish between direct and indirect effects of variation on a gene expression level.
    Journal of computational biology: a journal of computational molecular cell biology 03/2010; 17(3):533-46. DOI:10.1089/cmb.2009.0176 · 1.67 Impact Factor
  • Source
    • "Genetic variation information in a segregating population has been used to reconstruct causal phenotype networks [3] [4] [5] and to infer causal relationships among pairs of phenotypes [6] [7] [8] [9] [10]. Approaches based on structural equation models [11] [12] [13] and causal discovery algorithms [14] [15] have also been proposed. "
    [Show abstract] [Hide abstract]
    ABSTRACT: A Bayesian network has often been modeled to infer a gene regulatory network from expression data. Genotypes along with gene expression can further reveal the regulatory relations and genetic ar-chitectures. Biological knowledge can also be incorporated to im-prove the reconstruction of a gene network. In this work, we propose a Bayesian framework to jointly infer a gene network and weights of prior knowledge by integrating expression data, genetic variations, and prior biological knowledge. The proposed method encodes bi-ological knowledge such as transcription factor and DNA binding, gene ontology annotation, and protein-protein interaction into a prior distribution of the network structures. A simulation study shows that the incorporation of genetic variation information and biologi-cal knowledge improves the reconstruction of gene network as long as biological knowledge is consistent with expression data.
Show more