Harnessing naturally randomized transcription to infer regulatory relationships among genes

Department of Biostatistics, University of Washington, 1705 NE Pacific St, Seattle, WA 98195, USA.
Genome biology (Impact Factor: 10.47). 02/2007; 8(10):R219. DOI: 10.1186/gb-2007-8-10-r219
Source: PubMed

ABSTRACT We develop an approach utilizing randomized genotypes to rigorously infer causal regulatory relationships among genes at the transcriptional level, based on experiments in which genotyping and expression profiling are performed. This approach can be used to build transcriptional regulatory networks and to identify putative regulators of genes. We apply the method to an experiment in yeast, in which genes known to be in the same processes and functions are recovered in the resulting transcriptional regulatory network.

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: A Bayesian network has often been modeled to infer a gene regulatory network from expression data. Genotypes along with gene expression can further reveal the regulatory relations and genetic ar-chitectures. Biological knowledge can also be incorporated to im-prove the reconstruction of a gene network. In this work, we propose a Bayesian framework to jointly infer a gene network and weights of prior knowledge by integrating expression data, genetic variations, and prior biological knowledge. The proposed method encodes bi-ological knowledge such as transcription factor and DNA binding, gene ontology annotation, and protein-protein interaction into a prior distribution of the network structures. A simulation study shows that the incorporation of genetic variation information and biologi-cal knowledge improves the reconstruction of gene network as long as biological knowledge is consistent with expression data.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Reconstructing biological networks using high-throughput technologies has the potential to produce condition-specific interactomes. But are these reconstructed networks a reliable source of biological interactions? Do some network inference methods offer dramatically improved performance on certain types of networks? To facilitate the use of network inference methods in systems biology, we report a large-scale simulation study comparing the ability of Markov chain Monte Carlo (MCMC) samplers to reverse engineer Bayesian networks. The MCMC samplers we investigated included foundational and state of the art Metropolis-Hastings and Gibbs sampling approaches, as well as novel samplers we have designed. To enable a comprehensive comparison, we simulated gene expression and genetics data from known network structures under a range of biologically plausible scenarios. We examine the overall quality of network inference via different methods, as well as how their performance is affected by network characteristics. Our simulations reveal that network size, edge density, and strength of gene-to-gene signaling are major parameters that differentiate the performance of various samplers. Specifically, more recent samplers including our novel methods outperform traditional samplers for highly interconnected large networks with strong gene-to-gene signaling. Our newly developed samplers show comparable or superior performance to the top existing methods. Moreover, this performance gain is strongest in networks with biologically-oriented topology, which indicates that our novel samplers are suitable for inferring biological networks. The performance of MCMC samplers in this simulation framework can guide the choice of methods for network reconstruction using systems genetics data. Copyright © 2015, The Genetics Society of America.
    Genetics 01/2015; 199(4). DOI:10.1534/genetics.114.172619 · 4.87 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Chronic Obstructive Pulmonary Disease (COPD) is a complex disease. Genetic, epigenetic, and environmental factors are known to contribute to COPD risk and disease progression. Therefore we developed a systematic approach to identify key regulators of COPD that integrates genome-wide DNA methylation, gene expression, and phenotype data in lung tissue from COPD and control samples. Our integrative analysis identified 126 key regulators of COPD. We identified EPAS1 as the only key regulator whose downstream genes significantly overlapped with multiple genes sets associated with COPD disease severity. EPAS1 is distinct in comparison with other key regulators in terms of methylation profile and downstream target genes. Genes predicted to be regulated by EPAS1 were enriched for biological processes including signaling, cell communications, and system development. We confirmed that EPAS1 protein levels are lower in human COPD lung tissue compared to non-disease controls and that Epas1 gene expression is reduced in mice chronically exposed to cigarette smoke. As EPAS1 downstream genes were significantly enriched for hypoxia responsive genes in endothelial cells, we tested EPAS1 function in human endothelial cells. EPAS1 knockdown by siRNA in endothelial cells impacted genes that significantly overlapped with EPAS1 downstream genes in lung tissue including hypoxia responsive genes, and genes associated with emphysema severity. Our first integrative analysis of genome-wide DNA methylation and gene expression profiles illustrates that not only does DNA methylation play a 'causal' role in the molecular pathophysiology of COPD, but it can be leveraged to directly identify novel key mediators of this pathophysiology.
    PLoS Genetics 01/2015; 11(1):e1004898. DOI:10.1371/journal.pgen.1004898 · 8.17 Impact Factor