Article

Functional Discovery via a Compendium of Expression Profiles

Rosetta Inpharmatics, Inc., Kirkland, Washington 98034, USA.
Cell (Impact Factor: 33.12). 08/2000; 102(1):109-26. DOI: 10.1016/S0092-8674(00)00015-5
Source: PubMed

ABSTRACT Ascertaining the impact of uncharacterized perturbations on the cell is a fundamental problem in biology. Here, we describe how a single assay can be used to monitor hundreds of different cellular functions simultaneously. We constructed a reference database or "compendium" of expression profiles corresponding to 300 diverse mutations and chemical treatments in S. cerevisiae, and we show that the cellular pathways affected can be determined by pattern matching, even among very subtle profiles. The utility of this approach is validated by examining profiles caused by deletions of uncharacterized genes: we identify and experimentally confirm that eight uncharacterized open reading frames encode proteins required for sterol metabolism, cell wall function, mitochondrial respiration, or protein synthesis. We also show that the compendium can be used to characterize pharmacological perturbations by identifying a novel target of the commonly used drug dyclonine.

2 Followers
 · 
156 Views
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Reconstruction of gene regulatory networks (GRNs) from experimental data is a fundamental challenge in systems biology. A number of computational approaches have been developed to infer GRNs from mRNA expression profiles. However, expression profiles alone are proving to be insufficient for inferring GRN topologies with reasonable accuracy. Recently, it has been shown that integration of external data sources (such as gene and protein sequence information, gene ontology data, protein–protein interactions) with mRNA expression profiles may increase the reliability of the inference process. Here, I propose a new approach that incorporates transcription factor binding sites (TFBS) and physical protein interactions (PPI) among transcription factors (TFs) in a Bayesian variable selection (BVS) algorithm which can infer GRNs from mRNA expression profiles subjected to genetic perturbations. Using real experimental data, I show that the integration of TFBS and PPI data with mRNA expression profiles leads to significantly more accurate networks than those inferred from expression profiles alone. Additionally, the performance of the proposed algorithm is compared with a series of least absolute shrinkage and selection operator (LASSO) regression-based network inference methods that can also incorporate prior knowledge in the inference framework. The results of this comparison suggest that BVS can outperform LASSO regression-based method in some circumstances.
    Frontiers in Bioengineering and Biotechnology 05/2014; 2. DOI:10.3389/fbioe.2014.00013
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Gene expression varies widely in natural populations, yet the proximate and ultimate causes of this variation are poorly known. Understanding how variation in gene expression affects abiotic stress tolerance, fitness, and adaptation is central to the field of evolutionary genetics. We tested the hypothesis that genes with natural genetic variation in their expression responses to abiotic stress are likely to be involved in local adaptation to climate in Arabidopsis thaliana. Specifically, we compared genes with consistent expression responses to environmental stress (expression stress responsive, “eSR”) to genes with genetically variable responses to abiotic stress (expression genotype-by-environment interaction, “eGEI”). We found that on average genes that exhibited eGEI in response to drought or cold had greater polymorphism in promoter regions and stronger associations with climate than eSR genes or genomic controls. We also found that transcription factor binding sites known to respond to environmental stressors, especially abscisic acid responsive elements, showed significantly higher polymorphism in drought eGEI genes in comparison to eSR genes. By contrast, eSR genes tended to exhibit relatively greater pairwise haplotype sharing, lower promoter diversity, and fewer non-synonymous polymorphisms, suggesting purifying selection or selective sweeps. Our results indicate that cis-regulatory evolution and genetic variation in stress responsive gene expression may be important mechanisms of local adaptation to climatic selective gradients.
    Molecular Biology and Evolution 05/2014; 31(9). DOI:10.1093/molbev/msu170 · 14.31 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Networks are ubiquitous in biology and computational approaches have been largely investigated for their inference. In particular, supervised machine learning methods can be used to complete a partially known network by integrating various measurements. Two main supervised frameworks have been proposed: the local approach, which trains a separate model for each network node, and the global approach, which trains a single model over pairs of nodes. Here, we systematically investigate, theoretically and empirically, the exploitation of tree-based ensemble methods in the context of these two approaches for biological network inference. We first formalize the problem of network inference as classification of pairs, unifying in the process homogeneous and bipartite graphs and discussing two main sampling schemes. We then present the global and the local approaches, extending the later for the prediction of interactions between two unseen network nodes, and discuss their specializations to tree-based ensemble methods, highlighting their interpretability and drawing links with clustering techniques. Extensive computational experiments are carried out with these methods on various biological networks that clearly highlight that these methods are competitive with existing methods.
    Molecular BioSystems 04/2014; DOI:10.1039/C5MB00174A · 3.18 Impact Factor