Inferring Metabolic States in Uncharacterized Environments Using Gene-Expression Measurements

Department of Bioinformatics (CMBI), Centre for Molecular Life Sciences, Radboud University Nijmegen, The Netherlands
PLoS Computational Biology (Impact Factor: 4.62). 03/2013; 9(3):e1002988. DOI: 10.1371/journal.pcbi.1002988
Source: PubMed


The large size of metabolic networks entails an overwhelming multiplicity in the possible steady-state flux distributions that are compatible with stoichiometric constraints. This space of possibilities is largest in the frequent situation where the nutrients available to the cells are unknown. These two factors: network size and lack of knowledge of nutrient availability, challenge the identification of the actual metabolic state of living cells among the myriad possibilities. Here we address this challenge by developing a method that integrates gene-expression measurements with genome-scale models of metabolism as a means of inferring metabolic states. Our method explores the space of alternative flux distributions that maximize the agreement between gene expression and metabolic fluxes, and thereby identifies reactions that are likely to be active in the culture from which the gene-expression measurements were taken. These active reactions are used to build environment-specific metabolic models and to predict actual metabolic states. We applied our method to model the metabolic states of Saccharomyces cerevisiae growing in rich media supplemented with either glucose or ethanol as the main energy source. The resulting models comprise about 50% of the reactions in the original model, and predict environment-specific essential genes with high sensitivity. By minimizing the sum of fluxes while forcing our predicted active reactions to carry flux, we predicted the metabolic states of these yeast cultures that are in large agreement with what is known about yeast physiology. Most notably, our method predicts the Crabtree effect in yeast cells growing in excess glucose, a long-known phenomenon that could not have been predicted by traditional constraint-based modeling approaches. Our method is of immediate practical relevance for medical and industrial applications, such as the identification of novel drug targets, and the development of biotechnological processes that use complex, largely uncharacterized media, such as biofuel production.

Download full-text


Available from: Sergio Rossell,
    • "Although a draft model can be based on annotated genome sequences, it needs extensive manual curation based on additional biochemical and physiological knowledge. Such generalised genomebased models can be made specific for certain environmental conditions or differentiation stages by restricting them to reactions that are actually active according to transcriptome and proteome data [11] [14] [15]. In multicellular organisms, such data can be used to refine genome-scale models to reflect specific cell types [16]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: To rationalise drug target selection, we should understand the role of putative targets in biological pathways quantitatively. We review here how experimental and computational network-based approaches can aid more rational drug target selection and illustrate this with results obtained for microbes and for cancer. Comparison of the drug response of biochemical networks in target cells and (healthy) host cells can reveal network-selective targets.
    Drug Discovery Today Technologies 08/2015; 15(August 2015):17-22.
  • Source
    • "By integrating transcriptomic data with a global human metabolic model using this method, they predicted tissue-specific metabolic activity in ten different tissues. A method called EXAMO (EXploration of Alternative Metabolic Optima) is an extended version of iMAT that builds a context-specific model [46]. Tailored gene expression using user-defined thresholds may avoid data normalization issues [33]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Several computational methods have been developed that integrate transcriptomic data with genome-scale metabolic reconstructions to infer condition-specific system-wide intracellular metabolic flux distributions. In this mini-review, we describe each of these methods published to date with categorizing them based on four different grouping criteria (requirement for multiple gene expression datasets as input, requirement for a threshold to define a gene's high and low expression, requirement for a priori assumption of an appropriate objective function, and validation of predicted fluxes directly against measured intracellular fluxes). Then, we recommend which group of methods would be more suitable from a practical perspective.
    Computational and Structural Biotechnology Journal 08/2014; 11(18). DOI:10.1016/j.csbj.2014.08.009
  • Source
    • "Flux-balance analysis is the most popular example of this strategy, but it becomes questionable once the steady-state assumption can no longer be upheld. Furthermore, as more data on enzyme abundance become available, we should attempt to include such information and the impact on metabolic processes (Colijn et al, 2009; Rossell et al., 2013). Here we provide a new framework that allows us to model metabolic fluxes and their dynamics, and which deals with the missing data problem in metabolic analysis in a flexible and consistent manner. "
    [Show abstract] [Hide abstract]
    ABSTRACT: One of the challenging questions in modelling biological systems is to characterise the functional forms of the processes that control and orchestrate molecular and cellular phenotypes. Recently proposed methods for the analysis of metabolic pathways, for example dynamic flux estimation, can only provide estimates of the underlying fluxes at discrete time-points but fail to capture the complete temporal behaviour. In order to describe the dynamic variation of the fluxes we additionally require the assumption of specific functional forms that can capture the temporal behaviour. But it also remains unclear how to address the noise which might be present in experimentally measured metabolite concentrations. Here we propose a novel approach to modelling metabolic fluxes: derivative processes that are based on Multiple-output Gaussian processes (MGPs), which are a flexible nonparametric Bayesian modelling technique. The main advantages that follow from MGPs approach include the natural nonparametric representation of the fluxes and ability to impute the missing data in between the measurements. Our derivative process approach allows us to model changes in metabolite derivative concentrations and to characterise the temporal behaviour of metabolic fluxes from time course data. Because the derivative of a Gaussian process is itself a Gaussian process we can readily link metabolite concentrations to metabolic fluxes and vice versa. Here we discuss how this can be implemented in an MGP framework and illustrate its application to simple models, including nitrogen metabolism in Escherichia coli.,
    Bioinformatics 02/2014; 30(13). DOI:10.1093/bioinformatics/btu069 · 4.98 Impact Factor
Show more