Comparison of probabilistic Boolean network and Dynamic Bayesian network approaches for inferring gene regulatory networks

School of Computing, University of Southern Mississippi, Hattiesburg, MS 39406, USA.
BMC Bioinformatics (Impact Factor: 2.58). 02/2007; 8 Suppl 7(Suppl 7):S13. DOI: 10.1186/1471-2105-8-S7-S13
Source: PubMed


The regulation of gene expression is achieved through gene regulatory networks (GRNs) in which collections of genes interact with one another and other substances in a cell. In order to understand the underlying function of organisms, it is necessary to study the behavior of genes in a gene regulatory network context. Several computational approaches are available for modeling gene regulatory networks with different datasets. In order to optimize modeling of GRN, these approaches must be compared and evaluated in terms of accuracy and efficiency.
In this paper, two important computational approaches for modeling gene regulatory networks, probabilistic Boolean network methods and dynamic Bayesian network methods, are compared using a biological time-series dataset from the Drosophila Interaction Database to construct a Drosophila gene network. A subset of time points and gene samples from the whole dataset is used to evaluate the performance of these two approaches.
The comparison indicates that both approaches had good performance in modeling the gene regulatory networks. The accuracy in terms of recall and precision can be improved if a smaller subset of genes is selected for inferring GRNs. The accuracy of both approaches is dependent upon the number of selected genes and time points of gene samples. In all tested cases, DBN identified more gene interactions and gave better recall than PBN.

Download full-text


Available from: Ping Gong
  • Source
    • "Although reverse engineering methods such as Boolean networks [1], Bayesian networks [2,3], dynamic Bayesian networks [4,5], multivariate regression methods [6-8], linear programming [9], genetic algorithm [10] and information theoretic [11] approaches have been applied to deduce the circuitry of signaling and gene networks, all currently developed methods have significant limitations. For instance, the Boolean network based methods are found to be formidably slow, and their performance degrades with increasing network size [12]. Bayesian network methods are unable to account for feedback regulation, a hallmark of signaling networks [2]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Recent advancements in genetics and proteomics have led to the acquisition of large quantitativedata sets. However, the use of these data to reverse engineer biochemical networks has remained achallenging problem. Many methods have been proposed to infer biochemical network topologiesfrom different types of biological data. Here, we focus on unraveling network topologies from steadystate responses of biochemical networks to successive experimental perturbations. We propose a computational algorithm which combines a deterministic network inference methodtermed Modular Response Analysis (MRA) and a statistical model selection algorithm calledBayesian Variable Selection, to infer functional interactions in cellular signaling pathways and generegulatory networks. It can be used to identify interactions among individual molecules involved ina biochemical pathway or reveal how different functional modules of a biological network interactwith each other to exchange information. In cases where not all network components are known, ourmethod reveals functional interactions which are not direct but correspond to the interaction routesthrough unknown elements. Using computer simulated perturbation responses of signaling pathwaysand gene regulatory networks from the DREAM challenge, we demonstrate that the proposed methodis robust against noise and scalable to large networks. We also show that our method can infer net-work topologies using incomplete perturbation datasets. Consequently, we have used this algorithm toexplore the ERBB regulated G1/S transition pathway in certain breast cancer cells to understand themolecular mechanisms which cause these cells to become drug resistant. The algorithm successfullyinferred many well characterized interactions of this pathway by analyzing experimentally obtainedperturbation data. Additionally, it identified some molecular interactions which promote drug resis-tance in breast cancer cells. The proposed algorithm provides a robust, scalable and cost effective solution for inferring network topologies from biological data. It can potentially be applied to explore novel pathways which play important roles in life threatening disease like cancer.
    Full-text · Article · Jul 2013 · BMC Systems Biology
  • Source
    • "Structural equation models, Bayesian networks, and other probabilistic graphical models are widely used for studying causal relationships. Many authors have proposed to use Bayesian networks for analyzing gene expression data [44-47] and for generating causal networks from observational data [48] or genetic data [49,50]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Background Co-expression measures are often used to define networks among genes. Mutual information (MI) is often used as a generalized correlation measure. It is not clear how much MI adds beyond standard (robust) correlation measures or regression model based association measures. Further, it is important to assess what transformations of these and other co-expression measures lead to biologically meaningful modules (clusters of genes). Results We provide a comprehensive comparison between mutual information and several correlation measures in 8 empirical data sets and in simulations. We also study different approaches for transforming an adjacency matrix, e.g. using the topological overlap measure. Overall, we confirm close relationships between MI and correlation in all data sets which reflects the fact that most gene pairs satisfy linear or monotonic relationships. We discuss rare situations when the two measures disagree. We also compare correlation and MI based approaches when it comes to defining co-expression network modules. We show that a robust measure of correlation (the biweight midcorrelation transformed via the topological overlap transformation) leads to modules that are superior to MI based modules and maximal information coefficient (MIC) based modules in terms of gene ontology enrichment. We present a function that relates correlation to mutual information which can be used to approximate the mutual information from the corresponding correlation coefficient. We propose the use of polynomial or spline regression models as an alternative to MI for capturing non-linear relationships between quantitative variables. Conclusion The biweight midcorrelation outperforms MI in terms of elucidating gene pairwise relationships. Coupled with the topological overlap matrix transformation, it often leads to more significantly enriched co-expression modules. Spline and polynomial networks form attractive alternatives to MI in case of non-linear relationships. Our results indicate that MI networks can safely be replaced by correlation networks when it comes to measuring co-expression relationships in stationary data.
    Full-text · Article · Dec 2012 · BMC Bioinformatics
  • Source
    • "Currently, a lot of research is being devoted to introduce improvements in the working of these algorithms and enhance our understanding about gene interactions. Out of the statistical techniques currently adopted to model gene networks, dynamic Bayesian networks have received the most widespread attention [9], [10], [37]. State space models [11], [12], [24], [25], [32] and Kalman filter (EKF), which are specific instances of dynamic Bayesian networks, have also been employed to model gene regulatory networks [5], [22]. "
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper considers the problem of learning the structure of gene regulatory networks from gene expression time series data. A more realistic scenario when the state space model representing a gene network evolves nonlinearly is considered while a linear model is assumed for the microarray data. To capture the nonlinearity, a particle filter-based state estimation algorithm is considered instead of the contemporary linear approximation-based approaches. The parameters characterizing the regulatory relations among various genes are estimated online using a Kalman filter. Since a particular gene interacts with a few other genes only, the parameter vector is expected to be sparse. The state estimates delivered by the particle filter and the observed microarray data are then subjected to a LASSO-based least squares regression operation which yields a parsimonious and efficient description of the regulatory network by setting the irrelevant coefficients to zero. The performance of the aforementioned algorithm is compared with the extended Kalman filter (EKF) and Unscented Kalman Filter (UKF) employing the Mean Square Error (MSE) as the fidelity criterion in recovering the parameters of gene regulatory networks from synthetic data and real biological data. Extensive computer simulations illustrate that the proposed particle filter-based network inference algorithm outperforms EKF and UKF, and therefore, it can serve as a natural framework for modeling gene regulatory networks with nonlinear and sparse structure.
    Full-text · Article · Feb 2012 · IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM
Show more