Article

An estimation method for inference of gene regulatory net-work using Bayesian network with uniting of partial problems

Department of Bioinformatic Engineering, Graduate School of Information Science and Technology, Osaka University, Osaka, Japan.
BMC Genomics (Impact Factor: 4.04). 01/2012; 13 Suppl 1(Suppl 1):S12. DOI: 10.1186/1471-2164-13-S1-S12
Source: PubMed

ABSTRACT Bayesian networks (BNs) have been widely used to estimate gene regulatory networks. Many BN methods have been developed to estimate networks from microarray data. However, two serious problems reduce the effectiveness of current BN methods. The first problem is that BN-based methods require huge computational time to estimate large-scale networks. The second is that the estimated network cannot have cyclic structures, even if the actual network has such structures.
In this paper, we present a novel BN-based deterministic method with reduced computational time that allows cyclic structures. Our approach generates all the combinational triplets of genes, estimates networks of the triplets by BN, and unites the networks into a single network containing all genes. This method decreases the search space of predicting gene regulatory networks without degrading the solution accuracy compared with the greedy hill climbing (GHC) method. The order of computational time is the cube of number of genes. In addition, the network estimated by our method can include cyclic structures.
We verified the effectiveness of the proposed method for all known gene regulatory networks and their expression profiles. The results demonstrate that this approach can predict regulatory networks with reduced computational time without degrading the solution accuracy compared with the GHC method.

0 Bookmarks
 · 
101 Views
  • [Show abstract] [Hide abstract]
    ABSTRACT: Abstract Gene regulatory networks (GRNs) play a central role in sustaining complex biological systems in cells. Although we can construct GRNs by integrating biological interactions that have been recorded in literature, they can include suspicious data and a lack of information. Therefore, there has been an urgent need for an approach by which the validity of constructed networks can be evaluated; simulation-based methods have been applied in which biological observational data are assimilated. However, these methods apply nonlinear models that require high computational power to evaluate even one network consisting of only several genes. Therefore, to explore candidate networks whose simulation models can better predict the data by modifying and extending literature-based GRNs, an efficient and versatile method is urgently required. We applied a combinatorial transcription model, which can represent combinatorial regulatory effects of genes, as a biological simulation model, to reproduce the dynamic behavior of gene expressions within a state space model. Under the model, we applied the unscented Kalman filter to obtain the approximate posterior probability distribution of the hidden state to efficiently estimate parameter values maximizing prediction ability for observational data by the EM-algorithm. Utilizing the method, we propose a novel algorithm to modify GRNs reported in the literature so that their simulation models become consistent with observed data. The effectiveness of our approach was validated through comparison analysis to the previous methods using synthetic networks. Finally, as an application example, a Kyoto Encyclopedia of Genes and Genomes (KEGG)-based yeast cell cycle network was extended with additional candidate genes to better predict the real mRNA expressions data using the proposed method.
    Journal of computational biology: a journal of computational molecular cell biology 09/2014; DOI:10.1089/cmb.2014.0171 · 1.69 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Building accurate gene regulatory networks (GRNs) from high-throughput gene expression data is a long-standing challenge. However, with the emergence of new algorithms combined with the increase of transcriptomic data availability, it is now reachable. To help biologists to investigate gene regulatory relationships, we developed a web-based computational service to build, analyze and visualize GRNs that govern various biological processes. The web server is preloaded with all available Affymetrix GeneChip-based transcriptomic and annotation data from the three model legume species, i.e., Medicago truncatula, Lotus japonicus and Glycine max. Users can also upload their own transcriptomic and transcription factor datasets from any other species/organisms to analyze their in-house experiments. Users are able to select which experiments, genes and algorithms they will consider to perform their GRN analysis. To achieve this flexibility and improve prediction performance, we have implemented multiple mainstream GRN prediction algorithms including co-expression, Graphical Gaussian Models (GGMs), Context Likelihood of Relatedness (CLR), and parallelized versions of TIGRESS and GENIE3. Besides these existing algorithms, we also proposed a parallel Bayesian network learning algorithm, which can infer causal relationships (i.e., directionality of interaction) and scale up to several thousands of genes. Moreover, this web server also provides tools to allow integrative and comparative analysis between predicted GRNs obtained from different algorithms or experiments, as well as comparisons between legume species. The web site is available at http://legumegrn.noble.org.
    PLoS ONE 07/2013; 8(7):e67434. DOI:10.1371/journal.pone.0067434 · 3.53 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: Techniques in molecular biology have permitted the gathering of an extremely large amount of information relating organisms and their genes. The current challenge is assigning a putative function to thousands of genes that have been detected in different organisms. One of the most informative types of genomic data to achieve a better knowledge of protein function is gene expression data. Based on gene expression data and assuming that genes involved in the same function should have a similar or correlated expression pattern, a function can be attributed to those genes with unknown functions when they appear to be linked in a gene co-expression network (GCN). Several tools for the construction of GCNs have been proposed and applied to plant gene expression data. Here, we review recent methodologies used for plant gene expression data and compare the results, advantages and disadvantages in order to help researchers in their choice of a method for the construction of GCNs.
    Briefings in functional genomics 02/2013; DOI:10.1093/bfgp/elt003 · 3.43 Impact Factor

Full-text (2 Sources)

Download
24 Downloads
Available from
May 22, 2014