Discovering time-lagged rules from microarray data using gene profile classifiers.

Laboratorio de Investigación y Desarrollo en Computación Científica (LIDeCC), Departamento de Ciencias e Ingeniería de la Computación, Universidad Nacional del Sur, Av, Alem 1253, 8000, Bahía Blanca, Argentina.
BMC Bioinformatics (Impact Factor: 2.67). 01/2011; 12:123. DOI: 10.1186/1471-2105-12-123
Source: PubMed

ABSTRACT Gene regulatory networks have an essential role in every process of life. In this regard, the amount of genome-wide time series data is becoming increasingly available, providing the opportunity to discover the time-delayed gene regulatory networks that govern the majority of these molecular processes.
This paper aims at reconstructing gene regulatory networks from multiple genome-wide microarray time series datasets. In this sense, a new model-free algorithm called GRNCOP2 (Gene Regulatory Network inference by Combinatorial OPtimization 2), which is a significant evolution of the GRNCOP algorithm, was developed using combinatorial optimization of gene profile classifiers. The method is capable of inferring potential time-delay relationships with any span of time between genes from various time series datasets given as input. The proposed algorithm was applied to time series data composed of twenty yeast genes that are highly relevant for the cell-cycle study, and the results were compared against several related approaches. The outcomes have shown that GRNCOP2 outperforms the contrasted methods in terms of the proposed metrics, and that the results are consistent with previous biological knowledge. Additionally, a genome-wide study on multiple publicly available time series data was performed. In this case, the experimentation has exhibited the soundness and scalability of the new method which inferred highly-related statistically-significant gene associations.
A novel method for inferring time-delayed gene regulatory networks from genome-wide time series datasets is proposed in this paper. The method was carefully validated with several publicly available data sets. The results have demonstrated that the algorithm constitutes a usable model-free approach capable of predicting meaningful relationships between genes, revealing the time-trends of gene regulation.

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In recent years, gene networks have become one of the most useful tools for modeling biological processes. Many inference gene network algorithms have been developed as techniques for extracting knowledge from gene expression data. Ensuring the reliability of the inferred gene relationships is a crucial task in any study in order to prove that the algorithms used are precise. Usually, this validation process can be carried out using prior biological knowledge. The metabolic pathways stored in Kegg are one of the most widelyused knowledgeable sources for analyzing relationships between genes. This article introduces a new methodology, GeneNetVal, to assess the biological validity of gene networks based on the relevance of the genegene interactions stored in Kegg metabolic pathways. Hence, a complete Kegg pathway conversion into a gene association network and a new matching distance based on genegene interaction relevance are proposed. The performance of GeneNetVal was established with three dierent experiments. Firstly, our proposal is tested in a comparative ROC analysis. Secondly, a randomness study is presented to show the behavior of GeneNetVal when the noise is increased in the input network. Finally, the ability of GeneNetVal to detect biological functionality of the network is shown. 1. Background Modeling processes occurring in living organisms is one of the main goals in Bioinformatics[1, 2, 3, 4]. Gene Net-works (GNs) have become one of the most important ap-proaches to discover which genegene relationships are in-volved in a specic biological process. A GN can be represented as a graph where genes, pro-teins and/or metabolites are represented as nodes and their relationships as edges [1]. It is important to note that GNs can vary substantially * Corresponding author depending on the model architecture used to infer the net-work. These models can be categorized into four main ap-proaches according to Hecker et al. [1]: correlation [5, 6], logical [7, 8, 9], dierential equationbased and Bayesian networks [10, 11]. These approaches have been broadly used in Bioinformatics. For example, Rangel et al. [12] used linear modeling to infer T-cell activation from tem-poral gene expression data, or Faith et al. [13] adapted correlation and Bayesian networks to develop a method for inferring the regulatory interactions of Escherichia coli. Once a model has been generated, it is very important to assure the algorithm reliability in order to demonstrate its Preprint submitted to The Scientic World Journal
    The Scientific World Journal 01/2014; · 1.22 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Gene networks (GNs) have become one of the most important approaches for modelling gene-gene relationships in Bioinformatics (Hecker et al, 2009). These networks allow us to carry out studies of different biological processes in a visual way. Many GN inference algorithms have been developed as techniques for extracting biological knowledge (Ponzoni et al, 2007; Gallo et al, 2011). Once the network has been generated, it is very important to assure network reliability in order to illustrate the quality of the generated model. The quality of a GN can be measured by a direct comparison between the obtained GN and prior biological knowledge (Wei and Li, 2007; Zhou and Wong, 2011). However, these both approaches are not entirely accurate as they only take direct gene–gene interactions into account for the validation task, leaving aside the weak (indirect) relationships (Poyatos, 2011). In this work the authors present a new methodology to assess the biological coherence of a GN. This coherence is obtained according to different biological gene-gene relationships sources. Our proposal is able to perform a complete functional analysis of the input GN. With this aim, graph theory is used to consider not only direct relationships but indirect ones as well.
    EMBnet.journal. 11/2012; 18(Suppl.B).
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In the last decade, the interest in microarray technology has exponentially increased due to its ability to monitor the expression of thousands of genes simultaneously. The reconstruction of gene association networks from gene expression profiles is a relevant task and several statistical techniques have been proposed to build them. The problem lies in the process to discover which genes are more relevant and to identify the direct regulatory relationships among them. We developed a multi-objective evolutionary algorithm for mining quantitative association rules to deal with this problem. We applied our methodology named GarNet to a well-known microarray data of yeast cell cycle. The performance analysis of GarNet was organized in three steps similarly to the study performed by Gallo et al. GarNet outperformed the benchmark methods in most cases in terms of quality metrics of the networks, such as accuracy and precision, which were measured using YeastNet database as true network. Furthermore, the results were consistent with previous biological knowledge.
    Journal of Computer and System Sciences 01/2013; · 1.09 Impact Factor

Preview (2 Sources)

Available from

Ignacio Ponzoni