Reverse engineering module networks by PSO-RNN hybrid modeling.

Lombardi Comprehensive Cancer Center, Georgetown University, Washington, DC, USA.
BMC Genomics (Impact Factor: 4.04). 02/2009; 10 Suppl 1:S15. DOI: 10.1186/1471-2164-10-S1-S15
Source: PubMed

ABSTRACT Inferring a gene regulatory network (GRN) from high throughput biological data is often an under-determined problem and is a challenging task due to the following reasons: (1) thousands of genes are involved in one living cell; (2) complex dynamic and nonlinear relationships exist among genes; (3) a substantial amount of noise is involved in the data, and (4) the typical small sample size is very small compared to the number of genes. We hypothesize we can enhance our understanding of gene interactions in important biological processes (differentiation, cell cycle, and development, etc) and improve the inference accuracy of a GRN by (1) incorporating prior biological knowledge into the inference scheme, (2) integrating multiple biological data sources, and (3) decomposing the inference problem into smaller network modules.
This study presents a novel GRN inference method by integrating gene expression data and gene functional category information. The inference is based on module network model that consists of two parts: the module selection part and the network inference part. The former determines the optimal modules through fuzzy c-mean (FCM) clustering and by incorporating gene functional category information, while the latter uses a hybrid of particle swarm optimization and recurrent neural network (PSO-RNN) methods to infer the underlying network between modules. Our method is tested on real data from two studies: the development of rat central nervous system (CNS) and the yeast cell cycle process. The results are evaluated by comparing them to previously published results and gene ontology annotation information.
The reverse engineering of GRNs in time course gene expression data is a major obstacle in system biology due to the limited number of time points. Our experiments demonstrate that the proposed method can address this challenge by: (1) preprocessing gene expression data (e.g. normalization and missing value imputation) to reduce the data noise; (2) clustering genes based on gene expression data and gene functional category information to identify biologically meaningful modules, thereby reducing the dimensionality of the data; (3) modeling GRNs with the PSO-RNN method between the modules to capture their nonlinear and dynamic relationships. The method is shown to lead to biologically meaningful modules and networks among the modules.

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: While the robot is in motion, path planning should follow the three aspects: (1) acquire the knowledge from its environmental conditions. (2) determine its position in the environment and (3) decision-making and execution to achieve its highest-order goals. The present research work aims to develop an efficient particle swarm optimization-based path planner of an autonomous mobile robot. In this approach, a fitness function has been introduced for converting the mobile robot navigation problem into multi objective optimization problem. The fitness of the swarm mainly depends on two parameters: (1) distance between each particle of the swarm and target, (2) distance between each particle of the swarm and the nearest obstacle. From the obtained fitness values of the swarm, the global best position of the particle is selected in each cycle. Thereby, the robot reaches the global best position in sequence. The effectiveness of the developed algorithm in various environments has been verified by simulation modes.
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING 06/2014; 39(8). DOI:10.1007/s13369-014-1154-z · 0.37 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Finding an efficient method to solve the parameter estimation problem (inverse problem) for nonlinear biochemical dynamical systems could help promote the functional understanding at the system level for signalling pathways. The problem is stated as a data-driven nonlinear regression problem, which is converted into a nonlinear programming problem with many nonlinear differential and algebraic constraints. Due to the typical ill conditioning and multimodality nature of the problem, it is in general difficult for gradient-based local optimization methods to obtain satisfactory solutions. To surmount this limitation, many stochastic optimization methods have been employed to find the global solution of the problem.
    BMC Bioinformatics 05/2014; 15(Suppl 6):S1. DOI:10.1186/1471-2105-15-S6-S1 · 2.67 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Microarray data are usually peppered with missing values due to various reasons. However, most of the downstream analyses for microarray data require complete datasets. Therefore, accurate algorithms for missing value estimation are needed for improving the performance of microarray data analyses. Although many algorithms have been developed, there are many debates on the selection of the optimal algorithm. The studies about the performance comparison of different algorithms are still incomprehensive, especially in the number of benchmark datasets used, the number of algorithms compared, the rounds of simulation conducted, and the performance measures used. In this paper, we performed a comprehensive comparison by using (I) thirteen datasets, (II) nine algorithms, (III) 110 independent runs of simulation, and (IV) three types of measures to evaluate the performance of each imputation algorithm fairly. First, the effects of different types of microarray datasets on the performance of each imputation algorithm were evaluated. Second, we discussed whether the datasets from different species have different impact on the performance of different algorithms. To assess the performance of each algorithm fairly, all evaluations were performed using three types of measures. Our results indicate that the performance of an imputation algorithm mainly depends on the type of a dataset but not on the species where the samples come from. In addition to the statistical measure, two other measures with biological meanings are useful to reflect the impact of missing value imputation on the downstream data analyses. Our study suggests that local-least-squares-based methods are good choices to handle missing values for most of the microarray datasets. In this work, we carried out a comprehensive comparison of the algorithms for microarray missing value imputation. Based on such a comprehensive comparison, researchers could choose the optimal algorithm for their datasets easily. Moreover, new imputation algorithms could be compared with the existing algorithms using this comparison strategy as a standard protocol. In addition, to assist researchers in dealing with missing values easily, we built a web-based and easy-to-use imputation tool, MissVIA (, which supports many imputation algorithms. Once users upload a real microarray dataset and choose the imputation algorithms, MissVIA will determine the optimal algorithm for the users' data through a series of simulations, and then the imputed results can be downloaded for the downstream data analyses.
    BMC Systems Biology 12/2013; 7 Suppl 6(Suppl 6):S12. DOI:10.1186/1752-0509-7-S6-S12 · 2.85 Impact Factor