Article

Evolutionary Stochastic Search for Bayesian model exploration

02/2010; DOI:abs/1002.2706
Source: arXiv

ABSTRACT Implementing Bayesian variable selection for linear Gaussian regression models for analysing high dimensional data sets is of current interest in many fields. In order to make such analysis operational, we propose a new sampling algorithm based upon Evolutionary Monte Carlo and designed to work under the "large p, small n" paradigm, thus making fully Bayesian multivariate analysis feasible, for example, in genetics/genomics experiments. Two real data examples in genomics are presented, demonstrating the performance of the algorithm in a space of up to 10,000 covariates. Finally the methodology is compared with a recently proposed search algorithms in an extensive simulation study.

0 0
 · 
0 Bookmarks
 · 
39 Views
  • Source
    Article: Construction of regulatory networks using expression time-series data of a genotyped population.
    [show abstract] [hide abstract]
    ABSTRACT: The inference of regulatory and biochemical networks from large-scale genomics data is a basic problem in molecular biology. The goal is to generate testable hypotheses of gene-to-gene influences and subsequently to design bench experiments to confirm these network predictions. Coexpression of genes in large-scale gene-expression data implies coregulation and potential gene-gene interactions, but provide little information about the direction of influences. Here, we use both time-series data and genetics data to infer directionality of edges in regulatory networks: time-series data contain information about the chronological order of regulatory events and genetics data allow us to map DNA variations to variations at the RNA level. We generate microarray data measuring time-dependent gene-expression levels in 95 genotyped yeast segregants subjected to a drug perturbation. We develop a Bayesian model averaging regression algorithm that incorporates external information from diverse data types to infer regulatory networks from the time-series and genetics data. Our algorithm is capable of generating feedback loops. We show that our inferred network recovers existing and novel regulatory relationships. Following network construction, we generate independent microarray data on selected deletion mutants to prospectively test network predictions. We demonstrate the potential of our network to discover de novo transcription-factor binding sites. Applying our construction method to previously published data demonstrates that our method is competitive with leading network construction algorithms in the literature.
    Proceedings of the National Academy of Sciences 11/2011; 108(48):19436-41. · 9.68 Impact Factor

Full-text (2 Sources)

View
0 Downloads
Available from

Keywords

analysis operational
 
Bayesian multivariate analysis feasible
 
current interest
 
extensive simulation study
 
genetics/genomics experiments
 
Implementing Bayesian variable selection
 
large p
 
linear Gaussian regression models
 
new sampling algorithm
 
paradigm
 
proposed search algorithms
 
real data examples
 

Leonardo Bottolo