Yi N, Yandell BS, Churchill GA, Allison DB, Eisen EJ, Pomp D. Bayesian model selection for genome-wide QTL analysis. Genetics 170: 1333-1344

Department of Biostatistics, University of Alabama, Birmingham 35294, USA.
Genetics (Impact Factor: 5.96). 08/2005; 170(3):1333-44. DOI: 10.1534/genetics.104.040386
Source: PubMed


The problem of identifying complex epistatic quantitative trait loci (QTL) across the entire genome continues to be a formidable challenge for geneticists. The complexity of genome-wide epistatic analysis results mainly from the number of QTL being unknown and the number of possible epistatic effects being huge. In this article, we use a composite model space approach to develop a Bayesian model selection framework for identifying epistatic QTL for complex traits in experimental crosses from two inbred lines. By placing a liberal constraint on the upper bound of the number of detectable QTL we restrict attention to models of fixed dimension, greatly simplifying calculations. Indicators specify which main and epistatic effects of putative QTL are included. We detail how to use prior knowledge to bound the number of detectable QTL and to specify prior distributions for indicators of genetic effects. We develop a computationally efficient Markov chain Monte Carlo (MCMC) algorithm using the Gibbs sampler and Metropolis-Hastings algorithm to explore the posterior distribution. We illustrate the proposed method by detecting new epistatic QTL for obesity in a backcross of CAST/Ei mice onto M16i.

10 Reads
  • Source
    • "), which gradually filters out the markers with negligible effects and therefore reduces the dimensionality of the model. The second method is a MCMC-based model-finding algorithm of Yi et al. (2005), which has been implemented in the R package qtlbim (see Yandell et al. 2007). We set the expected number of QTL with main effects (main.nqtl) to be 9, and the expected total number of QTL with both main and epistatic effects (mean. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Bayesian hierarchical shrinkage methods have been widely used for quantitative trait locus mapping. From the computational perspective, the application of the Markov chain Monte Carlo (MCMC) method is not optimal for high-dimensional problems such as the ones arising in epistatic analysis. Maximum a posteriori (MAP) estimation can be a faster alternative, but it usually produces only point estimates without providing any measures of uncertainty (i.e., interval estimates). The variational Bayes method, stemming from the mean field theory in theoretical physics, is regarded as a compromise between MAP and MCMC estimation, which can be efficiently computed and produces the uncertainty measures of the estimates. Furthermore, variational Bayes methods can be regarded as the extension of traditional expectation-maximization (EM) algorithms and can be applied to a broader class of Bayesian models. Thus, the use of variational Bayes algorithms based on three hierarchical shrinkage models including Bayesian adaptive shrinkage, Bayesian LASSO, and extended Bayesian LASSO is proposed here. These methods performed generally well and were found to be highly competitive with their MCMC counterparts in our example analyses. The use of posterior credible intervals and permutation tests are considered for decision making between quantitative trait loci (QTL) and non-QTL. The performance of the presented models is also compared with R/qtlbim and R/BhGLM packages, using a previously studied simulated public epistatic data set.
    Genetics 01/2012; 190(1):231-49. DOI:10.1534/genetics.111.134866 · 5.96 Impact Factor
  • Source
    • "In the field of genomic selection, which is based only on additive effects, an SVS implementation of Meuwissen and Goddard [34] was applied by Calus et al. [35] to simulated data as well as by Verbyla et al. [36] to dairy cattle data. In case of additional non-additive effects, SVS was developed and already successfully applied to obesity data in a mouse backcross population [37]. In that work, an upper bound of model dimensionality had to be fixed and indicator variables were involved specifying which main and epistatic effect had to be included in the model. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Molecular marker information is a common source to draw inferences about the relationship between genetic and phenotypic variation. Genetic effects are often modelled as additively acting marker allele effects. The true mode of biological action can, of course, be different from this plain assumption. One possibility to better understand the genetic architecture of complex traits is to include intra-locus (dominance) and inter-locus (epistasis) interaction of alleles as well as the additive genetic effects when fitting a model to a trait. Several Bayesian MCMC approaches exist for the genome-wide estimation of genetic effects with high accuracy of genetic value prediction. Including pairwise interaction for thousands of loci would probably go beyond the scope of such a sampling algorithm because then millions of effects are to be estimated simultaneously leading to months of computation time. Alternative solving strategies are required when epistasis is studied. We extended a fast Bayesian method (fBayesB), which was previously proposed for a purely additive model, to include non-additive effects. The fBayesB approach was used to estimate genetic effects on the basis of simulated datasets. Different scenarios were simulated to study the loss of accuracy of prediction, if epistatic effects were not simulated but modelled and vice versa. If 23 QTL were simulated to cause additive and dominance effects, both fBayesB and a conventional MCMC sampler BayesB yielded similar results in terms of accuracy of genetic value prediction and bias of variance component estimation based on a model including additive and dominance effects. Applying fBayesB to data with epistasis, accuracy could be improved by 5% when all pairwise interactions were modelled as well. The accuracy decreased more than 20% if genetic variation was spread over 230 QTL. In this scenario, accuracy based on modelling only additive and dominance effects was generally superior to that of the complex model including epistatic effects. This simulation study showed that the fBayesB approach is convenient for genetic value prediction. Jointly estimating additive and non-additive effects (especially dominance) has reasonable impact on the accuracy of prediction and the proportion of genetic variation assigned to the additive genetic source.
    BMC Genetics 08/2011; 12(1):74. DOI:10.1186/1471-2156-12-74 · 2.40 Impact Factor
  • Source
    • "We employed the approach reported by Yi et al. (2005) to ascertain the maximum-QTL number and found that the results were not very sensitive to the maximum QTL number. Theoretically, the maximum-QTL number may be set as any value as long as it is greater than the actual QTL number. "
    M Fang · J Liu · D Sun · Y Zhang · Q Zhang · S Zhang ·
    [Show abstract] [Hide abstract]
    ABSTRACT: In this article, we propose a model selection method, the Bayesian composite model space approach, to map quantitative trait loci (QTL) in a half-sib population for continuous and binary traits. In our method, the identity-by-descent-based variance component model is used. To demonstrate the performance of this model, the method was applied to map QTL underlying production traits on BTA6 in a Chinese half-sib dairy cattle population. A total of four QTLs were detected, whereas only one QTL was identified using the traditional least square (LS) method. We also conducted two simulation experiments to validate the efficiency of our method. The results suggest that the proposed method based on a multiple-QTL model is efficient in mapping multiple QTL for an outbred half-sib population and is more powerful than the LS method based on a single-QTL model.
    Heredity 04/2011; 107(3):265-76. DOI:10.1038/hdy.2011.15 · 3.81 Impact Factor
Show more


10 Reads
Available from