Continuous and tractable models of the variation of evolutionary rates. Math Biosci

Department of Mathematics and Statistics, McGill University, Montréal, Canada.
Mathematical Biosciences (Impact Factor: 1.3). 03/2006; 199(2):216-33. DOI: 10.1016/j.mbs.2005.11.002
Source: PubMed


We propose a continuous model for variation in the evolutionary rate across sites and over the phylogenetic tree. We derive exact transition probabilities of substitutions under this model. Changes in rate are modelled using the CIR process, a diffusion widely used in financial applications. The model directly extends the standard gamma distributed rates across site model, with one additional parameter governing changes in rate down the tree. The parameters of the model can be estimated directly from two well-known statistics: the index of dispersion and the gamma shape parameter of the rates across sites model. The CIR model can be readily incorporated into probabilistic models for sequence evolution. We provide here an exact formula for the likelihood of a three-taxon tree. The likelihoods of larger trees can be evaluated using Monte-Carlo methods.

Download full-text


Available from: Stephan Lawi,
  • Source
    • "Under a 'relaxed-clock' model, substitution rates change over the tree in a constrained manner, thus separating the rate and time parameters associated with each branch and allowing inference of lineage divergence times. A considerable amount of effort has been directed at modeling lineage-specific substitution rate variation, with many different relaxed-clock models described in the literature [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19]. When such models are coupled with a model on the distribution of speciation events over time (e.g., the Yule model [20] or birth-death process [21]), molecularsequence data can then inform the relative rates and node ages in a phylogenetic analysis. "
    [Show abstract] [Hide abstract]
    ABSTRACT: Time-calibrated species phylogenies are critical for addressing a wide range of questions in evolutionary biology, such as those that elucidate historical biogeography or uncover patterns of coevolution and diversification. Because molecular sequence data are not informative on absolute time, external data-most commonly, fossil age estimates-are required to calibrate estimates of species divergence dates. For Bayesian divergence time methods, the common practice for calibration using fossil information involves placing arbitrarily chosen parametric distributions on internal nodes, often disregarding most of the information in the fossil record. We introduce the "fossilized birth-death" (FBD) process-a model for calibrating divergence time estimates in a Bayesian framework, explicitly acknowledging that extant species and fossils are part of the same macroevolutionary process. Under this model, absolute node age estimates are calibrated by a single diversification model and arbitrary calibration densities are not necessary. Moreover, the FBD model allows for inclusion of all available fossils. We performed analyses of simulated data and show that node age estimation under the FBD model results in robust and accurate estimates of species divergence times with realistic measures of statistical uncertainty, overcoming major limitations of standard divergence time estimation methods. We used this model to estimate the speciation times for a dataset composed of all living bears, indicating that the genus Ursus diversified in the Late Miocene to Middle Pliocene.
    Proceedings of the National Academy of Sciences 07/2014; 111(29). DOI:10.1073/pnas.1319091111 · 9.67 Impact Factor
  • Source
    • "Statistical analysis of ecological complex systems [1], [2], financial data [3] or genetics [4] increasingly relies on stochastic models for data underlying processes. "
    [Show abstract] [Hide abstract]
    ABSTRACT: We model two time and space scales discrete observations by using a unique continuous diffusion process with time dependent coefficient. We define new parameters for the large scale model as functions of the small scale distribution cumulants. We use the non - uniform distribution of the observation time intervals to obtain consistent and unbiased estimators for these parameters. Closed form expressions for migration proportions between spatial domains are derived as functions of these parameters. The models are applied to estimate migration patterns from satellite tag data. Comment: 25 pages, 5 figures
  • Source
    • "Others assume that branch-specific rates are drawn from a single underlying distribution, such as a lognormal, gamma, or exponential distribution, the parameters of which are estimated from the data (Drummond et al. 2006; Lepage et al. 2007; Rannala and Yang 2007). The available relaxed-clock methods have been compared in several reviews (Magalí on 2004; Welch and Bromham 2005; Lepage et al. 2006; Rutschmann 2006), and their performance has been assessed in a number of studies (e.g., Ho et al. 2005; Drummond et al. 2006; Lepage et al. 2007). The new relaxed-clock methods have also introduced more flexible techniques for incorporating calibrations, leading to a lively discussion about approaches to calibrating estimates of divergence times (Graur and Martin 2004; Hedges and Kumar 2004; Donoghue and Benton 2007; Ho 2007). "

    Systematic Biology 06/2009; 58(3):367-80. DOI:10.1093/sysbio/syp035 · 14.39 Impact Factor
Show more