Science topic

Molecular Clocks - Science topic

Explore the latest questions and answers in Molecular Clocks, and find Molecular Clocks experts.
Questions related to Molecular Clocks
  • asked a question related to Molecular Clocks
Question
3 answers
Dear colleagues,
I will appreciate your help in setting the parameters to use two nodes fossil-calibration for a phylogenetic-chronogram reconstruction using BEAST.
I have the divergence time between the outgroup (two species) 43 mya and the crown group of species that I would like to date, and the divergence time of these two species in the outgroup (8.7 mya).
Is there any way to get the a posteriori mutation rate in the way I can use this value later on to estimate populations divergence time using among populations genetic distance?
Many thanks in advance for your help.
Roberto
Relevant answer
Answer
Technically you can define more than one prior in BEAST2.
However, scientifically there might be some issues. To date your "ingroup", one callibration point might be enough 8split ingroup/outgroup). Of course it would be best to have another calibration point within the "ingroup". To calibrate the outgroup two times (split outgroup-ingroup and split of the two outgroup taxa) might bias your results. Be aware, that also sampling density as well as underlying data might bias the dating. You will find plenty of papers about that...
  • asked a question related to Molecular Clocks
Question
3 answers
I am trying to calculate the origin time of some bacteria lineages, and testing the beast2 with a very sample dataset with only 12 taxa and 1 protein sequence with 1000 AAs, with wag model. I used the prior root age with 3500 MA and one cyanobacteria lineage 1200 MA with normal distribution at "priors" at BEAUti, and calibrated yule model using a fixed starting tree (with 4 parameters turn to 0). However, I keep getting the results that have very short branch length and the ESS is always low even I set the chain length to 40000000. Could anyone provide me some suggestions? Thanks a lot!
Relevant answer
Answer
Hi Jessy,
My first guess is that there is nothing wrong with it, only the proportional differences between the two main groups are hugely higher than the difference among sequences within it. So if you zoom it in you will see taxa relationship as expected. You can also try to plot each group separately.
I would try that first if I was you.
I hope it helps.
Best
  • asked a question related to Molecular Clocks
Question
9 answers
What is the best way to date a phylogenomic tree using fossil calibration? It is more or less straightforward with a few Sanger loci using programs like BEAST, but it becomes intractable with hundreds of genes, as produced with phylogenomic approaches (e.g., target capture). Just wondering if anyone had any opinions?
Thanks a lot!
Relevant answer
Answer
First, you must accept that it does not exist "the best". Instead, you need to take estimating divergence time of any lineage with any kind of data as a serious endeavour that requires a lot of serious attention and exploration of the data. Secondly, you have to explore both the molecular data and any kind of calibrations (mainly fossil record) available with the required attention to the details. Thirdly you need to take attention to the biology of your organisms. Finally, you can explore different programs such as MCMC but keep in mind that you have to accept none of the established models will be suitable for all cases and the differences are less between programs but between the models of evolution available in the programs.
  • asked a question related to Molecular Clocks
Question
6 answers
I have long been wondering about the following problem.
The regular, conventional theory of evolution states that mutations occur randomly.
The molecular clock states that mutations occur at a relatively constant rate over geological time (thus enabling to establish phylogenetic relationships).
Are not these two statements completely incompatible?
How can random events lead to a regular mechanism?
Albert Jacquard stated that the simplest hypothesis to explain the constant rate of amino-acid substitutions in polypeptides was to postulate that spontaneous mutations occur at a constant rate over time. Is not such an explanation tautological?
How is it possible to establish a logical correlation between random mutations and the molecular clock?
Relevant answer
Answer
A way of understanding this is to look at an analagous random process which derives time: radioactive half-life. Radioactive decay of any individual atom is a random event; we can never predict when a single atom might undergo radioactive decay. But if we look at a large number of the radioactive atoms, we can say how much time it will take for half of them to have decayed. For example, polonium-210 has a half-life of 138 days; if we start with a pot of a billion polonium-210 atoms and wait 138 days, 500 million of them will have decayed into different elements. Rhodium-101 has a half life of 3.3 years. It was start with a pot of 100 of them, 3.3 years later we will have around 50 left.
The same principle applies to molecular evolution. Neutral mutations (mutations which do not change the function of the sequence) are incorporated into sequences randomly*. We cannot say when a mutation might occur, but if we look at a lot of them, we can see a generalised rate of mutation. For example it might be 5 substitutions (mutations) per 1,000 sites, per million years. If we look at the same 1,000 nucleotide-long gene from two different species, and there are ten changes between them, we infer that around 2 million years have passed since they shared the same sequence: since they were the same species.
* Just to address a common misunderstanding: mutations can be random (but not always), but natural selection is definitely not random. Most mutations are rejected via natural selection - but the substitutions (=mutations) we use for molecular clocks are not affected by selection (a non-random process). For the molecular clock, the mutations are neutral: in protein coding genes, they make the same protein, despite a change in nucleotide content of the gene sequence. Without going into detail, each nucleotide triplets "word" (codons) which control the production amino acids (the components of proteins) can be "spelled" differently, but have no effect on the phenotype, so have no effect on fitness, and so cannot be subject to selection. (I am over-simplifying here.)
Overview
Kimura - Neutral theory of molecular evolution
Zuckerkandl & Pauling 1965
  • asked a question related to Molecular Clocks
Question
1 answer
I did several runs of NS analysis in BEAST2 (through Cipress Science Portal) for particular combinations of priors used (fossil calibration, molecular clock, paleogeographical events) to find out the best model describing my data. However outputs of NS analysis are fluctuating too much for certain combinations of priors. Even I used high number of particles (40) resulting in low SD (between 1 and 2). For example, the output of one run of analysis resulted in marginal likelihood 11432.23, the other one however in ML 8343.235. What to improve in analysis to get stable results for specific combination of priors or how to choose between these ones which one is better?
Thanks for advice.
Relevant answer
Answer
In NS analysis the P value tests the null hypothesis that the three columns (treatments) are the same on average.
•Random effects. The variation among subcolumn means and within subcolumns, expressed as both standard deviations (easier for scientists) and variances (more familiar to statisticians).
•Do the subcolumns differ (within each column). This P value tests the null hypothesis data for all subcolumns are sampled from populations with identical standard deviations.
•Goodness of fit (optional). If you choose this option, Prism reports the number of df and the REML criterion, which will only be meaningful to a statistician familiar with mixed models.
  • asked a question related to Molecular Clocks
Question
4 answers
The 16S rRNA gene is used for phylogenetic studies as it is highly conserved between different species of bacteria and archaea. ... It is suggested that 16S rRNA gene can be used as a reliable molecular clock because 16S rRNA sequences from distantly related bacterial lineages are shown to have similar functionalities.
Relevant answer
Answer
Small subunit rRNA genes (16S and 18S) have both conserved and variable (the so called V1, V2, etc.) regions.
It is the conserved regions that allow us to obtain plausible alignments, even between very distantly related taxa.
The variable parts of the marker (which, conveniently, are located between the conserved ones) allow us to differentiate highly related taxa.
  • asked a question related to Molecular Clocks
Question
8 answers
While running fossil-calibrated molecular-clocks analysis in BEAST, I keep receiving some strange numbers as node ages. I input the numbers in millions of years (see Fig.1) and yet I am receiving mean node ages in numbers like 0.899, 0.371 etc. (see Fig.2).
I am basically rerunning published time-calibrated pyhlogenetic analysis after inclusion of new OTUs. Therefore, I have some idea how results should look like and it seems that node ages are dated relatively (in sense of their relative position) correctlly, i.e. in accordance to that published study.
In other words, the tree itself seems to be fine but my time axis looks like this (Fig.3), while it should look like this (Fig.4, btw I received this picture with correct units on the axis by forcing the root age in FigTree to be in accordance to the published study mentioned above, which is a step I wish to avoid).
I bet this will be some minor issue but perhaps someone will share their experience and save me a bit of time. Does anyone have any ideas?
Relevant answer
Answer
Hi Jan, it's likely that your priors for TMCRA were not used or your settings were not write to .xml file. Please have a check for you xml file or try to use other version BEAST to exclude such possibility. It appears to me that BEAST v1.7.4 and v1.8.4 are stable.
  • asked a question related to Molecular Clocks
Question
6 answers
Hello,
I'm wondering if someone could offer some guidance on what should be considered a "minimum" sample size for estimating accurate clock rates/divergence dates within a single BEAST analysis.
I have several E. coli SNP datasets from different MLST sequence types (same species) that are quite divergent from each other (~30K SNPs between sequence types; <100 SNPs within sequence types), and I was considering analyzing each dataset separately, rather than as one analysis, as this allows me to use more appropriate (i.e. closely-related) references for SNP calling, etc. From the literature, the expected clock rate is ~2x10^(-7) SNPs/site/year, but when I include ALL isolates from all sequence types in a single root-to-tip regression (in TempEst to test for a temporal signal), the rate estimate ends up on the order of 10^(-2) SNPs/site/year...which is much higher than expected (and quite unrealistic for bacteria!). I suspect has something to do with how divergent the sequence types are from each other...although I do not know how to test this...(any thoughts on this would also be welcome!)
The problem is that some datasets end up with only 3-4 bacterial isolates per sequence type - is this too small a sample size for getting accurate results from a BEAST analysis? I have tried searching online, reading other papers, and the BEAST handbook but I can't seem to find anything discussing minimum sample sizes for accurate estimation. Intuitively, I would think that as long as there is a temporal signal in the data (i.e. a root-to-tip regression has a high R^2), then I don't see why the analysis shouldn't work with only a few samples...but from my naive statistical knowledge, small sample sizes are generally a negative...is this true for such Bayesian approaches as well?
Any help is much appreciated!!
Thank you,
Conrad
Relevant answer
Answer
A quick question - are you estimating rates of molecular evolution and divergence times with the SNP data only? If so, that will explain the unrealistically high substitution rate. If you want to use molecular clock models such as those implemented in BEAST2, MCMCTREE, or REVBAYES, you will need the full-length sequence data, not just the variable sites as this heavily biases the models of molecular evolution in addition to inflating the branch lengths.
Sometimes sample size is a problem you can't always overcome and you will need to evaluate if there is information in your data by analyzing the posterior distributions in comparison to the priors and marginal priors.
It's not clear to me what the questions and goals are, but maybe you don't want to actually split these up? If you want to do some comparative work to say something about the ages of the strains, you will need to do everything in a single run. Otherwise, the root ages are going to be poorly estimated and highly susceptible to priors.
In the absence of good calibrations (I would be surprised if there are any ways to calibrate Escherichia), you will probably have to use coalescent methods (STARBEAST2 or BPP), and then rescale coalescent branch lengths to absolute time given some mutation rate and generation time for E coli. If you have tip dates for your samples, you could use those too - check the BEAST handbook.
There are probably simpler things you can do too, since it seems like this a problem with relatively closely related species/strains/individuals, but the question was about the Bayesian options. I would argue that sample size is not always a concern, you can use multiple individuals per species with the coalescent methods and that might tighten up some credible intervals, but just carefully inspect the posterior distributions and do a run or two without any data to inspect the marginal priors too. I would not estimate divergence times among sequence types separately, get the topology for all of the strains by ML, then fix the species tree and estimate divergence times on that. Give mapping to the same reference a chance, the filtering should knock down a bulk of the errors.
  • asked a question related to Molecular Clocks
Question
4 answers
Hi everyone!
I have a phylogenetic tree obtained using BEAST2 and inferred from two genes (COI and 18S). I would like to use this tree to improve my understanding of the evolution of each of the groups, therefore I thought about a molecular clock.
Unfortunately, I don't have fossils or geologic events that can help me set this molecular clock.
I am working with psyllids (Hemiptera: Psylloidea) and I was wondering if any known (or similar) mutation rate can be applied..
Question within the question, given that my phylogenetic tree was inferred from two genes (having different mutation rates) does this complicate the situation?
Thank you for your time,
Francesco
Relevant answer
Answer
Dear Francesco,
I suggest you that before starting any dating/calibration, check that your mtCOI sequences are not saturated (DAMBE5 is a good software for that). If they are, make different partitions or just don't use the third codon position. If you mix all positions, you can get and overestimation, since these positions are evolving at different rate.
On BEAST2 web (https://www.beast2.org/tutorials/) you have a really nice tutorial to get the basics on how to calibrate your tree. And don't forget to run a couple of runs just from the priors to be sure that your results are form your sequences!
Best,
Diego
  • asked a question related to Molecular Clocks
Question
8 answers
I wonder what an average time-span of genera may be for evolutionary lineages of various supergeneric taxonomical level across the tree of life (e.g. fish, mammals, vertebrates, beetles, insects, ecdysozoans, metazoans, flowering plants, embryophytes, fungi etc.). In other words, how long on average may genera live in certain lineages? I am aware of the subjectivity of higher taxonomic categories, but there must be some time-span in which the genus is being found in the paleontological record. Similarly, using molecular clocks calibrated with fossils, we may assume the age of extant genera. Does anyone have some tips for relevant literature?
Relevant answer
Answer
I'm afraid this depends entirely on what you're look at! Of course, there will be an average (mean, median, etc.), but it will be meaningless because it is measuring different things in different groups.
Of course, a genus is an entirely artificial concept - a genus of a well-studied group of birds is nothing like the same as a genus of nematodes. Recognising genera is the next problem: the features used to define them may not be recognisable in the fossil record. So, for example, we get Lingula going back to the Ordovician, although the soft tissue anatomy is likely to have changed dramatically.
Then there are apparently very long-lived genera such as the pterobranch Rhabdopleura, which appears in near-identical form (including soft tissues, from what we can see) in the Cambrian. Is that because it was morphologically very stable, or because we don't properly understand it?
I know that's not very helpful, though, so here's my perspective on fossil sponges. In most cases, fossil genera appear to last for a few tens of millions, up to around 100 million years. There are, however, many genera recorded from only a single site, so we can't say how long they lasted, but this broad range also covers most of the molecular clock predictions for generic divergence dates. However, there are also some remarkably long-ranging fossil sponges like Nucha (300 million years, apparently)... and we don't yet have a clue what was going on in the deep sea, where the very long-lived sponges tend to thrive (but watch this space...).
As a final thought, I have seen a tendency of some palaeontologists to name new genera for fossils that are out the 'expected' time range--despite morphological similarity. This, of course, will result in shorter ranges in the fossil record than the the true ages based on divergence of living species within extant genera. So many problems!
  • asked a question related to Molecular Clocks
Question
3 answers
For divergence time estimation
Relevant answer
Answer
Hi,
There is no generally accepted substitution rate. You can calibrate a sutitabele substitution rate for your target marker using geological events or fossil records in BEAST software. Hope this helps.
  • asked a question related to Molecular Clocks
Question
5 answers
Dear colleagues,
I am having some problems with node dating using substitution rates in MrBayes 3.2.6, even following the example in the program manual.
According to the manual I need to:
1.     Set a normal distribution as the prior for the clock rate: e.g. using 0.02 as the mean and 0.005 as the standard deviation assuming the rate is approximately 0.01 ± 0.005 substitutions per site per million years:
      MrBayes > prset clockratepr = normal(0.01,0.005)
 2.    Modify the tree age prior to an exponential distribution with the rate 0.01:  
       MrBayes > prset treeagepr = exponential(0.01)
 When I run the analysis, the program does not recognize the argument “exponential” to modify the age prior:
      No valid match for argument “exponential”
      Invalid Treeagepr argument
      Error when setting parameter “Treeagepr”
I have checked the "Command Reference for MrBayes ver. 3.2.6" and, in fact, “exponential” does not appears as a valid argument for Treeage parameter, so I think it is an error in the manual but I cannot find a way to solve it. .
Does anyone have had such situation before?
Any solution to solve the problem?
Many thanks,
Yoannis
Relevant answer
Answer
Dear Yoannis
I know MrBayes well and I appreciate it a lot. But I highly recommend you to use BEAST for time estimations based on sequences. Both programs analyse molecular data Bayesian using MCMC. But BEAST is entirely orientated towards rooted, time-measured phylogenies. I have had very good experiences with BEAST using BEAUTi for creating the input file. BEAUTi is a user interface that enable uploading sequence data (nexus formated) and defining all needed parameters (taxa groups, molecular clock (relaxed, strict), substitution model, etc.). Finally, the BEAST software package includes some other programs more such as Tracer (that analyses the log-file of BEAST or MrBayes) and TreeAnnotator (that searches the "best tree" among data obtained from BEAST).
I recommend you to do first a tutorial to learn the application of different programs.
Good luck!
Carolina
  • asked a question related to Molecular Clocks
Question
3 answers
The posterior probability is mostly (>0.90), ESSs are great, MCMC samplings converge very well but I am getting overlapped HPD for divergence time estimate. I used relaxed molecular clock with log noarmal distribution. I am analyzing a mitochondrial gene with two calibration node, one mid-interior and the other is very recent. 
Relevant answer
Answer
Dear Binod Regmi,
I found this post from the BEAST team very helpful when setting the parameters for divergence time estimates: http://beast2.org/2015/06/23/help-beast-acts-weird-or-how-to-set-up-rates/
  • asked a question related to Molecular Clocks
Question
7 answers
Hi I want to check the molecular divergence or clock or time of birds by some specific nuclear genes, So please anyone can suggest me some best and easiest softwares to check molecular divergence. I am trying to do in BEAST software, If anyone knows the easiest protocol of how to use beast software means please share with me. Thanks in advance.
Relevant answer
Answer
Here is a step by step tutorial. Hope it helps.
                Csaba
  • asked a question related to Molecular Clocks
Question
10 answers
In a molecular phylogeny of fishes produced using a cytb marker of 704 bp, Sota et al. (2005) (http://www.ncbi.nlm.nih.gov/pubmed/15684588) calibrated an ML clock tree with a node corresponding to the MRCA of two lineages that are assumed to have diverged in allopatry for 3.5 million years.
The 'node height' of this calibrated node is 0.047 (in Fig. 3, illustrating the ML clock tree, 'node height' apparently = number of substitutions per site).
The authors state that the calibration resulted in a "substitution rate of 2.7% per million years". Later on, they state that "3.5 million years corresponds to 9.4% sequence difference, giving a molecular clock of 2.7% per My".
I suppose that: 9.4/3.5 = 2.7 ...
The node height (0.047) should in fact be the branch length, or the number of substitutions separating the MRCA to one of the two sister lineages, divided by the length of the sequence (704), that is, the (average) number of substitutions per site. In this case, the 'divergence' between the two sister sequences should be twice this amount (the number of substitutions per site between the two sequences, along both branches), or 0.094.
By dividing the divergence (0.094 or 9.4/100, or '9.4%') by 3.4 million years, the authors found a 'divergence rate' of 2.7% per million year.
This however is referred to as the "molecular clock", or the "substitution rate".
Indeed, many authors (including me) would in this case use the term 'substitution rate' to indicate the average number of substitutions per site between the MRCA and one of its descendants, that is 0.047/3.5 = 0.0134 per million year, or '1.3% per million year'.
(incidentally, it always puzzled me why this complication of the '%', which should correspond to a 'rate per 100 million years').
When Sota et al. (2005) compare their "fish cytb molecular clock" of 2.7% per million year with the estimates of different studies (Orti et al. 1994; Cantatore et al. 1994), they find a range 0.8-2.8% per million year that is perfectly compatible with both the 'divergence rate' (2.7%) and the 'substitution rate' (1.3%) calculated above ... a misunderstanding of these rates is obviously very easy, since it is entirely possible that these other authors reported 'substitution rates', and not 'divergence rates'.
I'd be happy to share your thoughts about this topic.
Gianluca
Relevant answer
Answer
The % expression is an expression of the molecular divergence.  It should be read as "per 100 base pairs".   Also, the rate of molecular divergence should be the same regardless of how it is calculated.  It does not matter if you call it divergence rate or substitution rate.  It is all the same thing.  If one rate is about half of the other rate, then someone has miscalculated.  Of course, the date estimates of fossils have a large amount of error.  This should be taken into account in proposing dates.  When possible, multiple fossil calibrations should be used.  
If you calculate a rate from the MRCA, as represented by a fossil, then you need to divide the observed divergence between the OTUs in half.  (This would be the same as using the node height on an ultrametric tree.)  If you are using someone else's rate, you can get the divergence time by dividing the divergence by the rate (all units cancel except time); in that case, you don't divide in half.  
You are right, this is easily confusing.  It looks like the authors, editors, and reviewers of that 2005 paper were confused as well.  The question is: Is the error in how they used the term "node height" , or is the calculated rate inflated by 100%?  Their figure defines node height as substitutions per site.  But, we don't know if this is an estimate from the MRCA or a divergence between the OTUs.  You could check this by using their NS model to calculate divergence between the OTUs, using sequences from GenBank.  Also, you could ask them.  
  • asked a question related to Molecular Clocks
Question
4 answers
I wanted to know any affect of continuous light (LL) and continuous darkness (DD) on the circadian rhythm of clock genes expression on fish pineal at 24 hours time period, mainly clock and bmal genes. LL and DD is the free running condition here any environmental cues does not works. So in this situation the clock genes showing any rhythm or not like 12L-12D condition? 
Relevant answer
Answer
Hello Saurav,
By definition, circadian rhythms persist (at least for some time) in the absence of environmental influence (e.g., light/dark signal). Most of these rhythms are under the influence of the master circadian oscillator in the brain (mammalian SCN, analagous systems in other taxa). That said, moving your fish into a DD environment should not influence the persistence of the rhythm of gene expression. What will likely change is the period and amplitude of the rhythm.
  • asked a question related to Molecular Clocks
Question
4 answers
Hi all,
I am interested to know the estimated mitochondrial mutation rate (substitutions/site/my) in rodents (actually shrews) with the idea of calibrating a molecular clock (COXII gene). I have seen from the literature that it is common to see Cytb third codon positions for such purpose. I have also come across some papers on the Control Region, but being non coding is not ideal for comparisons. I have little variation (population study) and I just want an 'average' estimate to start from. I really want to avoid the 1%-2% estimate of Brown (1979) as it is most likely to be too conservative and outdated.
Thanks to all for any help/advice on this one.
Michael  
Relevant answer
Answer
 Hi Michael, in our recent squirrel paper (Corrie et al. 2015 in J. Hered.) we tested some different substitution rates from Horn et al. (2011) "Mitochondrial genomes reveal slow rates of molecular evolution and the timing of speciation in beavers (Castor), one of the largest rodent species. PLoS ONE 6:e14622." These were based on mitogenomes excluding the CR.
  • asked a question related to Molecular Clocks
Question
1 answer
Hi, molecular clock rates are widely used to link genetic divergence in invertebrates to vicariance; for example, geological events in the Pleistocene, or earlier in the Miocene. My question is how far back in time is appropriate for (invertebrate) mtCOI dating analysis? Is the Mesozoic too far back in time?  (btw, I realise the use of mtCOI molecular clock rates are controversial)
Relevant answer
Answer
Riddle and Hafner (chapter 7 in Biogeography in a changing world, Ed. Ebach and Tangney 2007) compiled a graph showing keyword search on ISI web of science using "phylogeograph" and geological time, and had produced some results for "Cretaceous". However, there are considerably more results for Holocene-Miocene than Oligocene to Cretaceous. Pleistocene has the modal peak. 
Although this does not answer why there are not many phylogeographic studies for Oligocene-Cretaceous events, which is a question I am also wondering, I hope this is helpful to you.
  • asked a question related to Molecular Clocks
Question
3 answers
I was trying to do Bayesian analysis on some of my sequence data using BEAST 1.7.5 to see how closely related they are and their migration patterns. 
The substitution model used was GTR+I+G (strict molecular clock). I did 10 million iterations primarily to have a better ESS thus a rich posterior probability. Well it worked fine and for each run, I had ESS <700.
But once their locations (discrete trait) are added to the analysis, ESS dropped down to <10. Even after combining 4 independent runs, ESS remained low (<75). Trees each run generated were significantly different and location patterns doesn't seem to right. The branch colours were really confusing.
Can anyone help me to get this analysis right with the discrete trait (location)?
I guess if everything goes right, the posterior probability values I got w/o locations should be similar to with locations, right?
My expertise with Bayesian algorithms and BEAST/ beauti is extremely low.
Thanks
Relevant answer
Answer
Hi Harindra,
I meant the number states in your trait (e.g. Areas). Also, what trait reconstruction model are you using (symetrical or asymetrical?). Try with longer generations, I think its just a convergence issue. If the model gets more complex, it needs more time to finish. If you are increasing to say 50 M generations, remember also to increase the sampling frequency to 5,000 so you end up with 10,000 states.
cheers,
  • asked a question related to Molecular Clocks
Question
4 answers
The quartet method is one of the methods for a molecular clock. The method has some advantages over the other methods and some weaknesses against other methods. So, I search about strengths and weakness of the quartet method, if you have any experience or paper, explain about it, please.
Relevant answer
Answer
  • asked a question related to Molecular Clocks
Question
10 answers
Hello there.
What I mean: I have sequences of one mitochondrial gene of one species from different regions of the world. I see micro-variation in it (haplotypes). It's difficult to get a specific and good value to describe diversity of the networks I get with my data. So I thought about using a molecular clock to compare my data. At best I want to get an estimated time-value of evolution for each of my datasets. So that I could say for example: In place A there we have 1.2 million estimated years passed and in comparison to that in place B there have only 0.92 million estimated years passed.
Is there a software to use for such a thing? I searched already, but only found clocks for interspecific questions, or to get a timeline for some trees and so on.
Thank you all in advance for your help.
Relevant answer
Answer
Using coalescent samplers such as BEAST and IMa is important as noted above. You also need to be careful of "time-dependency" in which rates calibrated at intraspecific time scales (say 1 million years or so) are *apparently* faster than rates calibrated from older time points (e.g. fossils or the Isthmus of Panama). Hypothesized reasons for this phenomenon have to do with nearly neutral theory and weak purifying selection. So if you use a (slow) interspecific rate on an intraspecific estimate, you will likely get dates that are biased high. You might search the literature for ancient DNA calibrations on something related to your taxon, or alternatively I recently wrote a paper that allows one to make relatively recent calibrations from known expansion events (attached) :-)
  • asked a question related to Molecular Clocks
Question
9 answers
Molecular Clock estimation has been used frequently in phylogeographic studies in order to determine divergence time of specific taxon within group of taxa, exploring phylogegeraphical scenarios. How can I do this? I'm not able to run BEAST software. Do I have to use this software? Are there any other tools to do this?
Relevant answer
Answer
Dear Gonzalez
Thank you so much for your answer.
Regards
  • asked a question related to Molecular Clocks
Question
1 answer
My current research focuses on the family of polyketide synthase (PKS) genes in filamentous fungi, particularly on entomopathogenic fungi. Our previous work identified a group of reducing clade III PKSs that are highly specific and highly conserved for these insect-pathogenic fungi. Interestingly unlike other groups of PKSs, a reducing clade III PKS is present as a single copy gene in a fungal genome, verified by the data from two available genome sequences of two fungi, Beauveria bassiana and Cordyceps militaris.
In the evolutionary viewpoint, the fact that this reducing clade III PKS gene is very conserved and single-copied in the genome might lead to a hypothesis (my hypothesis) that this PKS could be an ancestor of this PKS gene family.
I would like to ask an expert in the field how to determine which clade is an ancestor and which clades are descendants and/or the ratio of nonsynonymous to synonymous substitution using a software. I understand that PAML can do that. However, the program seems difficult to use and we are not evolution people. I have tried the graphical user interface version of this software, PAMLX, still I could not complete the analysis, likely due to the incorrect settings or options selected. I was wondering if anyone can give me a clue in this analysis. Any input or comment is highly appreciated.
Relevant answer
Answer
I am not an expert in the evolutionary biology, however, here are my recommendations. To calculate the syn/nonsyn substitutions, you can use MEGA 5. It is user friendly and easy to follow. For calculating the ratio of syn/non syn, you may try DnaSP software. It is free to download and very useful for other evolutionary analyses as well. These two software will be an excellent start.
Reconstruct a rooted phylogenetic tree and it will help you to approximate which class is ancestral. You can also reconstruct the ancestral states and compare your sequences to it. This will also help you getting an idea whether or not your hypothesis make sense.
I hope this will help. However, i will recommend you to wait for expert opinions from other RG users.
Best
Nabeel
  • asked a question related to Molecular Clocks
Question
7 answers
Does somebody know/use a method for phylogeny calibration of groups without fossil records? I've read about calibration using genetic distance of sequences to estimate approximate divergence times but I really don't know how it works.
Any advice?
Relevant answer
Answer
That is called a molecular clock. Basically, at some point, someone will have generated a series of time-calibrated divergence dates and measures of sequence divergence for some group of organisms for some gene. If you examine the same gene and are willing to assume that your group is evolving at the same rate as the calibrated group, then you can use the calibrated "rate" (% divergence per million years) to estimate the age. There are more sophisticated methods available, such as a Bayesian program called BEAST, but it still requires some sort of calibration estimate to infer the ages of divergence of the nodes on the tree.