Phylogeography and Phylogenetic Biogeography

    How to interpret the symbols on the results of RASP?

    That's probably a beginner question, but, after I run the BBM analysis and go to the graphic tree view how interpretation I give to the symbols in event matrix on Information tab? An example:

    Event Route:
     What's mean the numbers in Dispersal, Vicariance and Extinction? And the simbols in event route between the area C, they have the same significance which have in mathemarical equations?

    How do you align DNA sequences for phylogenetic work? What is the order of events and what software do you use?
    I anticipate having to align sequences from hundreds of loci for more than 100 taxa. What is the most efficient way to go about this? I usually use muscle (or clustal), then manual alignment in mesquite. The ability to select blocks (within the larger sequence) across multiple taxa and align those blocks using muscle also helps. With this problem of many loci and sequences I imagine I will be looking for shortcuts, but I don't want to overly impact the quality of the final product.
    Bihua Chen

    Solfware MEGA

    Why does few reptiles follow inverse Bergmann's Rule?

    Why does few reptiles follow inverse Bergmann's Rule?

    Debaprasad Sengupta

    @ Abhishek Raj,

    Thanks for the paper. It will be useful

    Can everyone guide me to run molecular clock?
    Molecular Clock estimation has been used frequently in phylogeographic studies in order to determine divergence time of specific taxon within group of taxa, exploring phylogegeraphical scenarios. How can I do this? I'm not able to run BEAST software. Do I have to use this software? Are there any other tools to do this?
    A, Mahmoudi

    Dear Martin,

    Actually, I have some problems with prior setting. You know there are default priors, one could set priors as well. 

    Any comments would be appreciated,


    How can I trace the trait evolution history along a phylogenetic tree without branch length?

    hello everyone ! I got a question about trait evolution analysis.

    I have estimated a continuous trait of several species, which is I interested in. I know those species could be separated into several groups, but I only know the topological relationship of groups, not precisely of every species.

    Now, I want to analysis this trait evolution history(eg. the state of most recently common ancestor, tracing this character at every lineage split node

    So, how could I achieve that goal ? even when I have no branch length to every tips of this phylogeny?

    thank you all ! Any comments would be appreciated !

    Actually, the main problems in this project is the molecular phylogeny relationship  :

    1, the sequence data is not available for most of the species I studied;

    2, BUT, my species could clustered into several groups, and I have  sequence data of other species in those groups; 

    so, available sequence doesn't match at species level, but group level.

    Bernd Schierwater


    What is the correct method for calculation of gene flow (Nm) from Gst value?

    Nm= (1/Gst-1)/4 or Nm=(1-Gst)/4Gst) or Nm=0.5(1-Gst) or any other method..

    Can anybody suggest phylogenetic analysis and identification of chlorella sp.?

    The methods for identification(molecular ) of chlorella species and their phylogenetic analysis to confirm the identity, could be suggested by anyone?

    Volker A R Huss

    See the most recent paper dealing with new description of a Chlorella species and references therein: Ma et al. (2015) Hydrobiologia 760:81-89

    How to construct the input matrix to infer haplotype network using TCS software?

    Hi everybody! I've been tying to construct an haplotype network using TCS software (Molecular Ecology (2000) 9, 1657–1659) but I'm having problems with the input matrix. I realize that TCS was developed to infer haplotype networks from DNA sequences. However a large quantity of studies infer haplotype networks from cpSSR. How can I construct a distance matrix from cpSSR when I only have the allele sizes? or can I use a matrix with the "raw" data from the cpSSR?

    Thank you in advance!

    When inferring deep branches of an ancient protein phylogeny, is analyzing thousands of sequences better than a more sophisticated analysis with less?

    I work on an ancient protein family that exists in both eukaryotes and prokaryotes and my interest is in elucidating its deeper branches. When I pull down the data from NCBI I get about 6000 sequences. If I use an algorithm to help me cluster similar proteins (e.g. 80% similar) it gets reduced, and if I do my clustering with 50% similarity I get about 550.

    The question is, in your experience, which one would you choose:

    - use the smaller dataset (or even smaller) and do the most sophisticated, yet computationally intensive, analyses you can hoping that if there is a signal, these analyses would pick it up and thus the deeper branches get better support.
    - use a lot of data and hope that then the intermediate evolutionary steps could be inferred more easily and the deeper branches get more support?

    Lauri Kaila

    Matthews' answer has an important point that I want to highlight, pointing towards an intermediate approach. I might try first with the larger data set. If you find lots of near-duplicates in some clusters, thinning there taxon sampling may improve possibilities for more sophisticated methods without loosing phylogenetic signal for greater patterns. But, definitely, all more deviating taxa should be kept intact, otherwise the risk  for severe bias due to long-branch attraction (or rather short branch explusion) is almost inevitable.

    Can paleoclimate be correlated to population genetic divergence to explain distribution of a sp. of extant tropical highland plant?

    What kind of treatment can be done and what kind of data is needed? And what meaningful implication can we observe? (to plant conservation, etc)

    Csaba Csuzdi

    Here you are a nice paper dealing with the speciation pattern of blind mole-rats (Spalacidae). Here the authors proved, that far before the Quaternary climatic fluctuations the dry/wet climatic cycles caused significant speciation in the investigated group (of course also tectonic changes resulted in speciation).

    How too make a correct bootstrap on R using dominant markers?
    I am trying to analyze some AFLP data, and making dendrograms, I need some bootstrap support, but, it seems that something is wrong with that. I built my tree like that: dist=vegdist(gel, method="jaccard", binary=TRUE, diag=FALSE, upper=FALSE, na.rm = FALSE) dendro=hclust(dist,"complete") dendro2<-as.phylo(dendro,cex=0.5) Then I used the function boot.phylo: boot.phylo(dendro2,gel,make.tree,B=100,rooted=TRUE) Most of my values equal 0, even for clades that are far from others on the tree, and that seems logical, (different places etc...) Is it the bootstrapping method that is wrong, or is it something else? Any ideas?
    Bjarne Larsen

    I am working with a polyploid SSR dataset. I have successfully made a distance matrix using the package "polysat":

    testmat <- meandistance.matrix(appleDK)

    and a NJ tree using the package “ape”:

    appleDK <- nj(testmat)


    Now, I need to add bootstrap values to my NJ tree.

    How do I do that?

    What could possibly be the reason for getting really low ESS when a discrete trait (location) is added to the BEAST analysis?

    I was trying to do Bayesian analysis on some of my sequence data using BEAST 1.7.5 to see how closely related they are and their migration patterns. 

    The substitution model used was GTR+I+G (strict molecular clock). I did 10 million iterations primarily to have a better ESS thus a rich posterior probability. Well it worked fine and for each run, I had ESS <700.

    But once their locations (discrete trait) are added to the analysis, ESS dropped down to <10. Even after combining 4 independent runs, ESS remained low (<75). Trees each run generated were significantly different and location patterns doesn't seem to right. The branch colours were really confusing.

    Can anyone help me to get this analysis right with the discrete trait (location)?

    I guess if everything goes right, the posterior probability values I got w/o locations should be similar to with locations, right?

    My expertise with Bayesian algorithms and BEAST/ beauti is extremely low.


    Santiago Sanchez-Ramirez

    Hi Harindra,

    I meant the number states in your trait (e.g. Areas). Also, what trait reconstruction model are you using (symetrical or asymetrical?). Try with longer generations, I think its just a convergence issue. If the model gets more complex, it needs more time to finish. If you are increasing to say 50 M generations, remember also to increase the sampling frequency to 5,000 so you end up with 10,000 states.


    Could somebody please advise on how to make Haplotype networks in DNAsp or MEGA 6?

    Looking to make Haplotype networks in the above mentioned software or any other such freeware. 

    Cristian Cornejo Latorre

    I know three options where you can make your "haplotypes networks" 

    1) Arlequin Software (manually)

    2) Network 

    3) TCS with statistical parsimony


    Any ideas for mitochondrial primer sequence for species level discrimination in plants?

    I am working with closely related species in family Caprifoliaceae. Full ITS amplification is a major problem in this group. I have used several chloroplast regions but not much variation detected.

    Could you suggest some potential mitochondrial region for plant or specifically for caprifoliaceae.

    Danny J Gustafson

    I want to agree with Keir Wefferling - the series of papers by Joel Shaw published in the American Journal of Botany are just what you need. The alternative are the typical Bar Code sequences (also mentioned above). 

    Does anyone have experience in using GenGIS?

    The software merges geographic, ecological and phylogenetic biodiversity data in a single interactive visualization and analysis environment.

    Does anybody know to how build the various layers such as maps, genetic information as well as geographic location?

    Any help would be much appreciated. 

    Eneas Konzen

    OK, glad you made it! 

    Does anybody have information about plant DNA barcode as tool for identification of illegal traffic of endangered species?

    I am looking for information about plant DNA barcode as tool for identification of illegal traffic of endangered species. Thank you very much in advance.

    Cintia Souto

    You can check the Barcode of Life database at 

    Also the Scientific abstracts from the 6th International Barcode of Life Conference in the last issue of Genome

    How do I compare trait in the context of phylogenetic dependence?

    when I try to compare difference of trait among tree\shrub\liana in a big database of China woody species, I konw ANOVA or Kruskal-Wallis rank sum test and Wilcoxon rank sum test could be used in traditional statistics. however, in phylogenetic context, I don't quite clear how or when to use the right technique.

    phylogenetic analysis is not easy for me, it seems like a brandnew field to an ordinary ecological researcher. Thus, I am looking for help or potential collaborator. 

    thanks for any suggestion.

    Jingming Zheng

    thanks Vladislav! I will try this.

    may I ask another specific question:  as my phylo tree is too big to manipulate for PGLS, I want to use family phylo mean to regress one trait vs. others. I realize that the family mean is not arithmatic mean, or the family pics is not the average of species pics in the family.  some paper(Zhang et al, 2010, GEB) mentioned that could be solved by "analysis of traits" in PHYLOCOM or "pic3" in  R package picante, the latter is removed and the former is too much for me.  would you recommend other ways or literature?

    thanks in advance.


    What is the minimum genetic distance required (by K80 model) to differentiate the species into two varieties ?


    I am barcoding plant populations of species that are geographically separated. I am getting some genetic divergence between the same species (intraspecific) but, I am not sure if it is significant enough to propose the two varieties. I would appreciate if some one can help me on deciding to criterion to infer the genetic distances.

    Jaspreet Kaur

    the taxa I am working on belong to same species but there is controversy regarding their varietal status. Thanks all for giving your valuable suggestions and I think Andrew is right about hybridization concept. I am not able to distinguish between two taxa with regard to varieties. The genetic distances I am getting in my samples are all random.

    Are chloroplast regions with relative less informative characters in one species, will show much more informative characters in other species?

    Dear all,

    Maybe it is widely known that there is a rank in informative characters contains in chloroplast regions (Fig.4  in American Journal of Botany 94(3): 275–288. 2007).

    Recently, I plan to find genetic variation between two subspecies.  I sequenced some chloroplast regions with  relative more informative characters, such as rpl32-trnL, trnQ-5'rps16, psbJ-petA, 3'rps16-5'trnK and atpI-atpH and other high rank regions.

    But, only one indel and one substitution are detected in rpl32-trnL and trnQ-5'rps16,respectively. I feel maybe it is not easy to detect enough variation even if more regions are sequenced.  So, I plan to use others species. 

    But I heard that, in some plant species, rank of protential informative characters is not always similar to Fig.4 in the formerly mentioned paper. Some low rank regions may show far more informative sites than high rank regions. If so, I will sequence more regions to find informative sites.

    So, is there anyone would like to share you experience in primer screening? Is it possible that a low rank region show far more informative site than high rank regions?

    Thanks in advance.

    Claudia Pätzold

    Hi Yue Li,

    It is propably a good idea to go with a Nuclear Single or Low copy gene. However if you are still interested in an improved chloroplast data set (e.g. for comparison with a nuclear region to search for indication of hybridization events), I would reccomend a Look at a paper series called "The Tortoise and the Hare", in which relative informativeness of up to 21 noncoding cp regions were compared for different angiosperm families. Maybe you will find a region with higher variability in there.

    Godd luck to you

    Can I differentiate between closely related plant species?

    Kindly could anybody suggest any highly variable markers for discrimination for closely related plant species/lower taxonomic level identification. If DNA barcoding could able to go at that level? I have already used rbcla, matK, ITS2 , trnL-F and psbA-trnH but it didn't able to discriminate. Now, what should I use? any other highly variable intergenic spacers/NGS/RAD seq?

    Biswajit Bose

    Yes Mark, I want to do transcriptome sequencing...But, how to do that.

    I am working on the genus Nardostachys and Valeriana from family Caprifoliaceae

    Does anyone have an idea about the phylogenetic analysis of SSR allelic data?

    I have demonstrated a project, assessing the genetic diversity of breeds of an animal using SSR markers. I've got the allelic data from GeneMapperv.3.7 but i have no idea how to plot the dendrogram to show the genetic distance and what is the software through which I can calculate the Cophenetic Coefficient for my data? Can anyone guide me through this? 

    I have also attached an excel file containing the allelic data.

    Thank you

    Birutė Frercks
    • If you have the SSR data in binary matrix (so your object is not diploid), then I supose you to use the Treecon software (as I wrote in my last comment).
    • If you have the SSR data for diploids, then you don't need to convert it to binary matrix, just normalize the geneMapper data and use the Powermarker software
    What's the best nuclear marker to show evidence of hybridization and/or introgression in two congeneric snake species?

    I'll probably use Rag-1 or C-mos, but I don't know which one of these two is the best.

    Ignazio Avella

    Thank you everybody!

    Any advice on phylogenetic species evenness and phylogenetic clustering (sensu Helmus 2007) in bacterial communities?

    I calculated PSE and PSC in bacterial freshwater communities and get very low values for PSE (< 0.2) but constantly high values for PSC (>0.8). How does that go together? Shouldn't low tip clustering correlate with higher phylogenetic evenness? Sure the species abundance if often quite uneven which drags down PSE, but PSV is also low (<0.5 ). 

    Also: does anyone know of some studies that used these metrics in bacterial communities in order to get an idea of the range of values that are normally found?

    Basil V. Iannone III

    I apologize.  In my prior post, rho should equal -0.65 and not positive 0.65.  Thanks again for any thoughts.

    Can anyone tell why I am getting empty result file in Structure?
    Hi everyone I am running Structure programme smoothly. I scored data as o and 1and perform analysis but in the result files are empty at the end of the simulation. I tried both window 7 and 8 for running structure version 2.3.3 and 2.3.4 but the problem is still there. The data file I prepared as a binary matrix and species is assumed haploid input format is when uploaded with structure and programme were set with length of burning period 20000 and number of MCMC100000, using Admixture model with correlated allele frequency assuming set of population of 10. I set K=2 to K=10 the programme run smoothly but I could not get result file even after so many change. One more thing want to share that I am getting the value of Ln Like= --- and Ln PD = 0 is this the reason of no results. and if this the reason then how I can remove it Kindly suggest what is needed to be done exactly
    Mahmoud Magdy Elmosallamy

    Right click on the program icon => run as administrator. This should be an alternative to changing the program directory.

    Best wishes

    Ecological niche modeling in freshwater fishes: What is the best way to use ENMs to predict fish distributions? How can we make better data layers?
    Ecological niche modeling is increasingly important for understanding the factors that shape species distributions, as well as testing biogeographical hypotheses about species past, present, and future distributions as well as the role of ecology in speciation. However, most niche modeling work has focused on terrestrial and marine species (sound like conservation biology, in general?). I have previously used MAXENT to develop and project models of fish distributions, and the models we have published exhibited excellent predictive performance. And I am very interested in continuing to do so, particularly through coupling ENMs with phylogeographic analyses, and/or using them to inform phylogeographic hypotheses testing. However, I am skeptical of all models to some degree, and I am wanting to learn whether other techniques exist that would be more suitable for freshwater fish ecological niche modeling and paleoclimatic modeling, other than MAXENT (which is obviously most convenient for me). I am also interested in what the best data layers are for ENM analyses of freshwater habitats. I always want to learn more about these topics, so I figured I would ask here.

    So, first, do such 'better' ENM models exist that could/should be used instead of or in combination with MAXENT? And, if so, what is required to run such other models, and how would the assumptions of these potentially 'better' models differ from those of MAXENT in different cases?

    Second, it seems that a limitation of ecological niche modeling for freshwater taxa is a lack of sufficiently high resolution data layers for aquatic habitats. However, I am unsure about geospatial data repositories or resources for generating more suitable layers, and I would like specific advice about GIS procedures and data layers for making better data coverages. I am aware that some people are already doing this, but usually at very fine spatial scales. The broader community of ecologists and evolutionary biologists interested in fishes would therefore benefit much more from more comprehensive coverages.

    FYI, I should indicate that I am not really interested in using masks over bioclimatic variables to restrict model output to the boundaries of stream and river networks, because many pilot analyses I have run on North American species suggest this does not add much or produce different results relative to running the models without such masks. So, I would prefer to avoid such discussion unless you know or can show me that doing so improves model performance. Thanks in advance for your replies. Take care.
    Ursula Torres

    Hi Justin,

    I agree with you, we are lacking global datasets for freshwater habitats and I feel that they are bit ignored compared to terrestrial habitats.

    I am really interested to know if you managed to find ANY data layers for aquatic habitats.

    Thanks in advance


    Do you have any experience or any papers about the quartet method?

    The quartet method is one of the methods for a molecular clock. The method has some advantages over the other methods and some weaknesses against other methods. So, I search about strengths and weakness of the quartet method, if you have any experience or paper, explain about it, please.

    Valiollah Yousefi

    These papers are useful for your work.

    + 6 more attachments

    Is there a convenient software package for drawing minimum spanning networks for phylogenetic studies?
    I have struggled to do this with Powerpoint, but am sure that there must be simpler, better, more efficient software programmes for this purpose. Any suggestions? How have others done this? This is a good examples:
    Caner Aktas

    My pleasure, feel free to ask further questions.

    Why are common ancestors extinct?
    Look at any phylogenetic tree we will find the internal nodes. However, do we have the common ancestors which are occupying those internal nodes in the real world? Sometimes we come up with fossils of missing links. But why the common ancestors of the missing links are mostly extinct?
    Robert A D Cameron

    When we deal with terminal branches, species alive now, the question seems to me to be a matter of definition only. An ancestral species may split into two or more lineages over time. We can separate the lineages, but since we do not have the relevant data for the actual ancestor, we do not know to what extent each has changed. It is likely that the various lineages have changed to different extents, as well as in different ways, relative to their common ancestor, especially if selection has influenced which changes persist. Whether you call any terminal lineage "ancestral" depends on your assessment of the amount of change that has happened, and on your sense of how much (or how little) change permits you to regard forms as being the same or different species. And you don't have real data for the actual ancestor!

