Questions related to Phylogenetic Systematics
I'm currently learning the how's and why's of bioinformatics, this is the third three I build and I noticed that the more sequences I add, the lower is the bootstrap.
My sequences were trimmed using trimAl I choose to obtain them with no gaps. The evolutionary model was chosen using the Jmodeltest software and the phylogenetic tree was generated on Mega.
This is a Maximum Likelihood tree, my Log Likelihood is -24465.21, the three were built using the General Time Reversible model, G+I, gamma 6, NNI, 25 threads, and 500 Boostrap reps.
Can anyone elucidate why this is happening and what I should have in mind when constructing the next three?
Thank you very much
We used a single locus analysis (ITS) for the provided set of FASTA sequences. We're hoping you could give us advice or tips to make a cohesive interpretation of the phylogenetic tree given the limited data that we have which are morphological characteristics and genetic sequences only.
I'm working with sequence data from a set of closely related species and along with other indels and SNPs there are microsatellites regions in the sequences.
I was planning on building a character matrix using complex indel coding, which from my understanding requires the sequences to all share a 5' or 3' end. Where I'm running into trouble is that there is some variation in the repeats as well, with the occasional insertion or deletion giving me trouble aligning (ex. the repeats are mostly AT but there's the occasionally extra A or T).
I can't only rely on the microsatellite data, so is there a way to build a character matrix that differentiates the taxa based on the repeat number and the variations in them (insertions, deletions, and substitutions) rather than just depending on the length?
Are you an expertise in getting ancient DNA? Currently I am looking researchers who have expertise in getting DNA from old herbaria collections. Even better if you have it with Ascomycota (Fungi). If so and you like to collaborate with taxonomist to work in projects about systematics, evolution and biogeography, please contact me at firstname.lastname@example.org or email@example.com and write in the subject of the email "ancient DNA". I am looking for collaboration to learn these technics but also you will be coauthor of the results of this project that start in July.
The new variety is being of natural hybrid origin and it has ornamental values. It has not been described or discovered before as far as I have concerned.
The website for the web interface for the genealogical sorting index (gsi) is down, and it seems hard to find the r-package as well. Therefore I need an alternative. What do you suggest? I need to test whether morphotypes, as well as ploidy levels are sorted according to the phylogeny, in all may cases they don't seem to be monophyletic, but I see a pattern, but I need to test it it is significant.
Hi, I can't find published papers using the make.simmap function in phytools (R) for discreet characters, however it is supposed to implement a similar method to SIMMAP 1.5 and it runs better then the original program. I've had problems imputing a matrix with uncertainties (?) in SIMMAP 1.5, and have to manually edit a huge matrix every time I run the program. Any reason to prefer the original program instead of Revell's function for R? Has anyone tested both with similar results? Thanks a lot.
How do we select the best outgroup for constructing phylogenetic tree? Suppose if my isolate is gram positive. Can I select the outgroup from gram negative group.
Sequenced a marine actinobacterium for 16S, assembled and identified in Eztaxon. Obtained 98.58% similarity with Citricoccus yambaruensis. Can it be a new species? Can I go for whole genome sequencing to find out the novelty?
Seeking for valuable comments and directions...
Thanks in Advance.
The question stated above is the title of a book review (see http://www.journals.uchicago.edu/doi/pdfplus/10.1086/694936?utm_content=bufferaebfd&utm_medium=social&utm_source=linkedin.com&utm_campaign=buffer ). I thought it would be interesting to read both opinions about the book reviewed ("The Future of Phylogenetic Systematics: The Legacy of Willi Hennig.") and colleagues' own answers to the questions: "Does the Future of Systematics Really Rest on the Legacy of One Mid-20th-Century German Entomologist?"
I´m including P. orbignyi in a population analysis for conservation genetic study. In GenBank are a few sequences; but, into them, there is not any sequence from Brazil, where was originally described the distribution of this stingray. And it is important in order to used as a start or reference point f genetic analysis.
There are many methods used in phylogeny viz. Maximum Likelihood Tree; Neighbor-Joining Tree; Minimum evolution Tree; UPGMA TREE & Maximum Parsimony Tree.
I have simply two questions
Whats a criteria for new species and how molecular studies will support?
Which will be best method for species relationship or identification of new species/genus?
I need compare a particular protein coding (nucleotide) sequence and amino acid sequence with vertebrates and non-vertebrates and understand the mechanisms of evolution and relatedness using phylogenetic analysis. I'm confused between distance methods, parsimony, maximum likelihood, and neighbor joining methods for correct analysis. Also how to interpret the phylogenies and construct trees for the nucleotide and amino acid sequences. Kindly suggest me on the problem and right computational tools for the analysis.
My lab mates and I unable to conduct an SH test in RAxML on CIPRES. We have used a .phy file for our input, deselected "Configure Bootstrapping," supplied our best tree in a .tre file from our previous run using the same .phy file under "Simple Parameters', selected "Compute a log likelihood test," supplied our "Bootstrap_Result" file. Did we miss a step/is there a better way to do this?
The aims are as basically similiar as my previous question (Link has been attached). These photos are screenshot from PAST software that I used.
The aims are to know relationships among individuals in one species and to classify into two groups: good and bad individual which will be used to increase a population and to control it.
I want to reconstruct the phylogenetic trees of plants and insects.
What genes do you suggest when reconstructing the plants(Moraceae) phylogenetic tree? I have used the ITS, G3pdh, GBSSI, ncpGS and ETS, but there are some problem when sequencing. Is there are something wrong with my gene?
As for the insects (Hymenoptera), I want to use the mitochondrial cytochrome c Oxidase subunit, and what gene & primers do you suggest?
Morphological characters recorded include colony type, slerite form and size, colour.
Which software do you recommend?
I have a maximum likelihood tree (18S and COI) for a nematode family which contains 4 different genera. The bootstrap values were very low and all the 4 genera are intermingled in both trees with no separation per region or genus. Does this mean that the genera separation is not supported?
Basically, there are three modern schools of taxonomy: cladism, evolutionism and pheneticism. While the debate continues between the first two schools, did the third one completely faded out? If so, what was the last paper or the last book published advocating this classification scheme? If not, who does still advocate it?
EDIT: My question has been misunderstood by some. I am asking if you know someone or some modern papers promoting polyphyletic taxa in classification (= pheneticism) like Sneath and Sokal.
Recently,I read many papers about the selection pressure.In some papers,the author use the Codeml soft to detect the selection pressure of the internal and external in phylogenetic tree .I don't know the meaning of "internal"and "external",what's more,how do we set up the two in our phylogenetic tree and use the Codeml to detect them?
Thank you very much for your all answer!
What are the most state-of-the-art methods to build a phylogeny with missing data? I have a 12 genes and 50 species, but some genes are missing for some taxa. There are genes were have only little coverage. Supertree? Supermatrix?
In some cases I have more than one allele for a given gene. Can I somehow incorporate this to the final tree? e.g. LFY Allele A and Allele B for Species A, GBSSI Allele A and B for Species B, while other species have only 'Allele A' only...
In some cases I have duplicate sequences from different individuals of the same species. e.g. 3 ITS for species A, 5 GBSSI for Species B, 1 trnL for Species C etc. Is there a way to combine them together in one analysis?
This work really reflects a consensus of the work done on this issue?It has any connection with the theory of endosymbiosis? There is a phylogenetic relationship from its origin? Be right handle the classification by or Supergroups Kingdoms?They correlate with studies of genomes of viruses? From that point of view the classification should be done ?.
In the images that I uploaded is a summary before 2015 classification of eukaryotes.
Let's discuss this important topic, thanks for your contribution.
I want use outgroup for these bacteria. and I don't know is this possible that I use all of them in one tree?
Guide me please...
How can I use the RAST annotated microbial genome data to study phylogenetic relationship of a particular gene with other closely related organism?
I'd like to amplify Archaeal 16S in order to do phylogenetic analysis. I know that for bacteria the standard primers are 27F and 1492R, but I would like to know which ones are used for Archaea.
I mean can molecular data tell the authenticity of subspecies in a taxon ? Like it is a convention that difference of 2% genome leads to distinction of two populations as a separate species. Can it be inferred that between 1.5-2% difference in genome leads to conclusion of sub species ?
I would like to find taxonomic solution for the insects ie. aquatic bugs (Naucoridae) I am working with. I have already decided that I am going to approach with the genes of 16S rRNA and 28S rRNA for the aforesaid after went through the previous attempts of works related to this. Eventhough, I am looking for the valuable suggestions from the scientific communities those who are dealing with the related issues.
I am trying to build a phylogenetic tree with a few hundred 16S sequences of different sizes. About half my dataset is around 700 bp, the rest is complete sequences (around 1500 bp).
Obviously, when I align these sequences, a lot of positions are going to consist in gaps for half of my sequences.
I don't know what is going to hurt the quality of the final tree more : leaving these gaps, which don't really provide information and can lead to "wrong" clusters, or removing these positions, which comes down to removing information for the sequences that did have a base.
I guess another way to ask this question is to ask : is it a bad idea to try and make a tree with sequences of different lengths? Is there a way around these technical issues?
Thanks a lot for your help,
I retrieved 1500 sequences (from NCBI) of 6 different protein isoforms belonging to metazoans. Now, the issue is if it is possible to deal with such a large number of sequences in order to reconstruct the phylogenetic tree. What could be a clever strategy? Does this number pose challenge for dedicated software like PhyML, MEGA or MrBayes? What kind of these software should I use?
I want to perform phylogenetic analysis for a large number of protein sequences from banana. I have tried using few softwares but not able to get any reproductive data. Can anyone suggest me the apt. software to be used ? Thanks.
Moniliformis moniliformis might be the most likely species. The hooks in the proboscis is important in some literature, but it is difficult to distinguish since they do not seem to form lines..
In Genious I am getting the message "PAUP* block is illegal for the following reason: If you are doing bootstrapping, you must specify the file to save the boostrap consensus to."
I don't know if my block have errors:
LOG STAR FILE=ITS_parc.out;
HSEARCH START=STEPWISE ADDSEQ=RANDOM NREPS=2000 SWAP=TBR;
DESCRIBETREES 1/PLOT=PHYLOGRAM BRLENS=YES ROOTMETHOD=OUTGROUP OUTROOT=MONOPHYL;
SAVETREES FILE=ITS_parc_1_1.tre BRLENS=YES FROM=1 TO=1 ROOT=YES;
CONTREE ALL/ MAJRULE=YES PERCENT=85 TREEFILE=majrule.tre;
SAVETREES FILE=ITS_parc_1_1_contree.tre FROM=1 TO=1 ROOT=YES SAVEBOOTP=NODELABELS MAXDECIMALS=0;
HSEARCH ADDSEQ=RANDOM NREPS=2000 SWAP=TBR;
DESCRIBETREES 1/PLOT=PHYLOGRAM BRLENS=YES ROOTMETHOD=OUTGROUP OUTROOT=MONOPHYL;
SAVETREES FILE=ITS_parc_1_1.tre BRLENS=YES FROM=1 TO=1 ROOT=YES;
BOOTSTRAP NREPS=2000 SEARCH=HEURISTIC;
SAVETREES FILE=ITS_parc_1_1_cboot.tre BRLENS=YES FROM=1 TO=1 ROOT=YES;
I have a plan to conduct a study about the different cultivar of cassava found in Romblon. And, to determine the phylogenetic tree of cultivar, what kind of morphometric analysis should I use?
Thank you. :)
Is is legitimate to use geographic occurrence, specifically altitudinal/bathymetric range of species, as character states to use in an ancestral-states reconstruction? Can origins be inferred this way? Are there examples in the literature?
Phylogeny indicates a deep node between clade for Sturnidae and basal Rhabdornis clade in Zuccon, Dario; Cibois, Alice; Pasquet, Eric & Ericson, Per G.P. (2006): Nuclear and mitochondrial sequence data reveal the major lineages of starlings, mynas and related taxa. Molecular Phylogenetics and Evolution 41(2): 333–344. Still, the placement of Rhabdornis within Sturnidae requires confirmation.
Can I Use the EzBiocloud assemble service to merge F and R sequence? I heard that this will remove the bad sector sequences by automatic trimming and mergin them together. My question is: Can I use the assemble service to merge my sequences (Good quality sequence) when looking for new genuses? Kindly any one help me to fix this. This is very much important for my research. I hope some experts will give me a lead.
Thanks in advance
This is the phylogenetic tree constructed by a company using MEGAN 4.70.4, but I am not sure how to interpret the value. Can anyone explain to me?
Can anyone interpret the phylogenetic relationships for species J and species K (labelled as J1-J3 and K1-K3)? Both species are morphologically identified as distinct species. Gene A cannot differentiate both species, but Gene B SEEMS to differentiate slightly? Can anyone lend me your expertise? Thanks.
I'm using the "traditional search" on the TNT. Even after do optmization of characters, the found trees have bad parameters, for instance: Length = 79 ; CI = 0.4 ; RI = 0.7. Perhaps using another algorithm, these parameters change.
I am working on success rate of DNA barcoding in identification of species using distance and tree based methods.
Regarding distance based method, I have used Adhoc and species identifier and confirmed species identification using best closest match criteria (BCM).
Please suggest program/software/method for tree based identification (using NJ, parsimony, bayes), exhibiting clustering (number of clusters formed for particular species) that will suggest singleton (=Ambigious) species. (Pls see the attachment)
(Please Note: It is not possible to do it manually as iam having >4000 specimens)
Answers detailing methods or softwares is appereciated! Thank you!
I have been interested on the theoretical foundations of the Systematics, especially when this area meets molecular biology and started to ignore other information sources as chromosomes or morphological traits.
This union was very strong and congruent with the neo darwinian definition of evolution as a process of change in the frequencies of alleles. But in my opinion the causal pluralism that defend the Developmental systems theory in the relation Genotype-Phenotype (and additional critics and proposal as Extended Inheritance) demand a new definition of evolution and by consequence a new systematics framework, that could make clear how build the history of the living things.
The RelTime algorithm of Tamura et al. (2012 PNAS 109:19333-8), has been available for some time in MEGA6 but I see few people using it. Why is that? What concerns do people have with this method versus BEAST / R8s / PAML?
If I have two species belonging to the same genera, let say species A and species B, if species A has 3 specimens and species B has 4 specimens when calculating the genetic distances within the species there will be a total of 9 comparisons. That is 3 comparisons for species A (A1xA2, A1X A3, A2XA3) and 6 comparisons for species B (B1xB2, B1xB3, B1XB4, B2XB3, B2xB4, B3xB4).
Does anyone know if there is software which can calculate the total combinations?
Some bacteria have very high species diversification with various phenotypes and genotypes. For example, Bacillus subtilis have a lot of subspecies and strains carrying different properties such as enzyme production. Strepmyces is another example with lots of species and subspecies. What are the reasons? Does it mean that those bacteria are more active in gene exchange? Does it mean they are evolutionary more dynamic than other bacteria?
I found in MEGA5, after bootstrap test, there was no scale bar in my bootstrap consensus tree. While in the original tree, there was a scale bar. Does anyone know why this happens in MEGA5?