Science topic

Bioinformatics - Science topic

Explore the latest questions and answers in Bioinformatics, and find Bioinformatics experts.
Questions related to Bioinformatics
  • asked a question related to Bioinformatics
Question
4 answers
What are good resources for an undergraduate student to start getting familiar with bioinformatics and, if possible, get some practical experience? Any favorite websites, blogs, videos, etc?
Thanks!
Relevant answer
Answer
Thank you all!
  • asked a question related to Bioinformatics
Question
3 answers
Hello. We understand that a volcano plot is a graphical representation of differential values (proteins or genes), and it requires two parameters: fold change and p-value. However, for IP-MS (immunoprecipitation-mass spectrometry) data, there are many proteins identified in the IP (immunoprecipitation group) with their intensity, but these proteins are not detected in the IgG (control group)(the data is blank). This means that we cannot calculate the p-value and fold change for these "present(IP) --- absent(IgG)" proteins, and therefore, we cannot plot them on a volcano plot. However, in many articles, we see that these proteins are successfully plotted on a volcano plot. How did they accomplish this? Are there any data fitting methods available to assist in drawing? need imputation? but is it reflect the real interaction degree?
Relevant answer
Answer
Albert Lee : the issue with doing this is it makes the fold changes entirely arbitrary. Imagine I have a protein I detect in my test samples at "arbitrary value 10" but do not detect in my control samples at all.
If I call the ctrl value 0.5, then 0.5 vs 10.5 = 20 fold increase.
If I call the ctrl value 0.1, then 0.1 vs 10.1 = 100 fold increase.
If I call the ctrl value 0.0001, then 0.0001 vs 10.0001 = 100,000 fold increase.
In reality, the increase is effectively "infinite fold", but what this is really highlighting is that fold changes are not an appropriate metric here.
A lot (most) of statistical analysis is predicated on the measurement of change in values, not "present/absent" scenarios.
For disease biomarkers, for example, something that is present/absent is of use as a diagnostic biomarker, but not as a monitoring biomarker: you can say "if you see this marker at all, you have the disease", but you cannot really use it to track therapeutic efficacy, because all values of this marker other than "N/A" are indicative of disease.
For monitoring biomarkers you really want "healthy" and "diseased" values such that you can track the shift from one to the other.
David Genisys: I agree with Jochen Wilhelm , and would not plot my data in this manner.
A lot will depend on the kind of reviewers you get, and the type of paper you're trying to produce, but it would be more appropriate to note that these markers are entirely absent in one group, and then to comment on the robustness of their detection in the other. You wouldn't run stats necessarily, because as noted, stats are horrible for yes/no markers, but you could use the combination of presence/absence and actual level of the former to make inferences as to biological effect. If a marker goes from "not detected" to "detected but barely", then it might be indicative of dysregulated, aberrant expression behaviour, or perhaps stochastic low-level damage. Interesting, but perhaps not of biological import or diagnostic utility. If instead if goes from "not detected" to "readily detected, at high levels", then it's probably very useful as a diagnostic biomarker, and also indicative of some active biological process, be it widespread damage/release, or active expression of novel targets.
In either case you can make biological inferences without resorting to making up numbers so you can stick them on a volcano plot (and to be honest, if you get the kind of reviewers that demand volcano plots, you can always use the trick Albert suggests).
Volcano plots are primarily a way to take BIG DATA and present it in a manner that allows you to highlight the most interesting targets that have changed between groups: if you have whole swathes of genes that are instead present/absent, then those could be presented as a table, perhaps sorted by GO terms or something (if it looks like there are shared ontological categories you could use to infer underlying biology).
  • asked a question related to Bioinformatics
Question
3 answers
Hi, I am working on protein-protein interaction studies, specifically on antibody-antigen interaction. I would like to observe the changes in interaction if there's mutation occurs in the protein. Could anyone suggest a tool that can be used to induce substitution mutation to a targeted amino acid of a 3D protein and tools to validate that the mutation is not a nonsense mutation that produces truncated protein?
Relevant answer
Answer
Hey,
You need to consider a few things:
  1. Nonsense Mutations: Regarding your concern about nonsense mutations leading to truncated proteins, it's important to note that you don't need 3D modeling tools for this. Nonsense mutations, AKA stop-gain mutations, can be identified through basic sequence analysis since they involve a codon change that introduces a premature stop codon. Therefore, any sequence analysis tool that can read and interpret genetic codes can be used to identify if a mutation is a nonsense mutation.
  2. Mutation Induction: To induce substitution mutations at targeted amino acids in a 3D protein model, you can use software like UCSF Chimera (or Chimera X ). These tools allow you to manipulate amino acid residues.
  3. Protein Folding Prediction: If you're interested in how these mutations might affect protein folding, ChimeraX can integrate with AlphaFold. This integration can help predict how the altered amino acid sequence might fold. However, it's important to remember that structural predictions may not provide direct insights into the functional impact of the mutations. I'm not sure how informative this approach would be, but you can check out this video: https://www.youtube.com/watch?v=H-pDs9rZtkw
  4. Functional Analysis of Missense Mutations: For a more reliable approach to missense mutations, it's advisable to consult databases and tools that provide functional insights. As of 2023, a valuable resource for this is AlphaMissense - . AlphaMissense is specifically designed to predict the functional impact of missense mutations, offering a more targeted approach to understanding if these changes alter the function of the protein. They probably already tested your mutations, and you can find the score in the tables attached to the article.
  • asked a question related to Bioinformatics
Question
2 answers
I want to find the UTR sequence of mRNA sequence of bacteria protein. Can anyone suggest a insilico process for that
Relevant answer
Answer
Hi Harshita
lot of possibilities, but the main ones are to go to the NCBI or UCSC database (for instance, just type NCBI XXXX YYYY UTR region, where XXXX is your bacteria and YYYY your gene) in google.
or just give the species and target in research gate...maybe someone could answer ;)
all the best
fred
  • asked a question related to Bioinformatics
Question
3 answers
As of now, there is no public database available for this kind of sample to take as a control.
Relevant answer
Answer
To gain insights from your proteomic data in the context of pathways:
1. Protein-protein interaction networks: Construct protein-protein interaction (PPI) networks using available databases or tools. These networks represent the physical interactions between proteins and can provide insights into functional relationships and pathway associations. Analyze the network topology, identify highly connected proteins (hubs), and explore protein clusters or modules that may represent enriched pathways.
2. Functional enrichment analysis: Perform functional enrichment analysis using tools such as DAVID, Enrichr, or g:Profiler. These tools allow you to input a list of proteins and assess enrichment of Gene Ontology (GO) terms, biological pathways, or other functional annotations. This analysis can help identify overrepresented functions or pathways in your protein dataset.
3. Cross-referencing with gene-level data: If available, consider integrating your proteomic data with gene-level or transcriptomic data from the same samples or a related study. By mapping proteins to corresponding genes, you can leverage gene-level pathway analysis methods and identify pathways enriched with differentially expressed genes associated with the proteins of interest.
4. Literature-based analysis: Conduct a literature search to explore existing knowledge and studies related to the proteins identified in your proteomic dataset. Look for studies that have investigated the functions, interactions, or pathways associated with these proteins. This qualitative analysis can provide valuable insights into the potential involvement of specific pathways in your disease sample.
5. Pathway databases: Explore curated pathway databases such as Reactome, KEGG, or WikiPathways. These databases provide well-annotated pathways and can serve as a reference to investigate potential connections between your identified proteins and known pathways. Look for proteins within your dataset that are annotated to specific pathways of interest.
Remember that pathway analysis based solely on proteomic data has limitations.
Hope it helps:credit AI
  • asked a question related to Bioinformatics
Question
4 answers
I'm on the lookout for remote bioinformatics and computational biology opportunities where I can actively contribute to research projects. Compensation is not a priority for me; my main focus is to gain hands-on experience in these fields.
#biopython
#computational_biology
#bioinformatics
#biology
#R
Relevant answer
Answer
Avenues you can explore to find such opportunities:
1. Academic research institutions: Many universities and research institutions offer remote research positions or internships in bioinformatics and computational biology. Check their websites, job boards, and reach out to individual researchers or research groups who align with your interests.
2. Online job portals and platforms: Websites and platforms dedicated to remote work, such as LinkedIn, Indeed, and Upwork, often have listings for bioinformatics and computational biology projects. You can search for specific keywords like "remote bioinformatics," "computational biology," or "bioinformatics internships" to find relevant opportunities.
3. Open-source projects: Contributing to open-source bioinformatics projects can provide valuable hands-on experience. Explore bioinformatics software and libraries like Biopython, Bioconductor (for R), or other popular tools on platforms like GitHub. Contribute to their development, report issues, or collaborate with the community.
4. Online communities and forums: Engage with online communities and forums focused on bioinformatics and computational biology. These platforms, such as Bioinformatics Stack Exchange, BioStars, or community forums associated with specific software packages, often have job boards or project collaboration opportunities shared by researchers or organizations.
5. Networking: Attend virtual conferences, webinars, and workshops related to bioinformatics and computational biology. Connect with researchers, presenters, and fellow attendees to express your interest in remote research opportunities. Networking can often lead to potential collaborations or recommendations for available positions.
When searching for opportunities, it's important to tailor your search keywords to include relevant terms like "remote," "internship," "volunteer," or "project-based." Additionally, clearly communicate your enthusiasm, willingness to contribute, and desire for hands-on experience in your application materials or when reaching out to potential mentors or supervisors.
Hope it helps:credit AI.
  • asked a question related to Bioinformatics
Question
2 answers
Hi,
I am beginner in "Bioinformatics" and want to learn " how to analyse bacterial and fungal genomic data?". Would you suggest me some materials and sources so that I can devleop myself?
Note: My interest is now on " Bacterial and fungal genome and proteome analysis by using bioinformatics"
Relevant answer
Answer
Thanks
  • asked a question related to Bioinformatics
Question
4 answers
Hello,
I am trying to construct phylogenetic tree of HIV-1. I downloaded sequences from few neighbor countries from Los Alamos HIV database. After aligning and trimming the length of sequences is usually 722 nucleotides. I can't trim less, because there are a lot of gaps within alignment file. When I construct Maximum Liklehood tree in FastTree or PhyML, the branches look very short. What could be a possible reason for it?
If 722 nucleotides length sequences can be used for constructing reliable phylogenetic tree?
Thank you!
Relevant answer
You can also give it a try with the MEGA Platform.
  • asked a question related to Bioinformatics
Question
4 answers
Hi,
I am beginner in "Bioinformatics" and want to learn " how to analyse bacterial and fungal genomic data?". Would you suggest me some materials and sources so that I can devleop myself?
Note: My interest is now on " Bacterial and fungal genome and proteome analysis by using bioinformatics"
Relevant answer
Answer
You can start by taking courses in bioinformatics from Coursera.
  • asked a question related to Bioinformatics
Question
5 answers
Hello fellow researchers,
I wanted to start a discussion on the exciting topic of the future of bioinformatics and its evolution. Bioinformatics has come a long way in recent years, but there are undoubtedly new frontiers to explore and challenges to overcome. What are your thoughts on the current trends, emerging technologies, and the potential impact of bioinformatics in the years to come? I'm eager to hear your insights and predictions on the future of this rapidly evolving field.
Relevant answer
Answer
I think the most relevant avenue to be pursued is eliminating the term 'informatic' that constraints the filed to a purely technical and ancillary role of 'software development' while what is needed is to develop a new 'biological statistical mechanics' allowing to face the complexity of biological systems after the total failure of deterministic 'gene-centric' era.
  • asked a question related to Bioinformatics
Question
4 answers
I am reaching out to #researchers in the field of #Biochemistry, #Biophysics and #Bioinformatics, for collaborative partnership in scientific research. The researcher should be academic staff at the tertiary institutions in following listed countries:
#Afghanistan
#Angola
#Bangladesh
#Belarus
#Belize
#Benin
#Bhutan
#Burkina Faso
#Burma
#Burundi
#CaboVerde
#Cambodia
#Cameroon
#CentralAfricanRepublic
#Chad
#Comoros
#Congo
#CookIslands
#Cuba
#Democratic People's Republic of Korea
#Democratic Republic of the Congo
#Djibouti
#Dominica
#EquatorialGuinea
#Eritrea
#Eswatini
#Ethiopia
#Gambia
#Ghana
#Grenada
#Guinea
#Guinea-Bissau
#Guyana
#Haiti
#Iran
#IvoryCoast
#Kenya
#Kiribati
#Kyrgyzstan
#Lao People's Democratic Republic
#Lebanon
#Lesotho
#Liberia
#Madagascar
#Malawi
#Maldives
#Mali
#Marshall Islands
#Mauritania
#Micronesia (Federated States of)
#Mozambique
#Myanmar
#Nauru
#Nepal
#Nicaragua
#Niger
#Niue
#Palau
#PapuaNewGuinea
#Moldova (Republic of)
#Rwanda
#SaintHelena
#SaintLucia
#SaintVincent and the #Grenadines
#Samoa
#SaoTome and #Principe
#Senegal
#Sierra Leone
#SolomonIslands
#Somalia
#SouthSudan
#Sudan
#Suriname
#Syrian Arab Republic
#Tajikistan
#Timor-Leste
#Togo
#Tokelau
#Tonga
#Tuvalu
#Uganda
#Ukraine
#Tanzania (United Republic of)
#Vanuatu
#Yemen
#Zambia
#Zimbabwe
Interested researcher should kindly email to hezesapience@gmail.com with the subject: Research Collaboration from "your country".
Thanks.
Toluwase H. Fatoki
Visionary @ Heze-Sapience International, Nigeria.
Lecturer @ Department of Biochemistry, Federal University Oye-Ekiti, Nigeria.
Relevant answer
Answer
And why don’t you want any collaboration from Nigeria?
  • asked a question related to Bioinformatics
Question
2 answers
Our lab have a bioinformatics project about developing a functional enrichment software. We have several ideas but we realize we need real feedback from wet lab researchers as well to make sure our functional enrichment web application will be reliable and useful for all of you.
Therefore, if you are a wet lab scientist who have experience using functional enrichment software (such as Metascape, DAVID, etc), what kind of questions do you want to address in the functional enrichment result? Are there any information that they are still unable to give to you?
Relevant answer
Answer
so, what is your experience when you use a functional enrichment analysis application? do you think its result satisfy your expectation or you think it still has a room for improvement?
  • asked a question related to Bioinformatics
Question
3 answers
I already know the pathway but want to know the upstream lncRNAs that regulates that pathway using the datasets and bioinformatics.
Relevant answer
Answer
I think you can do correlation analysis between lncRNA expression, For each gene within the pathway, calculate the correlation coefficients between its expression and the expression of all known lncRNAs (list can be obtained from LNCipedia etc). then using the approriate threshold can narrowdown the list.
or the other method,
As we know the pathway then get the gene list for that pathways and can Search for lncRNAs that are located near the genes in the pathway and may act as cis-regulatory elements/ trans by using tools that predict the potential of lncRNAs to act as trans-regulatory elements by interacting with genes at the transcriptional or post-transcriptional level like LncTar (https://www.cuilab.cn/lnctar)
hope that helps
  • asked a question related to Bioinformatics
Question
1 answer
In their website they mentioned it's IF is 5.8. But in the JIF2022 report, I did not find. Is it because of its inclusion in the Emerging Source Citation Index? and because of not included in the "Science Citation Index Expanded" Please help.
From where can I get valid IF. One more thing, this journal is not included in BioxBio, have checked.
Relevant answer
Answer
Hi,
check the ISSN of the journal in the master journal list (https://mjl.clarivate.com/home) or in the scientific journal ranking (SJR) (https://www.scimagojr.com/journalrank.php) site to see its IF and Q ranking.
If it is a valid journal, you should find its information on these sites.
BW
  • asked a question related to Bioinformatics
Question
1 answer
bioinformatics
Relevant answer
Answer
I guess it would depend on the context, but generally "frame" refers to a sequences that encodes a peptide/protein. "Query" is the user input, (ie sequences you enter) and "subject" refers to the reference sequence
  • asked a question related to Bioinformatics
Question
6 answers
Could someone explain to me why the p-value in the right column of the forest plot is different than the p-value in the test for effect in the subgroup?
I thought that these two p.values should be the same.
Relevant answer
Answer
Now coming to your table p-value in the right column of the forest plot is the p-value for the overall test of the treatment effect across all subgroups. It is calculated by combining the results of the individual studies in the meta-analysis. In this case, the p-value is 0.56, which is not statistically significant.
The p-value for the test for effect in the subgroup is the p-value for the test of the null hypothesis that the treatment effect in the subgroup is equal to zero. It is calculated using only the data from the studies in the subgroup. In this case, the p-value for the test for effect in the subgroup is 0.094035, which is statistically significant.
The two p-values are different because of the heterogeneity between the studies in the meta-analysis. The heterogeneity statistic (0.5) is very high, which indicates that there is a lot of variability in the treatment effects across studies. This variability could be due to a number of factors, such as different study designs, different populations of patients, and different treatment regimens.
When there is heterogeneity in the treatment effects across studies, it is more difficult to detect a significant overall treatment effect. This is because the variability in the treatment effects across studies can mask the true effect of the treatment.
In this case, the p-value for the overall test of the treatment effect is not statistically significant, but the p-value for the test for effect in the subgroup is statistically significant. This suggests that the treatment may be effective in the subgroup, but it is not possible to draw a definitive conclusion without further research.
It is important to note that a statistically significant p-value for the test for effect in a subgroup does not necessarily mean that the treatment is clinically effective in that subgroup. It is possible that the difference in the treatment effect is small or that it is not clinically meaningful.
To determine whether the treatment is clinically effective in a subgroup, it is important to consider the magnitude of the difference in the treatment effect and the clinical implications of that difference
  • asked a question related to Bioinformatics
Question
4 answers
Hello, I've recently been studying Ancestral Sequence Reconstruction (ASR), attempting to infer ancestral sequences of viruses. I understand that this inference is constrained by factors like sample size and models, and represents a plausible sequence that may have existed. However, I'm curious about whether directly comparing these inferred ancestral sequences holds biological significance. Can they reflect the differences among the extant sequences from various lineages that were used to infer them?
Relevant answer
Answer
Hongzhuang Chen I am afraid that you can lose a lot of information from such comparison. But, it can be applied (and very useful) to illustrate the differences supported statistically by analysis of the original data (sequences).
  • asked a question related to Bioinformatics
Question
3 answers
Dear All,
Ph.D. full-time position in Bangalore with fellowship:
Eligibility: M.Sc. Chemistry/Biochemistry/Biotechnology/Microbiology/Bioinformatics with first class of 60%.
GATE or UGC-NET or UGC-CSIR or SLET or JRF should be qualified.
RS 25,000 per month for full three years will be given.
For further details, contact me on: +919182864256. Call or what's app me for further details.
Relevant answer
I apologize and excuse the owner of the post. I would like to invite you to read my ebook and discover why microorganisms are so fantastic. https://www.amazon.com.br/dp/B0CF1VKKK8
  • asked a question related to Bioinformatics
Question
4 answers
I am trying to analyse mutation data for endometrial cancer obtained from different studies within several databases (COSMIC, cBioportal, Intogen). I have collated the data and grouped the mutations by gene. The focus of the analysis are non-synonymous coding mutations - because these mutations are most likely to cause a change in the normal protein function.
The aim of the study is to understand the mutational landscape of Endometrial cancer. The main objectives of the study are to find the commonly mutated genes in endometrial cancer, to find significantly damaging gene mutations in endometrial cancer and to create an updated list of genes comparable to commercial gene panels.
I have created this table with the collated data:
  1. Gene name
  2. Number of samples with coding mutations
  3. Frequency ( number of samples with coding mutations / total number of samples with coding mutation)
  4. CDS length
  5. Total number of unique coding mutations
  6. Number of unique coding: synonymous mutations
  7. Number of unique coding: non-synonymous mutations
  8. Mutation burden (number of unique coding: non-synonymoys mutations / CDS length)
  9. Composite score [(frequency of samples * 0.7) + (mutation burden * 0.3)]
The idea here is to use mutation burden to imply damaging effects of the genes' mutations in endometrial cancer. We then created a composite score to use as a comparable figure between the genes.
At the moment, our list of genes is at 16,000+. We are currently trying to think of a way to narrow down the list of genes to only focus on those significantly mutated compared to the other genes by way of statistics. Any advice is greatly appreciated.
Relevant answer
Answer
The significance of gene mutation burden in endometrial cancer data collated from different studies can be assessed using statistical methods such as Fisher’s exact test and logistic regression.
  • asked a question related to Bioinformatics
Question
2 answers
We had sent some phytoplankton samples for sequencing. And we had just received the generated sequences, and the next step was to do BLAST to identify what the phytoplankton that we sent is. Basically DNA Barcoding.
To give some context, when we send our samples for sequencing to the sequencing facility, they send us back two files, one for the forward sequence and another for the reverse sequence, based on the primers (forward and reverse) we gave.
So, the initial step involves us checking the quality of the sequences, specifically looking for any signs of low quality, ambiguity, or overlapping signals in the chromatograph.
Now, I'm a bit uncertain about the next steps.
The following step would be sequence trimming. To do this, I need to identify the start of each sequence by locating the primer sequence. This means finding the forward primer sequence in the generated forward sequence and doing the same for the reverse primer in the reverse sequence.
Afterward, I perform reverse complementation on the reverse sequence.
Following that, I conduct a pairwise alignment between the generated forward and reverse sequences and subsequently generate the consensus sequence.
My questions are, as I am a bit stumped with this (I apologize in advance, I'm a bit new with bioinformatics), (1) what if neither of the generated sequences have the primer sequences? Would that mean the sequences generated were of bad/low quality? and (2) Is this approach correct, or have I missed a crucial step?
Thank you!
Relevant answer
Answer
With some sequencing technologies up to the first 50 bases read tend to be unreliable so do not pass quality control. This means that often your primers are already cropped from the 5' end. I find it best to just align the forward and reverse sequences and see how much overlap you are getting.
  • asked a question related to Bioinformatics
Question
6 answers
I have extensively searched google scholar but I am struggling to find any groups who have previously used Rosetta to conduct ab-initio structure modelling of single-pass or membrane anchored proteins and I'm specifically not talking about homology modelling just ab-initio.
Please let me know if you have read any papers or know anyone who has done this,
thanks.
2nd year PhD student at University of Liverpool.
Relevant answer
Answer
Waqas Abbasi, Hi- No Rosetta-Membrane did not work well at all for this task and I would not waste your time attempting to do so, unless you have preliminary signs indicating positively for your protein (or your membrane anchored protein has some specific other attributes meaning it may work better). I would suggest firstly you thoroughly reading Anne Marie Honegger's post above and the link/paper they kindly provided there though.
I would strongly suggest you instead just try and use the new generation methods RosettaFold and AlphaFold2 as they seem to be able to position single/double TMHs away from the membrane-associated globular regions/domains of the chains better, albeit still fa from perfect. However, there does seem to be some signs in the near-ish future similar methods may be released better for such instances of TMH-anchored membrane proteins, but it remains to be seen.
Best of luck,
David
  • asked a question related to Bioinformatics
Question
1 answer
Dear all,
I'm working on the finer details of my experimental design, and have some questions regarding bridging channels for TMT based experiments.
I have two conditions to test, across nine biological replicates, in order to run as one 18-plex TMT-pro experiment.
I am aware of the use of one or more bridging channels being used with pooled samples to combine multiple TMT mixtures, however a colleague has mentioned that a bridging channel should also be considered for normalisation if only one set is used.
Does anyone have any experience using a bridging channel for normalisation in a single mixture? Is it worth sacrificing one or more biological replicates for?
I will be using MSstatsTMT for normalisation and summarisation.
Sam
Relevant answer
Answer
As an update to this discussion, I have decided to reduce my sample size and incorporate a pooled reference channel. Mostly to open up the possibility of integrating additional samples and conditions in the future.
Sam
  • asked a question related to Bioinformatics
Question
1 answer
Hello there,
I'm searching for reliable bioinformatics/immunoinformatics tools for predicting the immunogenicity of B-Cell Epitopes. Your expertise is invaluable! Could you kindly recommend any devices that have proven effective in this area? Your insights will significantly contribute to advancing our understanding of immunogenicity prediction.
Thank you in advance for your suggestions!
  • asked a question related to Bioinformatics
Question
8 answers
Molecular dynamics simulation , bioinformatics , molecular docking
Relevant answer
Answer
RAM= 32 GB or higher
Processor= Intel core i7 or higher
High-end GPU instead of CPU
Linux OS
I would suggest using a workstation instead of a laptop.
  • asked a question related to Bioinformatics
Question
5 answers
Are you familiar with Research4Life? It's a program that provides free or low-cost access to scientific research in low-income countries. Research4Life has two eligibility lists: Group A and Group B. Group A includes countries with the lowest gross domestic product, lowest human development index, and other factors that indicate lower-income countries. As an immunoinformatics, Bioinformatics and Molecular Modelling researcher, I'm calling on researchers from Research4Life's Group A countries to join me in collaborative research efforts. By working together and utilizing the program's valuable resources, we can advance our research and make a difference in the world. Best of all, with this collaboration, it will be completely free. #Research4Life #immunoinformatics #bioinformatics #molecularmodelling #collaboration
Relevant answer
Answer
I’m interested
  • asked a question related to Bioinformatics
Question
4 answers
Hello everyone; I am new to R programming. I want to calculate the firmicutes to Bacteroides ratio from my OTU table. I couldn't find the command and don't know how to do it. Please guide me on this.
I put an example of my OTU table.
Relevant answer
Answer
Thank you for this...
  • asked a question related to Bioinformatics
Question
1 answer
Hello,
I measured the distance between two centers of mass during a MD run using gmx distance. Even though the -oall file shows me that the distance changed over time the histogram file -oh puts 100% of probability on the last bin.
As this makes no sense does anyone have an idea on what happened?
Both files are attached
Thank you very much in advance and have a nice day!
Relevant answer
Answer
try adding the -len flag for the mean distance you are expecting and add the -binw flag for the bin width so you have less bins. It seems like it only makes so many bins and then last bin will have 100% probability if all the prior bins are unfilled. So for my example i had a distance of 5.1nm and i set the average to 3 and the binw to 0.1 like this -len 3 -binw 0.1
  • asked a question related to Bioinformatics
Question
10 answers
I have been trying to dock a certain protein with nd ion i downloaded from rcsb but after i add it to pyrx and try to convert it to ligand i get the following error. I tried converting the sdf file to pdb using pymol, chimeraX, avogadro, open babel but even then when i open the file it gives me this error: ligand: :UNK0:Nd and ligand: :UNK0:Nd have the same coordinates. Could someone please help?
Update: I want to dock an unbound protein with the neodymium metal ion which i downloaded from rcsb in sdf format and later tried to convert it to pdb using the aforementioned softwares for autodock to accept it but i can't get it to be accepted by autodock as a proper ligand. Apparently I am unable to get any of the rare earth elements to be accepted properly as ligands.
Relevant answer
Answer
Hello Piyush. I am not able to completely understand your problem. Did you download a protein with an ion "nd" that you want to re-dock with using pyrx? Or did you separately downloaded the ion file and want to perform docking with the unbound protein?
  • asked a question related to Bioinformatics
Question
4 answers
I am in urgent need of list Bioinformatics journals without APC
Relevant answer
Answer
thank you maryam ,
i have used this finder some times but not aware of these settings. ll use it soon. thanks
is there any other option available for the same
  • asked a question related to Bioinformatics
Question
3 answers
I know many websites have simple tools like transcription and translation available, but are there any analysis tools that researchers need that either do not exist or are not publicly available? It could be anything from algorithms to visuals. Thanks!
Relevant answer
Answer
Abhijeet Singh Thank you for your response and mentioning my earlier post! My belief is that researchers would know tools that are missing based on the fact that they would run into such problem often during their research. If there is some manual analysis task that researchers can automate, I believe that PeptiCloud can be the perfect platform to develop and make those tools publicly available. (For instance, PeptiCloud has a unique feature that allows users to further alter codon sequence of each amino acid after codon optimization with respect to a specific bacterial strain). With that being said, if you could check out PeptiCloud for yourself and see if anything could be added or improved, that would be greatly appreciated!
  • asked a question related to Bioinformatics
Question
2 answers
Hello All,
I am very new to bioinformatics and biological data , please bare with my question.
I have differential expression data of three, Parental cellines(drug sensitive ) and 10 isoforms (made resistant to the drug) by these three parental cells.
Is the data enough to generate a coexpression network.?
I Have tried constructing it using GWENA , and was also successful but I am not confident about it because of two reasons one number of samples and second can isoforms be treated as samples or not.
I would really appreciate any suggestions and anr reading resource that can be helpful in this regard.
Thankyou
Relevant answer
Answer
Thankyou so much Susanta for your reply ,
can you suggest way of network analysis on this data or any good resource to read relevant to this
  • asked a question related to Bioinformatics
Question
4 answers
In recent years, number of vaccine have been approved to fight against Covid-19, list of approved is available at FDA site. We are looking for sequence of these vaccine (RNA sequence in case of mRNA vaccines and amino acid sequence in case of protein based vaccines. I will highly appreciate help of community in searching sequence of vaccines.
Relevant answer
Answer
  • asked a question related to Bioinformatics
Question
4 answers
Greetings,
I have recently isolated a new E.coli phage and during the assessment of its host range, I discovered that this particular phage was effective against Pseudomonas aureginosa and staphylococcus aureus in wet lab experiments. However, upon examining the complete genome of the phage on NCBI, I noticed that it did not exhibit any similarities with known P. aureuginosa and S. aureus phages. Additionally, when I performed a blastp analysis on all the phage proteins in NCBI, I could not identify any homology with the aforementioned P. aureuginosa and S. aureus phages. Normally, I would expect to observe some degree of homology, especially in proteins responsible for recognition, such as tail proteins or lytic proteins.
My question is how I can determine the wide host range of the phage based on its genome. It appears that bioinformatic tools should provide information regarding the extent of the phage's host range. I would greatly appreciate your comments and recommendations on this matter.
Thank you.
Relevant answer
Answer
I don't think you can predict host range using bioinformatics tools There are so many subtleties that impact host range, both in terms of gene expression, chaperons that influence folding of proteins, and most importantly the interactions with the host receptor for phage binding and injection. We don't yet know enough to predict except in a very few well studies examples.
  • asked a question related to Bioinformatics
Question
23 answers
Here is list of Impact factor 2023.
Journal Citation Reports 2023
Relevant answer
Answer
This is not the complete list ... where are all the Human Resource Management journals, for example?
  • asked a question related to Bioinformatics
Question
4 answers
Has any of you ever done research in the field of bioinformatics?
Relevant answer
Answer
I am a bioinformatician, let me know what specific query you have. Bioinformatics has many subfields like genomics, proteomics etc.
  • asked a question related to Bioinformatics
Question
5 answers
I want to annotate each gene in the Homo sapiens taxon with its respective GO terms and its hierarchical parent terms in the GO database. How can I systematically do that? While I am aware that the obo file contains information such as "is a," "part of," and "regulates," it lacks a comprehensive hierarchy from child GO terms to all their parent terms. Is there an existing method available to achieve this systematic annotation, or do I need to develop a custom script to extract this information from the obo file?
Relevant answer
Answer
Mohammad Shahbaz Khan Certainly! Although the data is currently presented in Gene Ontology (GO) format, I want to create a comprehensive graph that visualizes the entire information. Further, I intend to annotate each gene with its corresponding GO term, including all parent terms associated with each gene.
  • asked a question related to Bioinformatics
Question
4 answers
I have been experimenting with machine learning in JavaScript, please, let me know also your experience! 😎🤗😍
In attachment a preprint!
Relevant answer
Answer
Feel free to add more details on your perspective!
  • asked a question related to Bioinformatics
Question
3 answers
Dear ResearchGate Community,
I am currently engaged in single-cell analysis for my research project and would greatly appreciate your insights and experiences regarding the use of Seurat and ScanPy.
I have been exploring both Seurat and ScanPy as tools for analyzing single-cell RNA sequencing (scRNA-seq) data. However, I would like to gather more information about these packages directly from researchers who have bioinformatic hands-on experience with them.
Specifically, I would be grateful if you could share your thoughts on the following:
1. Which package (Seurat or ScanPy) have you used for scRNA-seq analysis, and what were your primary reasons for choosing it? Is it depending on familiarity with programming languages (R for Seurat and Python for Scanpy)?
2. What are the notable features, strengths, or advantages of the packages you have worked with?
3. Were there any challenges or limitations you encountered while using the packages, and how did you address them?
4. Have you encountered any specific use cases or applications where one platform outperformed the other?
5. Are there any particular resources, tutorials, or best practices you found helpful when working with Seurat or ScanPy?
Your firsthand experiences and insights would be immensely valuable in helping me make an informed decision about which package to choose and understanding potential considerations for my single-cell analysis workflows.
Thank you in advance for taking the time to share your expertise. I look forward to hearing from you and benefiting from your valuable insights.
Best regards,
Emil Lagumdzic Institute of Immunology Department of Pathobiology
University of Veterinary Medicine Vienna
Relevant answer
Answer
Thank you, ChatGPT.
  • asked a question related to Bioinformatics
Question
4 answers
Is the hierarchical structure observed in the Gene Ontology (GO) OBO-basic file limited to the 'is a' relationship, or do the relationships 'has part' and 'regulates' also exhibit a similar hierarchical nature and can be propagated to the root?
Relevant answer
Answer
The hierarchical structure observed in the Gene Ontology (GO) OBO-basic file is primarily based on the "is a" relationship, which represents the parent-child relationship between terms. The "is a" relationship defines a broader term (parent term) and a more specific term (child term), indicating a hierarchical structure.
However, the GO also incorporates other types of relationships beyond "is a" to capture additional aspects of gene function. Two such relationships are "has part" and "regulates":
  1. "Has part" relationship: This relationship indicates that a term represents a part of another term. It describes a physical or functional subcomponent of a larger entity. While the "has part" relationship does not strictly follow a hierarchical structure, it provides additional information about the organization and composition of biological processes or structures.
  2. "Regulates" relationship: This relationship describes the regulatory interactions between terms. It indicates that a term controls or influences the activity or expression of another term. Similar to the "has part" relationship, the "regulates" relationship does not strictly conform to a hierarchical structure.
Although the "has part" and "regulates" relationships do not exhibit a hierarchical nature like the "is a" relationship, they can still be informative for understanding the functional relationships between terms. However, the propagation of these relationships to the root of the hierarchy is not as straightforward as it is with the "is a" relationship. The propagation of relationships to the root may require additional analysis and considerations based on the specific research context or requirements.
  • asked a question related to Bioinformatics
Question
5 answers
I am looking for data from mammals ideally, but I will take anything to be honest. I am getting to grips with bioinformatics and need a practice data set with which I can go through the steps of filtering and trimming and mapping to a reference genome etc..
If anyone also has any advice on tools used subsequently for analysis such as MethylKit that would be awesome.
Thank you
Relevant answer
Answer
The majority of bioinformatics tools offer sample data. Instead, you may use the data. For instance, test data may be found at https://github.com/FelixKrueger/Bismark.
  • asked a question related to Bioinformatics
Question
7 answers
I prefer to join 2 drug molecules (cocktail) using bioinformatics approach. Are there any tools available for it? Any software available where one can submit the individual structure of the drug molecules and receive the merged drug molecules?
Relevant answer
Answer
Susanta Roy Very many thanks for answering my query. I just have one more query. Is there any possibility to check for synergism between 2 ligands when docked with a protein? That is multiple ligand dockings? Any protocols
do you know for that?
  • asked a question related to Bioinformatics
Question
4 answers
I have a protein sequence with two cysteine residues and I would like to predict if those cysteins will form disulfide bonds.
I am looking for user-friendly tools to do this, either online tools or some other kind of easy to use software, since I am not well-versed in bioinformatics.
  • asked a question related to Bioinformatics
Question
7 answers
Please provide useful insights and general experiments required for designing lab manual.
Relevant answer
Answer
Sanjay Nagar Mohammad Shahbaz Khan Sabine Strehl Sanjay, with the addition of Mohammad's suggestions it looks to me that you now have information to make the best manual ever. - Add my "thank you" to Mohammed & Sabine.
  • asked a question related to Bioinformatics
Question
1 answer
Hi there,
I'm comparing the arrangement of a gene complex across different species to try and find clues about its evolutionary history. In some cases genes appear to have jumped around and switched positions, but I do not know if this is the result of recombination, or due to the orientation in which the chromosome has been assembled?
I'm taking data from the NCBI genome browser using ref seq chromosome level assemblies in each case. Does anyone know if there a standard direction that homologous chromosomes have to be uploaded in?
I imagine this is perfectly possible to do if you consider the positions of conserved genes at each end of the chromosome, but I would rather not have to do this myself if I know that it has already been accounted for...
Thanks,
Jake
Relevant answer
Answer
Yes, there is a standard orientation for chromosomes to be assembled in most genome sequencing and assembly projects. Chromosomes are typically assembled with a consistent orientation known as the "forward" or "+" orientation. In this orientation, the DNA sequence is aligned from the 5' end (start) to the 3' end (end) of the chromosome. This convention ensures consistency in the representation of genomic information across different studies and facilitates comparisons between different genomes.
The forward orientation is determined based on the directionality of DNA replication during genome sequencing and assembly processes. The DNA strands are typically sequenced in both directions, and the resulting reads are then aligned and assembled into contigs, which are further scaffolded to construct chromosome-level assemblies.
It's worth noting that in some cases, certain regions or genes within a chromosome may be inverted or have a reverse orientation due to specific biological features, such as gene rearrangements or evolutionary events. However, the majority of the chromosome is assembled and represented in the forward orientation to maintain consistency and standardization in genome research.
  • asked a question related to Bioinformatics
Question
6 answers
If I have a sequence (genome.fasta). And I want to check the gene located in 400nt -500nt.
What bash script (I have WSL in my windows) I should use or are there any conda packages ?
Thank you in advanced
Relevant answer
Answer
To extract a sequence from a larger genome file based on a specific location, you can use various command-line tools available in Bash. you can achieve this using the samtools and bedtools utilities, which can be installed via conda.
  • asked a question related to Bioinformatics
Question
3 answers
Is there any server or tools (bioconda, java, etc.) to exclusively annotate membrane protein only (similar to dbCAN for polysaccharides) from a bacterial genome?
Thank you in advanced!
  • asked a question related to Bioinformatics
Question
3 answers
Hi - I'm currently working with two RNA-Seq studies; one has RNA extracted from whole blood, the other PBMCs. Eventually we want to combine these data and perform some cell-specific deconvolution to look at DEGs.
Are there any recommended methods for batch correcting these data from different sources?
Mari
Relevant answer
Answer
It is better to consider batch as a factor in the design formula. The tximport pipeline proposed by Michael Love himself offers the most useful solution. Please have a look.
  • asked a question related to Bioinformatics
Question
3 answers
I am interested in predicting the protein structure of my protein of interest. Using NCBI BLAST, I found an experimental structure that corresponds to a domain of my protein, showing 24% query coverage and 100% similarity. My question is whether I can confidently use this experimental structure as a template for homology modeling, or if I should explore alternative techniques such as threading, ab initio modeling, or any other suitable approach. I would also appreciate recommendations for relevant servers or software that can assist in this case.
Thank you for your insights and suggestions.
Relevant answer
Answer
Quite honestly if your protein isn't too large, i.e., to many amino acids for it I would just use AlphaFold or ESMFold and compare the best model with the resolved one by aligning on this region. I think the models (or variants of it that participated listed in the previous post all do have lower performance in the last CASP competitions then AF had. Although I haven't checked this ^^
RosettaFold would also be a good option.
Of course homology modeling can still work pretty well, but usually only if you have good templates and ideally many of them. But if you have regions that basically are missing in your templates and those are significant it usually doesn't really work that well.
  • asked a question related to Bioinformatics
Question
3 answers
What to do if ChimeraX software doesn't recognise the .chimerax file downloaded from SwissDock after docking?
Besides, the zip file of prediction done was empty.
Thank you.
Relevant answer
Answer
If you are still encountering difficulties, you can try using alternative molecular visualization tools to view the SwissDock results. Examples of popular visualization software include PyMOL, VMD, and UCSF Chimera.
  • asked a question related to Bioinformatics
Question
4 answers
Greetings!
I have an issue that drives me crazy this evening...
I have a list of gene vectors, downregulated in different transgenic plants and I want to make a Venn diagram to visualize it and to show the intersections between plants.
But! The results from any package I used (in R) gaves me something like this (the uploaded picture 1)...
What's bothering me:
1. The numbers on "clear" (not intersected) parts of a diagram are lower, than the gensets I have. And I tried to use factor instead of character vectors, to remove possible duplications, to remove symbols (like space) that could cause software misunderstanding - all gaves me nothing... same result.
2. The intersection of vectors is not true - on the picture you can see that the intersection of 2 datasets (of 365 and 154 genes) - is 1133 genes!! How could that be?
The manual usage of intersect function on the same dataset gaves pretty correct results.
Maybe I am misunderstanding about Venn diagrams? Because in a web I found many examples of such strange mistakes - on the second picture from Datanovia you can see that the intersection of the red elliplse (of 58) and yellow (of 144) is 66!
It seemes logical to me that the intersection of 2 vectors cannot be greater than the length of a smaller vector. What am I doing wrong or misunderstanding?
Relevant answer
Answer
I believe Rob is correct.
Since you are using the intersect function, the numbers in your figure (e.g. 365 and 154) are the number of genes without any intersection.
The total genes of each set (e.g. OE21) will be the sum of all the numbers in each intersection + genes with no intersection. I couldn't do full sum for you as the core intersection number is missing.
  • asked a question related to Bioinformatics
Question
3 answers
I'm looking for an online course of Bioinformatics with a delivered certificate?
Relevant answer
Answer
  • asked a question related to Bioinformatics
Question
6 answers
Hello everyone,
I am not good at R so I am trying to find solutions for my problems through the internet. I have been stuck on a problem. I couldn't find a way to compare the means of groups separated by facet function. Maybe I should not have put x axis as it is now but I wanna make sure. Here is the shorter version of my code for you to have a look at:
my_comparisons <- list( c("Hybrid","Single"))
ggplot(data = rpkms_new2, aes(x = strand, y = log2(RPKM), fill=strand, label = strand))+
geom_violin(scale = "count", alpha=0.5)+
facet_grid(~Trans, switch = "x", scales = "free_x", space = "free_x") +
theme(plot.title = element_text(hjust=0.5))+
theme(panel.spacing = unit(0, "lines"),
strip.background = element_blank(),
strip.placement = "outside") +
stat_compare_means(ref.group = "None", aes(label = ..p.signif..), method = "wilcox")+
stat_compare_means(comparisons = my_comparisons, aes(label = ..p.signif..), method = "wilcox")+
geom_text(data = mean_ranks, aes(x = strand, y = -Inf, label = round(rank, 0)), size = 3, vjust = -1)
How should I modify my code to be able to compare all the subgroups(single and hybrid) with the "None" group ?
My data looks like below:
STRAND TRANS VALUES:
sense hybrid 2
sense hybrid 2
sense single 3
sense single 7
antisense hybrid 10
antisense hybrid 12
antisense single 1
antisense single 2
none none 1
none none 4
Relevant answer
Answer
Thumbs up Tuba Sena Ogurlu
  • asked a question related to Bioinformatics
Question
3 answers
I am currently an Indonesian high school student passionate about bioinformatics and its potential to drive impactful innovations in the fields of biology and medicine. I am eager to participate in the Regeneron International Science and Engineering Fair and showcase a research project that can make a significant contribution to the scientific community.
Considering the vast possibilities within the realm of bioinformatics, I would greatly appreciate any suggestions, ideas, or insights for a research project that aligns with the following criteria:
  1. Impactful Innovation: I am looking for a research topic that has the potential to make a significant impact in the biology or medical world. It could involve the development of new algorithms, computational tools, or methodologies that address critical challenges in these domains.
  2. Bioinformatics Focus: The research should predominantly involve bioinformatics techniques, such as data analysis, data mining, machine learning, genomics, proteomics, or other computational approaches. It should leverage the power of data and computational tools to gain insights into biological processes or contribute to medical advancements.
  3. Feasibility for a High School Student: As a high school student, I have certain limitations in terms of resources, time, and expertise. Therefore, I am seeking research ideas that are feasible for a high school-level project. While the topic should be challenging enough to meet the standards of the Regeneron ISEF, it should also be manageable within the scope of a high school research project.
Thank you in advance for your valuable suggestions and insights.
Relevant answer
Answer
if you are passionate about bioinformatics and its application in medical industry then there is a lot of research going on in molecular and functional genomics now a days. certainly, you will have diverse arena of research from research from metagenomics to single cell RNA sequencing. If you like you can also try to develop a computational pipeline to analyze publicly available cancer genomics data, such as The Cancer Genome Atlas (TCGA) dataset. Focus on identifying potential biomarkers, genetic variants, or gene expression patterns associated with specific types of cancer, aiming to contribute to personalized medicine and targeted therapies. you should read about this and need to have clear understanding.
  • asked a question related to Bioinformatics
Question
3 answers
Hello everybody, I'm a master degree student. I'm working with 16S data on some environmental samples. After all the cleaning, denoising ecc... now I have an object that stores my sequences, their taxonomic classification, and a table of counts of ASV per sample linked to their taxonomic classification.
The question is, what should I do with the counts for assessing Diversity metrics? Should I transform them prior to the calculation of indexes, or i should transform them according to the index/distance i want to assess? Where can I find some resources linked to these problems and related other for study that out?
I know that these questions may be very simple ones, but I'm lost.
As far as I know there is no consensus on the statistical operation of transforming the data, but i cannot leave raw because of the compositionality of the datum.
Please help
Relevant answer
Answer
Assessing diversity metrics in 16S data is an important step in analyzing microbial communities. Handling count data in this context can be challenging due to the compositional nature of the data, as you mentioned. While there is no one-size-fits-all approach, there are several techniques and considerations you can explore. Here are some suggestions:
  1. Transformations for diversity metrics: The choice of transformation depends on the diversity metric you want to assess. Common transformations include rarefaction, normalization (e.g., by library size or cumulative sum scaling), or transformations that aim to address compositionality, such as log-ratio transformations (e.g., centered log-ratio, clr transformation) or Hellinger transformation. Different transformations may be more suitable for specific diversity metrics, so it's essential to consider the metric's assumptions and properties.
  2. Compositional data analysis (CoDA): Compositional data analysis provides a statistical framework to analyze and interpret compositional data. It accounts for the constrained nature of relative abundance data by working on transformed data. CoDA methods, such as ALDEx2 or ANCOM, can help identify differentially abundant features between groups while considering the compositional structure.
  3. Multivariate analyses: If you want to explore the overall community structure and relationships, multivariate techniques like principal component analysis (PCA), correspondence analysis (CA), or non-metric multidimensional scaling (NMDS) can be employed. It's advisable to perform these analyses on transformed data to mitigate the effects of compositionality.
  4. Research articles and resources: To delve deeper into the subject, you can refer to scientific articles and resources that discuss the statistical analysis of 16S data. Some useful references include: "Microbiome Analysis Methods" by Paul J. McMurdie and Susan Holmes. "A guide to statistical analysis in microbial ecology: a community-focused, living review of multivariate data analyses" by Egoitz Martínez-Costa et al. "Statistical analysis of microbiome data with R" by Yinglin Xia et al. "MicrobiomeSeq: An R package for analysis of microbial communities in an environmental context" by Paul McMurdie and Susan Holmes. These resources provide insights into various statistical approaches, transformations, and analysis techniques for 16S data.
Remember that there is ongoing research in the field, and best practices continue to evolve. It's important to critically evaluate the methods, consider the specific characteristics of your data, and consult with your advisor or peers with expertise in microbiome analysis to make informed decisions about data transformations and diversity metric assessment.
  • asked a question related to Bioinformatics
Question
7 answers
I'm interested in studying specific missense mutations in a human gene. My goal is to determine whether the mutated region of the protein is conserved across various species. Could you please guide me on how I can use in silico tools to find homologous protein sequences and identify their conserved regions?
Thank you very much
Relevant answer
Answer
That's a good approach Susanta Roy I would add that once you are working with your multiple sequence alignment (MSA) in Jalview (https://www.jalview.org), you load an experimental 3D protein structure, or an AlphaFold model (all possible from Jalview, just right-click on a sequence label), and visualise the mutations and conservation scores on the structure too. Jalview makes this easy by colouring the structure by the sequence, so you can choose to colour by conservation and add features to represent your mutations and they will instantly be viewable on the structure.
The other thing I would add is that in addition to BLASTing the full-length protein, have a look at it on InterPro and see what domains it has. Then you can work with curated MSAs from the individual domains too.
Great question Muhammad Abrar Yousaf !
  • asked a question related to Bioinformatics
Question
3 answers
Hi, I am a beginner in bioinformatics and I would like to identify CRISPRs in my MAGs fasta files. Can someone recommend an up-to-date good tool that can be easily installed through the Conda environment, please? Thank You in advance
Relevant answer
Answer
The CRISPRCasFinder helps with CRISPRs and cas genes finding in your MSA/fasta files - you can upload them on this website - https://crisprcas.i2bc.paris-saclay.fr/CrisprCasFinder/Index. Otherwise you may need to use another soft in case your files are too big and the first link does not work https://crisprcas.i2bc.paris-saclay.fr/Home/Download
Reference: David Couvin, Aude Bernheim, Claire Toffano-Nioche, Marie Touchon, Juraj Michalik, Bertrand Néron, Eduardo P C Rocha, Gilles Vergnaud, Daniel Gautheret, Christine Pourcel, CRISPRCasFinder, an update of CRISRFinder, includes a portable version, enhanced performance and integrates search for Cas proteins, Nucleic Acids Research, Volume 46, Issue W1, 2 July 2018, Pages W246–W251, https://doi.org/10.1093/nar/gky425
  • asked a question related to Bioinformatics
Question
2 answers
Dear Researchers,
If anyone is interested in reviewing manuscript on multiepitope vaccine design. Please provide your following details:
Note: Reviewers from India, Pakistan, Egypt & Saudi Arabia are not eligible for this manuscript.
First Name:
Last Name:
Degree:
Position:
Institution:
Department:
Institutional E-mail id:
Relevant answer
Answer
Hi, I am in
For what journal do you need reviewers?
  • asked a question related to Bioinformatics
Question
3 answers
Can an MD simulation be performed by adding other salts by varying their concentration inside the box?
Relevant answer
Answer
It is possible to add other ions to the solvent the system in addition to NaCl and MgCl2 during a MD simulation. But, it is important to consider the effect of additional ions on the simulation results and the choose of the appropriate ion concentration are based on the strudied system.
  • asked a question related to Bioinformatics
Question
3 answers
"The result shows absence of intragenomic variation among 16S rDNA gene and presence of variable regions among the 16S rDNA sequences (intergenomic variation), noticing for example high variability around 800, 900, and 1000 bp and a large conserved region between 1150 and 1350 bp. This information allowed us to discard the restriction enzymes FnuII, AsuI, FokI, Eco57I that recognized some restriction sites contained within variable regions, since they are more susceptible of acquiring future nucleotidic variations and with this, the potential generation of different band patterns." [1]
I add that the article mentioned that these discarded enzymes were targeting conserved sites in the study species.
[1]Mandakovic D, Glasner B, Maldonado J, Aravena P, González M, Cambiazo V, Pulgar R. Genomic-Based Restriction Enzyme Selection for Specific Detection of Piscirickettsia salmonis by 16S rDNA PCR-RFLP. Front Microbiol. 2016 May 9;7:643. doi: 10.3389/fmicb.2016.00643. PMID: 27242682; PMCID: PMC4860512.
Is my reading right that the article implies that there is such potential? If yes, what are the possible mechanisms?
More important, what's the time frame of this "future nucleotidic variation", is it an evolutionary time frame that could take thousands of years?
Edit: i think my question can be thought of as: How common are new 16s rRNA gene variants in bacterial species?
Relevant answer
Answer
Yes, your reading is correct. The article implies that there is potential for future nucleotide variations within the conserved restriction sites that are located in variable regions of the 16S rDNA gene.
The possible mechanisms for such variations are mutations, insertions, deletions, or recombinations, which can occur spontaneously or as a result of exposure to environmental factors, such as UV radiation, chemicals, or antibiotics. These changes can accumulate over time and result in differences in the sequence and/or length of the conserved restriction sites, leading to the generation of different band patterns upon restriction digestion.
The time frame for such variations can vary depending on the bacterial species, its population size, its growth rate, and the selective pressures it faces. Some bacterial species have high mutation rates and/or frequent horizontal gene transfer events, which can result in rapid evolution and diversification. Others have lower mutation rates and/or stable environments, which can lead to slower evolution and conservation of certain traits. However, even slow evolution can accumulate changes over time, and it is difficult to predict the exact time frame for future nucleotide variations within conserved restriction sites.
Regarding your edited question, the frequency of new 16S rRNA gene variants in bacterial species can also vary depending on the factors mentioned above. Some bacterial species have high genetic diversity and high rates of recombination and horizontal gene transfer, leading to frequent emergence of new variants. Others have low genetic diversity and low rates of recombination and horizontal gene transfer, resulting in slower emergence of new variants. However, the 16S rRNA gene is generally considered to be a stable and conserved marker for bacterial identification and classification, and many conserved regions within this gene are used as targets for PCR amplification and sequencing.
These video playlists might be helpful to you:
  • asked a question related to Bioinformatics
Question
2 answers
Dear Friends and connection
I believe in the power of community. So, I post this,
I am excited to explore the possibility of collaborating with someone who works on network pharmacology. As, network pharmacology is an interdisciplinary field that combines principles of network analysis, bioinformatics, and pharmacology to investigate drug-target interactions and predict the therapeutic effects of drugs.
I have some projects related to bioinformatics and I believe that our collaboration can result in significant progress in this exciting field.
I am looking forward to hearing from you and exploring our collaboration for network pharmacology.
Regards
Shopnil Akash
WhatsApp: +8801935567417
Relevant answer
Answer
Network pharmacology, a systematic analytical method, can analyze the interaction network of multiple factors such as drugs, protein target, diseases, and genes.
Regards,
Shafagat
  • asked a question related to Bioinformatics
Question
5 answers
I've recently been using the NCI's Cancer Genome Atlas to find datasets and perform basic clinical correlation analyses. I think it's a fantastic tool, even for people with a limited bioinformatics background, so it made me curious if there are similar resources for people who study non-cancer diseases.
I was wondering if people are aware of any other databases/repositories/webtools that serve a similar purpose for non-cancer diseases. If anyone has recommendations/suggestions, please comment/link them down below.
Thanks in advance for your input!
Relevant answer
Answer
There are several databases and repositories available that provide genomic data and tools for the study of non-cancer diseases. Here are a few examples:
  1. The Genetic Association Database (GAD) - GAD is a database that collects data from published studies investigating genetic associations with various diseases, including autoimmune disorders, cardiovascular diseases, and neurological disorders. The database includes information on single nucleotide polymorphisms (SNPs), genes, and diseases.
  2. The National Institute of Neurological Disorders and Stroke (NINDS) Repository - The NINDS Repository provides access to biospecimens and genetic data from patients with neurological disorders. Researchers can use this resource to investigate the genetic basis of neurological diseases and to develop potential treatments.
  3. The Online Mendelian Inheritance in Man (OMIM) - OMIM is a database that catalogs genes and genetic disorders. The database includes information on the genetic basis of various diseases, including cardiovascular diseases, neurological disorders, and rare genetic disorders.
  4. The Comparative Toxicogenomics Database (CTD) - The CTD provides information on how environmental chemicals can affect human health. The database includes information on the genes and proteins that are impacted by toxic substances and their associated diseases.
These are just a few examples of the many databases and repositories available for the study of non-cancer diseases. It is important to choose the appropriate resource based on your research question and to make sure that the data and tools are reliable and validated.
  • asked a question related to Bioinformatics
Question
4 answers
"Is there any in-silico methods for studying the effect of up-regulation and down-regulation of the same genes?"
If yes, please suggest me the name/article.....Thank you
Relevant answer
Answer
Luke V Schneider Thank You...
  • asked a question related to Bioinformatics
Question
4 answers
What bioinformatics tools are available to help analyze and interpret large-scale molecular data generated from crop research?
Relevant answer
Answer
I highly recommend you to focus on your education and understanding the basics and fundamentals and not to spam here by posting questions and answering yourself.
Further, since you don't have the proper education to understand what software can be used for what it is totally illogical to talk about bioinformatics tools.
  • asked a question related to Bioinformatics
Question
2 answers
We all know that nanobody development is time and money consuming, it nearly needs a grant. I'm wondering if there is any bioinformatics tool or a method to predict nanobody sequence against certain antigen using this antigen sequence as an input ? Something like you put in the antigen sequence and that tool could predict how the nanobody against this antigen could be, in term of sequence, structure, etc?
Relevant answer
Answer
Yes, there are several bioinformatics tools available to predict nanobody sequences based on antigen sequences. One such tool is called "AbDesign," which is a web-based server that predicts the sequence and structure of nanobodies based on the input antigen sequence.
AbDesign uses a computational algorithm to predict the amino acid sequence of nanobodies that can bind to the input antigen sequence. The algorithm takes into account the physicochemical properties of the antigen and the CDRs (complementarity-determining regions) of the nanobody.
Other bioinformatics tools that can be used for nanobody sequence prediction include "Nanobody Mapper," "VHHDB," and "Nanobodies.org." These tools use a variety of algorithms and techniques to predict nanobody sequences, and some also provide additional features, such as database searches and visualization tools.
It's important to note that while bioinformatics tools can be useful for predicting nanobody sequences, experimental validation is still necessary to confirm the predicted sequence and determine its binding properties.
  • asked a question related to Bioinformatics
Question
3 answers
Hi, I would like to ask if anybody has positive experiences with single primer PCR ? Can you recommend me any proven protocol of this type of PCR ? Thank you for all recommendations. Bohuš
Relevant answer
Answer
Hi , in selection of mismatches (SNPs) it easily works. Coupling flourcent dyes to such primers can convert PCR to RT PCR .
  • asked a question related to Bioinformatics
Question
4 answers
I am running an MD simulation on a protein-protein complex.
After seeing a similar question on research gate, I checked the amino acids rtp file in my force fields folder, and as expected from this error, the HD1 atom was not present in the HSE entry. The atom HD2 is however present in that entry. So I figured replacing the HD1 atoms in my PDB file with HD2 should solve the error.
And it did. For the time being.
To reaffirm, I made changes in Histidine's hydrogen atoms in the PDB file. When I went ahead with the energy minimization step, I got an error that said there's an Infinite Force on an atom. It turns out that the atom was "HD2" of some Histidine in the PDB file.
I saw online that the reason behind this error was due to atom overlap. Hence, just for seeing if that was the case for me, I changed the coordinates of that atom a little bit (this was just for checking, I can't do this for the actual work). When I ran the EM step again, I got the same error, but for a HD2 of a different Histidine molecule. So yes, overlapping of the atoms is the reason for this particular error. I cannot solve it by changing coordinates of all the HD2 atoms of the Histidines. So it all boils down to the main fatal error that I mentioned.
How do I approach this?
1. Changing the atom name (as in HD1 -> HD2 is not working due to the subsequent error)
2. I do not know if I should add the atom HD1 in the HSE entry in the rtp file (I tried this and got several warnings).
3. I cannot (or should I?) use -ignh because mine is not an NMR structure. I have modelled my proteins on Modeller and refined them online.
Any suggestions/solutions will help me a lot. Thank you in advance!
Relevant answer
Answer
Hi, a crude measure is to use -ignh during pdb2gmx, it will rebuild all the H-atoms based on the force field you are using. Most of the time, it is a reasonable choice (though not always), as the H-atoms are mostly absent in crystallographic structure (as it is difficult to resolve h-atom positions).
Histidine is unique in the sense that its side chain offers multiple h-bonds at physiological pH. The better procedure is to check which heavy atom of His side chain is forming H-bond in the protein (either it is delta or epsilon), and rename your His residues accordingly (HIS, HIE, HID, please check your FF how these residues are named there).
"2. I do not know if I should add the atom HD1 in the HSE entry in the rtp file (I tried this and got several warnings)." > Try not to mess with rtp entry at this stage (as you are very new), and if you like to play around, just make a backup of FF directory and do as you like.
  • asked a question related to Bioinformatics
Question
2 answers
I've been trying to know more about bioinformatics pipelines for whole genome shotgun sequencing data to use for the samples of animal fecal microbes diversity and identify pathogenic microorganisms (both of DNA and RNA).
Relevant answer
Answer
Dear Dr Abhijeet Singh, thank you so much.
  • asked a question related to Bioinformatics
Question
3 answers
I have tried to separate a direct coculture of MSCs (mesenchymal stromal cells) and macrophages to do bulk RNA seq on macrophages, as I want to find out how MSCs change the genetic expression on macrophages. I have tried different methods to separate the coculture as much possible, but I can only manage to retrieve a cell population with 95% macrophages, and 5% MSCs still present.
Therefore, I want to know if anyone has experience with analyzing data when the population is not completely pure with one cell type and how do I handle such data?
Is it wise to proceed with bulk RNA seq when 5% of my cells are still MSCs, well aware that the expressed genes observed could come from the 5% MSCs?
Relevant answer
Answer
Dear Kian,
have you tried improve your purity by FACS? It´s fairly easy to choose markers to distinguish MSC & macrophages and sort highly pure populations.
  • asked a question related to Bioinformatics
Question
6 answers
Risk of bias assessment (sometimes called "quality assessment" or "critical appraisal") helps to establish transparency of evidence synthesis results and findings. and it is mandatory to have it in your systematic review!
if you know any tools or used ones, can you please share it/them with me?
or if you have extra information regarding the risk of basis assessments, can you share it with me?
Relevant answer
Answer
Systematic reviews and meta-analyses are proliferating, as they are an important building block to inform evidence-based guidelines and decision-making. Enforcement of best practice in clinical trials is firmly on the research agenda of good clinical practice, but there is less clarity as to how evidence syntheses that combine these studies can be influenced by bad practice. Our aim was to conduct a living systematic review of articles that highlight flaws in published systematic reviews to formally document and understand these problems...
Many hundreds of articles highlight that there are many flaws in the conduct, methods and reporting of published systematic reviews, despite the existence and frequent application of guidelines. Considering the pivotal role that systematic reviews have in medical decision-making due to having apparently transparent, objective and replicable processes, a failure to appreciate and regulate problems with these highly cited research designs is a threat to credible science...
  • asked a question related to Bioinformatics
Question
2 answers
Hello every body. I need HELP. PLEASE
in basecalling fast5 files, using genomicpariscentre/guppy image in docker, I try many flowcell and kit names which are available in guppy(I print them using this code: guppy_basecaller --print_workflows) but I recieve this error: could not find matching workflow for flowcell XXX and kit XXX.
What can I do to fix it?
Relevant answer
Answer
You must be knowing the flowcell and kit used to generate the sequence data/fast5 files. Printing workflow would not help much if you dont know what specifics were practically used.
You might some help from here