Science topic

Computational Biology - Science topic

A field of biology concerned with the development of techniques for the collection and manipulation of biological data, and the use of such data to make biological discoveries or predictions. This field encompasses all computational methods and theories applicable to MOLECULAR BIOLOGY and areas of computer-based techniques for solving biological problems including manipulation of models and datasets.
Questions related to Computational Biology
  • asked a question related to Computational Biology
Question
4 answers
I have some lists of gene IDs from multi species, I want to have their compiled FASTA format files for each species. it looks tedious to copy each accession and collect FASTA seqs.
Batch Entrez is giving me error, may be because the identifier is related to other database.
Relevant answer
Answer
in my case for 1st gene list having TAIR identifiers i got their FASTA seq file from TAIR < download < bulk data retrieval < sequences.
  • asked a question related to Computational Biology
Question
2 answers
Is there an updated list of all the approved biological databases with a brief description of each DB?
Relevant answer
Answer
Approved by FDA, but not only FDA;
By Approved I mean reliable.
  • asked a question related to Computational Biology
Question
2 answers
I am a young bioinformatics student, want to have clues for my project pipeline. hints and expert answers are welcome. THANKS
Relevant answer
Answer
I'd start with inferring the amino acid sequence and then BLAST it against proteins of known function. That way you can leverage what is known about the homologous, known proteins to build your starting hypothesis about your unknown protein. You can then design more targeted experiments to test your hypothesis.
  • asked a question related to Computational Biology
Question
4 answers
I'm currently trying to figure out the specific interactions among two specific proteins.
Unfortunately, these protein complexes aren't available experimentally(e.g. x-ray crystallography, cryo-em data).
Thus, our research group has worked together with a computational biology research group and created an AlphaFold predicted complex structure.
Because there were steric clashes and some inaccuracy in the initially predicted structure, they performed an additional 'refinement' process.
Is MD simulation after this refinement mandatory? i.e. Is the 'refined' structure accurate enough to predict any specific amino acid interactions between the two protein subunits using PDBePISA, etc?
Or conversely, is the MD simulation-applied structure still not accurate enough for further protein interface analysis?
Relevant answer
Answer
Unfortunately, there is not an easy answer for that as far as I know. The answer will depend on many factors like:
  • Length of disordered region
  • Conformational change required for the oligomeric state (or conformational change may not be required at all)
  • Experimental evidences that you already have etc.
I would probably check the initial complex structure with robetta as well. If individual structural components of the complex are available, I would try protein-protein docking approaches as well.
All of these factors -and probably some others- will affect what you can obtain with MD simulations.
  • asked a question related to Computational Biology
Question
1 answer
Hello everyone,
Can someone help me with this.
I have run a complex with DNA, Protein and RNA in NAMD for 50 ns. However, after the end of simulation, the base pairing of DNA and DNA RNA hybrid completely got disrupted. What should I do inorder to get a stable DNA and RNA in simulation. I have used CHARMM36 force field. The simulation was done at 310k.
Relevant answer
Answer
You may perform MD simulation with harmonic position restraints on the heavy atoms. This allows the solvent to equilibrate around the complex/DNA/RNA without disturbing the structure. For further details, check out the NAMD manual.
  • asked a question related to Computational Biology
Question
3 answers
Hello.
I am trying to find differentially abundant microbes between two conditions. I have the relative abundance data but not the absolute read counts.
Is there any method that considers relative abundance data as input?
or any way to transform this data before use?
Regards,
Pratyay
Relevant answer
Answer
ANCOM-BC (ANCOM with Bias Control). You can run it easily in R using your relative abundance data without having to transform it. Here is some documentation you may find helpful:
  • asked a question related to Computational Biology
Question
4 answers
I want to know which institute's in India offer synthetic biology courses>?
Relevant answer
Answer
  • asked a question related to Computational Biology
Question
3 answers
I need to superimpose and compare 4 pdb structures. The align and super commands in pymol only overlays 2 structures at a time. What will be the pymol command for superimposing more than 2 structures at a time?
Relevant answer
Answer
In PyMOL you can use A > align > all to this option to align all the open structures to a particular structure in a single step.
For multiple structure alignment you can also use VMD MultiSeq (https://www.ks.uiuc.edu/Training/Tutorials/vmd/tutorial-html/node7.html).
  • asked a question related to Computational Biology
Question
3 answers
Hi everyone. I am currently working with viral hemerrogic fevers and need to dock lead molecules with RNA dependent RNA polymarase enzyme. My question is that ' is the structure or RdRp (RNA dependent RNA polymarase) same for different viruses lkke Ebola, Dengue, westnile etc. Or is their specific RdRp for each virus? I have searched PDB but could not find Ebola RdRp. Is there any other database from which i can find it?
Relevant answer
Answer
No, RNA dependent RNA polymarase are different and you can model protein based on reference RdRps from other virus.
  • asked a question related to Computational Biology
Question
4 answers
So my last year project is Drug Efflux Pumps and Persistence in Methicillin Resistant Staphylococcus aureus and we gonna focus on persister cells to study the path way of antimicrobial resistance...my question is how can i link bioinformatics and some coding to this project without requiring wgs cause it's not an option inside our lab !I need a small yet beneficial technique/ tools in small scale that i can learn and implement by my self .PS I love programming in general but im still new to bioinformatics so i need help to link my passion for coding and my field "biotechnology"
Relevant answer
Answer
Please have look on our(Eminent Biosciences (EMBS)) collaborations.. and let me know if interested to associate with us
Our recent publications In collaborations with industries and academia in India and world wide.
Our Lab EMBS's Publication In collaboration with Universidad Tecnológica Metropolitana, Santiago, Chile. Publication Link: https://pubmed.ncbi.nlm.nih.gov/33397265/
Our Lab EMBS's Publication In collaboration with Moscow State University , Russia. Publication Link: https://pubmed.ncbi.nlm.nih.gov/32967475/
Our Lab EMBS's Publication In collaboration with Icahn Institute of Genomics and Multiscale Biology,, Mount Sinai Health System, Manhattan, NY, USA. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/29199918
Our Lab EMBS's Publication In collaboration with University of Missouri, St. Louis, MO, USA. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/30457050
Our Lab EMBS's Publication In collaboration with Virginia Commonwealth University, Richmond, Virginia, USA. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/27852211
Our Lab EMBS's Publication In collaboration with ICMR- NIN(National Institute of Nutrition), Hyderabad Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/23030611
Our Lab EMBS's Publication In collaboration with University of Minnesota Duluth, Duluth MN 55811 USA. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/27852211
Our Lab EMBS's Publication In collaboration with University of Yaounde I, PO Box 812, Yaoundé, Cameroon. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/30950335
Our Lab EMBS's Publication In collaboration with Federal University of Paraíba, João Pessoa, PB, Brazil. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/30693065
Our Lab EMBS's Publication In collaboration with collaboration with University of Yaoundé I, Yaoundé, Cameroon. Publication Link: https://pubmed.ncbi.nlm.nih.gov/31210847/
Our Lab EMBS's Publication In collaboration with University of the Basque Country UPV/EHU, 48080, Leioa, Spain. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/27852204
Our Lab EMBS's Publication In collaboration with King Saud University, Riyadh, Saudi Arabia. Publication Link: http://www.eurekaselect.com/135585
Our Lab EMBS's Publication In collaboration with NIPER , Hyderabad, India. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/29053759
Our Lab EMBS's Publication In collaboration with Alagappa University, Tamil Nadu, India. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/30950335
Our Lab EMBS's Publication In collaboration with Jawaharlal Nehru Technological University, Hyderabad , India. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/28472910
Our Lab EMBS's Publication In collaboration with C.S.I.R – CRISAT, Karaikudi, Tamil Nadu, India. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/30237676
Our Lab EMBS's Publication In collaboration with Karpagam academy of higher education, Eachinary, Coimbatore , Tamil Nadu, India. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/30237672
Our Lab EMBS's Publication In collaboration with Ballets Olaeta Kalea, 4, 48014 Bilbao, Bizkaia, Spain. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/29199918
Our Lab EMBS's Publication In collaboration with Hospital for Genetic Diseases, Osmania University, Hyderabad - 500 016, Telangana, India. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/28472910
Our Lab EMBS's Publication In collaboration with School of Ocean Science and Technology, Kerala University of Fisheries and Ocean Studies, Panangad-682 506, Cochin, India. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/27964704
Our Lab EMBS's Publication In collaboration with CODEWEL Nireekshana-ACET, Hyderabad, Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/26770024
Our Lab EMBS's Publication In collaboration with Bharathiyar University, Coimbatore-641046, Tamilnadu, India. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/27919211
Our Lab EMBS's Publication In collaboration with LPU University, Phagwara, Punjab, India. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/31030499
Our Lab EMBS's Publication In collaboration with Department of Bioinformatics, Kerala University, Kerala. Publication Link: http://www.eurekaselect.com/135585
Our Lab EMBS's Publication In collaboration with Gandhi Medical College and Osmania Medical College, Hyderabad 500 038, India. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/27450915
Our Lab EMBS's Publication In collaboration with National College (Affiliated to Bharathidasan University), Tiruchirapalli, 620 001 Tamil Nadu, India. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/27266485
Our Lab EMBS's Publication In collaboration with University of Calicut - 673635, Kerala, India. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/23030611
Our Lab EMBS's Publication In collaboration with NIPER, Hyderabad, India. ) Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/29053759
Our Lab EMBS's Publication In collaboration with King George's Medical University, (Erstwhile C.S.M. Medical University), Lucknow-226 003, India. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/25579575
Our Lab EMBS's Publication In collaboration with School of Chemical & Biotechnology, SASTRA University, Thanjavur, India Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/25579569
Our Lab EMBS's Publication In collaboration with Safi center for scientific research, Malappuram, Kerala, India. Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/30237672
Our Lab EMBS's Publication In collaboration with Dept of Genetics, Osmania University, Hyderabad Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/25248957
Our Lab EMBS's Publication In collaboration with Institute of Genetics and Hospital for Genetic Diseases, Osmania University, Hyderabad Publication Link: https://www.ncbi.nlm.nih.gov/pubmed/26229292
Sincerely,
Dr. Anuraj Nayarisseri
Principal Scientist & Director,
Eminent Biosciences.
Mob :+91 97522 95342
  • asked a question related to Computational Biology
Question
5 answers
I am a master's student of statistics. I have been in the field of econometrics and have taken projects on machine learning. However, I wish to change field. Can I have a supervisor who will be willing to mentor me through bioinformatics, taking my previous and current research areas into consideration? Or do I need another master's degree in bioinformatics or a related field before I can proceed to Phd?
Relevant answer
Answer
Try PhD in Bioinformatics
  • asked a question related to Computational Biology
Question
4 answers
Hello all
Is there any reliable and free web server that runs molecular dynamics simulation of proteins?
Relevant answer
Answer
These days journals demand for 100 ns and Online tools are not reliable for MDS. The best solution for MDS is always gromacs or schrodinger.
  • asked a question related to Computational Biology
Question
6 answers
I have been asked to check the gene expression patterns of the cells for a RNA seq data after performing principal component analysis plot using MATLAB. I have a CSV file that has the principal component values stored, but I am not sure how to perform differential expression analysis using the PC values. Any MATLAB function available? Kindly help me. Thanks in advance.
Relevant answer
Answer
I am preparing E-Readiness Index for farmers, extension personnel and agricultural scientist separately to measure the degree of an individual to utilize tools of ICT in agriculture. I have selected sub-groups as well as indicators for the same but i am stuck how to obtain e-readiness score ? As per my reading of literature i realized that PCA or Factor Analysis provides relevancy and accuracy to indicators but my confusion is when to apply PCA ? On which data - data obtained from pre-testing data or the final collected data? Or without any data collected ? Please guide.
  • asked a question related to Computational Biology
Question
5 answers
Kia ora,
We have been trying to search the anatomical substance of acupuncture points in the skin. While conducting experiments, we found a new anatomical structure in rat skin. In this structure mRNA of a gene of which function is unknown is very highly expressed. We know the exact nucleic acid sequence of this gene. What experiments need to be done to find out the function of this gene? Can it be done through computational biology study?
Any comments would be greatly appreciated.
Warm regards,
Kiho Lee
Relevant answer
Answer
you should first blast your sequence to a database as NCBI (https://blast.ncbi.nlm.nih.gov/Blast.cgi) against the right genome and see what results could give you a gene name. other databases ask kegg or gene cards would give you an idea on the type of gene you found.
best
fred
  • asked a question related to Computational Biology
Question
3 answers
Hello, research community,
I am looking for some open problems in bioinformatics specifically in the area of, but not limited to, proteomics, and genomics. Since I am new to this area, any useful suggestions, a discussion on open problems and relevant resources are welcome.
Thanks.
Rahul
Relevant answer
Answer
examples:
Protein structure prediction
single cell RNA/DNA unsupervised learning/clustering
correlation of gene expression & variation with clinical outcomes
many more open problems tbh
e.g. see DeepVariant by google research
  • asked a question related to Computational Biology
Question
3 answers
I have a data (shown in attached pic ) where I have RNA seq data of various samples for the same the gene twice.
Now suppose for sample-1 if I want to measure the gene ( which is haplotypic in nature ) how do I consider its RNA seq for the sample no 1. Do I take average or do I consider median or should I consider both these versions of genes as separate genes ? I guess biologist would make better explanations.
Relevant answer
Answer
From what i see the RNA seq data presented is already pre-processed, since raw RNA seq data would be in counts and integer type... Its important to know how its pre-processed. There are much more advanced ways of analyzing such data instead of taking median or a mean. I would suggest looking into Differential Gene Expression analysis for Haplotypes...
  • asked a question related to Computational Biology
Question
14 answers
I want to know about the topics in bioinformatics that are being focused on recently.
Relevant answer
Answer
my opinion fro you is to focus toward the genomics analysis of SARS-Cov2, mutational analysis in SARS-COV2 ,drug designing or pathways analysis
  • asked a question related to Computational Biology
Question
9 answers
I have RNA-sequence/est and genome sequence. I would like to identify the intron splice site with 5' GT-3'AG bias
Relevant answer
Answer
Thank you @Katharina Hoff's.
  • asked a question related to Computational Biology
Question
11 answers
This would help a Newbie who wants to go into bioinformatics and/or computational biology and wants to grow exponentially in the field.
Note: Information on this will be posted in Bioinformatics.co.ke
Add (Y) to be named and (N) not to be named. Without any of these, Names won't be mentioned.
Relevant answer
Answer
To improve your skills in bioinformatics and computational biology, having coding skills and mastering Linux, NGS, Python, Julia, R, C, C ++, Fortran, etc, is very helpful.
  • asked a question related to Computational Biology
Question
3 answers
Do you know an ISI open access journal in one of the following topics with quick reviewing process and short time to first decision . also short time to revision process to acceptance ?
I prefer journals with impact factor ranging from 1.5 to 3.3.
bioinformatics
systems biology
computational biology
genomics
genomics data analysis
omics data analysis
dermatology
Relevant answer
Answer
Hi, Dear Reyhaneh Naderi,
You can search in these below links:
  • asked a question related to Computational Biology
Question
24 answers
Hello friends, today I am raising a concern- What are real palindromic DNA sequence ? off course you will say- Restriction enzymes sites, but through a video available at the link http://bit.ly/palindromicDNA, I am raising an issue that, in true sense mirror repeats are palindromic in nature as defined by standard English dictionaries. There are many unique properties of mirror repeats DNA which i will share later. Hopefully biological scientific community will accept mirror repeats as True English Palindrome. So please check out http://bit.ly/palindromicDNA
Relevant answer
Answer
If you define palindromes as being the same whether you read it forwards or backwards, mirror repeats are not palindromes, because DNA recognising proteins do recognise the DNA double strand. The sequence of the reverse strand is implied by base complementarity
5-'GGATCC-3' implies the sequence 5-'GGATCC-3' on the reverse strand, and therefore is palindromic when looking at the double strand, while
5-'GTGGACCAGGTG-3' would imply 5'-CACCTGGTCCAC-3' on the reverse strand, and therefore is not.
  • asked a question related to Computational Biology
Question
3 answers
I would like to take the iupac names or 2 D structures of chemical compounds mentioned in publications which are not free. But these compound names are mentioned in abstract or supporting information which is free.
Are there any copyright issues from the journals concerned if I make use of the iupac names or the 2D struture of the compounds in my in silico research ?
The structures of these compounds are available freely on chemical databases which is in the public domain.
So is it legal to make use of the compound strutures in my computational work which is non-commercial in nature ?
Relevant answer
Answer
The structures are in the public domain and, most certainly, you can find them free of charge in public online databases such as ChEMBL or PubChem (in fact, I'd rather recommend compiling your data set from these databases than from journals).
  • asked a question related to Computational Biology
Question
2 answers
I am looking for a comprehensive compiled list of variant databases and their date of development. I am also looking out for their respective links to get more details on each of these databases.
Grateful if someone could help me in this.
Thank you in advance.
Cheers
ajit
Relevant answer
Answer
yes there is dbSNPs and ensembl that can provide you the list of variant in a gene genome or protein. my team is working on SARS-Cov-2 variant. so for your study you can easily retrieved the datasets of variants from dbSNPs and ensembl database
  • asked a question related to Computational Biology
Question
5 answers
Biology related Journals list covers (life science, biotechnology, computational biology, medical, bioinformatics, cancer research etc.) (On Request)
here is the link:
Deleted research item The research item mentioned here has been deleted
Relevant answer
Answer
Hi, Dear Dilraj Kaur,
you can search in this below link:
  • asked a question related to Computational Biology
Question
1 answer
PSSM(Position-specific scoring matrix) is one of the key features to be used for B cell conformation epitope prediction but I am confused about how to use it as a feature.
Relevant answer
Answer
You can use : PFeature
go to the evolutionary info.
  • asked a question related to Computational Biology
Question
3 answers
Hi everyone, I am using the pyTMs plug in on Pymol to phosphorylate a particular threonine in my protein. I am having difficulty in selecting only one or two residues - it seems I can only phosphorylate all serines/theronines. Does anyone know how to fix this?
Relevant answer
Answer
You can try Maestro software to edit (Phosphorylate) particular residue with energy minimization. Energy minimization is required step while converting (mutate) or editing (like Phosphorylation), so Maestro is a very suitable software for it.
Maestro is also Schrodinger's software like Pymol. You can use free for academic.
  • asked a question related to Computational Biology
Question
3 answers
#autodock #virtualscreening
Question1: Am I doing this right , means do setting up conda on server works for virtual screening (AUTODOCK)?
Question2: How can I modify the script (submit4.py) according to my server requirements?
Please read bellow for detailed explanation of the question.
Hey
I am new to Virtual Screening.
To learn this I had started with tutorial named “Using AutoDock 4 for Virtual Screening” (Attaching pdf) (http://autodock.scripps.edu/faqs-help/tutorial/using-autodock4-for-virtual-screening).
I was able to replicate the results (UPTO exercise 11) on my local machine.
Now I am trying to replicate the section named “Using the TSRI cluster: garibaldi” on my college server (page 32 in the pdf attached).
I do not have sudo rights in my college server.
So what I did was:
1) Installed CONDA on the server. I made a virtual environment there.
2) Installed autodock, autodock Vina, autodocktools, mgltools on CONDA environment.
3) Then I downloaded the file “submit4.py” and kept it in the path (here in the bin file of my CONDA environment) (I had changed the default path in the script) (attaching the script of submit4.py).
4) When I am launching my jobs. There I am getting this error -
“sh: 7: qsub:Permission denied”.
I had traced this problem back to 32nd line of the submit4.py script.
The line is-
“ qsub -l cput=23:00:00 -l nodes=1:ppn=1 -l walltime=23:30:00 -l mem=512mb %s.j >> %s ”
----------
**so my questions are:**
Question1: Am I doing this right , means do setting up conda like this works for virtual screening ?
Question2: How can I modify the script (submit4.py) according to my server requirements?
The script for submit4.py:
```
#!/usr/bin/env python
#
# Usage: submit4.py stem ndlgs
import sys, posix, time
path = "/home/tushar19221/anaconda3/envs/tushar_env/bin/autodock4"
stem = sys.argv[1]
ndlgs = int(sys.argv[2])
ndlg_start = 1
if (len(sys.argv) == 4):
ndlg_start = int(sys.argv[3])
cwd = posix.getcwd()
created = time.time()
jobIDsName = """%s.%.2f.jobIDs""" % (stem, created)
command = """touch %s\n""" % (jobIDsName,)
posix.system(command)
for i in xrange(ndlg_start, (ndlg_start + ndlgs)):
#
jobname = """%s.%03d""" % (stem, i)
#
command = """echo "ulimit -s unlimited
echo SHELL is $SHELL
echo PATH is $PATH
cd %s
/home/tushar19221/anaconda3/envs/tushar_env/bin/autodock4 -p %s.dpf -l %s.dlg" > %s.j
chmod +x %s.j
qsub -l cput=23:00:00 -l nodes=1:ppn=1 -l walltime=23:30:00 -l mem=512mb %s.j >> %s
""" % (cwd, path, stem, jobname, jobname, jobname, jobIDsName)
#
posix.system(command)
#
# next i
command = """echo "Job %s was launched on %d processors with these
job_identifiers:"
cat %s\n""" % (stem, ndlgs, jobIDsName,)
posix.system(command)
```
Thank you for reading.
Your help is highly appreciated.
Relevant answer
Answer
Will please send me the complete error details?
  • asked a question related to Computational Biology
Question
6 answers
Hope everyone is having a good day.
I want to learn computational biology. I have a PhD. in pharmacology. Lots of times I heard about the computational biology/bioinformatics but never had a guideline how to learn or to start this interesting field of research.
It would be very helpful if you can guide me through this.
Have a nice day.
Relevant answer
Answer
Dear Apu, Bioinformatics is a mean not an end in itself and the confusion of a mean with an end is the reason of the drmatic crisis science (not technology) is experiencing in these days (see for example https://www.pnas.org/content/113/34/9384.short).
Thus, first of all, you must aquire a 'quantitative sensibility' for biological problems that means: in the face of a biological problem how to restate the issue in order to have a simple recognition of which are the statistical units, the variables of interest , the most interesting scale where to look and if I can provide a suitable metrics preserving the original biological meaining.
Then the informatics will come by alone, tis means you must learn statistics (with a special emphasis on multidimensional descriptive methods like PCA, Cluster Analysis, MDS..), complex networks analysis, non-linear dynamics fundamentals (what an attractor is, what is a transition) and fundamentals of probability.
Attached you will find a sketchy representation of the quantitative needs for facing biological problems.
  • asked a question related to Computational Biology
Question
5 answers
Hello, im Phd student, In my master's thesis, I investigated the cytotoxic, apoptotic and cell cycle effects of an anticancer drug (Danusertib) on pancreatic cancer cells (CFPAC-1and Mia-PaCa-2) by using xCelligence and Flow cytometry in Cell culture lab.
However, I want to do my Phd thesis with virtual experiments using databases ( OMIM, COSMIC, GAD, TCGA) and computer power (maybe on Amazon web services, google cloud or azure) due to financial insufficiency and I like to spend time with computers. So I don't know where to start research about these things and can I do a logical research with these databases? Can anyone give a tip or advice ?
Relevant answer
Answer
Yes, you can use these datasets for research work equivalent to a PhD thesis. As a reference, you can check the publications by TCGA and other groups which utilized TCGA data. A series of these publications have been published by Cell Press as TCGA-Pan Cancer Atlas.
You can see, just in silico work published in the Cell Press journals. But, before thinking of that extend i.e., to entirely rely on these datasets, think what novel question you can address. If you have a highly relevant question, you can go for it. Otherwise, a simple and safe plan can be using hypothesis generation by datasets followed by validation using in vitro studies or vice-versa. This type of combinational work is regularly published and will be more acceptable to most universities and individuals. All the best.
You can check our papers also where we have used simple tools to analyze TCGA data.
  • asked a question related to Computational Biology
Question
7 answers
Hi Everyone, I am calculating the surface area per lipid of 200 ns membrane trajectory using MEMPLUGIN of VMD and it seems to be calculating the same too slow. Therefore, I would like to ask Is there any other way to calculate the surface area per lipid of 200 ns membrane trajectory, other than using MEMPLUGIN ofIf it is then please let me know or if you could help me to let me know how to use more and more processors while calculating the surface area per lipid that also will be very helpful for me. Please give your kind suggestions.
Thanking you in advance
Nandan Kumar
Relevant answer
Answer
Membplugin for VMD takes forever, agreed. Please do try the APLvoro tool. It is specifically designed to calculate membrane thickness and area per lipid per leaflet. It is made to be compatible with Gromacs. The 2D and 3D plotting features are great.
  • asked a question related to Computational Biology
Question
1 answer
If I have only two options in front of me to select either Gasteiger or the AM1-BCC, so based on which parameters or rules I can select the most appropriate charge scheme during minimization step for my ligand.
Does this has to do anything with my ligand size or the protein size whom I want to dock this ligand?
I have seen some people chose Gasteiger over AM1-BCC, I am confused why preferring an inferior charge algorithm when we have the option for choosing the semi-empirical novel type?
Thanks....
Relevant answer
Answer
Hi Jawwad,
The calculation of partial charges by Gasteiger is much faster than by AM1-BCC.
  • asked a question related to Computational Biology
Question
5 answers
I have a protein where I have found a mutation which may be disease-associated. Now I want to show the non-bonded interaction of this particular residue with other surrounding residues to predict if any significant decrease of interaction has occurred or not due to this substitution. How can I find this with discovery studio/pymol or other visualization software?
Relevant answer
Answer
By simply using pymol software..measurement tool can be used to predict distances and also angles. Hydrogen bonds can also be easily determined
  • asked a question related to Computational Biology
Question
9 answers
In the fasta output of Prokka listing the name of genes, some genes does not have any name ("gene: NA"). My question is  whether these genes are hypothetical or they do not have any name?
If the former one is the case,  how Prokka determine them?
Relevant answer
Answer
If you mean gene feature, then you can use --addgenes option for Prokka. Sebawe Syaj
  • asked a question related to Computational Biology
Question
7 answers
Dear everyone,
I'm working on bio-informatics. And my current project is to analyze public data from TCGA and GEO to find novel genes' relation to diagnosis or prognosis of cancer. The special feature of TCGA data is the enclosing clinical data, which we can use for further analysis.
I'm just want to know that is there any platform or source of public data like TCGA or GEO? Because sometimes I need to validate the data. But I can find another source for validation, and I have no available clinical data of the interest problem in my hospital.
Thank you very much for reading and sharing experience!!!
Relevant answer
Answer
Hi,
A nice platform to explore public datasets such as TCGA, ENCODE, ROADMAP, GEO, and IHEC is the WashU Epigenome Browser, where you can access a broad range of transcription and epigenetic datasets, ranging from RNA-seq to WGBS, ChIP-seq, among others. Wash U harbors different datasets for hg19 and hg38 genomes, therefore I suggest that you explore both options.
I hope you find it useful.
Regards,
  • asked a question related to Computational Biology
Question
2 answers
Excluding the obvious ones like ChemOffice, SciFinder or any other bulky packets. I am looking for those small programs that make life easier, like for instance: Mendeley (for managing documents) or Quartzy (for chemicals and protocols managing).
I am also desperately looking for some applications of similar kind for Android OS.
I would also like to hear some feedback about Electronic Lab Notebooks (ELNs). How are they working out for you? Would you choose to work in those or do you prefer paper?
Relevant answer
Answer
Have a look at https://www.bookkit.org ! Great for managing equipment, lab members and bookings. Especially useful to organize a safe return after COVID-19 ...
  • asked a question related to Computational Biology
Question
1 answer
I'm a molecular biologist, and i have a few projects coming up in transcriptomes and small RNA analysis. Can i get by without knowing any programming using user-friendly software such an Geneious Prime or another program you can suggest or is it absolutely a must?
Relevant answer
Answer
Hi,
To be an efficient bioinformatician, you need to learn at least any programming language. You need not[ be high-end developer, but at least know how to do your bits. Also, you can be comfortable and can use various user-friendly GUI, but it would need more time and space, whereas in coding you can customize, according to your needs.
  • asked a question related to Computational Biology
Question
17 answers
I want to simulate a niosome bilayer with schrodinger software (molecular dynamics), but first I have to design the proper bilayer. Does anyone know a software or simulator to design a bilayer with a certain composition?
Relevant answer
Answer
There is a VMD (visual molecular dynamics, https://www.ks.uiuc.edu/Research/vmd/) Plug-in to help set up membranes for the MD simulation of membrane proteins
Tutorial:
  • asked a question related to Computational Biology
Question
12 answers
I have taken Illumina reads and aligned them to a reference genome using BWA then obtained the corresponding BAM/SAM files. I have also called SNPs, which are in VCF format, and tried to use this file to predict synonymous and nonsynonymous sites (using snpEff), but this will only give me a N/S ratio and I what I really want is the dN/dS ratio. Is there any way do this from the BWA alignments? I am new to NGS genome assembly, so any tips are much appreciated.
Relevant answer
Answer
Dear Stacy,
Were you able to find a way to calculate the dN/dS ratios?
I have a similar issue to what you had. I'm working on Leishmania genome data.
Thank you in advance!
  • asked a question related to Computational Biology
Question
3 answers
Few of us wanted to create a discord server for Biophysics. What we intend is to begin a commonplace for discussions/numerical experiments. Also possibly document the results in the form of blogs or other media.
I believe that there are many biophysics/computational biophysics/Molecular Dynamics enthusiasts here. Here is the server link: https://discord.gg/qRQRq2k
Come and join us. Let us learn together.
Relevant answer
Answer
Dear Devanand,
Can you explain more what is the purpose of this post and what the discord means?
Bog
  • asked a question related to Computational Biology
Question
29 answers
I have 14 miRNA that is related to a particular disease. I want to draw a network like Gene Networking (GeneMania).I can draw a network easily by inputting the gene name in genemania but which softwere can take input miRNA name like this? Which software is better? I was trying to use Cytoscape but it require pre-networking data (if I am not wrong). I am not sure whether I can get any pre-networking data for miRNA. Some of the miRNA is quite new and some old version of the software can't recognize that one.
Please help me how I can get a network like Genemania. I only can input different miRNA name and particular disease. Thanks.   
Relevant answer
Answer
We recently developed miRViz to interpret microRNA datasets using microRNA networks:
To build miR-mRNA network, a Canadian group has developed miRNet: https://www.mirnet.ca/miRNet/home.xhtml
Both may be useful for you. No need for computational skills. I would suggest first miRViz, and then miRNet.
If you want more information, we also published in 2015 a paper building microRNA networks (we use these networks in miRViz):
'hope this will help you.
  • asked a question related to Computational Biology
Question
2 answers
I mean the number of contacts per protein residue with different different parts of the lipids. It more or less can be done in GROMACS, but you need to create many indexes for each trajectory, so it is quite a long analysis. I was wondering if there is any tcl script to use with VMD to do so.
Relevant answer
Answer
How exactly can MMPBSA help in this case Jamoliddin Razzokov ??
  • asked a question related to Computational Biology
Question
5 answers
---
Relevant answer
Answer
Hi Annemarie, sorry just wondering if you received a PM? Not sure if it went through.
Cheers,
Michael
  • asked a question related to Computational Biology
Question
3 answers
There is a published substitution matrix for intrinsically disordered proteins that I would like to use for a BLAST search, but I am unable to find a program that supports uploading a custom matrix. I prefer to use R for my computational biology, but I will use what ever is needed to support the matrix. Any recommendations or tips? Thanks!
Relevant answer
Answer
Yes, there is, but it is old. Legacy Blast 2.2.12 (or 2.2.13, if I well remember) accept custom matrices. The following versions of Legacy Blast and in all Blast+ series, I understand, the matrices are included in the program, so you cannot change them.
Part of the arguments to hardcode the matrices, at the time, were: 1. it made the blast search more efficient; 2. most of the researchers use only the known matrices. and; 3. no reviewer would "believe" in a newcomer without all the testing that current matrices have already passed.
I think this was an unnecessary added stiffness, as new ideas in sequence analysis may translate into new substitution matrices.
I think you can still download older versions of blast and include your own matrices to find new stuff.
  • asked a question related to Computational Biology
Question
8 answers
Hello,
I've just came across these two algorithms and I was wondering whether there are any available versions for ACO and PSO as ranking-based feature selection approaches?
Any comment would be appreciate.
Kind regards,
Davide Nardone
Relevant answer
Answer
mostly ACO and PSO used for feature selection
  • asked a question related to Computational Biology
Question
3 answers
Hi, I have a computational biology backgroud and right now studying how the cells are organized in a tissue. Someone told me that my cells are orientated in specific manner so it has some kind of similarity with liquid crystal because there also mesogen (or molecules) of long axis aligned to the director. In general liquid crystal have the order parameter value in range between 0.3<S<0.8.
In my case, the order parameter is negative (-0.3) that means cell short axis aligned to the director. What shall I understand from this about the morphology of the cells? Any help will be appreciated. Thanks.
S = 0.5 <3cos^2(theta) -1> , where theta is the angle between director and molecule long axis.
Relevant answer
Answer
In the negative order parameter band gap between the two compounds increased.
  • asked a question related to Computational Biology
Question
10 answers
We are interested in developing method for predicting siRNA, thus we need a large set of siRNA for developing models. I will highly appreciate if you please suggest best database or databases on siRNA. This will help us in creating large dataset that may cover all experimentally characterize siRNA. Please also suggest best (latest) prediction method on siRNA. Do you think their is possibility for developing better prediction method or this field is already saturated. 
Relevant answer
Answer
Hi I wonder whether you have found the updated siRNA database?
  • asked a question related to Computational Biology
Question
1 answer
I have gene expression data from different conditions from different studies. Instead of using the actual TPM values for Pearson Correlation coefficient (PCC) calculation, I have decided to use Fold change values from different studies to eliminate biases from different studies. My question is whether using these raw fold change values for identifying co-expressed genes is a correct way to do it or should perform quantile normalization on these fold change values before using them for PCC calculation? (Note: Distribution of fold change values in different studies is quite different)
Relevant answer
Answer
There are several methods to account for the differences between studies. As far as I know most of these methods use the raw probe intensities (microarrays) or gene counts (sequencing). The differences between studies or samples in the same studies that were processed in different groups are known as batch effects. You can look up the appropriate methods for removing batch effects in the specific data type you are handeling. Normalization would only be useful in removing the variance among samples in the same batch or after removing the patch effects.
I am not sure how were you planning to use fold-change to calculate gene co-expression to start with! The aim of co-expression analysis is to find the correlations between pairs of genes in a dataset or sets of data using a co-expression measure such as PCC. You don't need to calculate fold-change at all. The difference in gene expression between two conditions in a dataset is not the concern here. I will try to explain with an example
Expression matrix:
c1 c2 c3 t1 t2 t3
g1 1 1 1 2 2 2
g2 1 1 1 5 5 5
g3 6 6 6 3 3 3
The fold-change (t_vs_c) would be something like:
g1 = 2/1, g2 = 5/1, and g3 = 3/6
The co-expression would be something like:
g1_g2 = 1, g1_g3 = -1, g2_g3 = -1
There is however something called the differential co-expression which is the change in the correlation of a pair fo genes between two conditions. First, the correlation is calculated between a pair fo genes in each condition separately. Then the correlations are compared between conditons.
  • asked a question related to Computational Biology
Question
2 answers
Hi!
I've been using the refine.bio website to download normalized transcriptome data; each downloaded dataset consists in a compressed directory with an expression matrix in .tsv format, its metadata in .tsv format too and an aggregated metadata file in .json format.
I'm trying to associate the expression matrix with its metadata using R programming language, but I don't know how to do it, and I don't find the way in the site's documentation. I only know that I need reed these files with these commands:
> library(rjson)
>
> expression_df <- read.delim('SRP068114/SRP068114.tsv', header = TRUE,
> row.names = 1, stringsAsFactors = FALSE)
> metadata_list <- fromJSON(file = 'aggregated_metadata.json')
but I have no idea how to merge them for generating a full-informative matrix.
Can someone help me, please?
Thank you so much.
Relevant answer
Answer
First you would need to flatten your json file :
library(jsonlite)
metadata <- fromJSON("File.json", flatten = TRUE)
View(metadatadata)
#After that you will read your expression table :
expression <-read.csv(" SRP068114.csv",header = TRUE)
Mergedataset <-merge(expression, metadata[, c("ColumnName")])
head(Mergedataset)
  • asked a question related to Computational Biology
Question
5 answers
I am working on computationally understanding the active and inactive conformations of some proteins. Simulating the inactive conformation from the active conformation is reported in literature by performing enhanced sampling MD studies, like Metadynamics, REMD etc., in which energy is added in particular co-ordinates called collective variables. This makes it slightly biased.
If I run unbiased atomistic molecular dynamics simulations of several microseconds, will my protein explore the conformational space by crossing the energy barriers? Or will the system be eternally stuck in a local minimum which it first reaches?
Relevant answer
Answer
Hi Rajiv,
It depends on the system you are studyng, some systems dont show important conformational changes in the microsecond or even in the milisecond time scale not only because they get trapped in a local minimum but also perhaps the conformational change is coupled to protonation or deprotonation or by some physicochemical conditions (pH or temperature). Therefore, before planning a computational experiment you need to search enought informatation about your system to propose the experiment.
If you ask me, I prefer run unbiased atomistic molecular dynamics simulation in the microsecond time scale, than biased methods for the reasons that you mentioned.
  • asked a question related to Computational Biology
Question
5 answers
Dear research professors and scholars,
I have developed a novel 3D protein structure (mutant DNA Gyrase enzyme of antibiotic-resistant E. coli) by homology modeling technique. Because this type of protein was not deposited in the protein data bank (www.pdb.com). So, I attempted to create mutant DNA Gyrase protein in homology modeling method. I would like to this protein in some online protein bank for future research on antibiotic-resistant related studies. Please suggest some online 3D protein upload website except www.pdb.com.
Relevant answer
Answer
@Nikil, if you base any major conclusions in a publication on a protein model, it makes sense to make your specific model available to the reader, either in the supplemental material or in a database. This is all the more important for non-trivial models.
  • asked a question related to Computational Biology
Question
9 answers
I'm looking for a book for microarray data analysis. I'm a mathematician and I'm interested to find a book able to give a framework for microarray data analysis (from the beginning to the end-backgroung correction, normalization, dim. reduction, clustering, etc...). I found this: http://www.springer.com/gp/book/9781402072604
There are some more appropriated ?
Thanks
Relevant answer
Answer
thanks
  • asked a question related to Computational Biology
Question
3 answers
how to study a certain type of mutations with another type of protein (not mutation and totally different from mutation) by (bioinformatics tools). What is your opinion and suggestion about this?
Relevant answer
Answer
Explain what you want to do or what you are actually asking. The question is very confused in itself.
  • asked a question related to Computational Biology
Question
2 answers
I would like to perform positive selection analysis among mammals. I am mostly interested in positive selection in humans and I have started off with around 100 mammals species and looking for a way to find the ideal number of species I can use to detect the selection.
I want to have a balance in the evolutionary distances between the species I will include: not too distant or not too divergent. Previous discussions give a minimum number of species to include in a positive selection analysis; however, not much information is given about the maximum number.
What is the appropriate number of species that should be used in positive selection analysis and what would be the maximum? Also, what interval of evolutionary distance should be used to be able to detect the positive selection and avoid false positives at the same time?
Relevant answer
Answer
Hello Ozge,
For selection analyses you want to be in a range where your rate of synonymous substitutions (dS) is not saturated. If this rate is too high then any methods based on dN/dS will not be appropriate.
In mammals, Rodents or Simian primates are great sets of species to detect signatures of selection (over 20 assembled genome in each). In general I would say that between 50-70My of evolution is a good amount for selection analyses in mammals.
As far as number of species, there is a great study by Sarah Sawyer's group that looked into this. I would recommend to aim for 20. If you are using likelihood methods you will run into computational hurdles with more species. But I would advise against using the entirety of mammals to run any type of selection.
I hope it helps.
Cheers,
Antoine.
  • asked a question related to Computational Biology
Question
1 answer
I want to generate a graph showing the relative evolutionary constraints on single positions of a certain sequence of amino-acids (protein sequence).
I came across this attached figure of E.V. Koonin in his book "The Logic of Chance", what he called "genomescape", is there any method to measure evolutionary constraints per residue in a sequence of amino acids, and generate such a graph?
Thanks
Relevant answer
Answer
You said you are interested in generating a constraint graph only for a protein sequence (rather than the full gene). The general principles would be similar. First you need to have an alignment that will allow you to measure the evolutionary constraint at each position. For proteins, I recommend generating it using HHblits.
Then you need to calculate a score for each position in the alignment - various scores exist, the most commonly used is probably weighted entropy, implemented here:
  • asked a question related to Computational Biology
Question
6 answers
Hi. I have a protein structure, GPCR with 400 residues approx. The EC loop contains around 70 residues which makes it so flexible and they are not my attraction of studies, hence i hardly need it.
How is possible to clip the EC loop?
1. How to clip it?
2. Is there any consideration before I clipping?
Please help me with your suggesstions.
Relevant answer
Answer
I would do it the same way I do it to design an expression construct with a shortened loop - searching for a shorter loop structure that will connect to the base structure without strain, then optimize its sequence using Rosetta.
  • asked a question related to Computational Biology
Question
5 answers
I just completed setting up my egpu setup for exediting GROMACS MD simulations. I have seen quite a few post here and there regarding this. So, I thought it would be a good idea to share my experiences.
I bought:
1. Zotac GTX1050TI OC edition- 200$
2. EXP GDC Beast 8.4D mPCIe- 31$
3. A local premium grade 500W power supply- 12$
I had an old reliable Lenovo Z510 and sacrificed my wlan card. Put the adapter there and changed the BIOS graphics mode to UMA only. Working fine, got almost an 80% boost.
Relevant answer
Answer
Dear Dr. Souprano,
Even though the trick you applied to fasten the speed of your MD simulation could give some achievements, however, I highly recommend you to avoid doing such works because there is a possibility you lose your laptop. You can turn your trick into a P.C. to reach the expected results. Of course, you could also purchase an account from online MD servers to do your simulations easily instead of that work your laptop linked to an external graphic card!
Hope to see your success more and more.
Have a whale of time.
HR
  • asked a question related to Computational Biology
Question
6 answers
I have modeled a protein, performed MD simulation and Docking studies. What are the other/additional computational study that can be performed further in order to target a high impact journal.
Relevant answer
Answer
To make a high impact:
1. Find a problem.
2. Find a way to solve the problem.
3. Solve the problem.
4. Write about the steps 1-3.
  • asked a question related to Computational Biology
Question
3 answers
I'd like to share tips and tricks about this useful software.
Relevant answer
Answer
I use Tableau more of for visualization
  • asked a question related to Computational Biology
Question
2 answers
Hi, I am a starting my BS' senior year in a few months. The major of my study is molecular and cell biology, I also have a decent background of computational biology tools used for analyzing high throughput sequencing data.
I am interested in perusing my graduate studies at coral reef genomics, biotechnology of coral reef restoration, etc. The problem is I am confused somehow and do not know where to start from. Can anyone give me any advice that can help (Recommending a quality lab that works in the field, having the contacts of a professor that works on the field and maybe needs to recruit a masters or Ph.D. student, or recommending an online course or a textbook that would help me get the required knowledge)? Please, provide me with anything that you think may help. Thanks in advance.
Relevant answer
Answer
Check out some of the marine science programs such as the ones at the University of North Carolina in Wilmington. there are all sorts of programs along the coast. Find a university that has marine field stations such as UNC, U, and College of Charleston just to name a few. I am sure there are many more along the East coast, Gulf of Mexico, and the west coast.
  • asked a question related to Computational Biology
Question
2 answers
Computational studies of membrane protein
Relevant answer
Answer
I am not aware of anyone doing so successfully. We use the cpm assay to experimentally screen for the stability of detergent solubilized membrane proteins , and developed the CHESS method of high throughput screening selection for protein mutants with increased stability in detergents: https://www.bioc.uzh.ch/plueckthun/pdf/APpub0345.pdf
  • asked a question related to Computational Biology
Question
8 answers
1ns/hr for a 60k atom system unrestrained... GROMACS 2019... 1.0nm cutoff and 0.14 fft spacing...
160ns approx in 9 days... Count the power outage (totalling almost 9-10hrs last week), 3hrs rest per day...
Bad??? I don't think so... Any opinion???
Relevant answer
Answer
I find post MD 'analysis' more time consuming that pushes your limits to know more and more about your system in order to come out with something meaningful.
  • asked a question related to Computational Biology
Question
7 answers
i want study cancer genome by bioinformatics tools. can advice me in articles or review paper can guide me to do my research.
Relevant answer
Answer
You will no doubt want to start by getting familiar with The Cancer Genome Atlas (TCGA) website and portal:
Here is the data portal: https://portal.gdc.cancer.gov/
And, this documentation provides useful guides to the bioinformatics pipelines used/available: https://docs.gdc.cancer.gov/Data/Introduction/
To begin reading you may find this review article helpful:
Best,
Chris
  • asked a question related to Computational Biology
Question
8 answers
Hi All,
Thank you for all your support.
Thank you chandra mohan , Christian Janiesch and Ramin Sedaghat.
Looking for more published projects where students can get benefited by referring these documents.
Please share the docs directly into genotech.in@gmail.com or reply me here.
Regards,
Ranjan
Relevant answer
Answer
Hello, there were many aspects in microbiology which need more detail study till now we have information only about 5% of total biodiversity of microorganisms. So 95% is future work!
Good luck!
  • asked a question related to Computational Biology
Question
9 answers
I am currently trying to model my proteins (they are antibody fragments) I am looking for this comparison between the two major tools for protein model prediction. I either find people saying to use either one of them but I failed to get any comparison in features and reliability or differences between those two.
It would be very helpful if someone could mention their thoughts on these two.
Thanks 
Relevant answer
Answer
For generic modeling of structures with low homology to potential template structures, you are right. However, in generic modeling, you assume the most important parts are the prediction of the fold, the interactions in the core of the protein and the most conserved parts of the protein.
In contrast, when modeling antibody variable domains, the core structures and their variability between germlines is very well know (>5000 structures), however, the most interesting parts are the variable loops that form the paratope, and the source of their variability to a large part does not stem from random mutations, but from specific mechanisms of genetic recombination. In addition, the detailed influence of specific sequence variants on CDR conformation and relative domain orientation has been extensively studied. You should, at least for comparison, look at https://sysimm.ifrec.osaka-u.ac.jp/rep_builder/ for modeling the antibody variable domains.
  • asked a question related to Computational Biology
Question
7 answers
As the wall time is up, the production run stopped. now i have prd.cpt & prd_prev.cpt. so what is the difference between them and which one should be used in restarting simulation.Also in tutorial topol.tpr is a topology file or a binary file produce by grompp.
Several .tpr file have produced during simulation like "#prd.tpr.67# ", "#prd.tpr.64#" and "prd.tpr ". what is difference between them ? which one should be used to restart the simulation?
Thanks.
Relevant answer
Answer
For extend a MD 1ns:
- gmx convert-tpr -s md.tpr -extend 1000 -o prod2.tpr
- gmx mdrun -s prod2.tpr -deffnm md2 -cpi md_prev.cpt -append
In this case, you need another .tpr beacause the previous one finished the time specified.
If you wanna to re-start an MD use:
- gmx mdrun -s md.tpr -cpi md_prev.cpt -append -deffnm md2
This is when for a reason you stop your MD and you want to complete it.
  • asked a question related to Computational Biology
Question
7 answers
Suppose 2 genes produce 2 proteins which would be binding. After the genes are turned on and the proteins are produced, then genes can be turned off and the proteins produced can bind with each other even after the genes were turned off right?
Relevant answer
Answer
Many weak bonds are needed to enable a protein to bind tightly to a second molecule, which is called a ligand for the protein. ... The region of a protein that associates with a ligand, known as the ligand's binding site, usually consists of a cavity in the protein surface formed by a particular arrangement of amino acids.
  • asked a question related to Computational Biology
Question
5 answers
I'm trying to use homology modelling to find the structure of the N-terminus of a protein which is thought to contribute to its hetero-oligomeric structure. However, I can't find an x-ray structure to use as a template. I use blast to search for pdb structures but none of them seem to have solved the structure of this region even though it is in their sequence. Is there a way of filtering out structures which do not have this region solved? If not, what would be a good alternative to solve the structure computationally?
Relevant answer
Answer
Use a sensitive homology detection method such as HHpred:
It also makes sense to do secondary structure prediction and, in the case of helical regions, the prediction of coiled coils:
With the results of these methods, one can decide what to do next (homology or de novo modeling). Feel free to contact me.
  • asked a question related to Computational Biology
Question
5 answers
I have imported a MD trajectory in VMD. I need to generate plots like how the distance between two selected atoms is changing over the simulation. Is there a way to do this in VMD?
Thanks.
Relevant answer
Answer
Of course, there are a number of things VMD can do.
For instance, you can perform RMSD alignments, calculate distance between groups, generate movies from trajectories....
  • asked a question related to Computational Biology
Question
3 answers
I am having problem understanding how weka calculates ROC curves. As I am generating machine learning models by 4 methods-j48, random forest, naiive bayes and when i save the data for roc by each of them number of instances vary. for some there are only three instances or some like hundred. Should I not be getting almost same number of instances in all the cases irrespective of the model. I understand that ROC is plot of True positive rate vs. False positive rate but how does number of instances come into picture in weka.
Relevant answer
Answer
I have tried this by installing a plugin 'JFreeChart Renderer'. Multiple ROC curves of different algorithms can be plotted in a graph. It is easy to compare. However, I could not find the configuration to choose or set the different colors for different algorithms, thus it would be able to compare algorithms along with different datasets. Problem here is multiple graphs of different algorithms in different datasets are different.
  • asked a question related to Computational Biology
Question
2 answers
Dear all, I am working on a knockout strain of e coli for mixed acid production and have so far got a bit of understanding of Flux Balance Analysis. Now, for my next part of the project, I have to experimentally verify the results obtained from flux balance analysis. I haven't got much literature on where the people have done experimental verification of flux balance analysis. For that, I have devised the following experimental design-
  • Batch Reactor(1.5 L working Volume)- Data to be taken only for the exponential phase.
  • M9 minimal media with xylose as the carbon source(No other carbon source and no use of yeast extract, even though it enhances growth)
  • Controlled pH of 6.8
  • Oxygen supply and agitation speed- optimized values for mixed acid production. However, the Dissolved Oxygen probe shows the value to zero after a few hours of experiment, which means whatever oxygen is provided is consumed readily
  • Calculating oxygen uptake rate- Not sure, need help
I request to look at the experimental procedure and provide your suggestions, answers, and comments. 
Relevant answer
Answer
Hi Prashant
I have a question, when you simulated your model in minimal medium, what are the constrained you used. what was your model.medium lower_bound and upper_bound?
  • asked a question related to Computational Biology
Question
7 answers
Hello,
I would like to ask if somebody know user friendly way to blast query sequence (protein or nucleotide) against custom (user-defined) database of sequences. Ideally to work on Windows :-)
Thank you for response
Relevant answer
Answer
Yes, standalone BLAST is how I would do it (and have done in the past). You can interface it with python etc.
  • asked a question related to Computational Biology
Question
5 answers
I am trying a coarse-grained simulation in gromacs to understand certain protein folding. I initially tried with triclinic box and -d 1.0. I used -c to center my protein in the solvent box. But the protein moved out of the solvent box during the simulation.
I tried a second time in a cubic box and -d 2.0. It didn't help.
Any suggestions in this regard?
Thanks
Relevant answer
  • asked a question related to Computational Biology
Question
21 answers
Computational Biology 
Relevant answer
Answer
Maybe I'm a bit late answering this question, but I recommend doing the following for Windows 10:
2) Now you can perform an installation as it would be done on ubuntu (http://manual.gromacs.org/documentation/2018/install-guide/index.html)
Before installing gromacs you will probably have to do the following:
> sudo apt-get update && sudo apt-get upgrade
> sudo apt-get install gcc
Cheers!!
  • asked a question related to Computational Biology
Question
11 answers
Hello,
I am wondering if could be possible to align set of protein sequences (for example 100 protein sequences) each to each by any user friendly way. I.e. sequence no. 1 with the sequence no. 2, sequence no. 1 with the sequence no. 3 .... sequence no. 1 with the sequence no. 100 ................................................................................................. and finally sequence no. 100 with the sequence no. 99. Does such tool exist? Ideally with some graphical output (heatmap of similarities,...).