Questions related to Bioinformatic Software
How to show two peptide structures side by side on Pymol. I have 100 PDB files. I want to show all the 3D image structures side side in a two-dimensional array form. How could I do the same in Pymol?
My professor asked me to find a software solution to deal with scRNA data analyses (preprocessing and analyzing data). The department is going to have VIZGEN and COSMX machine coming next quarter. 3-4 labs depend on 1 bioinformatician guy is really annoying and inconvenient.
There're some companies I am looking at:
Most of the time, we want to compare between metadata categories to generate insights, and also apply gene set / pathway analysis on those. Any suggestion on that cause I am truly depressed rn :( ?
I now that alot of artificial networks has appeared now. And may be soon we wil not read articles and do our scientific works and AI will help us. May be it is happening now? Wat is your experience working with AI and neural networks in science?
I have a collection (around 500) of peptide 3D structures (PDB file each 10 residues long and for a given sequence). I need to cluster it based on RMSD values among them. Is there any Python module or any other software which could do that and identify the distinct structural clusters among those 500 files?
I have been trying to dock a certain protein with nd ion i downloaded from rcsb but after i add it to pyrx and try to convert it to ligand i get the following error. I tried converting the sdf file to pdb using pymol, chimeraX, avogadro, open babel but even then when i open the file it gives me this error: ligand: :UNK0:Nd and ligand: :UNK0:Nd have the same coordinates. Could someone please help?
Update: I want to dock an unbound protein with the neodymium metal ion which i downloaded from rcsb in sdf format and later tried to convert it to pdb using the aforementioned softwares for autodock to accept it but i can't get it to be accepted by autodock as a proper ligand. Apparently I am unable to get any of the rare earth elements to be accepted properly as ligands.
How can I dock more than one protein with more than one ligand, I know that pyrx is the software which docks 1 protein with multiple ligand but how can I do it for multiple proteins with multiple ligands?
If I have a sequence (genome.fasta). And I want to check the gene located in 400nt -500nt.
What bash script (I have WSL in my windows) I should use or are there any conda packages ?
Thank you in advanced
Is there any server or tools (bioconda, java, etc.) to exclusively annotate membrane protein only (similar to dbCAN for polysaccharides) from a bacterial genome?
Thank you in advanced!
I need to use the AutoDock program, but I tried both versions 1.5.6 and 1.5.7. I had no problem loading, my only problem is that when I open the program, the loading remains around 9% in the window that opens. Then the window closes with the command page that opens. It will be more understandable when I add screen shots. I am using Win11 operating system, I have an HP brand laptop, I really can't figure out where the problem is. In the screenshot I attached, it stays as it is. The program closes directly.
I'm using autodock vina in Python to dock multiple proteins and ligands, but I'm having trouble setting the docking parameters for each protein. How can I do this in Python? (I have attached my py code which I have done in this I have assumed this parameters same for all proteins)
I generated a large number of peptide structures for a given sequence using the peptide builder module. However, the peptide builder module does not include hydrogen atoms all along the peptides. Is there any method through which I can add hydrogens to peptides ?
I created this R package to allow easy VCF files visual analysis, investigate mutation rates per chromosome, gene, and much more: https://github.com/cccnrc/plot-VCF
The package is divided into 3 main sections, based on analysis target:
- variant Manhattan-style plots: visualize all/specific variants in your VCF file. You can plot subgroups based on position, sample, gene and/or exon
- chromosome summary plots: visualize plot of variants distribution across (selectable) chromosomes in your VCF file
- gene summary plots: visualize plot of variants distribution across (selectable) genes in your VCF file
Take a look at how many different things you can achieve in just one line of code!
It is extremely easy to install and use, well documented on the GitHub page: https://github.com/cccnrc/plot-VCF
I'd love to have your opinion, bugs you might find etc.
I'm looking for bioinformatics softwares to determine intercellular interactions between macrophages and endothelial cells, the pathways and genes involved. Please recommend any software that can help me identify important interactions, that I can also validate in wet lab
I tried using Phaster.ca and PhiSpy for phage detection in the bacterial genome
They showed a completely different result for regions and the virus identified.
Do you have the same experiences and could you share your suggestions, please?
Thank in advanced!
I am working on a few transcription factors and I need to check multiple genes that they may be regulating in my fungal system. So can anyone suggest some easy to use online tool where I can input my genome sequence and the protein sequence of the TF to check for different sites that the TF may be binding to?
I am interested to find interactions between plant wax and other chemical compounds. Is there any computational tool or webserver to find the interactions? Thank you
Using NEB cutter tool I found different restriction enzymes regarding my sequence.
I am targeting some mutations regions on my sequences. I am not sure about to select which restriction enzyme for my work, as on NEB tool result it is suggesting many restriction enzymes.
I already know about the target location (mutations on the target regions), but want to know the further criteria that should I take care for selecting a particular restriction enzyme for my work.
I have performed MD simulation of 100ns on Desmond and now I want to calculate solvent accessible surface (SASA) area over the trajectory. Can Anyone please guide how to do it? Thanking you all in Anticipation.
I have whole genome of a bacteria. Do you know which program to detect virus genome within the bacteria?
Is using annotation (example : prokka) and then looking manually for viral genes/proteins? Or by checking the assembly (example : prokka) and blast the shorter contigs will is enough?
Thank you in advanced
I need to look for a software that can screen for a sequence in multiple genes from its database. I work with Drosophila and was able to get a list of DEGs and now need to check if those genes are controlled by Stat92E. To do so, I am using SOCS36E promotor sequence as target sequence. I was told to use Target Explorer software to do so but apparently its no longer on the original website. I tried to reach out to the publishers of the paper but it's been a futile attempt thus far. I tried using NCBI Drosophila blast but nothing comes up. I was wondering if y'all know of a software that can help me with my stuff. Or if y'all know if Target Explorer is hosted on another site.
Thank you in advance!
I am downloading the ncbi nucleotide databasein kraken2, it seems that it is in process but after a while an error appears.
Help me please..
root@LAPTOP-SBTF23AA:/mnt/c/Users/Maria/Documents/Bioinformatics/software# kraken2-build --threads 16 -db NCBI --download-library nt
Downloading nt database from server... rsync: read error: Connection timed out (110)
rsync error: error in socket IO (code 10) at io.c(794) [receiver=3.1.3]
rsync: connection unexpectedly closed (184 bytes received so far) [generator]
rsync error: error in rsync protocol data stream (code 12) at io.c(235) [generator=3.1.3]
I am computing Van der Waal interactions in python for a peptide of size 10 residues for various conformations. The total conformations (or the number of PDB files is 300,000). Is it possible to compute only the 1-4 atom distances to compute Van der Waals interactions as the bonded and 1-3 atom distances are irrelevant when it comes to Van der Waal interactions using some python module?
Good day great scholars. Please I would appreciate help on how to install Discovery Studio and PyRx in Ubuntu.
I will also like to know how I can assess/identify the path of the License Pack for Discovery Studio.
I would like to know how to get all possible DNA sequences that encode every 20 amino acid (aa means 60nt) frame in protein X and all possible aa substitutions at each residue of this protein.
I want every 20 amino acids in protein X to be linked with 10nt-barcode (huge library, 20 power20) and then test the expression of this protein using high-throughput sequencing. I would like first to get a few sequences of this system. If this works, I will order libraries that combine barcodes-10nt with protein X-60nt and then test the whole library expression using deep sequencing.
The sequence I want looks like this before cloning it to the vector:
NNNNN-[Barcode]--XXX--protein X-60nt(20 aa)]-NNNNN
XXX are 3 restriction endonucleases.
Please help me out with any available tools or websites to do this.
I have performed a Mantel test using IBD software to test isolation by distance. However I am confused by the output of the software. Why there are so many analyses for a single dataset and which is the actual result of the analysis?
I have uploaded the .txt file for reference
I want to perform targeted molecular docking of:
(a) a receptor (enzyme) whose structure is not availabe, and hence has been built by computational ab intio methods (and not homology, since the % identity is very low);
(b) a substrate whose structure is availabe.
Given that, the active sites of the enzyme are also not truly available, but has been obtained from (a) literature review; or
(b) inferred from cavities/clefts predicted by CASTp results, how exactly should I perform targeted molecular docking?
I am trying to create a similar diagram to this one attached below for my ligand. I ran a 100ns for the complex of ligand and protein. i am willing to consider interaction fractions of hydrophobic and h-bonds interactions only since my ligand is not charged, thus electrostatics are zeros.
now i was told i can calculate the nonbonding energy and vdw energy using the NAMDenergy plugin in vmd. so my files were in gromacs format. therefore, I converted them to dcd and psf by saving my trajectory in dcd format and using these two commands in vmd console to get the psf file
set all [atomselect top all]
$all writepsf XXXX.psf
unfortunately, i got this error from NAMDenergy plugin:
FATAL ERROR: Structure (psf) file is either in CHARMM format (with numbers for atoms types, the X-PLOR format using names is required) or the segment name field is empty.
when i checked the psf file it was already in xplor format and the segment name filed is there as well. so i don't know why it is giving me this error. here is a part of my psf file
REMARKS VMD-generated NAMD/X-Plor PSF structure file
1 1 GLU N N 0 14.007 0
2 1 GLU H1 H1 0 1.008 0
3 1 GLU H2 H2 0 1.008 0
4 1 GLU H3 H3 0 1.008 0
5 1 GLU CA CA 0 12.011 0
6 1 GLU HA HA 0 1.008 0
7 1 GLU CB CB 0 12.011 0
8 1 GLU HB1 HB1 0 1.008 0
9 1 GLU HB2 HB2 0 1.008 0
so the questions are:
1- how to solve this error so i can calculate the nonbonding and vdw energies
2- how to calculate the energy for hbonds so i can fractionate it as well
3- is there alternative way to calculate these energies or to create such graph?
thanks in advance
The following error showed up while i was converting some small molecules from sdf to pdbqt in OpenBabel from PyRx:
<<bound method VSModel.PrepareLigandMol of <PyRx.vsModel.VSModel instance at 0x091D8EE0>>
I have not much experience in bioinformatics and I need to find what are the common genes in several gene expression datasets, in other words, I need to find genes that match in all (or some) of my datasets. I am looking for some kind of tool that give me Venn diagrams with the coincident genes. Any suggestion (free software plese) will be very appreciated.
How to check whether which isoform of a gene is expressed in a particular cell line? For example, a gene has 3 isoforms. I want to know which isoform is predominant in HeLa cells.
I have tried PDBePISA to calculate the free binding energy for the protein-DNA docked complex structure. However, I am still looking for the other programs (online available) that can give a comparison of the binding energy. Please suggest some reliable online server that can give those values.
Thank you in advance
What would be the most suitable tool for candidate gene investigation of a multifactorial disease using Whole-exome sequencing (WES) data?
Although the Exomiser is mostly used for rare Mendelian pattern diseases, could it be used for multifactorial diseases, changing the frequency filter, for example?
Hello. I am trying to run a haplotype analysis in PopArt. It's going well until I realized I can not load a previous work in PopArt. I can only export the graphical output as .svg, .png, or .pdf but not as a "network" file which I can reload or edit if I want to in the future. I noticed that it can be saved as a .nex file and the new file actually had additional lines (the portion of the code started with: "Begin NETWORK"). I think this is supposed to be read by PopArt but it fails to do so. I encounter parsing errors when I try to run the new file. I am not sure if there is a way around this as I am new to the software. Any help would be appreciated. Stay safe, anon!
In the R programming language, I'm going to install the MetaDE package. Nonetheless, I get a warning that package 'MetaDE' is not available for this version of R, A version of this package for your version of R might be available elsewhere. How can I overcome this issue while I'm using R version 4.1.0?
I have a set of Ramachandran angles. I wish to make Ramachandran plot out of it with standard allowed region contours in the background in Python. I couldn't do that in python. If not in python, Is there in anyother software where I can do this?
I am trying to simulate an interaction between two proteins and I cannot seem to get the free Maestro (academic use) to run it. Therefore, I am looking for a free/open source software to do the simulation. preferably, something with a graphic interface.
Thank you very much in Advance.
I have used P2Rank in the PrankWeb software and the CASTp tool to analyze the refined structures of some proteins to predict protein pockets and cavities. But now I am not finding any clue to visualize them in PyMOL.
Applications of bioinformatics in medicine is a key factor in technological advancement in the field of modern medical technologies.
In which areas of medical technology are the technological achievements of bioinformatics used?
What are the applications of bioinformatics in medicine?
I invite you to the discussion
Thank you very much
I want to see whether my identified gene members are produced through gene duplication and what kind of duplication happened.
A brief overview of the parameters that should be covered in the docking study. Please do mention some open source software with it !!
The gene that I work on has several large deletions in patients. I want to analyse the impact on the protein domains and also represent it in a figure.
I have some files in bed and bedgraph format to analyze with IGV. My team and I tried to upload them on IGV following the IGV site's tutorias but it hasn't worked. The bedgraph files are large (5157) and we converted them to the bynary .tdf format using the IGVTools "Count" command but it hasn't worked. Only with some files we can see a single flat line on IGV screen without any information. With FilexT we can see that the files in bed and bedgraph are not damaged.
We think that the problem is the step when we select the option "Load from File" on IGV. How can we do? What can we do?
We use the IGV_2.10.3
i am trying to install baliScore on BioLinux (software for the multiple sequence alignement comparison on benchmark sequences). I find multiple error mesaje on BaliScore instalation: "error while loading shared libraries: libexpat.so.0"
I have a data (shown in attached pic ) where I have RNA seq data of various samples for the same the gene twice.
Now suppose for sample-1 if I want to measure the gene ( which is haplotypic in nature ) how do I consider its RNA seq for the sample no 1. Do I take average or do I consider median or should I consider both these versions of genes as separate genes ? I guess biologist would make better explanations.
Suppose I have a peptide sequence of 400 amino acid and there is a particular domain (suppose a DNA binding domain). I am looking for any software that will take the peptide sequence as input and give an option to specifically download the sequence of the domain in the peptide sequence.
I just installed a new RTX 2080 on an old pc (I7 980x 12 CPU cores), downloaded and installed the latest nvidia drivers and CUDA toolkit for ubuntu, also I downloaded the namd build with CUDA acceleration, but when I run a molecular dynamics simulation the GPU usage just go to 15% meanwhile CPU usage is 100%, Is there any way to make namd use more of the GPU?
The saturation mutagenesis would be conducted in silico. I would like to know if it will be considered as directed evolution or otherwise such as site directed mutagenesis. Kindly attach the literature or source of your answer if possible. Thank you.
I would like to know if you know a tool to visualize in diagram the cell signaling pathways (e.g.: EGFR > ras> raf > mek > erk ; PI3K AKT mTor etc ...) starting from the receptor to the activation of genes.
Like this diagram but in a bioinformatics tool that groups them all together.
I know the Reactome with cytoscape tool, but the diagrams it proposes are a little bit confusing I think
I thank you in advance for your help!
I have Transcriptome/Genome data and want to list all the member of a KCS/CER gene family whose Domain AP2/ERF, FAE1_CUT1, according to the Arabidopsis database, so is there any tool or online database which can use to perform a quick search for all members,
if have then please share with basic step by step guidelines,
In the ped format for genotype, alleles of any SNP are represented by two columns (one for each allele, separated by a space).
Is a column sufficient for the haplotype to represented the allele of each SNP?
What is the difference between genotype and haplotype data in the ped format?
I appreciate your help.
I am trying to construct a multi-layer fibril structure from a single layer in PyMol by translating the layer along the fibril axis. For now, I am able to use the Translate command in PyMol to move the layer along the fibril axis to make the next layer, and then repeat this step to make other layers.
For example, to make a 4-layer fibril from the original fibril layer (chains D and J of PDB structure 2LMO), my commands are:
fetch 2lmo; hide all;
create layer1, chain D+J; translate [0.02, 0.25, 4.47], layer1, camera = 0;
create layer2, layer1; translate [0.02, 0.25, 4.47], layer2, camera = 0;
create layer3, layer2; translate [0.02, 0.25, 4.47], layer3, camera = 0;
create layer4, layer3; translate [0.02, 0.25, 4.47], layer4, camera = 0;
as cartoon, (chain D+J) + layer1 + layer2 + layer3
The result is shown in the attached PyMol session file.
From the Internet, I learned about the Iterate and Alter commands in PyMol, but they are intended for repeating an operation "on the atoms in a selection", not for repeating an operation "for multiple times". Therefore, I am wondering if there is a way to repeat the translation (or other operations in general) in PyMol by iteration or other methods. I need to construct a fibril structure of 10 - 20 layers and apply this method to different protein fibrils, so the automation of this process will help a lot.
Thank you for your help in advance!
I am looking fora command that will modify 3 chains available in the original pdb into a single chain and then renumber all of the residues. I have tried using alter command but when I export the pdb I get only one chain (of the initial trimer) and not the merged chain
Im looking for a software (free if possible) or tool which allow me to predict a 3D structure of a transmembrane protein. The thing is I know where is the TM domain, but I would like to predict the completely 3D structure using that information.
I usually use Chimera as viewer and PSIPRED/Psyre2 as predictor, but I dont know how to force them to view/predict the 3D structure considering a particular region as "transmembrane".
I'm new to the signalling pathways field. I have a list of proteins that have been upregulated in a cell which has been exposed to a protein factor. I would like to find out how, if at all, all these proteins are related. Is there a website that allows me to enter the list of these proteins and then it generates a signalling pathway that links these proteins together? Thanks.
I am trying to measure telomere length in an ant species using the TRF method, which is a Southern blot technique. Currently, I am struggling with analyzing the images and would appreciate any tips and suggestions on how to statistically analyze my data.
- Which software do you recommend to analyze the length?
- Any tips on how to image my membrane to get accurate results and is there anything to avoid while imaging?
I have attached an image for reference!
Thank you in advance and much appreciated
Hi everyone, I am using the pyTMs plug in on Pymol to phosphorylate a particular threonine in my protein. I am having difficulty in selecting only one or two residues - it seems I can only phosphorylate all serines/theronines. Does anyone know how to fix this?
I was trying to find a plasmid origin of replication, ori-finder did not find it, and also I tried to blast against their database and try to align to other similar bacteria, but with no success.
Can someone recommend a fairly simple program for someone who is not a bioinformatician?
I was also looking into GC skew analysis, if someone can recommend a program for that as well, i would appriciate it.
I have transcriptome sequences of parasites found in dogs. I use the blood from the dog to amplify the parasite so I would like to make sure that the sequences that I am using are of the parasite and not the dog?
Which bioinformatics software can I use and how do I go about it?
I have installed the program in C drive and set this as C:\ligplot. This folder contains exe files and .cmd files of ligplot, ligonly, dimplot and dimonly. After installation I downloaded the example PDB 1A8A from RCSB. I kept this pdb file in pdb folder within ligplot folder. Then i started the ligplot command to execute the process as given in manual. There is one problem that says cannot find coordinate file. Can anybody help me in this regard. What is the problem?
The results show me the presence of plasmids, and compares them with plasmids from the database, but I can't find their locations, thank you!