Science method
Metagenomics - Science method
The genomic analysis of assemblages of organisms.
Questions related to Metagenomics
How can I effectively perform metagenomic assembly on large Illumina sequencing data from environmental samples, given that I have already completed quality control but encounter issues due to the data size?
Hello ResearchGate community,
I am looking for a statistician with experience in metagenomic data analysis to assist with a research project. The data involves genotypic diversity within microbial profiles, and we require statistical expertise to ensure accurate and robust analysis. Specifically, I am seeking someone who is skilled in handling large datasets and can provide insights through advanced statistical methods.
If you have expertise in this area or know someone who does, please feel free to reach out. I’d be happy to discuss further details regarding the project and potential collaboration.
Thank you in advance for your support and recommendations.
I have FASTQ files for metagenomic analysis of the 18S rRNA gene, and I began to analyze them using DADA2 in R, following the tutorial at DADA2 Tutorial(https://benjjneb.github.io/dada2/tutorial.html). However, I stopped at the taxonomy assignment step due to my laptop overheating, which means the device's performance is insufficient to complete the analysis.
My question is: Is there any way to complete the analysis and obtain the taxonomy results while using my laptop? Or is there an alternative to DADA2 that can yield similar results? I would appreciate your expert opinion.
There is growing interest in techniques like single-cell genomics and metagenomics to study microbes that are difficult to culture using traditional methods.
Dear All,
We are looking to buy a sequencer for NGS metagenomic analysis and whole genome sequencing. We would really like to get your feedback and user experience on which is better?
No doubt Illumina has been in the market for a long time hence has a bigger user database, but our budget is limited. The machine we can afford is from Illumina iSec100 or from Nanopore PS solo2 (promethion).
We are working in antibiotic resistance in wastewater and sewer system.
Any response would be very helpful for us.
Thanks in advance
Kanchan
I'm making my research on metagenomics and I want to find the source of the command 'vsearch merge-pairs' that used to be 'vsearch join-pairs''
How can I cite that?
Hi there,
I have some 16S and metagenomic sequencing data from mice fecals and I want to upload them to NCBI Sequence Read Archive (SRA). I don't know what NCBI package is suitable for my animal fecal samples.
What packages should I choose to complete my submission?

I am doing a metagenomic data analysis.
Where from cell free DNA of AML patients who have sepsis.
It is a illumina NOVA Seq paired end data.
When I used various algorithms like minimap2, bowtie2 etc I got mapped reads of each TAXA.
What's the best way to predict Species abundance from its number of mapped reads ?
I have ref seq length of species also.
Some do say abundance = mapped reads/ ref seq length.
I wanted to know if there is any literature of how abundance could be predicted which is a more dynamic and robust quantitative value ?
happy to engage !
Does anyone have any advice regarding 16s metagenomics and an appropriate sequencing depth and paired end reads. Expecting around 10-15 organisms within the samples.
Thank you :)
I am a Nigerian researcher working on microbial diversity of an African fermented food product. I will appreciate it if l can get a location and price for shotgun metagenomic sequencing.
Hi everyone!
I hope you all are fine.
I have tried many tutorials regarding the calculation of alpha and beta diversity, however, my RStudio is throwing up so many issues. Packages aren't getting installed due to some compatibility issues. That's another discussion. However, may I request you all to kindly guide me in calculating these diversity indices?
What R script should I follow to calculate the Alpha and Beta- diversity indices. People have told me to use vegan, but how to go about it?
P.S. My input taxonomic data comes from Kraken2, so I have sample_kraken_report.txt for all my samples.
full title: metagenomics and alternative approaches for the fundamental study and exploitation of telluric microflora
"alternative approaches" here means metaproteomic and metatranscriptomics ?
Hi all..
I am seeking advice on the preferred method for sequencing a bacterial 16S metagenome: Illumina platform or Nanopore method? Any insights or experiences would be appreciated. #sequencing #metagenomics #researchmethods
In a metagenomic library profile desired size is 450-550bp,getting unwanted fragments which is causing failure of the library or it is difficult to get the desired data after sequencing.
I have red poultry mite samples which have been stored in ethanol. Now I need to perform RNA and DNA extraction followed by shotgun metagenomic sequencing. I would like to know if there is a way to successfully extract the nucleic acids so that I have no inhibition or problems during the sequencing run. You opinion and experience is highly appreciated.
I have been able to use the Kraken2 pipeline for analysing the nanopore reads and derive a KronaPlot for metagenomics. Since KronaPlot also gives a list of reads from a sample that is associated with the species identified, I am able to find those reads from my sample fasta file. Now, when I nucleotide BLAST those reads in NCBI, they don't identify the same species that the Kraken2 pipeline has output. Also, if I try to assemble the reads to the species identified by Kraken2, they don't align.
What am I doing wrong here, or did I miss identifying some important concept?
I am using Standard Kraken2db (All bacteria, fungal, viral and human) available here
What is the best way of metagenomic analysis using multiple rDNA amplicons (e.g. multiple 18S and ITS for each sample)? What methodology would you use to reliably normalize and integrate the data from individual amplicons, regarding variable amplicon length between the taxons and variable libraries sizes? Can we employ Kraken or Bracken classifiers? Thank you in advance for any advice.
During writing a review, usually published articles are collected from the popular data source like PubMed, google scholar, Scopus etc.
My questions are
1. how we can confirm that all the articles that are published in a certain period (e.g.,2000 to 2020) are collected and considered in the sorting process(excluding and including criteria)?
2. When the articles are not in open access, then how can we minimize the challenges to understand the data for the metanalysis?
Hello,
I recently received metagenomic 16S rRNA gene sequence data from a company, which includes both raw reads, and clean data with barcodes removed. My goal is to analyze these sequences and obtain information on the taxonomic diversity and abundance of the species present in the sample.
Since I use a Windows system and cannot utilize Mac or Linux, I would greatly appreciate guidance on how to proceed with this analysis. Are there any web server-based applications available that can assist with this task?
Furthermore, if there are any researchers or experts interested in this project, I would be grateful to explore potential collaborations. Please feel free to reach out to me if you are interested or have any recommendations.
Thank you in advance for your assistance.
Best regards
We did surveillance of ARGs. After getting results of metagenomics, suggest the appropriate way to explore and compile results
I'm planning to conduct Metagenomic analyses on DNA extracted from soil using the ITS and 16S primers. However, I'm getting bands at 2k or 3k on the gel, even though my ratios and DNA concentration are within the acceptable range.
The 260/280 ratios are approximately 1.9 and 260/230 ratios of approximately 2 and DNA concentration measures around 250-300ng/ul.
I was expecting a thicker band above 10k. I'm concerned about the quality. Should I proceed with the metagenomic analyses or should I extract the DNA again modifying my method? Do I have too much smears? I use Dneasy Pro kit for soil extraction.
Info:
Primers: Crude DNA extracted from soil with Dneasy Qiagen PRO kit
Gel %: 1% gel
Ladder full name: Lambda DNA/HindIII Marker and 1kb plus
Time and Voltage: 30 mins at 80v and 1h at 80v
Sample Volume and concentration: 1 ul of 6x DNA Loading Dye and 5ul of DNA, 1A 1:1:4
Ladder conc.: 1 ul of 6x DNA Loading Dye and 5ul of Ladder and 1:1:4. 1Kb is 1:5


what's the difference between gene abundance in metagenomics and expression value in metatranscriptomics
How do scientists use metagenomics and high-throughput sequencing technologies to study microbial communities and uncover novel microbial species and functions?
AS the title. The fecal samples were stored in the -80 ℃ refrigerator for several years. Want to know if it is suitable for 16sRNA or metagenomic sequencing to analyze the gut microbiota.
Please suggest a website, API, or other solutions to sequence similarity searches on many (all available) metagenomic libraries without downloading the metagenomic reads or assemblies.
Thank you
I have some soil samples from wheat rhizosphere. Which physical parameters may I check for a metagenomic analysis? Will approximately 20 grams of soil be enough for all tests?
which method should be more succeful and realabel in idetifing Microbiota
Metataxonomics or the shutgun ?
I am working on metagenomics datasets...I have six disease datasets...i need to find disease-specific genes and proteins for comparative analysis. I would like to know that, can we find out disease-specific genes and proteins with metagenome rather than metatranscriptome...thank you in advance..
How abundance calculation is done for the genes which are annotated by any reference database like MegaRes, contig file generated after de novo gene assembly is aligned to the database using Blast. Is the abundance referred to the number of contigs annotated as a specific gene or is it some other way??
Hello to all,
I need to perform a standard curve for metagenomic analysis with qPCR, of Treponema Denticola and Pseudoramibacter Alactolyticous, using the 16S RNA copies of my DNA.
As well I must perform a standard of a monk microbial community purchased by the Zymobiomics.
Since it is the very first time that I come across this topic I am sicking for help so I asked help from another colleague and she has helped me a lot BUT, in her calculations of 16S RNA copies and bacterial population she has used the guide of applied biosystems where the Molecular wight of DNA is reported to be 660 g/mole and according her calculations the DNA mass is equal to 9,13*10^20 bp/ng.
I asked the technical team of Zymo and they replied that I should use the following formula 6,022 x 10^23/10^9/650 which equals to 9,26462E+11 bp/ng. At this point I am totally stuck and I can not proceed with my calculations.
According to your experience could you please help me?? Should I consider as the correct molecular weight of the double stranded DNA the 650 or the 660???
Why I obtain 9,26462E+11 and my colleague 9,13*10^20??
Thank you
I want to know if I want to apply the metagenomics test routinely to my soil throughout planting cycles to diagnose and predict harmful pathogens that might affect crop health and overall yield. How often should I take a soil sample?
My colleague and I are planning to do a culture-independent study on identifying specific bacteria in a river system. We just have some questions before we undertake this study.
1. If we happen to sample pathogenic bacteria, do we need to work in a BSL-2 laboratory?
2. What is the general procedure for trying to identify specific bacteria? Do we need to perform DNA extraction, cultivation, etc.? We are planning to perform 16S rRNA metagenomic analysis and are scouting sequencing centers around our country.
How to analyze LEfSe and interpret the result of soil microbes obtained through metagenomic high throughput sequencing
I have microbiome sequencing data and I have analysed it to determine which taxa are up or down-regulated with my treatment, however, this doesn't really tell me much about the metabolic changes.
Is there anything I can use that is similar to the Qiagen IPA where I can place my metagenomics data and it can tell me if there are any changes in metabolic pathways?
Any links and tutorials would be greatly appreciated!
Hello, collegues,
I'm trying to estimate the distribution of a certain kind of microorganism (eg Methanobacterium) in a certain environment (permafrost) from metagenomic data presented on various online services (NCBI, MG-RAST, https://microbeatlas.org). Such an analysis can be done on MG-RAST, but data are scarce. The service https://microbeatlas.org gives good data, but they do not correspond to the entire genus, but only to selected reference genomes. The situation is complicated by the fact that my computer is not very powerful and I cannot download 10,000 metagenomes and analyze them myself. Can you please tell me if it is possible to carry out such an analysis online, at least partially, with post-processing on a computer?
I asked a similar question before but I'm here again. I'm working on urine samples to perform shotgun metagenomics. The biggest problem is that my A260/A280 is below 1.8 (about a third of the samples are below 1.0). I cannot redo the entire procedure since I ran out of raw sample. Is there any way for me to improve A260/A280?
Someone recommended zymo kit but doesn't zymo kit require raw sample? If not which zymo kit should I use?
I work in the cancer research field and human disorders by using the bioinformatics approach. These projects contain the analysis of transcriptomic data such as microarray, RNA-seq analysis, TCGA, systems biology analysis, survival analysis and etc. also, the metagenomic analysis in microbiome fired are conducted. Those interested in participating in analyses and writing articles are invited to send their CV to the email below.
email: qiimenigeb@gmail.com
What would you suggest as the most efficient method for virus concentration in DNA extraction from soil for shotgun metagenomics assay? In a situation where you have soil and shoot samples for DNA extraction, targeted for metagenomics.
I have done 16s metagenomic analysis (16S rRNA gene targeted amplicon data )for soil samples. I have also characterized my sample for its geochemical parameters. I want to study ordination between the two data but I don't know what or which value to input from the metagenomic sequence results. In papers, they have used OTU abundance (Picture attached and the article too).
The results of metagenomic analysis are a large amount of data, please tell me what is the minimum percentage value that is significant (0.1%, 0.2%, 0.3%....) at the level of phylum, class, order, family, genus?
I'm a grad student working on metagenomics and I ran into some issues with the samples. I collected urine samples and extracted DNA using DNeasy Blood & Tissue kit. I used NanoDrop to measure DNA concentration and ran into different issues. The problem is that the A260/280 level is too low (not to mention the contamination) and I don't know how to increase the purity level. I don't think I can collect the samples again, or at least it will take a while before I can do that again. Is there a way or a kit I can use to increase the purity level of my DNA extracts without significantly lowering DNA concentration?
Hi all,
we would like to perform metagenomics on plant tissue as well as modern soil samples. We are not experienced with these analyses neither on plant material nor on modern soil. Has anyone recommendations for sequencing depths or tips how to choose the appropriate sequencing?
Thank you very much in advance!
Cheers,
Barbara
I'm using Kraken2 to generate a custom database to run a shotgun metagenomics (microbiome) dataset against for associated host removal (the custom database is the host). The database builds fine, but the "--unclassifed-out" .fastq files have "x"s added to them (perhaps instead of Ns) and this is not readable in my downstream applications with qiita/qiime2. Why is this happening? I can't figure it out. Interestingly, I have done this procedure before and have not had this problem. Any thoughts as to what is going on? The output in my text editor looks like this (note the "x"s in the sequence):
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,,F:,::,,,:,FF,,:,,::FF,FF,FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
@A00223:522:HW3NCDSXY:1:1101:8992:1000 1:N:0:GTGCACGA+GCCTATCA
AGTGCACCTCAACCCTATGTATTGTGGACCTATACCAGTCCTTATAGAATGGCAGACTGTACCTCACTCAAxCCTATGTACxGAGATCGGAAGAGCACACGTCTGAACTCCAGTCACGTGCACGAATCTCGxxTxxCxTCxTxTxxTTxxA
Hi,
I got the shotgun sequencing data of my samples and I ran RGI for my samples to check the ARGs present in my samples using CARD database. Now I want to calculate the abundance of ARGs. May I request if I can get any help/guidance for doing the same?
Regards
Yasir
I did DNA extraction using Power soil DNA extraction >> the yield was very low with high level of inhibitors.
I checked the publications and I found a recommendation to use one hour incubation in tissue lysis buffer and Proteinase K from Qiagen’s DNeasy Blood and Tissue kit then I complete with power soil kit however I don't know how to do this step and the amount of the enzyme I should use?
I am having problem with 16S amplification on colon content DNA from DSS treated group. I have found that DSS is a polymerase inhibitors and tried to purify DNA with LICL and Glycogen-ethanol precipitation. I have also tried to dilute my stock DNA before PCR. However, there is no successful 16S amplification. I have attached a gel electrophoresis result ( tapestation report) on PCR product of my samples and E.coli as control for your reference. Only control showed 200bp band but all my samples showed no amplification. Any suggestion would be greatly appreciated.

Dear all,
I am looking for a user-friendly pipeline for metagenomics analysis using COI (Cytochrome oxidase I) as marker. I have tried some of them based on python and computed by docker but is not enough clear to be used with students. Could anyone recommend me any alternative?
Thank you in advance.
I am looking for packages to predict functional profiles from metagenomic 16S rRNA data in R language, as an alternative to PICRUst in Python. I found the Tax4Fun, but there are only a few examples of how to apply it. Does anyone know other similar and well documented packages?
Hi,
I'm new to shotgun metagenomics and just realized that there are multitudes of databases available to analyze the data, however no clear way to differentiate which one is better suited for which environment..
Which one to use for human gut shallow shotgun metagenomics data? (not looking to do de novo genomes, just to map to existing curated genomes..)
Looking at taxonomy databases for now, like GTDB, Web of Life, UHGG catalog....
Thank you for any advice!
Dear researchers
In my subject, I work on the characterization of the Moroccan marine microbiota by metagenomics under the discipline of microbial ecology.
During the sampling, we could not have all the in situ measurements of the Physico-chemical parameters of the studied microbiome. That's why we had recourse to the extraction of spatial data from the NASA website to complete.
my question is: do we have the right to combine our own data with those of a database to analyze them? ethical and copyright aspects. since these data are submitted to the public with open access.
I am planning to run16S metagenomic sequencing on libraries prepared from colon content of C57BL/6 mice to understand the gut microbiota diversity of control group and DSS induced colitis group. I am having problem of very low library concertation of some samples from DSS groups after quantifying by qPCR. However, the qubit showed considerable amount of library concentration. I have repeated the library preparation on same samples by increasing DNA input and cycle numbers for PCR. But the result is still same. Can I diluted the libraries based on qubit concertation for further sequencing? Can I use fecal samples instead of colon samples for those samples for preparing libraries? Any information would be greatly appreciated. I have used following kits for library preparation and quantification.
16S Library preparation kit : Ion 16S™ Metagenomics Kit, A26216
Library quantification kit : Ion Universal Library Quantitation Kit, A26217
Dear researchers
In my subject, I work on the characterization of the Moroccan marine microbiota by metagenomics under the discipline of microbial ecology.
During the sampling, we could not have all the in situ measurements of the Physico-chemical parameters of the studied microbiome. That's why we had recourse to the extraction of spatial data from the NASA website to complete.
my question is: do we have the right to combine our own data with those of a database to analyze them? ethical and copyright aspects. since these data are submitted to the public with open access.
Basically, the circular plasmid is more specific to identify its completeness. However, when we gain the metagenomic sequencing result from bacterial community or whole-genome sequencing result from bacterial isolates, after assembly of contigs, it is not clearly to confirm the completeness of linear plasmid.
I am trying to analyze the diversity of the bacteriome at different taxonomic levels
Hi,
I have biological replicates: they come from the same raw sample (tube of sediment) but from different DNA extraction & sequencing experiments. These are metagenomes and I used whole genome sequencing.
If I treat them separately, the abundances are consistent, some of them cluster together on a PCoA but not so well on hclust.
Question:
1) Should I concatenate the raw fastq files together beforehand?
2) Or should I treat them separately until after the alignment to a db step and then sum the counts of each replicates together?
What is the best procedure?
Thanks
It's about RNA viruses metagenome.
I would bé gratful if you could send me a protocol to extracted metagenomic Dna from soil. The aime of work is shotgun metagenomics sequencing,so i'm looking for a good yield ans concentration. I already try with the power max kit of quiagen comptant and the concentration is very low.
Thanks
What are the best webserver/online tools for Downstream Analysis of 16S amplicon Metagenomic dataset besides Microbiomeanalyst ??
Thanks in advance...
as we submit genomic DNA sequences in various repositories (ex. NCBI). so is there any specific repository to submit metagenomic analysis results of 16s rDNA sequencing.
I am looking for future collaborators in India working in the field of microbiome and metagenomics, with focus on antimicrobial resistance. Any suggestions?
I have completed taxonomy assignment of the assembled contig from raw data but I cannot get the relative abundance of the species present. Is there any tool that can do that?
Hello,
I want to determine the gut microbiome composition from my mice feces by using DNA metabarcoding sequencing of V3-V4 region from 16S rRNA.
I have contacted Novogene for doing it, and they offer 150.000 reads/sample and 30K Raw Tag.
Do you have any experience in using Novogene services?
Do you think this numer of reads for sample and Raw Tag is enough for species level discrimination?
Thanks in advance
I have a set of metagenomic sequence data from aquatic eDNA samples, and have been able to analyze the the bacterial/16S aspects of the samples but the program I use cannot analyze eukaryotic data. Does anyone have recommendations for programs that can be used to analyze eukaryotic metagenomic data?
Hi everyone! I'm trying to figure out how to process metagenomic data (obtained using qiime2 and picrust2 for qiime) to do some network analysis using igraph or similar (do you have any recommendations?).
Now, unfortunately I can't find a good tutorial about this passage, so do any of you have something which could help me?
which extraction kit could be used for studies on metagenomics in bees?
Hello.
I am trying to find differentially abundant microbes between two conditions. I have the relative abundance data but not the absolute read counts.
Is there any method that considers relative abundance data as input?
or any way to transform this data before use?
Regards,
Pratyay
I have metagenomic data of gut bacteria of tribal population of Himachal Pradesh, can anyone help me regarding analysis of same for writing a research article. we will share authorship for the same.
I am currently analysing Illumina sequences for a metabarcoding project. The primers used in the process have NOT been removed from the raw data. But after exploring the data, I found that ~5% of my raw data has no primers.
What could explain this 5% ? Should I discard these primerless sequences?
PS: The data was prepared according to the following protocole : 16 Metagenomic Sequencing Library Preparation Part #15044223 Rev. B (copy paste on internet to find it)
Thank
I work on investigating the microbiome of the grapevines Rhizosphere using Shotgun-metagenome sequencing-Illumina. DNA was extracted using PowerSoil® DNA Isolation Kit - QIAGEN and further purified by Sodium acetate. Metagenomic DNA libraries (50 ng input) were prepared using NEBNext® Ultra™ II Prep Kit for Illumina®. The readouts of Bioanalyzer, Qubit™ dsDNA HS, or 16s PCR look perfect. However, the sequencing run gets always underclustered in case of my samples. Nevertheless, it disrupts the clustering of the other samples, too.
I look forward to any help!!
I have done both full-length 16S metagenomic microbiome and cultured microbiome identified with sanger sequencing. The results turned out greatly different from the metagenomics and cultured one, which should be usual. However, many microbes I cultured do not have the corresponding OTUs/ASVs in the metagenomic microbiomes. Regardless of the possibility of contamination, is there any other possible factor affecting the results?
Thanks in advance!
The key difference in data (structure and function), effectiveness for which analysis, and which one gives more clarity for gene identification.
Good morning,
Please I have an issue with my run and I need some help.
We did a metagenomic 96 sample run on Minion Mk1B (short-read 16S 400bp amplicons). The run lasted for 72 hours. Output was 9.7 gigabases (FAST5 files are 241GB). After this base-calling was initiated.
After 24 hours, the base calling was only 17% at which we aborted the base calling to do it on our server.
After 4 days now and only less than 20% is done. Why is it taking so long? We are using guppy v5.
What is the expected output size for such runs?
Does analysis usually take this long?
Is there a time limit by which we should stop the runs?
Thank you in advance.
Hello, I am trying to get a metagenomic analysis and found Novogene whose prices are pretty cheap (almost 1/3 of our university core). Does anyone have any experience with this company? about the data quality or reliability?
Please let me know,
Thank you,
For 16s rRNA profiling/ metagenomic analysis.
Please suggest a tool that can provide alpha diversity and beta diversity of microbes from shotgun metagenomic data either from raw sequences or assembled contigs.
I need to extract DNA from the vaginal flora of cows for metagenomic studies. Does anyone know any extraction technique?
I am using STAMP software to analyze 16S metagenomic data. Basically I have two experimental replicates for six different experiments. The first three experiments were performed in a specific condition and the remaining three in a second condition. It is therefore a “two groups experiments”. I am checking differences in metagenomic composistion between the two conditions and I prepared a tab-separated file for stamp software. The software works correctly (both with sample files and with files containing only two columns of data) but do not recognize my “tab separated” file.
I have attached a small file with the first rows of data. Is there something wrong in this file? Is it possible to obtain a sample file for a “two groups” experiment?
Thank you
Hi,
Has someone experience in building a custom db using Kraken2?
I have downloaded the fasta files for some taxa and build the new db but it produced an unmapped.txt file with a long list of accession numbers.
What does this file mean? How can I deal with it? Can I overcome this issue?
How can I found out which taxa have been successfully included in the db that I have created?
Thanks