Science topic

Illumina Sequencing - Science topic

Explore the latest questions and answers in Illumina Sequencing, and find Illumina Sequencing experts.
Questions related to Illumina Sequencing
  • asked a question related to Illumina Sequencing
Question
2 answers
How can I effectively perform metagenomic assembly on large Illumina sequencing data from environmental samples, given that I have already completed quality control but encounter issues due to the data size?
Relevant answer
Answer
Several online resources you might try have limitations, so whether they work or not will depend on how much data you have. Most of them perform other analyses besides assembly, so you might consider what to do with the results and check if other tools available on the same site could also help.
You could try:
Bacterial and Viral Bioinformatics Resource Center (BV-BRC) https://www.bv-brc.org/
  • asked a question related to Illumina Sequencing
Question
6 answers
I recently performed Illumina shotgun sequencing and my fastqc data is as attached.
is this correct?
Relevant answer
Answer
The fastqc data looks good quality-wise.
  • asked a question related to Illumina Sequencing
Question
1 answer
I'm using the Novaseq 6000 and HiSeq 4000. Assembled alone, the Hiseq data has little missing data and many reads per sample, Novaseq has more missing data but still is useable. When I assemble them together, Hiseq individuals have few or no SNPs. I've checked trimming for both datasets and that does not appear to be the issue.
Assembling using ipyrad, I've assembled de novo and mapped to a reference.
Relevant answer
Answer
Hi. I am having similar issues. Were u able to find any solution for this?
  • asked a question related to Illumina Sequencing
Question
3 answers
I am trying to map Illumina data to rotavirus genome. But I noticed that mapping to different reference genomes results in some variations in the consensus sequence. Any suggestion? Should I discard genome regions with low coverage? If yes, then what is considered low coverage? 
Relevant answer
Answer
Mapping and assembling viral genomes from Next-Generation Sequencing (NGS) data are critical steps in virology research, enabling the understanding of viral evolution, pathogenicity, and resistance mechanisms. The process requires a strategic approach to accurately reconstruct the viral genome from the short reads generated by NGS platforms. This task can be challenging due to the high mutation rate of viruses and the presence of host DNA in the samples. The following methodological framework outlines an effective strategy for mapping and assembling viral NGS data:
1. Quality Control and Preprocessing
  • Initial Quality Assessment: Utilize tools such as FastQC to evaluate the quality of raw sequencing reads. Assess metrics like base quality, sequence quality scores, and the presence of adapters.
  • Read Trimming and Filtering: Apply tools like Trimmomatic or Cutadapt to trim adapters and remove low-quality bases from the reads. Filter out reads below a quality threshold to ensure that only high-quality data is used for assembly.
2. Host DNA Subtraction (If Applicable)
  • Alignment Against Host Genome: To reduce the complexity of data and improve assembly accuracy, align the reads to the host organism's genome using aligners such as BWA or Bowtie2. Remove any reads that map to the host, retaining only those likely to originate from the virus.
3. Choice of Assembly Approach
  • De Novo Assembly vs. Reference-Based Assembly: Choose between de novo assembly and reference-based assembly based on the objectives and the nature of the viral genome.De Novo Assembly is preferred when the viral strain is novel or significantly divergent from known sequences. Tools like SPAdes or Megahit can be used for this purpose. Reference-Based Assembly is advantageous for well-characterized viruses or when a closely related reference genome is available. This can be done using tools like BWA for mapping, followed by SAMtools for sorting and generating a consensus sequence.
4. Assembly and Consensus Sequence Generation
  • Assembly: Use the selected assembly approach to construct the viral genome. Ensure to optimize parameters specific to the viral genome size and complexity.
  • Consensus Sequence Generation: For reference-based approaches, generate a consensus sequence from the alignment, considering base call frequencies to account for viral diversity and potential mutations.
5. Validation and Error Correction
  • Validation: Validate the assembled genome by checking coverage uniformity and depth across the genome using tools like QualiMap or IGV. High variability in coverage may indicate assembly issues or mixed infections.
  • Error Correction: Apply error correction tools or manually inspect and correct regions with ambiguities or misassemblies, especially in regions of high importance like open reading frames.
6. Annotation and Further Analysis
  • Annotation: Annotate the assembled genome to identify genes, protein-coding regions, and other functional elements using tools like Prokka or custom databases relevant to the virus of interest.
  • Comparative Genomics: Compare the assembled genome with reference sequences to identify mutations, recombination events, or phylogenetic relationships.
Conclusion
Successfully mapping and assembling viral genomes from NGS data requires meticulous quality control, strategic selection between de novo and reference-based assembly approaches, and rigorous validation of the assembled sequences. This process not only demands proficiency with bioinformatics tools and workflows but also a deep understanding of the biology of the virus and the host, if applicable. By following this comprehensive approach, researchers can achieve high-quality, accurate viral genome assemblies that are crucial for advancing our understanding of viral genetics and epidemiology.
This list of protocols might help us better address the issue.
  • asked a question related to Illumina Sequencing
Question
6 answers
Hello!
I have performed a phiX validation run with Illumina standard phiX kit and V3 chemistry. I diluted the phiX library to 10 pM and expected to get a cluster density of around 1000, but it resulted in a very low amount of cluster density (~130). Also as you can see in SAV plots, phred scores have decreased after cycle 100 in read 1 and cycle 40 in read 2, which resulted in a low >Q30 percentage at the end of the run. What do you think has caused this issue and how I can fix it? Can this be because of the low cluster density? can this be because of bad reagent storage conditions or handling? I have performed a system check and it was successful. what are your suggestions?
I have attached plots of SAV analysis and thumbnail images in different cycles of the run. the photo with better quality is for cycle #17 and the photo with lower quality is for cycle #436, both for A nucleotide.
Thank you.
Relevant answer
Answer
As far as I know, there is no official recommended value, but one should expect both phasing and prephasing in read2 to be slightly greater than in read1.
  • asked a question related to Illumina Sequencing
Question
9 answers
Hello,
I recently received metagenomic 16S rRNA gene sequence data from a company, which includes both raw reads, and clean data with barcodes removed. My goal is to analyze these sequences and obtain information on the taxonomic diversity and abundance of the species present in the sample.
Since I use a Windows system and cannot utilize Mac or Linux, I would greatly appreciate guidance on how to proceed with this analysis. Are there any web server-based applications available that can assist with this task?
Furthermore, if there are any researchers or experts interested in this project, I would be grateful to explore potential collaborations. Please feel free to reach out to me if you are interested or have any recommendations.
Thank you in advance for your assistance.
Best regards
Relevant answer
Answer
SilvaNGS - simple and win-based. https://ngs.arb-silva.de/silvangs/ It is probably less flexible than all nice approaches mentioned above but it is quite useful for first look or for someone who needs an established reliable pipeline
  • asked a question related to Illumina Sequencing
Question
1 answer
We are trying to sequence the HBV and HCV but we dont have specific protocol
Relevant answer
You can get it from EASL GUIDLINE .
IT IS FOR FREE JUST DOWLOAD IT.
  • asked a question related to Illumina Sequencing
Question
4 answers
I have been using the DNeasy kit from Qiagen to extract DNA and, in the final step of the extraction, the product guide recommends to elute the DNA in AE Buffer that contains EDTA (0.5 mM). However, the Nextera protocol might be seriously affected by EDTA during tagmentation. Could I use TE buffer (0.1mM EDTA) for storing DNA samples in a stable manner for a long time? Or would I have to extract my samples only in nuclease free water for immediate use?
Relevant answer
Answer
Hi Moises , I agree with Jatesh , yes you can use EB without any EDTA
  • asked a question related to Illumina Sequencing
Question
2 answers
In illumina sequencing results how to rank genes on the basis of false discovery rate (FDR)? What FDR values should be considered as significant?
Relevant answer
Answer
The commonly used threshold for significance is an FDR of 0.05 or less (corresponding to a 5% false discovery rate). This means that among the genes considered significant, you would expect a maximum of 5% false positives due to random chance.
  • asked a question related to Illumina Sequencing
Question
2 answers
I need to extract DNA of hundreds of soil samples for library preparation for Illumina Sequencing. In my lab people have been working with the DNeasy Power Soil Pro Kit (individual reactions). Now that I need to speed the process due to the amount of samples I am considering the DNeasy PowerSoil HTP 96 Kit. Any opinions on the recovery/yield of this kit?
Please let me know about your experiences, any advice is appreciated.
  • asked a question related to Illumina Sequencing
Question
4 answers
I'd like to sequence the genome of the gopher tortoise. The genomes of congeners are ~2.4Gb. I'm trying to decide how much coverage is necessary; we plan to run the sample on a portion of a NovaSeq run, and at least one PacBio SmrtCell. I'm trying to evaluate the benefits of additional sequencing effort: my starting point would be something like 30x coverage for the 2x150 NovaSeq run, and one PacBio SmrtCell (~8x coverage with HiFi reads? Not so sure about this), but i'm wondering how necessary a second PacBio cell, or additional Illumina reads, would be for assembling a nice genome.
We don't have any tissues available for transcriptomics, and the immediate application will be to map whole-genome methylation seq reads to the genome.
I'm pretty new to all of this, so any suggestions or references to guidelines are most welcome! Thanks!
Relevant answer
Answer
Ireneusz Stolarek Thanks for your thoughts, very helpful.
  • asked a question related to Illumina Sequencing
Question
6 answers
Hello, I'm preparing libraries for Illumina sequencing and i would like to know if I have to do DNA end repair immediately after shearing, or if I can freeze sheared DNA and do end repair a week or two later. Thanks
Relevant answer
Answer
When water freezes the shard like ice crystals that form can shear DNA, if a substance such as glycerol, or DMSO is added to the solution (10-15% by volume) the water will freeze as a glassy solid without forming damaging shards. This should allow you to successfully freeze your DNA. If glycerol, or DMSO happen to be problematic in your assay, they are easily removed by dialysis.
Good luck.
  • asked a question related to Illumina Sequencing
Question
5 answers
We recently got our sequencing data from a company sequenced with Novaseq. As far as I have learnt, Novaseq uses a simpler quanlity score system, which merges different scores into only four categories/levels. This of course causes problem when processing the data with quality-score based pipeline, e.g. DADA2. I got really weired results with DADA2. We also tried with UNOISE3, the ASV table looks fine for the mock data. But since this is also based on quanlity, I'm not sure if this can be trusted.
I also searched for some publications using Novaseq for amplicon sequencing. They all used traditional/old pipelines based on similarity to generate OTUs. I wonder if this is the only way to do this.
I wish there could be someone who has experience dealing with this situation. Is there still some possible way to use DADA2 or UNOISE3 with some modifications to process the data?
Relevant answer
Answer
There have been broad discussions on this issue for a while and the current solution is to enforce monotonicity in the fitted error model if using DADA2. You can refer to this thread (https://github.com/benjjneb/dada2/issues/791#issuecomment-941366623) for the workaround codes provided by Benjamin Callahan.
  • asked a question related to Illumina Sequencing
Question
4 answers
Hello! I have an amplicon sequencing dataset (Illumina) of the D1/D2 region (NL1 and NL4 primers). I want to analyze the sequences using Dada2 in R or Qiime2, but I am unsure which is the most comprehensive and updated database for yeast identification. I appreciate any feedback on this. Thank you in advance.
Relevant answer
Thank you for all the answers!
  • asked a question related to Illumina Sequencing
Question
6 answers
I have designed degenerate primers for a gene. On amplifying it with genomic DNA I'll get the desired band, for confirming it I have cloned the eluted band in the pGEMT vector and then sequenced it through Illumina sequencing. But I didn't get the desired sequence for it. What could be the reason for it?
Relevant answer
Answer
Thankyou so much for the suggestion
  • asked a question related to Illumina Sequencing
Question
1 answer
Dear all,
I'm currently working on a ChIP-Seq for a transcription factor and was preparing libraries using the qiagen QIAseq Ultralow Input Library Kit, checking the ligation of adapters with the Kapa library quantification kit for Illumina sequencing and performing size selection with Ampure XP beads as described in the library kit protocol. The libraries were then checked on a bioanalyzer high sensitivity DNA ChIP.
The first replicate worked fine for me (some samples are included on the bioanalyzer Chips, sometimes the chip didn't run nicely, don't mind that), however, in the second replicate strange larger peaks appear in the bioanalyzer chip and also there are multiple peaks/bands instead of a normal distribution.
The shearing of both replicates was fine (done mechanically with a bioruptor) and targets could be recovered as checked in a ChIP-qPCR. Also, all samples that had been prepared in parallel looked weird (ChIPs and inputs), for both antibodies used and in replicate 1 for antibody 3, although all other samples were fine. If I purified those samples in parallel with a re-purification of an old sample. the old sample still looked fine, so I assume it has to do with the library preparation itself. Adapters were not reused and diluted to a 1:10 dilution as I was using 5 ng of input DNA. The number of cycles was chosen according to the Kapa library quantification and the minimum of cycles was used. For the PCR reaction I prepared a master mix and added it to the samples, so maybe that was a problem? Or do you think an enzyme could have gone bad? Is it just trash that got over-amplified, or could it be a problem with the adapter ligation? Or the most simple question: has anyone seen peaks like this before? And do you think it might be worth to give it a shot at sequencing nonetheless or is it just trash? For the libraries with antibody 1: could it be that those are "just" over-amplified? And if yes, could I try to sequence them, maybe after another round of size-selection to remove larger peaks?
The ChIPs were performed with primary cells (so I would be super happy, if I could use any of the samples) using two different conditions in wt and KO mice and the analysis therefore "just" should be the overlay between the different conditions (yes in wt, no in KO etc), only peak calling, nothing quantitative. Could I have a chance the sample quality might be sufficient for this setting? At least with antibody 1? Might the effect of this fragmentation into single peaks phenomena get lost once I pool the libraries for sequencing?
Thank you so much in advance.
Relevant answer
Answer
Have you find any answer to the question? can you explain for me I have the same issue.
  • asked a question related to Illumina Sequencing
Question
3 answers
Hi
I am working on a project where I have specific segments of interest. Almost 18 region of TB positive samples were selected from where the mutation can be occurred. For that mutation the specific drug could be resistant. So i wanted to sequenced that specific region applying mini seq illumina sequencing. So i extracted the DNA from left over through Qiagen extraction kit. But when I go for quantification through qubit fluorometer i was surprised that the quantity of the sample is very low . I am confused why this happens. Even though when i pcr that product the quantity is also low . But my pcr was good because the positive control result is perfect. So anyone know or have better solution on that ,please kindly help me
Thanks
Relevant answer
Answer
If I am understanding this properly, it seems your sample input is too low. Can you obtain more cells and prepare more genomic DNA?
If you cannot obtain more DNA, then try adjusting your primer concentrations. Due to low DNA input, your primer concentrations could be too high for this PCR. I also recommend running the DNA products out on a TBE gel. Check to see the amount of primers remaining relative to the PCR product. Also, you probably need to adjust the number of cycles.
  • asked a question related to Illumina Sequencing
Question
2 answers
Hello,
I have several single-end fastq files. Before trimming with Trimmomatic, FASTQC reported TruSeq adapter sequences as possible source of overrepresented sequences. However, after trimming, now FASTQC reports Clontech SMART CDS Primer II A as source of overrepresented sequnces. What should i do about them? Can those sequences cause any negative effects on downstream analysis?
Thanks in advance.
Relevant answer
Answer
Thank you Mehmet Tardu
  • asked a question related to Illumina Sequencing
Question
3 answers
I am designing a custom NGS run and need to determine the sequence coverage. I am using the Illumina Sequence coverage calculator https://support.illumina.com/downloads/sequencing_coverage_calculator.html
How do I determine the genome/ region size?
Additional information:
Human gDNA will be used. I will have 4 amplicons about 250 bases each from 4 different genes.
Any help would be appreciated. Thanks, Nikhil.
Relevant answer
Answer
Thanks Luigi Marongiu but I still have not gotten my data yet. I should have worded better. I am just looking to determine the number of samples I will in my run, and for that I will need the genome size or region size.
  • asked a question related to Illumina Sequencing
Question
1 answer
I am currently analysing Illumina sequences for a metabarcoding project. The primers used in the process have NOT been removed from the raw data. But after exploring the data, I found that ~5% of my raw data has no primers.
What could explain this 5% ? Should I discard these primerless sequences?
PS: The data was prepared according to the following protocole : 16 Metagenomic Sequencing Library Preparation Part #15044223 Rev. B (copy paste on internet to find it)
Thank
Relevant answer
Answer
If you are using trimming program, use parameter which discard the sequence which will be untrimmed because it doesn't have a primer sequence.
  • asked a question related to Illumina Sequencing
Question
5 answers
Dear all, I am trying to use CD-hit to remove the duplicates from the file that is the output from trinity (RNA seq assembly).
I used the following parameters:
cd-hit-est -i in.fasta -o out_cdhit90.fasta -c 0.90 -n 9 -d 0 -M 0 -T 0
But the output file still contains lots of small or fragmented sequence plus the best one. How can I remove those small or fragmented duplicates by changing the parameters?
thanks
ZQ
Relevant answer
Answer
Hello, do you know any tool DIFFERENT from CD-hit to filter CDS unigenes.?
  • asked a question related to Illumina Sequencing
Question
1 answer
Hello,
Does anyone have experience with Zymo-Seq RRBS Library Kit and subsequent Illumina sequencing? A part of my project that I work on should be "methylation profiling of brain tissues". I wanted to use Agilent's SureSelectXT Methyl-Seq Library Preparation Kit for targeted methylation sequencing, but it's much more expensive, and I don't think that for our purposes, it's necessary to target hotspot regions. I wanted to know whether you are satisfied with the kit and how do you sequence libraries? The manufacturer recommends at least 30 million reads and read length > 50 bp, so Illumina's NextSeq 500/550 Mid Output Kit v2.5 (150 Cycles) could be sufficient for four libraries?
Relevant answer
Answer
Dear Madam - No Experience.
  • asked a question related to Illumina Sequencing
Question
3 answers
I'm looking for some resources (article or videos or courses) to understand the working of Illumina and other NGS platforms? I have a lot of concept gaps and doubts.
Relevant answer
Answer
I recently came across this post https://kscbioinformatics.wordpress.com/2017/02/13/illumina-sequencing-for-dummies-samples-are-sequenced/ and it does a decent job explaining it I think.
Not implying anything :)
  • asked a question related to Illumina Sequencing
Question
3 answers
Hello, I have done WGS of my bacterial strains and got some preprocessed Illumina sequencing files in .fna format. It has a format like this
>1 length=400016 depth=0.86x the sequence
>2 length=323455 depth=1.00x and so on to >102
I want to know how to deal with this .fna file and which analysis to run on it.
Relevant answer
Answer
Abhijeet Singh with a big position and seat comes big responsibilities. It would have been great if you provide some guidance rather than demeaning others.
  • asked a question related to Illumina Sequencing
Question
4 answers
The Fig2A of the paper shows that a tiling library of a gene was prepared containing 50bp fragments. The fragments span over the entire gene sequence incrementing about 7bp from each other. Later this sample was used to study the sequence dependence on DNA bendability over genome scale. This is a very interesting study but I am unable to figure out how the tiling library was prepared. Is it done by preparing a sequences for primer pool for every fragment and ordering them? Can we prepare a tiling library with any amount of spacing between them? Please let me know if you have any idea.
Relevant answer
Answer
Red biotechnology: This area includes medical procedures such as utilizing organisms for the production of novel drugs or employing stem cells to replace/regenerate injured tissues and possibly regenerate whole organs. It could simply be called medical biotechnology.
nuha hamid taher
Senior lecturer
Faculty of Basic Education
Mustansiriya University
  • asked a question related to Illumina Sequencing
Question
3 answers
Hello,
we are doing stool microbiome analysis for bacteria and fungi. We are using same protocol for bacterial and fungal samples, but we have zero yields after normalization, ligation and quantification only in fungal libraries. We are using SequalPrep™ Normalization Plate Kit, 96-well and usually I put 10 ul of PCR products for bacteria and 20 ul for fungi (since they are usually less concentrated). Then I continue according to protocol. After that we are using KAPA HyperPrep Kit PCR-free according to protocol, only with extended ligation time (1 hour). We already tested shorter and longer ligation time, but results are basically the same, so I don´t see that as issue. As adapters we are using TruSeq DNA Single Indexes Set A (12 Indexes, 24 Samples). For quantification we are using KAPA Library Quantification Kit - Complete universal kit and again I follow the protocol.
So do you have any idea in which step we are loosing our fungal DNA? It´s not human factor either, because two individual people did the same process and got the same zero results. Maybe some of you are more experienced in case of fungi in stool (human, mice).
Thank you in advance for any piece of advice!
Relevant answer
Answer
Suggest you control analyses with added bacteria or fungi to show you can recover the target. This should be your 1st step - not just jumping into apllication with an unvalidated prptocol.
  • asked a question related to Illumina Sequencing
Question
1 answer
I need the information about the companies in India offering services of environmental DNA for any organism. Please add website links.
Relevant answer
Answer
Hi Dinakarsami, I'd be happy to help you with metagenomics and metabarcoding analysis. Pl. drop your message.
Thiru
  • asked a question related to Illumina Sequencing
Question
9 answers
I will sequence viral nucleic acids using Nextera XT library prep. I was informed by Illumina that the input must be dsDNA, at least 300-bp. I am wondering how to get it. I have a commercial kit to prepare the first strand using the viral RNA as template. My problem is synthetizing the second strand. I have seen a strategy where you must use a number of enzymes (RNAse H, ligase, polymerase), but I understand that this is more adequate when you work with long eukaryotic mRNA. For constructing the first strand, I will use random primes. Could I synthetize the second strand using random primers and only a polymerase (Klenow fragment)? In this case, how I would break the RNA/cDNA duplex? Is it possible (and necessary), to validate each step (first strand, second strand) with a qubit fluorometer?
Also, should I eliminate the viral DNA before the reverse transcription? Or may I keep it, get the cDNA for the viral RNA, and use the same reaction tube, with DNA and cDNA, to prepare a single Nextera XT library?
I know that there are commercial kits that prepare both strands at once, however I just found very expensive options (such as SuperScript™ Double-Stranded cDNA Synthesis Kit). If anyone knows a less expensive alternative, I would appreciate the advice.
Relevant answer
Answer
We are using a protocol for viral ds-cDNA synthesis, and it works well. see this article:
  • asked a question related to Illumina Sequencing
Question
4 answers
We are collecting human samples and thinking to use some storage buffer to ensure optimal preservation of RNA and DNA, among other molecules.
We saw several papers, and authors mentioned Allprotect Tissue Reagent (QIAGEN) as storage buffer, but we did not find anything about its composition. Then, I decide to ask if anyone knows whether Allprotect Tissue Reagent (QIAGEN) was a buffered salt solution such as RNAlater (Life technologies) is, since we we need to take this into account for the subsequent treatment of the samples.
Thank you so much.
Nerea
Relevant answer
Answer
Yes I am also checking the ingredients of this Qiagen Allprotect reagent too. Qiagen should not just cover it this information as a business secret because we need to know for risk assessment. Some researchers bring this reagent to operation theratre for collection of human tissue samples and this maybe accidently contaminate with patient especially in operartion site etc.
If any one knows about the ingredients of allprotect (Qiagen), pls share too. Thanks.
  • asked a question related to Illumina Sequencing
Question
4 answers
Our lab has sent rat cardiac tissue for sequencing and have obtained indigestible fastq data files. Is there a software I can use to organize these fastq sequence files in order to obtain meaningful results?
Relevant answer
Answer
You may follow this pipeline.
Let me know if you are interested to outsource to my lab www.eminentbio.com
  • asked a question related to Illumina Sequencing
Question
3 answers
I’m do de novo genome assemblies on a set of bacteria samples. I already have Illumina data for most of them and I just submitted samples for Nanopore sequencing for all of them, but I have 9 samples that still need Illumina sequencing. My university’s DNA facility doesn’t pool samples for people and we don’t have anything else we need sequenced. We also don’t know anyone else who is sequencing anything soon.
Does anyone have any recommendations for a company or group that can do Illumina sequencing on a small number of samples for cheap? Or am I going to have to pay for a full lane no matter what?
Relevant answer
Answer
Ashley Ann Paulsen , pl let me know if you are interested to send samples to my lab Eminent Biosciences www.eminentbio.com if interested pls write an email to me. my Email:director@eminentbio.com or anuraj@eminentbio.com
  • asked a question related to Illumina Sequencing
Question
5 answers
Dear All,
Can I use the DNA extracted from dissolving a low-melting point agarose for illumina sequencing (without using DNA purification kit)? It is because the recovery rate of column-based DNA purification (~100 bp) is extremely low.
Thank you,
Relevant answer
Answer
DNA extracted from the agarose gel might work for seq, but in many cases purity is suboptimal. and typical yield is not higher vs spin column purification or mag bead purification. I'd advise to use a DNA purification kit (there are many to choose from), just optimize the conditions to ensure sufficient yield for seq
  • asked a question related to Illumina Sequencing
Question
10 answers
Hi
I am extracting genomic DNA from dust samples and the 260/280 ratio is 1.4 whereas 260/230 is 1.35. I need to perform the metagenomic sequencing and for the same, the recommended 260/280 ratio should be greater than 1.8. Can anyone help me with how can I improve this. I have tried ethanol precipitation but it causes a significant reduction in the yield. As these are environmental samples the yield of genomic DNA is already low.
Thanks
Relevant answer
Answer
did you mange to solve your problem? I'm facing exactly the same challenges with extraction of metagenomic DNA from an air filter, i.e., low yield and unsatisfactory purity regardless of extraction protocol used (phenol-chl or commercial kits). Did you go ahead with your samples and sequenced them? Was it successful? I would appreciate any insight from your site.
Kind regards,
Maggie
  • asked a question related to Illumina Sequencing
Question
2 answers
A probably silly question to phage experts.
Is it possible to determine whether a phage genome is linear, circular, or circularly permutated using Illumina sequencing data and genome assembly? Thank you very much!
Relevant answer
Answer
Greetings,
Elucidation of the exact genome termini is what you probably want.
I've found PhageTerm (Garneau et al., 2017) to be a great tool to have an automated look at the phage read pile-ups onto the scaffold representing phage complete genome for termini determination, which works well with paired reads when the library prep involved random physical shearing (we usually use TruSeq DNA Nano workflow with shearing by Covaris ultrasonication; however, it won't be suitable for Nextera-prepared libraries, which is a very common library prep choice). Even of case of lackluster coverage, read pile-up inspection and TerL aa sequence tree generation with your phage TerL sequence and other TerL sequences from phages that had their termini elucidated experimentally (not only in silico) can usually aid in the proper in vitro experiment design (have a look at Casjens et al., 2009) to verify/find out the packaging strategy/termini phage has.
Best regards,
Nikita Zrelovs
1. Garneau, J.R., Depardieu, F., Fortier, LC. et al. PhageTerm: a tool for fast and accurate determination of phage termini and packaging mechanism using next-generation sequencing data. Sci Rep 7, 8292 (2017). https://doi.org/10.1038/s41598-017-07910-5
2. Casjens, Sherwood R, and Eddie B Gilcrease. “Determining DNA packaging strategy by analysis of the termini of the chromosomes in tailed-bacteriophage virions.” Methods in molecular biology (Clifton, N.J.) vol. 502 (2009): 91-111. doi:10.1007/978-1-60327-565-1_7
  • asked a question related to Illumina Sequencing
Question
3 answers
Hello, I am using KapaHyperPrep Kit for library preparation for Illumina sequencing. I have problem with adaptor dimers formation. Does anyone have some experience with this issue? How did you get rid of them? We tried shorter ligation time and also we used half of the usual amount of adaptors, but there is not that big improvement....
Thank you in advance!
  • asked a question related to Illumina Sequencing
Question
2 answers
I have some primers that have been used successfully to amplify the V1-2 region of the bacterial 16S rRNA gene. I plan to run some 16S rRNA sequencing analysis and I must now add a sequencing adapter sequences to each of my primers respectively. Should I calculated the Tm based on the 16S-specific primer region only, or should I calculate the Tm with both parts of the primer (seq adapter + 16S specific region)? The addition of the sequencing adapter increases the size of the the primers and therefore the Tm significantly above recommended annealing temperatures.
Thanks in advance!
Relevant answer
Answer
Hey!
According to Illumina 16S guide, for Tm calculation you need only target-specific region of primer.
  • asked a question related to Illumina Sequencing
Question
5 answers
can a bacterial gene differ between sequencing and band size on gel? if yes, what is the maximum difference, above which it cannot be the same gene?
Relevant answer
Answer
It depends on seq quality. sequence length and gel band length might be equal if seq quality is super good. but usually, seq length becomes shorter due to trim bad bases from both sides. my experience is 30-50 bases differ
  • asked a question related to Illumina Sequencing
Question
2 answers
Hello,
I recently performed a sequencing run with MiniSeq. The experiment involves editing via viral transduction, where the construct contains the CAs9 and guide RNA. When performing the NGS analysis, the total reads seem to be good enough (good enough depth). However, the number of reads with the indicator sequences in the given amplicon seems to be awfully low compared to the total reads. I bought new primer sequences for NGS prep and repeated the NGS run. No contamination or primer dimer in the library prep. Still I see the same low reads with indicator sequences in the given amplicon.
I think the transduction and random integration can be the reason for imperfect alignment of the amplicon, resulting in low number of reads with indicator sequences. Am I right? Could there be any other reason ?
The analysis was done via Rgen Cas9 analyzer website link http://www.rgenome.net/cas-analyzer/#!
Relevant answer
Answer
Thanks for the reply! I spiked enough PhiX , to get a good cluster density. However, I see your point on low numbers of perfectly matched sgRNAs.
  • asked a question related to Illumina Sequencing
Question
4 answers
Hi,
I applied both the E.Z.N.A.® Plant DNA Kit | Omega Bio-tek and the thermofisher ChargeSwitch kit (magnetic beads) for DNA extraction from plant leaves (freeze-dried). It seems omega work best in terms of yield and PCR product. I want to use the omega EZNA kit for DNA extraction followed by ITS amplicon library preparation for Illumina MiSeq sequencing.
Is there any problem EZNA kit with Illumina MiSeq or HighSeq platform? Or Can he use the Omega EZNA plant kit for DNA and use PCR products and later for Illumina sequencing?
Relevant answer
Answer
Yes, you can
  • asked a question related to Illumina Sequencing
Question
15 answers
Suggest me a low-cost DNA sequencing machine.
Please also mention the price if possible.
Thanks!
Relevant answer
Answer
Getting started with Oxford Nanopore DNA sequencing is easy. All products are available as Starter Packs, which include everything you need to cost-effectively perform your initial Nanopore DNA sequencing experiments. Devices start at just $1,000 with no CapEx required. But be careful with cheap products. They could make you penny wise pound foolish.
  • asked a question related to Illumina Sequencing
Question
5 answers
I would like to know what are the main limitations of using nanopore sequencing, what are the most difficult steps ?
How about support, applications available (are apps easy to use and free of charge ?).
Are the real costs of using nanopore are smaller or bigger than illumina's solutions ?
I have experience in Illumina sequencing, but I wonder if nanopore technology is a real competition or still technology "in-progress" ?
What are your general opinions? How about sequencing COVID ?
I would appreciate any responses from the users :)
Relevant answer
Answer
It's very hard to get wet lab nanopore going. Illumina is much more standardized and for majority of standard applications it is a very good choice. For something really in need of long-read sequencing I would suggest using Certified Service Prodivder. You can then consult your whole experiment with the scientific team and let them handle specialized applications in lab.
For Illumina there exist tons of free complete pipelines (look at nextflow pipelines). Nontheless investing in NGS without investing in experienced bioinfo team has short legs and you will never use the equipment to its fullest extent.
  • asked a question related to Illumina Sequencing
Question
3 answers
Hi everyone,
Illumina provides a list of primers to amplify with high taxonomic coverage the ITS1 region for further fungal sequencing, but I cannot find the exact amount necessary of each primer. I understand then that have to mix them equimolarly, am I right? Has anyone used them?
Thx!
  • asked a question related to Illumina Sequencing
Question
5 answers
Looking expert opinion...
I have collected marine sponge samples and were shadow dried for two weeks. Now the sponge samples are well dried and can be directly powdered by grinding. I would like to study the sponge associated actinobacterial populations (uncultured) from the dried sample rather than fresh sample.
Here comes my doubt,
If we grind and use the sponge powder for metagenomic DNA extraction, does the DNA be damaged/sheared ?
or
can directly use the dried sponge material (without grinding) for metagenomic DNA extraction?
Kindly, some one clarify my doubts.
Thanks in Advance,
Siva
Relevant answer
Answer
Hi Sivasankar Palaniappan ! How did the shadow drying method affect the DNA quality of your sponge sample?
  • asked a question related to Illumina Sequencing
Question
3 answers
Hi,
I am working on some 16S sequencing data, and seems some of them are low quality.
I am not sure how to set the trunc value in dada2.
qiime dada2 denoise-paired \
--i-demultiplexed-seqs paired-end-demux.qza \
--p-trim-left-f 0 \
--p-trim-left-r 0 \
--p-trunc-len-f 220 \
--p-trunc-len-r 180 \
--o-table table.qza \
--o-representative-sequences rep_seqs.qza \
--o-denoising-stats denoising_stats.qza \
Plugin error from dada2:
No reads passed the filter. trunc_len_f (220) or trunc_len_r (180) may be individually longer than read lengths, or trunc_len_f + trunc_len_r may be shorter than the length of the amplicon + 12 nucleotides (the length of the overlap). Alternatively, other arguments (such as max_ee or trunc_q) may be preventing reads from passing the filter.
What parameters should I choose?
Thanks for all your help !!
Relevant answer
Answer
Your reverse read quality is very bad so it can't be paired to forward reads. Try using only forward reads.
  • asked a question related to Illumina Sequencing
Question
2 answers
I tried to run this function sitetest to perform Site-level Differential Methylation Analysis using IMA package but I got error message.
sitetestALL = sitetest(dataf,gcase="KO",gcontrol="WT",testmethod ="wilcox" ,Padj="BH", rawpcut = NULL,adjustpcut =NULL,betadiffcut = NULL,paired = FALSE) and I got this error message: Error in wilcox.test.default(x[1:length(lev1)], x[(length(lev1) + 1):(length(lev1) + : not enough (finite) 'x’ observations
Can you help me to solve this problem?
Relevant answer
Answer
Hi
I suggest using if and else in lapply.
for example:
if(nrow(coulmn1)> 30) {
x <- with(data, cor(a, b))
}
else {
x <- 0
}
This is a good solution when the number of samples are small.
  • asked a question related to Illumina Sequencing
Question
4 answers
I am interested in the different tools that can be used to create custom databases for targeted sequencing and how to trim the databases based on the amplicon size? Also, should custom databases contain species not assigned to a species level?
Relevant answer
Answer
Not familiar with taxonomy databases, but I can always recommend SDL Multiterm. It is a customisable type of database that is used for storing terms and their explanations, create 'fields' between them etc. Hopefully the program can be of help for you, or at least point you in the right direction.
  • asked a question related to Illumina Sequencing
Question
16 answers
I've been trying to figure out how to go by assembling raw sequences. I have 5 Lactobacillus and 1 acetobacter strains whose genomes need to be assembled. I am trying to achieve a full length 16S rRNA sequence using 27F and 1492R primers before sending it off for MiSeq Illumina sequencing. What I am confused about is how to go by assembling the genome and its annotation? I am a first timer. Please guide.
Relevant answer
Answer
If your using 27F and 1492R primers for 16s rRNA amplification, it will result in a 1500 bp product. You can go for long read sequencing technology like the Oxford Nanopore, which will sequence the entire 1500 bp product unlike illumina which generates fragmented reads and hence will require to generate a contig. Genotypic, a Bangalore based company offers Nanopore sequencing solutions.
  • asked a question related to Illumina Sequencing
Question
4 answers
Which primer pair is better for amplifying endophytic fungal communities (metabarcoding with Illumina sequencing) in roots of trees while avoiding host amplification? I would greatly appreciate a suggestion of a suitable primer pair to identify endophytic fungal communities in tree roots using Illumina sequencing. Some sequencing facilities have the ITS1-1F-F / ITS1-1F-R available, while the ITS3mix / ITS4ngs is also suggested in the literature (Tedersoo et al 2015). Which is better? Is there any other option?
Any advice based on experience will be highly appreciated.
Relevant answer
Answer
Try ITS86F/ITS4
I have used these for illumina Miseq..worked well.
  • asked a question related to Illumina Sequencing
Question
1 answer
I ordered 16S primers according to the earth microbiome website, however, the researchers added GCT to the end of the Forward primer Illumina 5' Adapter to affect the melting temperature of the primer.
My laboratory decided to use these primers with barcoded Reverse primers to reduce overall costs by using dual index instead of single index seq. A sequencing company has informed me that this "GCT" extension can cause issues as seen below:
"Reason being is that your P5 adapter flowcell binding site is 3bp longer than the standard Illumina adapter. As a result, all of the Index 2 reads will start with “GCT”. This low diversity region may end up with quality drop or even run stop"
Has anyone tried this particular this extension to the Illumina adaptor with dual indices, and not just single. The problem has not been flagged when the primers with the extension were used as single index barcodes, or by a different sequencing company so I do not know how serious this issue could be.
Thanks for your help
Kelly
Relevant answer
Answer
Dear Kelly,
Yes, it could be troublesome, because the first three bases of i5 would be identical in each cluster, as your seq provider told you. High percentage of other library (with unrelated i5 sequences) could cure the problem though. I'd aim at no less than 20-25% addition of different type libs. Plus, these extra bases need to be added to i5 sequences in samplesheet, otherwise you will end up with all reads in the 'undetermined' file. You will also need to set the i5 length to 13 cycles to read entire barcodes (assuming that you are going to use the 10 bp Golay barcodes from EMP). Remember that MiSeq by default doesn't produce index reads fastq files, so no afterwards parsing is possible.
Best,
Marcin Golebiewski
  • asked a question related to Illumina Sequencing
Question
3 answers
I want to sequence the amplicons with two different library prep kits.
1. Nextera DNA Flex Library Prep. kit
2. QIAseq Ultralow Input Library Kit
Can I use the libraries from both preparation kits in a single cartridge simultaneously? If not, what is the possible reason?
Furthermore, is it possible to use the Adapters of one library prep. kit for another?
Relevant answer
Answer
The question is why you want to do that? These kits have different purposes ... anyway, you can use different kits as long as you’re using compatible adapters.
Good luck
  • asked a question related to Illumina Sequencing
Question
1 answer
I filtered different amount of sea water and we did Illumina sequencing on the samples obtained. To be able to compare among different samples I would standardize the samples by calculating the OTU reads per liter.
I found that Quime suggests standardization to the minimum otbained to reduce noise. But since I dont work with the FASTA format but with an excel table with the reads of OTUs which are assigned to the different species I obtained, I am wondering if calculating OTUs per Liter would be a standarization method which allows me to compare among my 33 different samples?
How do you standardize your HTS data of environmental samples?
Relevant answer
Following.
  • asked a question related to Illumina Sequencing
Question
2 answers
Many DNA extraction protocols warn against vortexing during sample preparation as it tends to shear DNA into smaller fragments. This makes sense if the downstream application requires high molecular weight DNA - but vortexing is really handy. Illumina sequencing doesn't require high molecular weight DNA (shearing is part of the library preparation) - but I notice people still taking pains to avoid vortexing during extractions that are intended for Illumina sequencing. Is there a compelling reason to do so, or is it more likely a case of not customizing a generalized protocol to the specific needs of the situation.
Relevant answer
Answer
Hi Eric,
Your instinct is correct and no, you do not need to avoid shearing during DNA preparation for samples that will be used for Illumina sequencing.
Those are steps that you should take if your DNA samples are intended to be sequenced using long read platforms, such as PacBio and Nanopore.
Best,
Sheng
  • asked a question related to Illumina Sequencing
Question
2 answers
Hi there,
I am working on pooled sequencing samples of drosophila. I have three populations. When I first submitted my samples for (paired end) Illumina sequencing, the biotech center informed me that they messed up sequencing the third sample, and would have to redo the run. I still received the data from the other two samples which were fine. When they redid the sequencing for the thid population, I also received another set of reads from populations 1 and 2 from the second run. My advisor advised that I concatenate the fastq files from the original and redo runs for these populations to obtain more depth. I'm wondering a couple things;
1. Has anyone else done something like this with their data
2. Could this create any sort of problems or caveats that I should be aware of when analyzing my data downstream? (I'm currently using Popoolation2)
Thank you
Relevant answer
Answer
Hi Samual,
I suggest you process your second run samples (all three populations) together as already suggested with probable reasons. Also, you could try merging your data for populations 1 and 2 from both runs. That will help you understand the effect of relative depth if any. There will be no problem in merging your data from two different runs. You can either merge fastq files or bam files. I would recommend merge bam files and proceed with your downstream processing.
I hope it will be of some help.
  • asked a question related to Illumina Sequencing
Question
5 answers
Hi Everyone!
I am running a few population genomic studies on some invasive mammalian species, and I'm wanting to plan my study. I will shortly have some platinum phased chromosome level genome assemblies to scaffold everything on.
What I am wondering is what are the coverage levels required for various analyses? Obviously I have the trade off between samples and coverage.
Overall, I'm wanting to describe the levels of diversity across the population, look for genomic regions with higher and lower variability, and potentially look at some evolutionary historical questions such as founder number, origins and hybridization history, and maybe adaptation (this last one is probably very tricky given the demographic history). It's a little nebulous, but I hope to get a few different things out of the study, and I don't want to think in trerospect, 'damn, I wish I had done x coverage instead of y'.
My first thought was to do moderate coverage genomes (~20 - 30x ) for around 8 individuals sampled from different 'populations' in the introduced range, and ~4 individuals from the original range to compare diversity. I realise this is a small number from the original range, but it does make sense based on historical evidence. Then for genomic diversity at a landscape scale, do ~200 individuals with ddRAD. Get the best of both worlds.
I could alternatively do far more individuals at lower coverage.
It would be great to find out people's opinions on ideal coverage levels for different questions. Say, heterozygosity requires at least x coverage, or ROHs requies y. Are there clear benefits for some questions for having 30x rather than 10x, or 10x over 1x for instance?
Any thoughts would be greatly appreciated.
Relevant answer
Answer
Andrew Veale You have chromosome level reference assemblies available for specific mammals and want to know what sequencing coverage levels you need for various population genomic analyses: especially interested in trade off between samples and coverage
Here is my attempt.
1) Almost all analyses need to estimate the allele frequency of a group or population, around 50 chromosomes is required for any group, at a minimum, but given you may not know the groups before-hand around 50 diploid individuals for any likely groups is suggested.
2) If you base your metrics on the Poisson distribution around 2 reads/variant is optimal. However, many laboratory sequencing methods have GC and other biases so a little higher is better (i.e. improves the co-call rate for a particular locus when comparing two individuals). You really should avoid going below 1.
3) The next issue is how many variants and at what spacing, and that depends on laboratory methods used (i.e. ability to target specific variants when sequencing). Many analyses at a population level only need a relatively low number of variants ( ~200 for parentage, ~1000-2000 for strain composition or strain differentiation, ~10,000-20,000 for relationship matrices).
4) If you want to do selection sweep analyses or other genome regional analyses, you will need more variants but around the same depth as above for the markers within each region. The density of markers to use in turn depends on the effective “haplotype segments”. This in turn depends on the effective population size, genome length in Morgans and how many variants you want to mark each block and what allele frequency distribution within each block. I would suggest a minimum of 10 variants and given most mammalian genomes are ~30 Morgans in size it depends on the effective population size of the group. Figures range from 30,000 to 300,000 for Ne of 100 to 1000 which span the most frequent effective mammalian population sizes. Of course if the sequencing method has random spacing and SNPs it pays to be perhaps 1-2 times denser than these numbers would suggest.
5) Finally, there are some analyses like estimating runs of homozygosity or inbreeding estimates where you want to sample each chromosome of the diploid a little deeper typically around 8 reads per locus.
From these numbers you can work out what is best but generally if you once you reach these thresholds it is better to spread your sequencing effort over more individuals than going to a greater depth. In reality, often sample numbers are limiting as well.
  • asked a question related to Illumina Sequencing
Question
4 answers
I have to send my sample for metagenomics sequencing but my genomic DNA shows up a clear band with thin and long smear. It looks like my sample has degraded but I'm not sure. I read somewhere that it's quite common for genomic DNA to have smear. But I fear the sequencing lab would reject my sample.
So far, this is the best result I can get and I'm tight on schedule so it's almost impossible for me to optimized another extraction method.
Do you think my sample are qualified for the sequencing? My sample is lane 1 and lane 2 (I mistakenly cropped off the ladder, though).
I need your advices. Thankyou so much!
Relevant answer
Answer
Hi Kathleen, in the first two lanes, the band is of relatively high intensity. Though there is smear which is indicating fragmentation. I still think you can proceed to library preparation with these samples. I am assuming you are going to do short-read Illumina sequencing only so slight fragmentation would not affect your sequencing run. Also, once the library is prepared, you can do a QC of it using a bio-analyzer or tape-station before loading it on the sequencer.
  • asked a question related to Illumina Sequencing
Question
1 answer
Hi everyone! Currently I'm optimizing lab protocol focused on 16S sequencing of low abundant samples and I'm trying to deplete residual DNA from Q5 polymerase used for library preparation using 8-methoxypsoralen. After that we are performig cleanup with SPRI beads and then indexing and second cleanup. Is it possible that 8-methoxypsoralen can inhibit/result in crash of Illumina sequencing run?
Relevant answer
Answer
I believe not. In my whole career I havent experienced MOP to cause this kind of problem. I would suggest try to look at the clean up step - my guess would be that the problem lies there.
Let me know when you find out more.
Peter.
  • asked a question related to Illumina Sequencing
Question
3 answers
I would like to find a tool (if it exist ...), to predict the porportion of r vs K strategist in samples from a metabarcoding study.
For example, the tool I need (R package or equivalent) could work similarly to functional predictive tools such as Tax4Fun, FAPROTAX or PICRUST, but instead of predicting functions, it investigate ecological strategy such as r vs K as explained by the "r and K selection theory".
Benoît.
Relevant answer
Answer
Hi Benoît,
No program that I am aware of. I reckon that differentiate life-strategist from mere amplicon seqs can be, at best, unreliable if considering problems such as uncomplete reference datasets.
A good paper describing this but trying to list taxa that behaves as copio-oligotrophs:
Please let me know if you come across with a relevant program though
Cheers
Ben
  • asked a question related to Illumina Sequencing
Question
2 answers
Which Illumina-compatible sequencing kits are recommended for the preparation of multiplexed libraries that cover the ITS region of fungi?
Relevant answer
Answer
You can use
  • ThruPlex DNA seq-kit, Takara
  • NxSeq kit, Lucigen
  • Nextera or TruSeq kits, Illumina
  • NEXTflex kit, Bioo scientific
  • KAPA kits, Roche
  • Custom primer with barcodes (Hugerth et al, 2014 or Elbrecht and Leese, 2015 or Elbrecht and Leese, 2017)
  • asked a question related to Illumina Sequencing
Question
2 answers
What are the criteria for designing biotinylated probes differ from primer designing?
Relevant answer
Answer
The design of primers verse probes can both be down fairly easily with the use of the companies design tools (as long as you know the regions you want to target) and most companies will provide you with assistance.
Usually most people prefer probe based methods over primer methods since they work better of degraded samples (i.e. you need enough intact copies of your regions to get amplification). If you have good high molecular weight DNA and little to no degradation both primer and probe methods should be find. If you have degraded samples I would suggest using probe methods. I tend to just go with probe methods so that I can use them on both degraded sample (i.e. environmental DNA) and intact samples (i.e. fresh tissue extracts).
Hope this help and good luck!
  • asked a question related to Illumina Sequencing
Question
6 answers
I am working on a project aimed to determine the influence of long-term fertilization on soil microbial communities.  I am sampling both the rhizosphere and the bulk soil and hope to use the current best choice of primers for targeting bacterial and archaeal 16S rRNA genes. Initially I planned to use the primer pair 341F/785R, which targets the V3-V4 region of 16S and is reported to have good domain coverage for both bacteria and archaea. However, I now also have the option to use separate, archaea (956F/1401R)- or bacteria (969F/1406R) -specific primers, which target the V6-V8 region of 16S. The benefits of the separate primers are better coverage for archaea, and less eukaryotic sequence contamination, but the V3-V4 primers are the standard tool typically used in similar research. I am confused with which set should I proceed with or if there are any other primer sets I should be considering?
Relevant answer
Answer
Thanks all for your comments. Haitao Wang I agree with your suggestion but I guess the 515F/806R are shown to be biased against both the Crenarchaeota/Thaumarchaeota (https://msystems.asm.org/content/1/1/e00009-15) so after going through some articles I found that the V4-V5 primers 515F/926R performs well and can reduce bias and can detect more environmentally important taxa including archaea. I just posted here as I thought it might be helpful.
Best,
Sandipan
  • asked a question related to Illumina Sequencing
Question
1 answer
Dear All,
could some please suggest proper PhiX concentration for multiplex Illumina sequencing with use of MiSeq kit v2 and Nextera XT.
samples #1-11 are highly similar - around 50-56%GC
sample #12 significantly different from the others - somewhere around 33%GC.
5% PhiX will be ok?
thanks in advance,
Piotr
Relevant answer
Answer
Dear Piotr,
The correct answer will depend of your librairies.
Do you work on amplicons ? Do you work on metagneomics?
The first one will need more than 10% (15% at least) when the second will need 5%.
Best regards.
  • asked a question related to Illumina Sequencing
Question
1 answer
Anybody has experience with a kit for library preparation from DNA.
For T and B immune repertoire , for illumina sequencing.
BUT
samples are from FFPE!!
I use Lymphotrak and it is great with fresh tissue or blood, but not FFPE
Relevant answer
Answer
following
  • asked a question related to Illumina Sequencing
Question
4 answers
Which is better, V1-V3 or V3-V4 regions for 16S amplicons using 2x300 bp illumina sequencing? Are both good to target bacteria and archaea?
Relevant answer
Answer
Yes, I know, or Pacbio, but the prices are too expensive for us, we barely will be able to aford Illumina, that is why I would like to decide on the best primers to take advantage of the results we could get.
  • asked a question related to Illumina Sequencing
Question
1 answer
I am making libraries following a NEB protocol for Illumina sequencing.
The insert size is 350bp, which should yield 480bp long fragments in the library after amplification (insert+adaptor).
After amplification, I get longer fragments, with libraries peak centred on 800bp. Would anyone know why?
Relevant answer
Answer
Maybe the Fragmentation was too low.
Try increasing the fragmentation temperature for the RNA Fragmentation.
Best
  • asked a question related to Illumina Sequencing
Question
13 answers
I am very naive with the Illumina sequencing and I got the sequenced raw data and report. I am quite new with the technique and terminology. Can anyone explain what it means that was sequenced using "2x250bp paired-end reads"? To what do the numbers refer? 
Relevant answer
Answer
Abhijeet Singh,
I concur Marco's comments 100%. There are ways and ways to say one thing.
For the sake of correctness, in pair-end sequencing, both a forward and reverse reads are sequenced, and the addition of a nucleotide at each synthesis step is called a cycle. If you are using different terms in your lab, this is okay, but it doesn't necessarily mean this is the only correct name for use of terminology.
Good day!
  • asked a question related to Illumina Sequencing
Question
2 answers
Hi,
It is our first trial and I'm not very glad... We have performed a first experiment with eight fresh frozen samples (input 80 ng DNA).
We tried to Run and we are experiencing some problems, it seems that the MiSeq was able to detect the clusters (1086) but it wasn't able to give us any more results, and none of the clusters has good passing filter QC...strange...
Any idea?
Relevant answer
Answer
If you work with the QiaSeq panels, try the Seamless NGS software. It has a workflow that is able to correctly analyze Qiagen's QiaSeq panels (including UMIs, common sequence and single amplicon primer). Feel free to ask for a free personal demo (online) to see how the software performs!
  • asked a question related to Illumina Sequencing
Question
7 answers
Hi!
I've recently been trying to prepare a couple of DNA libraries for sequencing on the novaSeq platform. In that regard, I've had to try several methods (I'm doing bacterial transposon insertion sequencing, where traditional prep methods don't work), and I realized that apparently different kits have different index primes.
For example, the NebNext kits seem to rely on the following as rd2:
5' - AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT
While Nextera kits rely on this as rd2:
5' - GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG
How come? Is this not a problem on the platforms? How do they know which to expect, to make sure that the indeces are read?
Looking forward to your answers!
Relevant answer
Answer
Hello Mark,
There are many different kits for library preps, and they might come from different companies, such as NEB (NEBNext) or Illumina (Nextera). You need not to worry though, their primers and adaptors are fit for many platforms!
For example, NEBNext manual gives some examples of valid combinations of primers for NovaSeq (manualE6440, page 10).
How does it work?
The most recent kit that I have used is NEBNext, so I am going to use this example. If you are planning to do dual-indexing, your NEBNext library prep will have many fragments, each with the following structure:
P7primer - SequencingPrimerBindingSite1 - DNA - SequencingPrimerBindingSite2 - P5primer.
When the library is sequenced, each of your fragments will attach to a template strand. This is because each primer that you used are complimentary to the flowcell strand. The first bit of your fragments that joins the template strand is the P7 primer. The problem, as you mentioned, is that the two kits list different nucleotide combinations. There is a universal part of the primer index, so that the sequencing works from all commercial kits. I suspect that these sequences are "hidden" from the kit instructions.
I find it useful to check some videos from Illumina to understand the processes that happen in the sequencing flowcell.
Good luck with your library preps!
  • asked a question related to Illumina Sequencing
Question
6 answers
I know the equation is Coverage = (read length) (number of reads) / (haploid genome length) but am a little uncertain on what to put in each field (namely read #).
I have an organism with a genome 5,000,000 bp long, it's the only DNA I would be sequencing and I want to calculate the average fold coverage.
I'm using the NextSeq500.
This is where I am currently, but this doesn't look at all right.
C = LN / G
L = 150 bp
N = 130 x106
G = 5x106 bp
C = (150) (130,000,000) / 5,000,000
C = 3900
Relevant answer
Answer
ok, 106 is in fact 10power6...
so you're wright, 3900 would be your mean depth.
but don't forget waste and trim that will reduce your sequencing data and low coverage of some regions as GC rich regions.
best
fred
  • asked a question related to Illumina Sequencing
Question
4 answers
I want to go for whole genome sequencing using next gen Illumina but i have some problems.
Firstly, the organism is a mollicute with a small genome and is an obligate plant parasite with low titer. I cannot do a DNA extraction free from its host DNA. The best ratio i can possibly get is 1:10 bacterial to host DNA, optimistically. I can purify it using a NEBNext® Microbiome DNA Enrichment Kit but that will still leave me with organelle DNA.
The problem lies that the organism is very GC low, estimated around 26% GC, i've read that Illumina has a GC bias ( ) .
" For example, Illumina sequencing of a Plasmodium falciparum genome, which is extremely GC-poor with a mean GC content less than 25%, was found to favor the more GC-balanced regions, leading to few or no reads from the many GC-poor regions [13]. "
This is an issue because not only are we afraid Illumina will target GC regions of the bacteria but altogether favor the normal GC% host DNA which already outnumber the bacterial DNA. And thus we would be left with very low number of reads for our actual bacterial DNA.
Do you have any recommendations about this matter?
Also, in the event of not being able to extract adequate amounts of DNA, we would have to resort to an MDA, does anyone have experience with MDA favoring higher GC% genomes and also moving in favor of amplifying host DNA over bacterial DNA? . As a strategy to target my organism i can add oligomers in the MDA that are frequent in the organism reported in literature. Would that be recommended?
Would appreciate some answers.
Relevant answer
Answer
Pea aphid has a low GC content (about 35%) with a lot of bacteria symbionts and Illumina sequencing on DNA is working very well in this organism.
You'll just need to map your reads first to the host genome and map the unmapped reads to your mollicute genome.
You can do a test sequencing with really low number of reads, the proportion should not change much if you continue with deeper sequencing using the same library.
If you fear GC bias, depending on what type of method you use and what kind of quantification you want to make, I think you correct for the GC bias.
  • asked a question related to Illumina Sequencing
Question
4 answers
I got back results from Illumina sequencing in the form of 2 files. I trimmed them and then I am trying to merge and convert them to a faste file so that I can assemble using IDBAUD. Heres my code and the error that I received
$ fq2fa --merge C_S7_trim_R1_001.fastq.gz
C_S7_trim_R2_001.fastq.gz C_S7_outmergedfile.fa
Error: terminate called after throwing an instance of 'std::out_of_range'
what(): basic_string::substr
Aborted.
I have downloaded idbaud to my home directory and am using function fq2fa from it. Could anyone please tell me what is wrong with this? Any help is appreciated.
Thank you
Relevant answer
Answer
Decompress the data and try it.
  • asked a question related to Illumina Sequencing
Question
6 answers
Hello! I have bacterial (16s V3-V4) and fungal (ITS2) miseq data from soils collected on different farms where I'm hoping to compare microbial networks in different farming types (conventional and organic). While OTUs have traditionally been used for network analyses, I'm wondering if, with the push for ASVs (https://www.nature.com/articles/ismej2017119), ASVs should be used instead? From the cautionary blog posts from Noah Fierer's lab (http://fiererlab.org/2017/10/09/intragenomic-heterogeneity-and-its-implications-for-esvs/), I'm a bit hesitant to use ASVs in network analyses due to potential, strong positive correlations being the result of simply having multiple sequence variants from the same organism. The only published example I've found with networks of ASVs (https://www.nature.com/articles/ismej201729) evaluates ASVs within OTUs, not just ASVs. Should only OTUs be used in the analysis of microbial networks? Or can ASVs be used instead, and the lack of published networks with ASVs simply due to the recent development of programs that produce ASVs? Many thanks for thoughts and ideas on this.
Relevant answer
Answer
There are already many articles about ASVs and fungi (see below). When you search with google type also exact sequence variants (ESVs) instead of ASVs and you find more results.
Here some suggestion and interesting articles about ASVs:
Article about ASVs in SCIENCE – Fungi ITS2
Fungal diversity regulates plant-soil feedbacks in temperate grassland
Article ASVs (ESVs) and OTUs - ASM Journal– ITS2 -
Broadscale Ecological Patterns Are Robust to Use of Exact Sequence Variants versus Operational Taxonomic Units
Article about ASVs in Frontiers in microbiology–Fungi ITS2
Fungal Biodiversity of the Most Common Types of Polish Soil in a Long-Term Microplot Experiment
Super interesting discussion and comments about ASVs and OTUs-
  • asked a question related to Illumina Sequencing
Question
5 answers
I have mi-RNA sequencing data, where my 3 prime adapter is attached with 12 Nucleotide UMI Sequence (in random manner) along with Reverse Transcriptase primer and Universal Adapter
Overall sequence pattern= miRNA+3Prime adapter+RT primer+Universal Adapter
I am trying to remove the UMI sequence using UMI tools. I have also looked in to the regex function in UMI tools but i am not sure it will work because of the sequence pattern i have.
Please let me know if i have any chance of processing the sequence with UMI tools.
I have attached an example of my sequence pattern
Image annotation: 3 prime adapter colored in RED
Reverse Transcriptase primer in Orange
and in between them is the 12 Nucleotide UMI bases.
And after Reverse Transcriptase primer is Universal Adapter.
Relevant answer
Answer
Rajesh Pal In the discussion I concluded that if one can't explain what he/she want to do, its hard to give right answer.
  • asked a question related to Illumina Sequencing
Question
2 answers
I am trying to do some quality control checks on Illumina NGS paired end reads using FastQC 0.11.5. I tried to add the primer sequences from NEBNext multiplex library prep kit to the contaminant_list file. I already notice that the file has non-unix carriage return non-printable charaters (^M or /r). Both removing and making sure that all lines have them does not change the error. I even used the orginal contaminant file I still get:
Option c is ambiguous (casava, contaminants)
Started analysis of myfile.fastq.gz
Failed to process /path/to/file/contaminant_list.txt
uk.ac.babraham.FastQC.Sequence.SequenceFormatException: ID line didn't start with '@'
Curiously the sequence file being analyzed seems to be processed okay, so it is not corrupted:
Approx 5% complete for myfile.fastq.gz
Approx 10% complete for myfile.fastq.gz
Approx 15% complete for myfile.fastq.gz
Approx 20% complete for myfile.fastq.gz
etc....
Here are the commands I am using:
fastqc myfile.fastq.gz -o /path/to/fastqc_out -a path/to/adapter_list.txt --noextract -t 6 -j /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.191.b12-0.el6_10.x86_64/jre/bin/java -c /path/to/Configuration/contaminant_list.txt
Anybody know what is going on?
Relevant answer
Answer
It never hurts to really pay more attention to the error messages folks!
FastQC does not like you to use the -c flag if you haven't used either --casava or --contaminants. Apparently -c can mean both. Replacing -c with --contaminants in the scripts makes the error go away. Hopefully this will help some aggrieved DNA jockeys out there. Happy scripting!
  • asked a question related to Illumina Sequencing
Question
4 answers
Hi, we have two Illumina-sequenced plant samples where one is a mutant of the other. Do you know a straightforward method to compare these in order to spot small (SNPs, InDels) or maybe longer mutations (LTR insertion, don't think CNV) or do you suggest align both to a reference, call mutations and then compare?
Tnx
Relevant answer
Answer
Dear Marco,
You may consider artMAP tool for your purposes (https://www.biorxiv.org/content/early/2018/09/12/414433.full.pdf+html). The final analysis of the tool will only give snp having g to a and c to t conversion as EMS-induced those. However, the pipeline will produce VCF files from both samples with SNPs, and you can use bedtool subtract function or any tool you like to get mutant-specific SNPs easily. Also, the tool is ready-to-use (Linux/Mac/Win) since it is provided through a Docker container, see here: https://github.com/RihaLab/artMAP/blob/master/artMAP.md.
Otherwise, you can have a look at the galaxy project (https://usegalaxy.org/), where you can set up your own pipeline of analysis or load a pre-defined ones.
I hope this can help you,
Best regards,
Andrea
  • asked a question related to Illumina Sequencing
Question
3 answers
what exactly is the Nextera PCR Master Mix from the Nextera XT DNA Library Prep kit composed of? and at what concentrations if possible.
I saw that this question was asked already but the answer weren't specific. I already e-mailed tech support but I thought I would ask here too because I am always enlightened in some way from you guys!
Thank you in advance!
Relevant answer
Answer
in my experience, you can also use kapa hifi DNA polymerase and related buffer for your Nextera library amplification. remember not use hotstart version. becasue the first step is 72C extension.
  • asked a question related to Illumina Sequencing
Question
3 answers
I would like to study the metagenomic diversity of actinobacteria associated with the marine sponges. In this context, I would like to use actinospecific primers to capture the complete actinobacteria phylum rather than using universal primers. Especially, would like to explore the rare actinobacteria like Salinispora.
I am aware the actinospecific primers given by Stach et al. 2003 ( SC-Act-235aS20, SC-Act-878aA19) may be useful in my case. However, as the primer read length is 640 bp it can not be used for Illumina.
Illumina can deliver less than 400 bp.
Blackwood et al., 2005 has also proposed Phyla specific primers. He has reported that the Actinobacteria Phyla specific primer "Act1159R" captures most of the actinobacteria they tested. Can it be useful in my case? Whether the Reverse primer alone sufficient to explore the actinobacterial Phyla?
Kindly, some one can provide me the write primer that will be helpful for me.
Thanks in Advance.
Relevant answer
They made a first Actinobacteria-specific PCR with a1.2 kb amplicon length, and then another PCR with universal primers to amplify the V3-V4 region of the 16SrRNA.
  • asked a question related to Illumina Sequencing
Question
8 answers
This is a video on Illumina Sequencing that I really like:
However, I still have some technical doubts and I do appreciate your expert comments on any of these:
1) After preparing the sample DNA it is introduced on the lanes of the Illumina flow cell. How is it ensured that DNA will be uniformly distributed and not bind preferentially around to the point of injection?
2) My understanding is that the number of DNA bridge amplification cycles will affect the size of the clusters generated on the flow cell. How many cycles of bridge amplification are necessary? What impact does this have on the results?
3) With the introduction of different fluorescent tags on different nucleotides, can all nucleotides be present in the flow cell simultaneously? How can then be ensured that the polymerization of all the strands in a cluster occurs in a synchronized manner to measure a unique signal?
Thank you very much for your help!
Relevant answer
Answer
The answer that Abhijeet provides is accurate, but let me add a point of clarification. To make more effective use of the flow cell surface space, Illumina
created the patterned flow cell with distinct nanowells for cluster generation. Patterned flow cells are produced using semiconductor manufacturing technology, that ensures that DNA clusters only form within the nanowells, providing even, consistent spacing between adjacent clusters and allowing accurate resolution of clusters during imaging.
  • asked a question related to Illumina Sequencing
Question
3 answers
EDIT 29/10/2018: Added more details to bottom.
This might be a bit weird, but I thought I'd try here anyway.
Is it possible to configure Bowtie2, or similar programs, to match my query-sequence only from the end of the sequencing reads?
Query1:
AAATTTGGG
Read1:
NNNNNNNNNNNNAAATTTGGG
This query should give an an exact match score for Read1, no matter what the N's contain, while
Query1:
AAATTTGGG
Read2:
NNNNNNNNNNNNAAATTTGGGNNN
Or anything else where the end of the read does not match the query exactly, would result in a mismatch.
I'm currently just using grep to match my reads in this way, but it is terribly slow.
EDIT:
Bit of the background. I'm sequencing an oligo library coding for short peptide sequences. The peptide length varies, so to make them all behave the same way in PCR etc., I have added filler sequence bringing them all to length of 200 bp. Each peptide coding sequence starts with Kozak (GCTAGCCCACC). So each oligo in the library looks like this:
NNNN...NNNN-GCTAGCCCACCATGACCACAGGAGACACCTAGCT
1bp----Filler---------Kozak-----------------peptide-coding-----------200bp
Because of errors in oligo synthesis, PCR, and Illumina, there can be snips/indels in these sequences in the sequencing reads. Any mismatches in the N-part (filler) are of no consequence to the peptide being produced, which is why I would like to match the sequencing reads to my library only starting from the Kozak sequence. The filler length varies from 0-150 bp, so I cant think of an easy way of trimming these from the reads.
Relevant answer
Answer
* If you are aligning bith sequences, query would only align to the same/similar sequence in your read, doesn't matter if there are N.
* But if you already know that there are Ns in your read, the obvious approach should be the trimming of Ns before moving to MSA.
  • asked a question related to Illumina Sequencing
Question
4 answers
454 GS FLX Titanium system which is capable of generating 700 megabase (Mb) of sequence in 700 bp reads.
MiSeq reagents enable up to 15 Gb of output with 25 million sequencing reads and 2 × 300 bp read lengths.
In the first case to seq a total of 700 million bases it does by seq 700 base pairs at a time in a run?
In the second case to seq 15 billion bases what is 25 million sequencing reads and 2x300 bp reads.
And does the algorithm align these 300bp length sequences together to generate whole sequences?
Relevant answer
Answer
"To generate 700mb seq optimal primer length would be 18-25 bp only right? same is the case with both 454 GS FLX and MiSeq? or only MiSeq generate it as two 300mb seq.
In order to align them in to one seq we create two 300mb overlapping seq combine it as one whole 600mb seq.
A 'read' is a single seq after join two 300bp into one 600bp? correct me if I'm wrong."
Correct, you have to generate a "forward" read and a "reverse" read that need to overlap. The combination of the two will give you a consensus read used downstream for the data analysis.
Just to be sure, we are talking of "base pairs" (bp), so each forward and reverse sequence needs to be maximum 300 nucleotides (for MiSeq) if you want them to overlap.
When targeting a single gene the primers will be specific to that single gene why would it generate 25 million reads? okay u mean from many DNA molecules harboring the same gene in the sample? how did they determine it as 25 million it could be less or more than that?
For example, if you have DNA extracted from a full insect, targeting the 16S rRNA you will be obtaining all the bacterial DNAs from the insect's guts and the insects symbionts. This can amount to hundreds and hundreds of different bacterial taxa. And each taxon can obviously be present in a multitude of individual bacteria. So, yeah, the numbers can increase pretty quickly!
However, I am not sure how they got that exact estimate (25 million reads). It may be a technical limitation instead of a biological one. Which means that probably MiSeq is not usually capable of generate more results. I am not sure about it, though.
What if I deal with a new organism how do we seq its whole genome when the seq is not available on pubmed. Using random hexa primers or how do they do it these days? eg. nanopore offers real time DNA seq with out amplification or extraction?
Well, if you want to sequence a whole genome you probably prefer a "long reads" sequencer (e.g. PacBio, Nanopore) instead of a "short reads" one (e.g. MiSeq). Or you can use a short reads sequencer in addition to the long reads one, too. These long reads sequencers can indeed generate very long sequences (more than 100Kb) that you will have to assemble into contigs until (hopefully) you will obtain a continuous sequence of each chromosome. But that's a whole different world, especially for the data analysis, as compared to short reads sequencing.
  • asked a question related to Illumina Sequencing
Question
3 answers
Actually i want to do forensic genetics NGS and need assistance in this regard,
Relevant answer
Answer
Thank you so much Fei Guo and Marcela Krutova
  • asked a question related to Illumina Sequencing
Question
4 answers
We are looking for some protocols to genomic DNA extraction of H. pylori with the quality required to be sequenced in MiSeq system of Illumina. Does anyone have suggestions?
Relevant answer
Answer
I agree with Ronan, the DNeasy Blood and Tissue Kit (Qiagen) is a great kit for preparing bacterial genomic DNA for Illumina sequencing. To increase throughput, you can use a Qiavac vacuum manifold to pool several tubes containing the same sample together.
  • asked a question related to Illumina Sequencing
Question
4 answers
Next Generation Sequencing techniques are currently used in microbial profiling and other studies. Between the NGS (specifically Illumina Miseq Technique) and PLFA, which is the best method to employ in microbial studies? What are the pros/cons of either?
Or, besides the two, what is the other best method to employ in microbial studies?
Relevant answer
Answer
what makes PLFA not the most suitable for this?
  • asked a question related to Illumina Sequencing
Question
5 answers
Hi all,
As we all know that 16S amplicon seq data by Illumina sequencing is semi-quantitative and therefore I want to normalize my data output with the qPCR 16S data. Can you please recommend me some statistical method of doing so?
Thank you!
Arslan
Relevant answer
Answer
Yes! okay! thats right. The product was same at both concentrations.
lets assume that I have no inhibition, how do you think normalization can be done... any method do u recommend?
  • asked a question related to Illumina Sequencing
Question
5 answers
Hi all, 
I am trying to plot rank abundance curve for my 16S rDNA data in which I have different treatments. When I do so, I get one curve only describing about the presence of each species. However, as I have different treatments, I want to plot multiple curves to show the relative comparison and species distribution. This is not working for me. I am attaching the plot which I got and would be thankful for suggestions / help!
The code i am using is given below
phyloTemp = transform_sample_counts(physeq, function(x) 1e+02 * x/sum(x))
clusterData = psmelt(phyloTemp)
clusterData = dplyr::filter(clusterData, Abundance > 0)
clusterAgg = aggregate(Abundance ~ OTU + Genus, data=clusterData, mean)
clusterAgg = aggregate(Abundance ~ OTU + Genus,data=clusterData,mean)
clusterAgg = clusterAgg[order(-clusterAgg$Abundance),][1:100,]
ggplot(clusterAgg,aes(x=reorder(OTU,-Abundance),y=Abundance)) +
geom_point(aes(color=Genus),size=3) +
theme(axis.ticks = element_blank(), axis.text.x = element_blank()) +
scale_y_log10()
Relevant answer
Answer
You can use amp_rabund function with plot.type = "curve"from ampvis R-package.
  • asked a question related to Illumina Sequencing
Question
3 answers
I work in gram positives (S. aureus) and lately We have done WGS illumina single read for different clinical and lab strains. The results are really good. Also I had seen publication with illumina single read.
The problem is that we have to do WGS to some gram negatives (klebsiella and E.coli) and I had read that for this kind of bacteria is better Double read. I want to know if it true or we can also work with single reads (cheaper), and if it is better why (genome size?, amount of plasmids?)?
Thanks!!
Relevant answer
Answer
Hi Celeste!
Would you share the reference that was shared with you? It isn't appear in the commentaries.
Tks a lot..
  • asked a question related to Illumina Sequencing
Question
4 answers
My samples seem to have similar levels of amplification with no primer clouds or primer-dimers and no secondary amplification. Is there a reason I need to magnetic bead clean up each sample right after PCR instead of doing them all together after quantification/dilution/pooling?
Thanks!
Relevant answer
Answer
Hey David, I think the main reason you need to do PCR--> purification --> qubit --> pooling is that you want to get an accurate reading on your final output (purified PCR), from there you can ensure you pool equal morality of each indexed samples. Of cause this is assuming you want same amount of reads for each of you samples.
  • asked a question related to Illumina Sequencing
Question
5 answers
We recently sequenced 96 samples on a Nextseq, prepare from single- and double stranded DNA, using the appropriate protocols.
Turns out that the samples using the single-stranded protocol seemingly did not work. One of the things I can see is that we got a very high frequency of long poly-G sequences (~20%). Manually checking the fastq files shows both complete 76bp poly-G sequences, and in other cases a small poly-A tail (10bp) following the indexing adapter, then followed by the long poly-G tail. Do these poly-G tails mean no signal (considering the NextSeqs two-colour chemistry) and should therefore be ignored and trimmed? Can they be the sympton of some library preparation problem (bad adapter fill-in, blunt-end repair)? Thanks in advance!
  • asked a question related to Illumina Sequencing
Question
5 answers
Hi all,
I am new to the Illumina system. When I did the PCR and ran the gel using normal sets of primers. Everything is nice and clean. However, when I used the primers with Illumina adapters. I got non-specific bands and smear. Can someone help me please?
PCR conditions are the same for both with/without the adapters.
16S primers: 515'F/926r
GTGBCAGCMGCCGCGGTAA
CCGYCAATTYMTTTRAGTTT
16S primers with adapters:
5'[TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG]GTGBCAGCMGCCGCGGTAA-3'
5’[GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG]CCGYCAATTYMTTTRAGTTT-3'
16S PCR condition:
95 C - 5 min
35 cycles of 95 C - 30 sec/ 55.5 C - 30 sec / 72 C - 30 sec
72 C - 10 min
18S primers: Euk1391f/EukBr
GTACACACCGCCCGTC
TGATCCTTCTGCAGGTTCACCTAC
with adapters:
5’[TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG]GTACACACCGCCCGTC-3'
5’[GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG]TGATCCTTCTGCAGGTTCACCTAC-3'
18S PCR condition:
94C - 3 min
35 cycles of 94C - 45 sec/ 57C - 60 sec/ 72C - 90 sec
72C - 10 min
These are all environmental DNA which the original extracted DNA concentration are very low.
Thank you!
Kitty
Relevant answer
Answer
If the topic of discussion is about illumina sample prep, then it should be understood that the protocol should be about sample prep too. I did not mention anything about sequencing protocol.
  • asked a question related to Illumina Sequencing
Question
3 answers
Can ambiguous IUPAC nucleotides (R, S, Y, K...) ever appear in the raw reads from a fastq file?
Based on my experience I have only seen A, T, G, C and N.
Thanks
  • asked a question related to Illumina Sequencing
Question
4 answers
I've just finalized the analysis of a vast transcriptomic study in maize with 8 conditions and n=5 (40 libraries sequenced with Illumina technology). The conclusions are very pleasing, robust statistically and complement nicely a parallel metabolomic analysis.
Most published Illumina RNASeq studies validate their sequencing results by conducting some random qPCRs on parallel samples. But they are vague about how they do it and I realize that this will engage me into a massive work: 40 samples X 12 genes (10 randomly selected genes plus 2 housekeeping control genes) X 3 technical replicates X 2 cDNA library dilutions = 2 880 qPCRs. This excludes prior pilot studies to be conducted on each gene to determine the appropriate dilutions at which the qPCRs need to be conducted. This will cost as much as the Illumina sequencing !
Because of the high number of independent biological replicates (n=5) of my RNASeq dataset, is it really necessary to validate the results with some qPCRs ? And if yes, is there a way to diminish the number of qPCRs ?
Thank you in advance for any help on this.
Relevant answer
Answer
Dear all,
Thank you very much for taking the time to reading me and for those detailed answers. I am extremely grateful to you and
 I"ll try to compile and follow your advices !
Best regards,
Camille
  • asked a question related to Illumina Sequencing
Question
3 answers
I need to enrich my genomic DNA sample from bacterial strains with plasmid DNA extracted from them by specific kit. But I have not found any work as a reference, and I believe my genomic analysis will be more complete with this, since these have large plasmids that may be important.
Thank you for your attention !!
Relevant answer
Answer
Please consult the following publication:
Comparison of six commercial kits to extract bacterial chromosome and plasmid DNA for MiSeq sequencing ( )
  • asked a question related to Illumina Sequencing
Question
3 answers
To explore the microbial diversity through High sequencing approach like 454- Pyrosequencing or Mi seq Illumina sequence, sometime the terms microbial community and microbial diversity may confuse, or look similar in meanings. Kindly precisely differentiate that...
Relevant answer
Answer
I think it should not confuse as they are pretty different in meaning. Fungal community is the presence of many types of fungi as a whole in a community/society. Fungal diversity is the extent of different species or subspecies present in that community. If the diversity is low, there are only few taxonomic categories present or if it is high diversity there are more number of taxonomic classes present. The broadness or narrowness of diversity means how closely or distantly the fungal taxa are related. If you think same concept in human population (just for example) a community is the people together, and diversity (lets say gender) on a category level is low if we consider sex of people as there will be only three functional categories. And if we consider (age groups of 5 years) age group as a functional unit, there will be 20 classes (if age is 100 years), in this case this is highly diverse community.
In case of microbiology, the genus level is commonly taken as functional class to evaluate the diversity in fungal or bacterial community.
  • asked a question related to Illumina Sequencing
Question
8 answers
Hi,
I'm working on soil microbiology and I want to check the microbial diversity in the soil. so, I have 2 options,PE illumina or 454-Pyrosequencing. I want to ask that which technique is better to use and why is this so?
Relevant answer
Answer
  • asked a question related to Illumina Sequencing
Question
6 answers
Hi there,
I'm studying changes in frequencies of approximately 40-50 SNP alleles in nine regions across a country within a gene of interest in a plant pathogen. to gather this data a large number of samples have been taken from each of the nine sites for six sequential years, and illumina sequenced in a pool. so I will have nine .bam and .vcf files with around a hundred reads (high coverage) each, giving me a number analogous to allele frequency for each snp, for each site, in each year.
I am trying to determine an appropriate statistical analysis for detecting significant changes in allele frequency across this six year time frame. so far I think a CMH-test (probably using the r package popoolation2) will be my best option, with year and allele frequency as the nominal variables, and sites as replicates. however I'm conscious of heterogeneity between sites, which given my previous results is unlikely, but possible.
Would anyone be able to give me any other options? I was also thinking of a GLM with bonferroni correction, but cannot find a convenient way of implementing this with a six leveled qualitative independent variable.
Relevant answer
Answer
Following
  • asked a question related to Illumina Sequencing
Question
3 answers
Hello everybody!
Can anyone help or recommended laboratory or institution to bioinformatic analysis of Illumina Miseq full gene region sequencing results?
we paid them! We need to sequencing 3 types of Diseases nearly 10 to 50genes each. Please send recommendation or solution to my email: shalkar.bt@gmail.com
Relevant answer
Answer
Hi Tong,
it's ok, if you want send me your vcd files, I'll see what I can do for U. but please do not send me any indications on the samples.
fred
  • asked a question related to Illumina Sequencing
Question
17 answers
Hello everyone,
I was wondering if anyone has a good suggestion for an Archaeal 16S rRNA gene primer pair which is routinely used for paired end sequencing. I am having trouble in shortlisting a primer pair as most people are using the universal 16S primer (which I have already tested and it doesn't cover the diversity in my environmental sample) or the primers which amplify only shorter reads not covering the expected length  (approx. 400 to 500 bp) that I am looking for.
Any suggestions would be really helpful!
Thanks in advance.
Relevant answer
Answer
Hi,
Oh yeah seems the quality drop is really dramatic.
Did you ask the company as to how good the internal quality check was (I think Illumina assays sequence a small stretch of DNA, like an internal standard, after each run to see if the quality was affected)?
Unfortunately the MiSeq assay we tried failed completely due to some sequencer related issues and our barcoding approach was more of a problem that time so we moved to HiSeq runs (150 bp * 2),
I will attach the read quality of that if you would like to check it out.
You can see that in our case we also have generally poor read quality on the R2 side (last 30 bases or so).
Nonetheless a quick search online also showed me that people find it normal that Q score is between 25-30 (which doesnt reject many sequences on filtering downstream) for R2 reads and they seem to give no reason as to why it happens but say that its normal to observe this (biostars, seqanswers etc).
You can see our R2 scores were generally closing in at around ~27 but with huge deviations.
I would definitely contact the company and ask them the reason behind this.
You seem to generally have bad Q scores about after 150 bp which is not so nice.
I dont know how you added the Illumina adapters to your amplicons, DNA etc.
We do a blunt a ligation which means that our amplicons can be oriented with either forward or reverse primer after the forward sequencing primer region.
Based on this I dont think that the primer would be a problem. If that was the case then we would see bad quality reads (due to lack of heterogeneity in reverse primer, maybe) in the R1 file too.
But cant say much as I dont know how you prepared your library.
Hope this helps.
Regards,
Ajinkya
  • asked a question related to Illumina Sequencing
Question
17 answers
I am having some major challenges in PCR amplifying DNA from mixed community samples using degenerate primers that have inosine bases in them. The same samples amplify fine with a regular TAQ polymerase (lovely bright bands, good Sanger sequences), but as soon as I try the High Fidelity type, I have no amplification or extremely low levels with a range of DNA concentrations. I worry that the polymerases are mainly incompatible with the inosine bases in the primers, as some high fidelity polymerases explicitly state they are incompatible with inosines, but not all polymerases state this problem in their guides. Therefore I do not know if it is a universal problem with high fidelity polymerases or just problems with certain brands. So far my tests suggest it is the former.
So far I have tried:
Kapa HiFi (regular, and Hot start, Fidelity and GC buffer)
Phusion HiFi mastermix - did not amplify
Platinum SuperFI master mix - notes a problem with inosines, did not amplify
With Kapa HiFi (regular, not hot start), I was able to get a faint band after 40 cycles by adding 100 ng DNA/10 uL reaction, GC Buffer, and BSA. Adding less DNA = no bands. It seems crazy to add so much DNA and still get very little amplification, this is the mitochondrial COI gene (multiple copies). With a regular TAQ (MangoMix, and AmpliTaq Gold) I get strong amplification of my target sequence by adding only ~5 ng DNA/reaction, all other things remaining equal.
Ultimately I will do Illumina MiSeq on the indexed product to sequence diverse COI amplicons, so I should really have a high fidelity polymerase doing the amplification.
Thank you in advance for sharing your experience in this area.
I have not worked with inosine bases before and did not expect it to be such a problem! I am using someone's published primers, tried using their methods as much as possible, but they did not use a high fidelity polymerase in either of their publications (though they recommend doing so in the earlier of the two pubs!). Perhaps they tried and also failed - reasons there should be a journal of failed methods!
Relevant answer
Answer
Hi, I am a bit late to the party, but I have had good success with:
KAPA HIFI HotStart Uracil+
"The Uracil+ version enabled efficient high-fidelity
amplification of TAK multiplex primer set that had deoxyinosines."
  • asked a question related to Illumina Sequencing
Question
6 answers
I am very much interested for illumina sequencing of 16S rRNA gene of gut microbiota.
Relevant answer
Answer
Illumina sequencing requires costly enzymes and technology in order to sequence via PCR chemistry.
There is a newer NGS platform that is extremely cheap in comparison. Nanopore sequencing by Oxford Nanopore Technologies is a great option, especially for 16S sequencing. Look into the MinION for a cheap option. They even offer quick analysis tools post-sequencing to help with data analysis of metagenomic samples.
Hope this helps!
  • asked a question related to Illumina Sequencing
Question
3 answers
Does anyone have experience assessing RNA quality after RNA-immunoprecipitation? My downstream goal is RNA-seq (RIP-seq), and I want to check the quality of the RNA before proceeding to library preparation.
I've run my eluted, DNAse treated, and purified (Zymo Clean & Concentrator) RNA on a Agilent Bioanalyzer RNA Pico chip.
To MUCH dismay, my RIN was very low ( ~2) with very low concentrations.
Before trashing the very difficult to obtain samples, I just want to know if the Bioanalyzer is an accurate indicator of RNA quality when that RNA is isolated from an immunoprecpitation experiment. Any input would be great.
Relevant answer
Answer
Hi, Samantha.I'm working with plants and use antibody-free system in my experiment, but this article helped me. Authors determined RIP samples (without cross-linking) quality on a Bioanalyser.
  • asked a question related to Illumina Sequencing
Question
2 answers
I just have a question regard to the normalization for illumina sequencing. We all have problem of doing sequencing and ending up with not equal sequencing depth for every samples.
To avoid it we count the relatively abundance or log transform. And they each have their own problem.
I am just wondering could I divide my number of read to certain factor so that it would end up with the same number of read for every sample. For example if I have 6 sample from a1 to a 6 with number of read is 2k, 4K, 2.5K, 3K, 3.4K, 6K then I'll divide the number of read of a1/1, a2/2, a3*0.8, a4*2/3...so that all of them with have 2K of read.
Thank you so much!
Hanh
Relevant answer
Answer
excellent idea