Questions related to RNA-Seq
Most RNA-seq analysis methods lead to log2FoldChange, whereas in qPCR, analyzed via ΔΔCT method, also gives a foldchange result. So i wonder, in ideal conditions, should those 2 foldchange results obtained from different experiments be roughly the same? Or at least, agree with each other? If there is always discrepancies between RNA-seq and qPCR results, in normal cases, how big will it be?
My situation is that, the foldchange of my samples from qPCR is above 10, while the one of the same sample from RNA-seq is only 1.3 (p<0.05, pajd<0.05). They're still of the same trend indeed, but clearly it's not reasonable to say that they agree with each other. I just can not understand what's going on and don't know what to do next.
I downloaded the RNA-Seq dataset from the depository and took it for analysis.
For a tissue, sequencing was done 12 times and therefore had different sets of reads and normalized counts. Now for plotting a graph, we need a single value how to get a single normalized count value for a single gene from different sequencing data.
Whether can we do the average or sum of the CpM values of individual runs?
Kindly give me your suggestions on this.
I possess RNA-Seq data in fastq.gz format and I'm seeking guidance on transforming it into a suitable BigWig (.bw) format.
What steps or tools are recommended to convert FASTQ files to BigWig format for visualization and analysis purposes? Any insights or recommended protocols would be greatly appreciated
I have a single cell RNA seq dataset with 9 clusters covering 3 different cell types. I want to get the information on Gene Ontology and Pathways. Should I do a pseudo-bulk analysis or can I go ahead with the differentially expressed genes (DEGs) obtained from contrasting the two groups in the scRNA seq analysis? Is any R script available for this?
Let me introduce my data:
I am analyzing the single-cell RNA seq dataset. I'm gonna find the differential expression (DEGs) from different conditions in each cell type.
In addition, I'm working with Seurat's pipeline. My data is not suitable for pseudo-bulk DEGs analysis, therefore, mixed-model (MAST) is now my choice!
My question is, why do we need to use the normalized data (Scran-normalized, Log-normalized data) as the input of the MAST test, although the MAST test itself has the normalized method for count-depth (by cellular detection rate - CDR)?
Dear ResearchGate community,
I am fairly new to RNASeq analysis & wanted to ask for your input regarding accounting for different sequencing depth across my samples. I am aware that there are several normalization techniques (e.g. TMM) for this case, however, some of my samples have considerably higher sequencing depths than others. Specifically, my samples (30) range from 20M to 46M reads/sample in sequencing depth (single-end). Can I still normalize this using the tools provided in the various packages (DESeq2, limma etc) or is it preferable to apply random subsampling of the fastq files prior to alignment (I am using kallisto)?
Many thanks in advance!
Dear ResearchGate community,
I have a question regarding the possibility of a batch effect in my single-end bulk RNASeq data set: Some of my samples (10 out of 30) were sequenced 2x due to initial low read count (on two different days, same facility & instruments) and the reads were later concatenated prior to alignment. In your opinion, does this introduce a batch effect which ought to be accounted for?
Many thanks in advance.
I was performing a RNAseq data analysis. I did my alignment using RNA-STAR and then I perform featurecounts. I used latest assembly of human genome i.e. HG38.p14. But after feturecounts step i noticed that some gene were counted abnormally, like the screenshot i share you can see that ABO gene came two times, one as 'ABO' and then 'ABO_1' and you can see many more are came like this. in featured count i selected the option, "count them as single fragment". Dataset was illumina Paired end reads.
1. Dose anyone know What is the reason behind that?
2. Did I do any mistake during the processes that i didn't noticed?
3. What to do in this situation?
Thank you , very much for the time.
I am trying to generate single nuclei RNA seq libraries from a non-model organism. We have a genome that it is not well annotated and a bit incomplete. I know that for single cell RNA seq is possible to do the bioinformatics analyses with an assembled transcriptome but not sure whether is the same case for single nuclei RNA seq.
I am not a bioinformatician and don't know much about RNA-seq data analysis. I know that Bulk RNA-seq data must be normalized before differential analysis because differences in starting materials for RNA-seq need to be corrected. The starting materials of scRNA-seq are single cells, and the current normalization method is to assume that the total count of all genes in each cell is equal. But in fact, the overall expression level of each cell is not necessarily equal. Therefore, when comparing the expression level of a gene in the same type of cells that have undergone different treatments, wouldn't it be more reflective of the actual level of the gene expression to directly compare the denoised raw data without normalization? For example, by analyzing the sequencing raw data of C. elegans muscle cell at different ages, we can see that the gene Cox-4 is significantly downregulated with age (see the attached picture), but after normalization, it may no longer show such an obvious downregulation. I'm not sure if I'm correct. I'd appreciate it if anyone could answer my question.
I was planning to send RNA samples collected from Arabidopsis root tissue for standard RNA sequencing. All of these samples were processed in the same way except in the DNase removal step. I used the RapidOut DNA Removal Kit (Thermo Scientific) according to the manufacturer’s instructions. To remove the enzyme, I used the DNase Removal Reagent (DRR) that came with the kit on 16 of the 24 samples (before realizing that I would not have enough for all the samples). For the remaining 8 samples, I used EDTA at a concentration of ~4.5 mM and heat-inactivated the DNase at 75C for 5 min. For all samples, my RNA concentrations were similar (170-230 ng/ul) and my 260/280 values were greater than 2.0 as measured with a Nanodrop. My 260/230 values were between 1.7-2.0 for the 16 samples processed with the DRR and slightly lower (~1.4 - 1.6) for the samples processed with EDTA + heat-inactivation (which was not surprising, as EDTA absorbs at 230 nm). My question is, will this affect the final outcome of the RNA-Seq in terms of comparing gene expression between samples? Any advice on what to do in this case?
thanks for your scientific insight!
I have samples with callus remainings and bacteria that have been cultured on a callus solution. Now I want to isolate only the RNA from the bacteria to eventually only get mRNA from the bacteria to be able to RNA-seq. Now the problem is that the remainig callus appears to contaminate/overrule the RNA-seq data for now. So therefore, we would like to isolate the bacterial mRNA and remove all the mammalian parts.
If anyone knows something, let me know! :)
Thanks in advance!
I am writing to inquire about the low assignment ratio (19%) that I obtained using FeatureCounts in my RNA-seq analysis. I would like to confirm whether this is a normal result, and if possible, request your assistance in identifying possible reasons for this issue. To provide some context, I used HISAT2 to align paired-end stranded RNA-seq reads to the GRCh38 reference genome. The overall alignment rate by HISAT2 was 97%, with a multi-mapping ratio of 22% and a unique mapping rate of 72%. Based on this alignment result, I attempted to use FeatureCounts to obtain read counts from the BAM file generated by HISAT2. However, the successful assignment ratio was only about 19%. Thank you for your time and assistance.
I am looking for a tool to easily analyze the expression of different genes during the differentiation of mouse or human pluripotent stem cells to different derivatives, something similar to ChIP-atlas but for RNA-seq.
I know there are repositories with RAW RNA-seq data such as GEO or SRA, which sometimes include the tables with analyzed data. However, in many cases this needs some processing, and sometimes you only get the fastq files.
I wonder if there is some database that feeds from GEO/SRA, where I could look for classified experiments, such as "Neural differentiation of human pluripotent stem cells", and where I could easily plot de expression of a GOI in the different conditions/timepoints.
Thanks a lot!
I need a guidance from my research fellows in Bioinformatics here who are knowing on how to make miRNA sequences from RNA-seq data from NCBI with software such as Geneious Prime for instance as I am a beginner at this (RNA-seq assembly for miRNA). Thank you.
Recently, I did an experiment where RNA was isolated by TRIZOl reagent.
Finally the concentration of RNA check by nanodrop was around 5-16 microgram/microliter.
Their 260/280 ratios were in the range of 1.98-2.1 and also the 260/230 ratios were in the range of 2.1-2.3.
However, unfortunately the quality analysis by qubit 3 fluorimeter analysis showed RIN values less than 3 and were in the range of 1-1.8. I have to do downstream analysis RNASeq.
Gel was also run but only single band was obtained everytime.
I dont understand when concentrations and ratios were in good range then how can I monitor where I am going wrong. Why it failed the QC by Qubit?
I request to help me findout the reason and trouble shoot the problem..
Currently I'm working on a plant genome annotation work. For my plant, I don't have any RNA sequences or ESTs. Also very little ESTs and RNA sequence data from closely related species. Is it okay to use both closely related and somewhat distantly related species EST data and RNA seq data for gene prediction?
If not, is there any options I can follow instead of above method?
(I'm planning to use proteins sequences as well along with ESTs and RNA seqs).
I am working on a multi-site clinical study and one of our study sites is in India. Blood samples from participants have been collected and stored at -80 in PAXgene blood RNA tubes. The study is nearing completion and I need to arrange for the blood samples (~1000 samples) to be shipped somewhere where RNA extraction can be performed and then a subset will need to undergo RNA-seq. Most of the companies who do RNA-seq will only accept RNA and they do not provide RNA extraction as a service so I will probably need to get the extraction done separately with a different service provider.
I am wondering what the state of the art is for detecting m1Ψ in RNA sequences and how it can be differentiated from U. I am thinking of just using RNA-seq and then a separate assay to determine m1Ψ concentration, but ideally I would like to know if an individual base is m1Ψ or U.
Hi - I'm currently working with two RNA-Seq studies; one has RNA extracted from whole blood, the other PBMCs. Eventually we want to combine these data and perform some cell-specific deconvolution to look at DEGs.
Are there any recommended methods for batch correcting these data from different sources?
I have multiple sets of RNA-seq data and I want to compare gene expression between control and treated groups. My interest extends beyond differentially expressed genes; I also want to identify non-differentially expressed genes. I understand that Log2FoldChange and p-adj are commonly used to define differentially expressed genes. Alternatively, genes that fail to meet the criteria for differential expression are considered non-differentially expressed.
However, classifying a gene as non-differentially expressed does not definitively indicate that the RNA-seq data confidently establishes the absence of changes in gene expression. For instance, this could be attributed to substantial within-group variation or low gene counts that hinder unambiguous determination of expression levels. So, how can I effectively distinguish truly non-differentially expressed genes from those exhibiting significant within-group variation or yielding very low counts? Are there any software packages available for this purpose? Alternatively, are there established statistical methods or standards that can guide me in this regard?
I have a RNA-seq dataset without controls (or you can consider them all controls) and I am interested in an unsupervised ranking or clustering of samples with regard to how they are expressing pathways of interest. I am looking to stratify samples in terms of their pathway activity for specific pathways from the PROGENy resource: https://saezlab.github.io/progeny/ . Would you have any recommendations for how to run this analysis?
I’m using RNAeasy plant mini kit of Qiagen in Arabidopsis. My concentration is good and also 260/289 value, but I’m having troubles with 260/230 in some of my samples I have 0,78 and 1,58. What may I do to improve my 260/230? these samples are to send to RNA-Seq.
I have the results of both qPCR CT values and RNA-seq TPM values. Now that I have 2 sets of data, is it proper to compare expression fold change (2^ of delta delta CT) with log2 of TPM values?
Thanks in advance,
#qPCR, #RNA-sequence analysis, #TPM.
My question is how I can increase the quantity of RNA isolated from cells in wound edge in this case from keratinocytes? I have done the experiment and still the quantity of isolated RNA is low for RNA seq. does anyone have experience who could help me?
I have RNAseq data made with selection by PolyA and other RNA-seq Total data. I want to join this data to increase the sampling within some subtypes I have few samples.
How can I normalize these two dates in just one? Is there any method or process that makes this joining of RNA-seq PolyA and Total possible?
I looked for this information in many articles that work with multiple types of data, but they don´t detail how they did.
I am trying to use the Ovation RNA seq V2 kit from Nugen or Tecan to amplify low-input RNA (about 1ng). (https://lifesciences.tecan.com/ovation-low-input-rna-seq-kit-v2?p=tab--1).
The Protocol and workflow of the SPIA amplification process show linear amplification of cDNA, which is single-stranded (As per the schematic provided by them). However, the kit claims it generates double-stranded cDNA, which is not shown in the figure or explained how it is done!!!
Can somebody please help me understand this?
How does the Ovation RNA seq v2 kit generate double-stranded cDNA instead of single-stranded cDNA??!!
Nugen/ tecan suggests using their library preparation kits (for Illumina) which can utilize only dsDNA templates but not single-stranded.
They seem to work well, but I don't understand how the dsDNA is generated in the amplification step (RNA Seq V2 kit).
I've generated an enhancer knock-out in mice, which I then did qPCR of the gene it potentially regulates and seen it is 25% down-regulated in the homozygous knock-out. So I decided to do RNA Seq to analyze other genes, but in the RNA Seq data the gene isn's differentially expressed. I don't understand why, and which technique should I believe. The samples used for the qPCR and RNA Seq are not the same, but are the same genotype, same age, same tissue. The only technical difference is that I did Trizol extraction for the qPCR, but for the RNA Seq I did ARN column extraction.
I want to ask a technical question, I have treated and non-treated Male and Female RNA seq samples, and I want to do sex-biased gene expression. My concern is that should
I compare the male vs female samples or male vs normal and female vs normal when analyzed for sex-biased gene expression using Dseq2
please bear with me, because I am a complete beginner with regard to any form of bioinformatics and I am trying to understand the best approach to my experiment.
I am currently trying to isolate cells and sequence them for further bioinformatic analysis, more precisely RNA-Sequencing.
We have, however, had issues with purity and while some samples we looked at reached a purity of >90% after isolation (we usually validate it by use of flow cytometry), some samples of different animal genotypes did not.
This leads me to my first question:
How important is cell purity for Bulk RNA-Seq?
Which purity should be reached for and adequate, realiable analysis?
If anyone has any recommendations for papers to look into regarding that subject, I would be most grateful, because I have no idea where to start and what to consider.
Further along in the story we surmised that maybe Single Cell RNA Sequencing might be the better option in cases of lower purity.
But again, the same question arose: how relevant is cell purity for the following analysis and is there a cut-off value not to be crossed?
How advantegeous would using both methods be?
Sure, Bulk gives a better general overview and Single Cell is more precise, but do they complement each other or is it essentially redundant information gained by doing both experiments?
And are there any disadvantages to using only SC or do both methods completement each other when low purity levels are in the question?
Thank you a lot in advance!!
I'm recently trying to perform an RNA seq data analysis and in 1st step, I faced a few questions in my mind, which I would like to understand. Please help to understand these questions.
1) In 1st image, raw data from NCBI-SRA have marked 1&2 at the ends of the reads, What is the meaning of this? are those meaning forward and reverse reads?
2) In the second image I was trying to perform trimmomatic with this data set. I chose "paired-end as a collection" but it does not take any input even though my data was there in "fastqsanger.gz" format. Why is that? Should I treat this paired-end data as single-end data while performing Trimmomatic?
3) in the 3rd and 4th images, I collected the same data from ENA where they give two separate files for 1 and 2 marked data in SRA. Then I tried to process them in Trimmomatic by using "Paired-end as individual dataset" and then run it. Trimmomatic gives me 4 files for those, Why is that? which one will be useful for alignment ??
A big thank you in advance :)
I have interesting question asked by my professor and I could not find relevant answer anywhere.
Why are we seeing up and down pattern on transcript abundance? Example RNA seq data for a gene from a rice transcriptome data base is attached. LOCUS ID is highlighted in yellow and transcript abundance is in below three samples after drought treatment.
The question is ,why the signal level is not uniform on Exons? is it low signal reads? Why there are gaps or sudden fall in signals? ( which are Marked in Red arrows) How to read and understand this? and I know this is the common pattern in RNA-seq data, but I don’t know why? It’s an interesting question asked by my professor! can any bioinformatician help me understand this? Thanks in advance.
Current research often uses new, next-generation, "flashy" experimental techniques (i.e. single-cell RNA-seq) that have replaced some of the older, smaller, yet fundamental experimental techniques. Many of these new-age techniques seem overused and expensive when older techniques could be an adequate replacement. What are some good examples of these new "answer-all" techniques and how were these techniques done previously with smaller, fundamental techniques?
We want to perform a human RNA extraction from cell culture for an RNA-seq, but we have a viral RNA extraction kit (Quick-RNA™ Viral Kit-Zymo research) available. Therefore, we want to know if any methodological issues can interfere with the results if we use the viral kit.
I am doing total RNA extraction from PAXgene blood RNA tubes (6.9 ml of storage buffer + 2.5ml of collected blood in each tube) using the PAXgene blood RNA kit. I just want to extract the total RNA from a portion of the blood sample (around 4.5ml of the above combination) collected in PAXgene blood collection tubes. is there anyone who extracted total RNA from PAXgene blood RNA tubes? I will be glad if anyone has an answer for it.
I work in the cancer research field and human disorders by using the bioinformatics approach. These projects contain the analysis of transcriptomic data such as microarray, RNA-seq analysis, TCGA, systems biology analysis, survival analysis and etc. also, the metagenomic analysis in microbiome fired are conducted. Those interested in participating in analyses and writing articles are invited to send their CV to the email below.
What is the difference between Hiseq and Novaseq RNA-seq data and how to analyze them together?
I am looking for an open dataset to verify the results obtained by analyzing the total RNA-seq of patients in the TARGET-AML project (GEO search was not successful).
The dataset should include:
1) bone marrow RNA-seq of pediatric patients with non-relapsed AML;
2)bone marrow RNA-seq of pediatric patients with relapsed AML (primary tumor BEFORE relapse);
3)clinical data (relapse-free time, for example) - optional.
If you know where to find a dataset like this, I would be very thankful.
While trimming the adaptor and low quality RNA-Seq illumina paired end reads in Trimmomatic, I have got more Forward only survive of about 40 to 50%. This study is for estimate the transcript abundance (DEG) at various condition. How is the possibility to continue further...
1. USE singleton reads (R1-For only)
2. Only use both paired (survive) high quality reads (50% of the reads)
Any suggestion, Thanks in Advance
by, Ellango R.
I've run RNAseq and qPCR on a set of genes, and while the log2 expression values are consistent between the tests for most of the genes in the set, there are a handful that appear to be unregulated according to the RNAseq results and down regulated according to the qPCR results (and vice versa). Is there any possible reason that could explain this, other than just human error?
I am planning a RNA-seq experiment on neonatal rats ventricular myocytes cells (NRVMs), and I was wondering how many cells do I need to have per sample to extract a sufficient amount of RNA. I need 1ug of total RNA.
Thanks in advance!
I've been having some trouble isolating bacterial RNA from a gram positive organism for a RNA Seq analysis. My problem is that I always get a very intense "cloud band" on the agarose gel around the position where the 5S RNA band should be.. I've tried several protocols and kits, with and without bead beating, Trizol, Lysozyme, but it happens every time.. The first idea was that these are products of degradation, but then again the intensity of the 23S and the 16S bands clearly remains very high. And also, on a Bioanalyzer this 5S band definitely does not look like degradation, but rather as a sharp peak around 127 nt.. Does anyone have any experience with that? If this is in fact the 5S rRNA, why do I get such accumulation, how should I get rid of it and would it temper with my RNA Seq results?
Thank you all in advance!
Please tell me any open human databases with RNA sequencing and full genome/exome sequencing other than 1000 genomes. Preferably a healthy sample (not cancer patients).
-RNA seq and bioinformatics were carried out by professionals.
- Gene in question shows ~700 fold differential regulation by qPCR in multiple independent cohort of experiments - not in RNA seq.
A project in my lab involves single cell RNA-seq data analysis of mouse whole lung samples. However, when we analyzed and clustered the dataset and searched for markers to annotate the group, there appears to be few to no cells exhibiting the classical epithelial cell markers EPCAM and CDH1.
Each sample has around 8000 cells after filtering for mitochondrial content, and the overall quality seems fine. But over half of the cells appear to be immune cells and there were less than 100 epithelial cells for each sample.
The mouse models are established by a collaborating group, and whole lung samples are sent to a sequencing company (travel time about 3 hours minimum) to generate the scRNA data. Our collaborators have adjusted experimental protocols multiple times to increase cell viability (~85%), but we are having difficulty fixing this lack of epithelial cells.
Does anyone have some experience with this, or know why there would be so few epithelial cells in scRNA-seq data of mouse lung samples?
Both my lab and our collaborators are fairly new to handling scRNA-seq data, so any insight would be helpful.
I've done RNA-seq analysis on a dataset downloaded from GEO looking at immune gene expression in Asthmatic, COPD and normal epithelial lung cells. Trying to do a t-test for my statistical analysis, but I need to group my data into Asthmatic, Healthy and COPD samples/cells as it doesn't show up in R which samples belong to which group?
I need to ship some RNA samples overseas for RNA-seq.
I saw a paper that compares lyophilized RNA and non-lyophilized RNA. Also, I found a protocol that dries RNA with lithum + ethanol and ships at RT.
Has anyone ever done it? Did it work?
Thanks a lot
I need tutorial to analyze RNA Seq data in ubuntu Linux and R, and IGV.
I can't run the commands in Ubuntu Linux for alignment of data and mapping of reads. I need the tutorial for running commands in Ubuntu for merge, sort and index my data, and have to use Sam tools, Bam tools, and Bed tools., but I can't run the commands. And also need to RNA Seq data analyzing in R and IGV as well.
I want to perform a phylotranscriptome analysis. For that I have downloaded RNA-seq data from SRA. In galaxy I have trimmed my data, then did assembly through trinity to get contigs from reads. Now what should I do, I need the complete transcriptome to put it into the MEGA for MSA and subsequent analysis.
Thanks in Advance.
I have a good R and statistical analysis background (also with machine learning). in addition, I'm a fresh biotechnology grad. I would like to try to replicate some Rna-seq analysis using R papers (with their provided data). Any SHORT (beginner-friendly) papers to recommend?
there are diffirent program such as Rstudio, python,... for RNA-seq data analysis, according to your knowledge and experiences whic one is better and more comprehensive? and is there another program??
I have RNA seq data and I wanted to check the relative expression of selected targets based on RNA seq data. To validate this I have isolated RNA from a separate cohort and run the qPCR. However, the trend of my qPCR data is completely against the RNA seq data. The genes which are up-regulated in RNA seq is down-regulated in qPCR and vice versa. I do not know whether I am missing any variable here.
I know I have not written in detail but will be happy to discuss more if need any information.
I'm uploading RNA-seq data on NCBI. I have successfully done step one but in the second step there is an error occurring during data submission process. kindly guide me in details if someone know well. thanks for your cooperation
I'm looking for any publicly available RNA-seq data sets related to all sub-types of breast cancers to presearch for thesis project, thank you all...
Can the reads from multiple samples be aligned to the reference genome at once via the HiSat2 tool in RNA-Seq data analysis? Or should I run HiSat2 on each sample individually and then somehow combine them later?
I am working on a gene cluster from an amycolatopsis strain that supposedly produces a glycopeptide antibiotic - its a silent gene cluster at the minute.
I have sent it for RNA sequencing and the cluster is highly expressed, but there was no glycopeptide produced (checked via MS)
Any ideas as to why?
I'm in the initial stages of planning a miRNA seq experiment using human cultured cells and decided on TRIzol extraction, Truseq small RNA prep kit, using an illumina HiSeq2500. The illumina webinar suggests 10-20 Million reads for discovery, the QandA support page suggests 2-5M, and I wrote the tech support to ask, who suggested I do up to 100M reads for rare transcripts. Exiqon guide to miRNA discovery manual says there is not really any benefit on going over 5M reads. I was hoping to save money by pooling more samples in a lane, so I was hoping someone with experience might be able to suggest a suitable number of reads.
I am studying a protein and from imaging I can see that my protein is recruited to sites of DNA damage. I wish to UV irradiate HEK293 cells in culture prior to collection and analyses by RNA-seq and mass spectrometer. Does anyone have an idea of how to (protocol and instrumentation) UV irradiate cells in culture for such studies?
I have an analyzed RNA seq data set. The analysis part including differential gene expression, clustering analysis and enrichment analysis has been done. I am aware that the bioinformatic part is done and most of the analysis part is also done. Could someone please guide on how to extract the biological relevance from the data set. What should be the starting point for working with this data? Should I start by looking at the differentially expressed genes in different comparisons or start from the cluster analysis and try to look for the genes.
I have RNA seq dataset for two groups knockout and wild types of mice samples. I have the normalized values in terms of quant all datasets. Please guide me how to perform PCA on the normalized values. I am not a bioinformatician, kindly suggest non-coding methods.
Thanks in advance!
We plan to send total RNA samples from fish tissues for RNA-seq analysis. The total RNA samples will be TRIzol-extracted, DNase-treated, and cleaned using Zymo RNA Clean & Concentrator-5. For previous transcriptome profiling studies, we cleaned the total RNA samples using QIAGEN kits, so this would be our first time with the Zymo kit. The manufacturer states, "RNA is ready for all downstream applications including Next-Gen Sequencing, RT-qPCR, hybridization, etc."
Please let me know if you have any experience with Zymo-prepped RNA samples used for RNA-seq. Any feedback will be greatly appreciated.
If I am looking at a specific gene that is comprised of 3 exons or 2 protein coding regions, and I find that some of my reads being aligned are very small proportionally to to the entire protein coding region and located only in one of those protein coding regions. Should I consider this a "bad quality" alignment generally speaking? Similarly if the read spans the entirety of one protein coding region, but is largely absent in the other (1/2), how should I classify these alignments?
I have several single-end fastq files. Before trimming with Trimmomatic, FASTQC reported TruSeq adapter sequences as possible source of overrepresented sequences. However, after trimming, now FASTQC reports Clontech SMART CDS Primer II A as source of overrepresented sequnces. What should i do about them? Can those sequences cause any negative effects on downstream analysis?
Thanks in advance.
We would like to know the best value for money commercial company for DNA sequencing as part of an RNA-seq study.
Thanks you. Joe Duffy