Questions related to RNA-Seq
A project in my lab involves single cell RNA-seq data analysis of mouse whole lung samples. However, when we analyzed and clustered the dataset and searched for markers to annotate the group, there appears to be few to no cells exhibiting the classical epithelial cell markers EPCAM and CDH1.
Each sample has around 8000 cells after filtering for mitochondrial content, and the overall quality seems fine. But over half of the cells appear to be immune cells and there were less than 100 epithelial cells for each sample.
The mouse models are established by a collaborating group, and whole lung samples are sent to a sequencing company (travel time about 3 hours minimum) to generate the scRNA data. Our collaborators have adjusted experimental protocols multiple times to increase cell viability (~85%), but we are having difficulty fixing this lack of epithelial cells.
Does anyone have some experience with this, or know why there would be so few epithelial cells in scRNA-seq data of mouse lung samples?
Both my lab and our collaborators are fairly new to handling scRNA-seq data, so any insight would be helpful.
I've done RNA-seq analysis on a dataset downloaded from GEO looking at immune gene expression in Asthmatic, COPD and normal epithelial lung cells. Trying to do a t-test for my statistical analysis, but I need to group my data into Asthmatic, Healthy and COPD samples/cells as it doesn't show up in R which samples belong to which group?
I need to ship some RNA samples overseas for RNA-seq.
I saw a paper that compares lyophilized RNA and non-lyophilized RNA. Also, I found a protocol that dries RNA with lithum + ethanol and ships at RT.
Has anyone ever done it? Did it work?
Thanks a lot
I need tutorial to analyze RNA Seq data in ubuntu Linux and R, and IGV.
I can't run the commands in Ubuntu Linux for alignment of data and mapping of reads. I need the tutorial for running commands in Ubuntu for merge, sort and index my data, and have to use Sam tools, Bam tools, and Bed tools., but I can't run the commands. And also need to RNA Seq data analyzing in R and IGV as well.
I want to perform a phylotranscriptome analysis. For that I have downloaded RNA-seq data from SRA. In galaxy I have trimmed my data, then did assembly through trinity to get contigs from reads. Now what should I do, I need the complete transcriptome to put it into the MEGA for MSA and subsequent analysis.
Thanks in Advance.
there are diffirent program such as Rstudio, python,... for RNA-seq data analysis, according to your knowledge and experiences whic one is better and more comprehensive? and is there another program??
I have RNA seq data and I wanted to check the relative expression of selected targets based on RNA seq data. To validate this I have isolated RNA from a separate cohort and run the qPCR. However, the trend of my qPCR data is completely against the RNA seq data. The genes which are up-regulated in RNA seq is down-regulated in qPCR and vice versa. I do not know whether I am missing any variable here.
I know I have not written in detail but will be happy to discuss more if need any information.
I'm uploading RNA-seq data on NCBI. I have successfully done step one but in the second step there is an error occurring during data submission process. kindly guide me in details if someone know well. thanks for your cooperation
I'm looking for any publicly available RNA-seq data sets related to all sub-types of breast cancers to presearch for thesis project, thank you all...
Can the reads from multiple samples be aligned to the reference genome at once via the HiSat2 tool in RNA-Seq data analysis? Or should I run HiSat2 on each sample individually and then somehow combine them later?
I am working on a gene cluster from an amycolatopsis strain that supposedly produces a glycopeptide antibiotic - its a silent gene cluster at the minute.
I have sent it for RNA sequencing and the cluster is highly expressed, but there was no glycopeptide produced (checked via MS)
Any ideas as to why?
I'm in the initial stages of planning a miRNA seq experiment using human cultured cells and decided on TRIzol extraction, Truseq small RNA prep kit, using an illumina HiSeq2500. The illumina webinar suggests 10-20 Million reads for discovery, the QandA support page suggests 2-5M, and I wrote the tech support to ask, who suggested I do up to 100M reads for rare transcripts. Exiqon guide to miRNA discovery manual says there is not really any benefit on going over 5M reads. I was hoping to save money by pooling more samples in a lane, so I was hoping someone with experience might be able to suggest a suitable number of reads.
I am looking for an open dataset to verify the results obtained by analyzing the total RNA-seq of patients in the TARGET-AML project (GEO search was not successful).
The dataset should include:
1) bone marrow RNA-seq of pediatric patients with non-relapsed AML;
2)bone marrow RNA-seq of pediatric patients with relapsed AML (primary tumor BEFORE relapse);
3)clinical data (relapse-free time, for example) - optional.
If you know where to find a dataset like this, I would be very thankful.
I am studying a protein and from imaging I can see that my protein is recruited to sites of DNA damage. I wish to UV irradiate HEK293 cells in culture prior to collection and analyses by RNA-seq and mass spectrometer. Does anyone have an idea of how to (protocol and instrumentation) UV irradiate cells in culture for such studies?
I have an analyzed RNA seq data set. The analysis part including differential gene expression, clustering analysis and enrichment analysis has been done. I am aware that the bioinformatic part is done and most of the analysis part is also done. Could someone please guide on how to extract the biological relevance from the data set. What should be the starting point for working with this data? Should I start by looking at the differentially expressed genes in different comparisons or start from the cluster analysis and try to look for the genes.
I have RNA seq dataset for two groups knockout and wild types of mice samples. I have the normalized values in terms of quant all datasets. Please guide me how to perform PCA on the normalized values. I am not a bioinformatician, kindly suggest non-coding methods.
Thanks in advance!
We plan to send total RNA samples from fish tissues for RNA-seq analysis. The total RNA samples will be TRIzol-extracted, DNase-treated, and cleaned using Zymo RNA Clean & Concentrator-5. For previous transcriptome profiling studies, we cleaned the total RNA samples using QIAGEN kits, so this would be our first time with the Zymo kit. The manufacturer states, "RNA is ready for all downstream applications including Next-Gen Sequencing, RT-qPCR, hybridization, etc."
Please let me know if you have any experience with Zymo-prepped RNA samples used for RNA-seq. Any feedback will be greatly appreciated.
I have a good R and statistical analysis background (also with machine learning). in addition, I'm a fresh biotechnology grad. I would like to try to replicate some Rna-seq analysis using R papers (with their provided data). Any SHORT (beginner-friendly) papers to recommend?
If I am looking at a specific gene that is comprised of 3 exons or 2 protein coding regions, and I find that some of my reads being aligned are very small proportionally to to the entire protein coding region and located only in one of those protein coding regions. Should I consider this a "bad quality" alignment generally speaking? Similarly if the read spans the entirety of one protein coding region, but is largely absent in the other (1/2), how should I classify these alignments?
I have several single-end fastq files. Before trimming with Trimmomatic, FASTQC reported TruSeq adapter sequences as possible source of overrepresented sequences. However, after trimming, now FASTQC reports Clontech SMART CDS Primer II A as source of overrepresented sequnces. What should i do about them? Can those sequences cause any negative effects on downstream analysis?
Thanks in advance.
We would like to know the best value for money commercial company for DNA sequencing as part of an RNA-seq study.
Thanks you. Joe Duffy
I have a question and I was hoping to get some insight from you.
I ran RNA-seq on my samples and I didn't have replicates, I used differentially expressed genes with a cut-off of 2-fold change for Preranked-GSEA and got a list of pathways activated in each of my samples. The question is that can I use any of the values such as ES, NES, FDR, etc. or since I have no replicates, it doesn't make sense to use these? Next, if I don't use these values, should I rank my pathways based on the number of genes they have in the gene set? If yes, is there a cut-off that is being commonly used for that?
Thanks for the help.
Hello! I am looking to isolate cell nuclei from mouse brains (hypothalamus specifically) and have been considering several kits. There are two from Sigma - the Nuclei EZ Prep and Nuclei Pure Prep, as well as the Minute Single Nucleus isolation kit - offered both with and without detergent.
I have considered adapting a 'home-brew' protocol described in some recent papers, but due to time constraints, a kit would be ideal since there might be less troubleshooting and validation.
Does anyone have experience with these kits and do you have a recommendation as to which might work best?
Hello, What would be the best methodology to perform RNA-seq with samples with low RIN? What do you recommend?
If we want to use patients data, does RNA seq include any potential patient identifying information that should be checked for donor agreement?
I've just started studying about STAR aligner and I came across primary assembly and patch release. I understood so far that patch release is a minor version of a genome which comprises only the sequence(s) that has some update, not the whole genome itself.
Therefore, for RNA-Seq studies and lncRNA characterization (from alignment to differential expression), the patch release would not be recommended. Instead, the primary assembly should be used. Is that right?
I would appreciate if anyone could share any insight, review or basic publications. Thanks.
I have read the sentence below, and I have still diffculty to understand the term Read Depth. I would be glad if someone could explain it to me.
Read depth:The total number of sequencing reads obtained for a sample. This should not beconfused with coverage, or sequencing depth, in genome sequencing, which refers to how many times individual nucleotides are sequenced.
Recently I did RNA seq on mycobacterium. I got the TPM data from one of our colleagues in the Bioinformatics department because they helped us to analyze the raw data.
Then, I'm interested in making a volcano plot from the data. Do you think it's possible to get a p-value from TPM data?
I am running RNA extractions on whole gut samples for downstream RNAseq. For one individual I realized there was a length of gut tissue still in the original collection tube that I didn't add to the homogenization solution. I'm not sure what region of the gut it actually is or proportionally how large it is relative tissue that was homogenized (it is smaller), but I'm worried that if there are regional differences in RNA expression profiles that will bias the RNAseq data towards the already-processed portion of the gut.
Is this sample salvageable? If I extract RNA from the leftover tissue, could I just combine the total RNA sample volumes from both prior to sending in for sequencing? Alternatively, if we sequence both separately could we normalize and combine reads somehow? Are there any other strategies that would be more robust to prevent bias? Thanks in advance.
Hello! I'm new to bioinformatics and cancer databases. I was exploring cbioportal and analyzing coexpression of different genes through scatter plots. I noticed that the axis are labeled as " RSEM (Batch normalized from Illumina HiSeq_RNASeqV2)" (I attached an example so you can see). I know that RSEM is a transcript quantification software but what does "Batch normalized" mean? does it give upper quartile normalization? FPKM? or what?.
thanks in advance!
We currently have a study done with RNA-seq analysis. But the raw counts in our data show a large difference, up to 10 fold higher in the raw counts versus the lowest one. But after normalization, it shows 1.2 fold changes.
Is this 10-fold change in raw counts generally acceptable in the field of RNA-seq?
I have in my RNA-seq quantification data from Arabidopsis obtained by mRNA-seq polyA enrichment library transcripts encoded by chloroplast and mitochondrial genes in significant DEG. How is it possible, if chloroplast and mitochondrial transcripts do not have poly-A tail? Are these data reliable or contaminants which should be ignored?
Currently conducting cDNA amplification during a Tag-Seq (RNA-Seq) protocol. In the gel attached, all wells with 'D' should be cDNA with smears, all others should not because they are controls missing certain oligos during this test PCR that was run. The picture was edited and slight smears faintly appear. These should be amplified cDNA but look like junk because they appear in every well and should not.
We have already troubleshooted with template concentrations, reagent quality and batch, used 2 thermocyclers, different sets of polymerases, tested cDNA already synthesized by various library preps and results have appeared similar in each situation. Any suggestions would be greatly appreciated!
Dear all, I am trying to use CD-hit to remove the duplicates from the file that is the output from trinity (RNA seq assembly).
I used the following parameters:
cd-hit-est -i in.fasta -o out_cdhit90.fasta -c 0.90 -n 9 -d 0 -M 0 -T 0
But the output file still contains lots of small or fragmented sequence plus the best one. How can I remove those small or fragmented duplicates by changing the parameters?
I've recently sent off E. coli RNA samples for RNA-seq. The company we have used for sequencing have replied and said all samples failed QC due to degradation. I am looking at the Tapestation values and some of these samples have a RIN up to 9.5 but then a DV200 of 25. I believe this is due to the large band at approximately 95 bp- which I believed were tRNA. I am now unsure as to whether to proceed with sequencing as they cannot guarantee sequencing results. Has anyone had experience of such a contrast in numbers previously? And then gone on to successfully perform the sequencing?
I am currently looking for a high-throughput, low cost RNA-seq method that can give me data on the SNPs in a large scale transcriptomic study of Eucalyptus. I see that 3'tag RNA seq comes up a lot in the 'low cost, high throughput' criteria, but I can't find studies where it has been used for SNP detection/profiling. Anyone use it successfully for this purpose? Any other suggestions of RNA-seq technologies that might be applicable? We are looking at a sample size of at least 400 trees, so RNA-seq would be breaking the bank a bit. I am a bit of a novice with NGS technology, so any advice would be greatly appreciated!
When we do RNA_sequencing especially with low RIN score RNA samples, it becomes difficult to get the efficient library and good sequencing data. What solutions you are using and how its helping you???
We are checking the integrity of RNA isolated (using MagMAX 96 Total RNA Isolation kit) from mice tissue on a 1% agarose gel that contains a bit of bleach to get rid of RNases. Our 28S/18S ratios are less than 2 for most samples. Is this due to RNA degradation or is something wrong with our gel?
We sequenced some of our samples anyway using AmpliSeq RNA Transcriptome Mouse Gene Expression on the Ion Torrent S5. The amplicons that read end-to-end were around 13 000-14000 out of the total of 23 930 amplicons for all samples. Is this an indication of degradation or is this a normal value for RNA sequencing? We are wondering if we can continue sequencing the rest of our samples if we get similar results on the gels?
I'm working with an RNA-seq data set consisting of a large number of samples, sequenced at around 50-80M reads. There's a bit of uncertainty as to what the precise experimental workflow was for generating these data, but my best understanding at the moment is that the TruSeq RNA sample preparation kit was used (https://www.illumina.com/documents/products/datasheets/datasheet_truseq_sample_prep_kits.pdf).
This kit starts with total RNA, uses oligo-dT beads to bind polyA+ mRNA, then fragments the mRNA and carries out cDNA synthesis with random hexamer primers.
The data I've seen thus far show a very strong bias towards the 3' end of transcripts, in some cases so extreme that only the exons at the very 3' end are covered, with the rest of the regions having close to no reads at all. This bias is particularly pronounced in genes with long transcripts.
I'm aware that using oligo-dT priming is known to introduce a 3' bias into RNA-seq data as the reverse transcriptase will not always be processive enough to reverse transcribe in one go, but I'm at a loss to explain why the approach above might generate 3' bias if random hexamers were used.
Could anyone suggest any ideas as to what the possible causes of 3' bias in RNA-seq data might be? Are there any causes other than oligo-dT priming?
Would also really appreciate a link to a paper if one exists. Thank you!
I have received RNA seq data from three KO mice and wild type mice. The mice are littermates from commom father and two mother and age of 2-month old. I got the RPKM values and performed GSEA analysis. But I am not getting heat map with clear distinction. The sequencing they performed were Single paired. Here i am attaching heatmap for the hallmark gene set. I am a non-bioinformatician , please provide suggestions.
Hi, I am planning to extract nuclei from a mouse brain tissue and perform scRNA seq using 10x chromium. When extracting the nuclei it is recommended to use RNase inhibitor (non specified which one) and all the papers I've read so far use the one from Takara (#2313A). It would be faster for us to order one from Sigma or Thermofisher but I am reluctant since all the papers use Takara's one. Anyone has any recommendation/experience?
We have fibroblasts isolated from hepatocellular carcinoma tissue samples. We are planning transcriptomics. However, I am not sure if it is doable since there are only 2500-3000 cells. Which method do you advise? Might Nanostring work?
Thanks in advance for your time and consideration.
I am a complete beginner in terms of bioinformatics analysis and I am hoping to complete some functional analysis on some differentially expressed gene lists of some RNA seq data. However, I am a bit lost on how/best way to start: Below are the columns of the DE gene lists that I am operating upon (which seems to be quite different from other example data I’ve seen from various vignettes)
Ensembl Gene ID, RPKM of condtion 1, RPKM of condition 2, FDR 0.05, gene start, gene end, gene strand, gene name, gene description
Does anyone have any suggestions on how to import/modify this data into R so that will allow me to use a tutorial/vignette of sorts to perform GO or KEGG pathway analysis etc? (should/can I convert the RPKM to log2foldchange and pvalues? If so, how would I go about doing this?)
I am currently looking into how the presence of an oncogene influences cytokines responsiveness (primes cells to be more responsive to cytokines) within RNA-seq data. What elements and tools should I be looking into?
Would upregulation transcription factors be something that would explain such a phenomenon? If so could someone direct me to a resource/literature that would possibly explain the relationship between transcription factors and cytokines responsiveness?
I'm in the process of validating RNA-seq data with qPCR experiments. Currently I have my TPM data from RNA-sequencing and CT values from qPCR. Any recommendations for the best way to go about correlating/validating? I've read that delta CT will linearly correlate with log(RPKM/FPKM) but not sure if this would apply to my case? If it does, would I simply plot log(TPM) values to delta CT values?
Thanks in advance!
I have done differential analysis of RNA Seq data. I have got the list of 2000 upregulated genes. I have drawn the heat map of top 100 genes. Can I use these top 100 genes for further analysis like GO Term and Network analysis. I am very confused as I have a huge list of genes how I narrow down and focus on the new genes which are not identified before.
After downloading the RNA-seq Metadata of breast cancer with the gdcRNAtools package in R, I ran across this error message while processing and filtering RNA-seq Metadata:
Error in file(con, "r") : cannot open the connection
In addition: Warning message:
In file(con, "r") : InternetOpenUrl failed: 'The server name or address could not be resolved.'
I encounter with this problem when i use this code (metaMatrix.RNA - gdcParseMetadata(project.id = 'TCGA-BRCA, data.type = 'RNAseq, write.meta = FALSE):
My internet connection is working correctly, and my device is disconnected from the VPN, and I followed all the protocols in the Bioconductor site perfectly ( http://bioconductor.org/packages/devel/bioc/vignettes/GDCRNATools/inst/doc/GDCRNATools.html ).
How can I resolve this issue?
I've lost the raw data of RNA-Seq in rice, can I just upload the clean data to NCBI for a paper pubishment? Which data base can I upload them? Thanks so much!
Hi everyone. I'm having problem with quantifying viral reads from my RNA-seq data, which is part of my Final Year Project. Basically, I have filtered the host reads by aligning to host organism genome. The unmapped reads was aligned to viral database with BLASTn, which illustrated the present of some specific virus. Then, I tried to align the unmapped reads to the genome of these virus that I have found using HISAT2, but the output is 0% matched reads. I need some suggestion to solve my problem. Thank you very much!
I want to do RNAseq using the flow sorted cells from mouse retinae tissue. The RNA is the pool of many samples and the total yield is between 50-100ng and two samples have 5.5 RIN value. Although the bioanalyzer gel picture is okay and showing good rRNA bands. My understanding is that with this limitations I should go with rRNA depletion. I need experts advise that should I go with the poly A capture or rRNA depletion? Please share a good comparative research article if possible.
I have done the DEG analysis of RNA Seq by edge R software. I have a list of genes that have adjusted p values and log fold change 1.5. Upregulated genes are 2000 and downregulated are 1800. Now I am confused that should I used all the UP and down regulated genes for further DAVID and PPI analysis? Kindly give me some suggestions.
I find myself in a predicament that I'm not sure how to resolve:
I have managed to isolate a particular multi-cellular structure from postmortem human brain tissue with the intention of isolating RNA from that structure and building libraries for RNA-sequencing.
Reagents and kits used:
Single Cell RNA Purification Kit from Norgen
RNase-Free DNase I Kit from Norgen (on column DNase treatment)
SMARTer Stranded Total RNA-Seq Kit v3 - Pico Input Mammalian library kit from Takara
So far, RNA extraction from that structure has been successful and so has library building with one issue: There is a second, consistently present, smaller peak at a larger size than a library. So I either have the recurring issue of either gDNA (despite performing DNAse treatment) and/or over-amplified products in library traces. No matter how I have tried to adjust RNA input and PCR cycle, a second peak keeps cropping up in library traces.
I think I’ve managed to reduce the size of the second peak as much as I can to the extent where I don’t think my libraries are over-amplified and whatever it is, is too large from my library to be sequenced (see attached file Takara V3 Library Kit Optimization Conditions 6, specifically well B1). Would such a library be adequate for sequencing? I have received the criticism that my desired library peak may also have gDNA - how likely is this, given the shape of the trace? I know some genomic contamination is inevitable but I'm hoping to keep it as low as possible.
Side note for those who don't work with postmortem brain: lower RINe, lower yields in general, and lower quality "everything" is to be expected. So I’m also concerned that lowering amplification more will not be sufficient for a number of lower quality samples.
Any advice would be greatly appreciated!
I am currently analysing some RNA-Seq data from human primary fibroblasts. I noticed in the following paper (https://www.nature.com/articles/ncomms15824) that the expression of hox genes was used as a proxy for biopsy site.
Would anyone have any potential scripts or other resources they could point me to? I'm just not sure how to code it/which hox genes to include.
Many thanks for your time,
Hello, I'm trying to calculate the standard error in order to calculate the 95% confidence interval to represent error bars on a graph and I am stuck on the calculations when using RPKM. The data I have is from a colleague's RNA seq and I'm unsure how to do the calculations. I would appreciate any help I can get, thank you!
The data is as follows:
Total linear RPKM
R1 R2 Fold change pvalue
Control 8.82 1.85 2.24 0.622
Sample 1 355.63 245.66 164.46 0.049
Sample 2 10.42 11.49 6.09 0.26
*Note R1/2= Replicates
I'm performing RNA-Seq data analysis for differentially expressed genes. as I'm new to this kind of work, so I performed these three gene count tools separately on the same bam file from RNA-Star alignment. I assumed that I might get a slite variation between these three tools' results. and I got some P-adj value variation for the same genes. so I want to know which one of these is better statiscally.
I want to perform RNA-Seq data analysis for DEG's, by taking RAW reads from the NCBI-SRA database, of DENV1, DENV2, DENV3, DENV4. I want to perform this analysis on a galaxy web server. I'm a bit confused about the datasets from SRA. My confusion is, in this accession no from GEO-database- GSE69602, there is a total of 116 data are present. and I took only Total cell lysate data. In total cell lysate, there are two biological replicates at each time interval, like 6hr, 12hr, 24hr, 48hr, 72hr, and the other one is mock. I performed one analysis by taking two biological replicates of 72 hr and two mocks. workflow is, FastQC-Trimmomatic-RNA-STAR, StringTie, DEseq2. I want to know that is the right way or I'm doing anything wrong & if I have to take all the data from the respective time intervals, what is the protocol to specify those data at DEseq2?
All datas are singel-end data,
if you need to see my galaxy history I can share it with you.
A big thank you in advance
I am planning for single nuclear RNA seq from the mouse pancreas tissue. It is pretty hard to get good quality RNA from the mouse pancreatic tissue. Is there any way to QC the nuclei prior to the 10X run? Also any tips on inhibiting the RNAses activity during the isolation process would be helpful. Currently I am using a lot of RNAse inhibitors in the solution but would like a better/ cheaper alternative. Thank you!
Hello, everyone. I obtained DEGs from RNAseq analysis for normal and infected samples. Then I decreased the number of them by some downstream analysis. Now I have 120 DEGs, and I want to select between them the best combination of biomarkers that can recognize normal from infected samples (biomarker panel). So I want to use machine learning methods (At first, I want to perform feature selection and then draw ROC curve, count MCC, Spe, Sen, ....for the combined set of selected biomarkers by different algorithms such as the neural network and random forest). Because I don't have experience in machine learning, I have some questions. And please let me know if you think I am doing any steps that explain here wrong!
1- What kind of RNAseq files should I enter into machine learning software? count file, FPKM, tpm, or any other files?
2- Should that be normalized?
3- Should the entry be log2 transformed?
4- Can the training and discovery dataset be the same?
5- Is what I write below a correct study design?: The use of a dataset for obtaining DEGs then, partitioning it into k subsets of equal size. Of the k subsets, a single subset is retained as the test data set. The remaining k - 1 subset is used as training data sets. The cross-validation process is then repeated k times, with each of the k subsets used exactly once as the test data. The k results from the k iterations are averaged (or otherwise combined) to produce a single estimation. And then performing a test for the model with an external dataset to validate the model.
6- Can the validation dataset be from a different technology like microarray? Is any pre-processing needed for the datasets to be tuned before performing machine learning methods in this case?
Thank you to answer my questions
Hi. We are planning to do a single-cell RNA seq in combination with an ONT system for sequencing instead of Illumina. So is there anyone who has done it before? There are a few steps in the library construction protocol after GEM generation and cDNA amplification/cleanup steps that we can change as per my understanding. I need expert advice on this. Looking forward to your valuable suggestions?
I was wondering if anesthesia (most likely isofluran) of animals before euthanasia and sampling of internal organs (here reproductive tracts in lizards) for RNA seq might ater the mRNA expression profile?
Would you recommand to perfor the euthanasia without the anesthesia or would you anesthetized them?
Thanks a lot for your answers,
I plan on running qPCR to validate RNA-seq data but wasn't sure on the starting material (or the amount). Right now, I have extracted mRNA (extracted using the NEBNext mRNA Isolation Kit) and cDNA libraries (synthesized using the NEBNext Ultra II Directional Library Prep Kit).
If using mRNA as the starting material, how much mRNA would be ideal to create cDNA for qPCR? Before, I've used total RNA as starting material and this seems to be the common starter used in other qPCR papers. Additionally, is it possible to just use the cDNA libraries I have created as starting material for qPCR? If so, is there a general protocol for it?
Thanks in advance!
Does anyone have experience of shipping extracted total RNA at ambient temperatures, for RNA seq;
(i) using DNA/RNAshield using Zymogen? (or recommend an alternate product. RNAstable has been discontinued)
(ii) ethanol precipitated RNA (ethanol, 3M sodium acetate)?
I was wondering if there is a negative effect of RNase inhibitors (like RNase Out, RNasin, etc) on low input RNA samples at higher concentrations? I find that many single-cell RNAseq methods are using concentrations as low as 0.01U/uL where standard RNA-seq protocols are using 1U/uL. Or instead is this simply to avoid reagent waste, i.e. less RNase inhibitor needed for less RNA?
There are many markers for ferroptosis as listed in the link below:
And different literature probes different set of biomarkers. Some markers (NRF2, FTH1, ACSL4, SLC7A11, etc.) were examined in some literatures while they were not in others. I would like to detect ferroptosis efficiently because budgets are limited for primary antibodies for detection of ferroptosis using Western blot. Guys, is there any suggestion on narrowing the to-blot list? I guess it needs taking into consideration what ferroptotic sub-pathway my research subject is involved. Maybe some preliminary experiments such as RNA-seq can help me out to determine the sets of markers I will be blotting?
Your help is appreciated!
There are few steps to make heatmap of your qRT-PCR data (fold change or relative quantification) using R.
Data file preparation:
Make excel file of your data in which your will place your gene of interest in column and your treatment or conditions in row.
Save the file in *csv extension.
Import data file in R:
By using following codes, import your data file into R,
data2 <- read.csv("data1.csv")
~ data1.csv will be file name your data file your created in excel and data2 is the name of your data in R. You can use your own names instead of data1 or data2 and you can even give your data a single name at both places.
When you will import the data, you will see first column composed of serial numbers. We need to replace the numbers with the names of actual column of your data that contain your gene of interest. To do this use this code:
rownames(data2) <- data2$Name
~ Name is first column
This will replace the serial numbers with your first column. But now you have two columns with your genes of interest. To remove duplicate, use this code:
data2$Name <- NULL
Now your data is ready to create heatmap.
First create matrix of your data by using following code:
data2 <- as.matrix(data2)
Now install a package to create heatmap "pheatmap" by following code:
after installing you will call that package every time when you want to use it by following code:
Then give a command to make heatmap of your data by following codes:
Usually we show fold change/relative quantification value inside our heatmap to add them modify your code in the following way:
pheatmap(data2, display_numbers = TRUE)
- You can customize your heatmap in many ways. Contact me any time if your any help.
I'm currently delving into proteomics head first, which is entirely new to me. My collaborators will be carrying out tandem mass tag spectrometry with fractionation on my samples (cases versus controls, tissue is postmortem brain tissue) and will be sending processed results my way, which include # Peptides, # Unique Peptides, values scaled to QC, % CV, abundances, normalized abundances. I'm interested in case versus control differences so what would be the best analyses to do? I'm only familiar with RNA-seq analyses, so any workshops, youtube tutorials, tips and advice would be GREATLY appreciated!
I'm doing the mRNA library prep using Illumina TruSeq Stranded mRNA kit.
The kit recommends anything between 100-1000ng of total RNA for the prep that is quite a range. My samples are from Zebrafish embryos.
What is the best amount to use?
Would appreciate advice from more experienced users.