Science topic

Transcriptomics - Science topic

Transcriptomics are the transcriptome is the set of all RNA molecules, including mRNA, rRNA, tRNA, and other non-coding RNA produced in one or a population of cells.
Questions related to Transcriptomics
  • asked a question related to Transcriptomics
Question
2 answers
Hi everyone,
I’m planning to extract total RNA from Staphylococcus samples for transcriptomic profiling, but this is my first time doing RNA extraction. I’ve heard great things about the QIAGEN RNeasy kits, but the cost is a bit steep, especially when factoring in the RNAprotect Reagent and DNase Set.
I came across the NEB Monarch Total RNA Miniprep Kit, which includes gDNA Removal Columns and DNase I at a more affordable price point. Does anyone have experience with the NEB kit? How does it compare to the RNeasy in terms of performance and ease of use?
Thanks in advance!
Relevant answer
Answer
The kit was pretty user friendly but it didn't work well for my particular samples (leaves with high phenolic content).
NEB will often send a small sample-sized kit of many of their products. You could reach out to their customer service as well and ask if anyone has used it for your organism.
  • asked a question related to Transcriptomics
Question
3 answers
In my RNA-seq data, I have on average >70% duplicate read percentage ( for all samples). Any suggestion how to clear my data to eliminate this bias?
Relevant answer
Answer
Depending on the type of your experiment, read duplication is to be expected and deduplication is not necessarily beneficial. You might want to have a look at this article: https://www.nature.com/articles/srep25533
  • asked a question related to Transcriptomics
Question
3 answers
When culturing bacteria and comparing two conditions—one with a specific substrate and one without—transcriptomic studies can indicate which proteins play a role in the metabolic pathway of the available substrate. However, I am curious if the absence of a substrate could also lead to the upregulation of genes relevant to the metabolic pathway associated with the missing substrate.
For example, consider two conditions: Condition A lacks substrate A but has substrate B, leading to a shortage of substrate XY. As a result, a protein (let's call it protein C) is involved in overcoming this shortage or regenerating/producing substrate XY, and therefore is highly expressed, even though substrate A is absent, which is ultimately necessary to produce the limited substrate XY. Thus, the transcriptomic study would show an upregulation of the gene encoding protein C for the condition where substrate B is present. However, protein C plays a crucial role in the metabolic pathway of substrate A, not substrate B.
Is this a scenario that could possibly occur, or is it unlikely? I understand that in science, it's difficult to say "never." Please provide scientific arguments for why this might or might not be the case.
Relevant answer
Answer
Dear Stefanie,
from my experience there are scenarios where cells express genes that allow for the degradation of potential substrates which might not be present in the medium as part of a general stress or starvation response. This is the case for example in Bacillus subtilis which expresses genes of the DegU regulon comprising the upregulation of degradative enzymes like (proteases, peptidases, levansucrase, etc.) upon carbon depletion or entry into the stationary phase.
Also catabolite control or repression can mediate such a response (e.g. via cAMP in E. coli oder CcpA in B. subtilis).
Best
Michael
  • asked a question related to Transcriptomics
Question
2 answers
I am looking for a some chapter or book related to the statistical analysis of mRNA transcriptomics. Thank you;)
Relevant answer
Hi Laura,
you have free tutorials in Galaxy web:
in which you can find the process to analyze RNAseq data. Then you can perform different analyses depending on your experiment design.
  • asked a question related to Transcriptomics
Question
2 answers
For researchers who have conducted transcriptomics using the adult rat prefrontal cortex, is it necessary to optimize the RNA extraction method? Additionally, what is the average weight of the rat prefrontal cortex, and how much RNA can typically be extracted from it?
Relevant answer
Answer
The average brain weight of a rat is 1.5-2 g, depending on the body weight and the age.
TRIZOL is a good option for RNA isolation; typical yields from brain tissue (mg RNA/mg tissue) are 1‑1.5 mg.
  • asked a question related to Transcriptomics
Question
1 answer
Elaborate on why transcriptomics and comparative genomics are chosen as the primary methods for investigating fungicide resistance in Alternaria alternata?
Relevant answer
Answer
Transcriptomics and comparative genomics are the most standard techniques in NGS, and I would say "easy to interpret" comparing to any other technique such as CHIP-seq, ATAC-seq and many others.
  • asked a question related to Transcriptomics
Question
1 answer
I am new to spatial transcriptomics and am exploring data sets. Can someone walk me through what's contained in the R Object (Robj) files that I have?
Relevant answer
Answer
First load the Robject file and see what it contains.
how to open an Robj file in r? - Stack Overflow
You can use ls() to list all the object in it.
For most of the analysis, there are libraries already written, so you just need to fine the type of analysis you want to conduct, look for library, install it and follow the documentation.
  • asked a question related to Transcriptomics
Question
1 answer
I have been recently working on abiotic stresses (drought, high temperature, salinity and cold) responses in wheat (Triticum aestivum) using meta-analysis of transcriptomics (microarray) data. The computational stage is almost done. Is any one interested and specialist in this area for contribution to reporting the results in a paper? Please contact me via mshahsavari@ut.ac.ir OR masoudsisa@gmail.com .
Relevant answer
Answer
Certainly! I'd be interested in collaborating on this project. Please feel free to share more details about the research, and I'll be happy to discuss further. You can reach me at kdgb2007@yahoo.fr. Looking forward to potentially working together on reporting the results in a paper.
  • asked a question related to Transcriptomics
Question
1 answer
I am interested in analyzing the correlation between the expression of a set of genes and transposable elements (TEs) in cancer. However, despite there are multiple online databases for gene expression in cancer, including TCGA, they do not include repetitive elements. Despite I've found some papers analyzing transposable elements and quantifying their expression in different cancer using TCGA data, supplemental tables only provide the fold change end p-values for differentially expressed TEs. Also, to identify and quantify TEs, the raw sequencing data, which have controlled access, would be necessary. Therefore, I was wondering if there is some database or published resource where I could find information regarding TE expression per sample in TCGA database. Does someone know something like that? Alternatively, if someone have analyzed this type of data and have some worksheet with pre-processed data that could be shared, I would be deeply grateful.
Relevant answer
Answer
Hi Glauco,
Are you aware of the Xena browser (https://xenabrowser.net/)? There are at least gene expression, mutation and methylation data per patient for the TCGA cohorts. Unfortunately I am not sure about the TE data, since I am not working with that but I think that Xena would be your best bet.
  • asked a question related to Transcriptomics
Question
1 answer
During my RNAseq data analysis, I encountered a problem where the statistics from MultiQC from the STAR method showed that 70% of my trimmed reads were aligned, but when I ran using the same BAM files, it said that only 3% of my reads were assigned.
Why is that? Should I proceed further or do I need to perform additional checkups? For reference i used HG38.p14 version from NCBI.
Relevant answer
Answer
Hi Pratanu,
have you checked if the parameters of featureCounts were set properly (correct annotation used, strand orientation, etc.)?
I would have a look at the bam files using a genome viewer, like IGV and check if there are reads at certain genes that I would expect to get reads for sure.
  • asked a question related to Transcriptomics
Question
3 answers
I was performing a RNAseq data analysis. I did my alignment using RNA-STAR and then I perform featurecounts. I used latest assembly of human genome i.e. HG38.p14. But after feturecounts step i noticed that some gene were counted abnormally, like the screenshot i share you can see that ABO gene came two times, one as 'ABO' and then 'ABO_1' and you can see many more are came like this. in featured count i selected the option, "count them as single fragment". Dataset was illumina Paired end reads.
1. Dose anyone know What is the reason behind that?
2. Did I do any mistake during the processes that i didn't noticed?
3. What to do in this situation?
Thank you , very much for the time.
Relevant answer
Answer
I think the gene ID in your GTF or GFF3 files you used for constructing the alignment index might not be the transcript ID including splice variant, which cause multiple alignment to 1 gene. I think you'd better use the genome annotation file and sequence file (gtf and fa) file from the ensembl (with gtf available at https://ftp.ensembl.org/pub/release-110/gtf/homo_sapiens/Homo_sapiens.GRCh38.110.gtf.gz and fa at https://ftp.ensembl.org/pub/release-110/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna.toplevel.fa.gz) or download pre-bulit index from the website or your alignment tools.
  • asked a question related to Transcriptomics
Question
1 answer
Submission Deadline: 30 September 2024
Summary
Metabolism refers to the unique biochemical processes that sustain life in an organism. In cancer cells, predominant biological processes include glycolysis, reduced oxidative phosphorylation, promotion of apoptosis and cell death, and increased synthesis of metabolite intermediates essential for cell proliferation, migration, and death. These metabolic properties lead to changes in the tumor microenvironment.
Transcriptomics studies the total RNA (mRNAs and non-coding RNAs) transcribed from a specific cancer cell in a particular functional state. As the size of transcriptome datasets continues to increase regarding tumor biology, there is a growing demand for computational and analytical methods. At present, countless public datasets available online allow researchers to have a comprehensive view on aging and related disorders. Microarray or sequencing of mRNA, ncRNA, or m6A provides informative clues for delineating biological progresses. Multi-omics analysis, including genomics, proteomics, and matrix omics et al., helps providing a comprehensive view of different processes. Single-cell methods further make it attainable to chart genome, transcriptome, and proteome at single-cell resolution. Furthermore, the advancement in algorithm boosts the reports of novel findings from existed data, and large public health databases, such as NHANES or Seer, further helps unveil the risk factors in real world.
The above data allows the possibility for the detection of tumor biology, novel targets, and evaluation of therapeutic effectiveness. This thematic collection aims to provide a comprehensive overview of the latest research advances in cancer metabolism through integrative analyses. We welcome research articles, reviews, perspectives, commentaries, and clinical trials that discuss both basic and translational research as well as therapeutic perspectives in cancer from the view of metabolism.
Relevant answer
Answer
We welcome research articles, reviews, perspectives, commentaries, and clinical trials that discuss both basic and translational research as well as therapeutic perspectives in cancer from the view of metabolism.
  • asked a question related to Transcriptomics
Question
1 answer
Dear ResearchGate community,
I'm currently engaged in research involving the prediction of immune responses using transcriptome data. As part of this, I'm exploring the utility of random forests and decision trees as predictive models.
In case of transcriptomics, what performance metrics have you found most informative when comparing the predictive accuracy of random forests and decision trees? Given the complexity of gene expression data, are there metrics that particularly resonate with understanding immune response prediction? Do you have any tips for optimizing model parameters to prevent overfitting and enhance generalization?
I'm excited to hear about your experiences working at the intersection of transcriptomics, immune responses, and machine learning.
Thank you in advance for your contributions, and I'm looking forward to engaging in enlightening discussions.
Best regards,
Emil
Relevant answer
Answer
Greetings! I'm delighted to share my insights and experiences with you in the realm of transcriptomics, immune responses, and machine learning. When comparing the performance of random forests and decision trees in predicting immune responses using transcriptome data, I've found the following performance metrics to be most informative:
  1. Accuracy: This metric measures the overall proportion of correctly classified samples. While it's useful for getting a broad sense of model performance, it's important to supplement it with other metrics to gain a deeper understanding of the models' behavior.
  2. Precision: This metric calculates the ratio of true positives (correctly predicted instances) to the sum of true positives and false positives (incorrectly predicted instances). Precision is particularly relevant when dealing with imbalanced datasets, where one class dominates the other. In the context of immune responses, precision can help identify models that excel at detecting rare but critical immune cells or genes.
  3. Recall: This metric assesses the proportion of true positives among all actual positive instances. In the context of immune responses, recall can help identify models that successfully capture the full range of immune cell types or genes involved in a particular response.
  4. F1 Score: This metric balances precision and recall, providing a harmonic mean of both. It's helpful when evaluating models that prioritize either precision or recall, depending on the specific application. An optimal F1 score represents a good tradeoff between precision and recall.
  5. Area Under the Receiver Operating Characteristic Curve (AUC-ROC): This metric plots True Positive Rate against False Positive Rate at various thresholds, allowing for the evaluation of model discrimination ability. A higher AUC-ROC signifies better separation between classes, with a value of 1 representing a perfect classifier. In the context of immune responses, AUC-ROC can help assess models' abilities to distinguish between healthy and diseased states or differentiate between distinct immune cell populations.
  6. Cross-Validation: To ensure that performance metrics aren't biased towards a particular subset of the data, employ cross-validation techniques like k-fold or leave-one-out validation. These methods allow for estimating model performance on unseen data, which is crucial for making predictions on new, independent samples.
When working with complex gene expression data, it's essential to consider the biological relevance of the performance metrics. For instance, in some cases, a high accuracy might not necessarily translate into biologically meaningful results. Instead, focus on identifying models that capture the underlying biology effectively, such as those that distinguish between different immune cell subtypes or predict functional pathways.
To optimize model parameters and prevent overfitting, follow these best practices:
  1. Use robust feature selection methods: Techniques like recursive feature elimination (RFE) or mutual information can help filter out irrelevant features and reduce dimensionality, thereby improving model interpretability and reducing overfitting risks.
  2. Regularization: Apply regularization techniques, such as Lasso or Ridge regression, to shrink model coefficients towards zero. This reduces the risk of overfitting by penalizing large coefficients.
  3. Set aside a validation set: Reserve a portion of your dataset for hyperparameter tuning and model selection. This allows for evaluating model performance on data that hasn't been used during training, ensuring that your chosen model generalizes well to new data.
  4. Perform hyperparameter grid searches: Explore a range of hyperparameters systematically, using techniques like grid search or random search. This enables you to identify the combination of hyperparameters that yields the best model performance.
  5. Monitor performance metrics during training: Track performance metrics like accuracy, precision, and recall throughout the training process. This helps avoid overfitting by identifying the point where model performance starts to degrade.
  6. Consider ensemble methods: Combine multiple models using techniques like bagging or boosting to improve generalization and reduce overfitting. Ensemble methods can often produce more accurate predictions than individual models.
By considering these performance metrics, biological relevance, and optimization strategies, you'll be well on your way to developing robust and reliable machine learning models for predicting immune responses using transcriptome data.
Good luck with your research endeavors!
  • asked a question related to Transcriptomics
Question
3 answers
Hi - I'm currently working with two RNA-Seq studies; one has RNA extracted from whole blood, the other PBMCs. Eventually we want to combine these data and perform some cell-specific deconvolution to look at DEGs.
Are there any recommended methods for batch correcting these data from different sources?
Mari
Relevant answer
Answer
It is better to consider batch as a factor in the design formula. The tximport pipeline proposed by Michael Love himself offers the most useful solution. Please have a look.
  • asked a question related to Transcriptomics
Question
3 answers
I want to purchase Macbook mainly for the bioinformatics analysis propose i.e., Transcriptomics, smalRNA, Methylation, lncRNA and other. Would anyone please suggest to me the best affordable one?
Relevant answer
Answer
I think a small server is a better choice for processing bioinformatics data analysis as it is cheaper and more convenient. This is because many analyses can take a long time, and MacBook do not have good heat dissipation.
  • asked a question related to Transcriptomics
Question
2 answers
I want to know that, when a heterologous gene is expressed under CMV promoter in mammalian cells, what is the percentage of this heterologous gene's mRNA in comparison to total cellular RNA and total cellular mRNA? Is there any mention of this in the literature?
Relevant answer
Answer
Any transient transfection can be titrated to achieve a desired expression level. The CMV promoter is very strong and will work in most cell types. The mRNA and protein levels for your gene will vary depending on RNA stability, size, translational efficiency, etc. So, by transfecting varying amounts of your expression vector mixed with something inert, like Salmon sperm DNA, you can empirically determine how much you wish to express. For 100mm plate we use 6ug DNA with lipofectamine or fugene. But that can be 100 ng of vector with 5.9 ug inert DNA. Vector can be anywhere from 100ng to 6 ug.
  • asked a question related to Transcriptomics
Question
3 answers
I have tried to separate a direct coculture of MSCs (mesenchymal stromal cells) and macrophages to do bulk RNA seq on macrophages, as I want to find out how MSCs change the genetic expression on macrophages. I have tried different methods to separate the coculture as much possible, but I can only manage to retrieve a cell population with 95% macrophages, and 5% MSCs still present.
Therefore, I want to know if anyone has experience with analyzing data when the population is not completely pure with one cell type and how do I handle such data?
Is it wise to proceed with bulk RNA seq when 5% of my cells are still MSCs, well aware that the expressed genes observed could come from the 5% MSCs?
Relevant answer
Answer
Dear Kian,
have you tried improve your purity by FACS? It´s fairly easy to choose markers to distinguish MSC & macrophages and sort highly pure populations.
  • asked a question related to Transcriptomics
Question
13 answers
I have two DEG sets for 2 disease conditions (from mild to severe condition) of the same viral infection. When I look at the common gene from these two sets of DEGs, I found that some genes show opposite expression among these two conditions ( Like a gene downregulated in mild but up-regulated in severe or vice versa). So what I want to know is that,
1) If this phenomenon is normal in viral infection??
Relevant answer
Answer
The shift between two opposite direction of change is the rule for regulatory genes that work as 'toggle.switches' in which the biphasic alternation of two conditions is the basis for a sort of digital control of biological regulation. It is not by chance that a great part of toggle-switches are retroviral origin sequences that mirror the lytic-lysogenic phases of viruses and phagi, see:
  • asked a question related to Transcriptomics
Question
3 answers
Hello,
Do you know if is that possible to identify bacterial enzymes (from microbiota) by analyzing transcriptomics data from a human tissue (i.e. TCGA samples)?
Thanks for you response!
Relevant answer
Answer
Oh, I see.
Thank you anyway!
  • asked a question related to Transcriptomics
Question
2 answers
Hello there!
I have full access to several metabolomic (metabolite concentrations) and transcriptomic databases (FC and pvalues). I would like to integrate all these info in one to obtain not only DEGs and metabolite boxplots but pathways and tissue/cell type information. I'm stuck searching for free software or friendly R packages other than mixOmics. Any idea?
Thanks!
Relevant answer
Answer
I am not sure if could help, but have a look on https://www.omicsnet.ca/.
  • asked a question related to Transcriptomics
Question
4 answers
I work in the cancer research field and human disorders by using the bioinformatics approach. These projects contain the analysis of transcriptomic data such as microarray, RNA-seq analysis, TCGA, systems biology analysis, survival analysis and etc. also, the metagenomic analysis in microbiome fired are conducted. Those interested in participating in analyses and writing articles are invited to send their CV to the email below.
Relevant answer
Answer
How does your institute financially support the applicants?
  • asked a question related to Transcriptomics
Question
4 answers
For RNA profiling of frozen tissues, researchers recommend to use single-nuclei RNA sequencing instead of single-cell. What is the reason for this?
Also, what is the best way to freeze cells for RNAseq at a later time?
Thank you very much for your help, be safe!
Relevant answer
Answer
There are several reasons why frozen cells are not typically recommended for single-cell RNA sequencing:
  1. RNA degradation: One of the main challenges with using frozen cells for single-cell RNA sequencing is the risk of RNA degradation. Frozen cells are more prone to RNA degradation than fresh cells, as the freezing process can damage the RNA molecule. This can lead to lower yields of RNA and poorer quality RNA, which can affect the accuracy and reliability of the RNA sequencing results.
  2. Loss of cell viability: Another issue with using frozen cells for single-cell RNA sequencing is that the freezing process can also lead to cell death. This can reduce the number of viable cells that are available for sequencing, limiting the number of cells that can be profiled.
  3. Inability to preserve rare cell types: In some cases, using frozen cells for single-cell RNA sequencing may also lead to the loss of rare cell types, as these cells may be more sensitive to the freezing process.
To minimize these issues, researchers often prefer to use single-nuclei RNA sequencing for RNA profiling of frozen tissues, as this approach allows them to analyze the RNA from individual nuclei rather than whole cells. Single-nuclei RNA sequencing can be performed on frozen tissues, and it has the advantage of allowing researchers to profile the RNA from a large number of cells without the need for cell isolation or culturing.
If you need to freeze cells for RNA sequencing at a later time, it is important to handle the cells carefully to minimize the risk of RNA degradation and cell death. Some general tips for freezing cells for RNA sequencing include:
  1. Grow the cells to high density before freezing to maximize the yield of RNA.
  2. Use a suitable storage buffer, such as RNA storage buffer or RNA lysis buffer, to protect the cells and RNA from damage.
  3. Quickly freeze the cells in liquid nitrogen or in a -80°C freezer to minimize the risk of RNA degradation.
  4. Store the frozen cells at a low temperature, such as -80°C or -196°C, to minimize the risk of RNA degradation and cell death.
Overall, it is generally recommended to use fresh cells or single-nuclei RNA sequencing for RNA profiling, rather than frozen cells.
  • asked a question related to Transcriptomics
Question
3 answers
I am interested in parallel genomic and transcriptomic sequencing at the single cell level but with the high throughput capacity of a system like 10X. I understand that this is doable at a low-throughput level via techniques like SMARTseq2, but I am wondering if such an option exists for HT methods like 10X, DropSeq, etc.
Thanks!
Relevant answer
Answer
Hi Eugene,
I'm currently in the process of optimizing and planning a pilot experiment using the DNTR method explained below:
"A Highly Scalable Method for Joint Whole-Genome Sequencing and Gene-Expression Profiling of Single Cells": https://www.sciencedirect.com/science/article/pii/S1097276520306559
Let me know if you like to brainstorm together.
  • asked a question related to Transcriptomics
Question
3 answers
Hello,
I have performed small RNA sequencing on zebrafish tissue from wild-type and mutant lines. I have deduced a list of differentially expressed miRNAs. I have used DIANA-microT-CDS to predict targets for these miRNAs and have filtered the list of targets to remove genes not expressed in the tissue of interest.
I would like to perform GO term enrichment analysis using GOrilla on the resulting lists of targets. I have two approaches in mind.
1) Use targets with a high predicted repression against a background list of genes expressed in the tissue of interest.
2) Use a single list of targets ranked from very high predicted repression strength to low predicted repression strength.
Could anybody advise if these methods seem suitable for the analysis I would like to perform?
Any advice on alternative methods of softwares would also be greatly appreciated.
Thanks.
Relevant answer
Answer
Tu add up to my college Daniel Toro-Domínguez ,
Both approaches are good but they are different types of enrichment analysis.
in 1) you will be doing a GSA where the results might vary according to the arbitrary cutoff to define, in your case, the "targets with high predicted repression". Using a specific background of target genes was proposed by several authors to correct probable bias effects.
in 2) You will be doing a GSEA "threshold free approach".
or 3) Using GeneCodis providing it directly your list of miRNAs that can be:
- Diretly tested the miRNAs with specific annotations or gene to mirnas transformed databases using the hypergeometric test.
- Or if you want to analyse the targets, use the Wallenius test to address a gene selection bias known in GSA of CpGs, miRNAs and TFs.
I would perform all of them and chose the results that allows you a better discussion / conclusion.
  • asked a question related to Transcriptomics
Question
2 answers
Hi everyone,
I want to perform a gene set enrichment analysis on some bacterial metatranscriptomic data. Right now the main idea is to reformat the KEGG orthology htext to gmt. I was wondering if someone has published such database or something similar already. Alas, my web searches have been unfruitful.
Thanks in advance.
Regards.
Miguel
Relevant answer
Answer
I ended up writing a python script which produced a GMT file that was valid for a GSEA. Here is the code if anyone else needs to do this:
  • asked a question related to Transcriptomics
Question
4 answers
What could be the probable techniques applied from transcriptomics to study metabolomics of plant pathogen interaction ?
Need suggestion.
Relevant answer
Answer
You can conduct RNA-Seq analysis (only in host or dual in both pathogen and host) to get the DEGs between infected and non-infected conditions and get a bit of idea regarding the genes and metabolic pathways which might be involved in inciting the disease response
  • asked a question related to Transcriptomics
Question
8 answers
Hi Everyone, I have query regarding cell type annotation for single cell characterisation. Whether automated annotation (based on identified clusters) methods or based on known marker genes (available in databases) Is better ?
Relevant answer
Answer
I think there are 2 major approaches: 1. Marker genes - the idea is that a gene with high average fold change and appropriate adjusted p value between all clusters is uniquely representing a cell type. Markers can be “canonical” - surface proteins detectable with flow cytometry, or “literature based” - genes known to distinguish cells by type and validated in literature. This is a developing area with new publications new frameworks to characterize cells and assign them with a defined type ( )
The second approach is classification based. 2. Using large collections of cells (I.e. https://www.humancellatlas.org or even cell line experiments with specific ceo type gene expression) we train a model to predict class assignment for new gene expression data. Such trained models rely on various numbers of datasets and features (hundreds) and more complex patterns than just logFC and adj. p-value. These are methods like celldex (https://bioconductor.org/packages/release/data/experiment/html/celldex.html).
Both methods have limitations in practice - many clusters can be assigned to several types of cells based on “good” logFC and p-value, so the user might choose the top values or go for unique cell type not present in other clusters. Since methods like Seurat can assign a cell type to each cell, you can also calculate proportion of cell types in a cluster and use that.
Automated annotation can also fail to assign a good cell type to a given set of cells. since there are many known types of cells and new variations are often found, combining these approaches and performing additional manual examination of marker genes, literature and expression patterns is typically required.
Hope that helps! We’re adding tutorials on this topic on our OmicsLogic portal: https://learn.OmicsLogic.com
  • asked a question related to Transcriptomics
Question
2 answers
I need a list of genes that are differentially regulated in diseases like Atopic dermitis. Is there a database for that?
Relevant answer
Answer
Markus Glaß thank you!
  • asked a question related to Transcriptomics
Question
3 answers
Hello,
I have the gene of the list of whole exome sequencing data from the paper. Can I use this list of genes to get the gene expression data? Should I download the transcriptomic data using this list of genes? How do I do that process? Also, can I get SNVs and CNVs data from those gene lists?
Thank you
Relevant answer
Answer
Simply download the raw counts as .csv and process it using DESeq2 or limma-voom in R. For limma you will have to normalize your counts first. DESeq2 is easier. You will get log2FC values along with adj. p value. If you just want to check their expression for wet lab experiments. You can also calculate the fold change without checking for statistical significance. Just average the counts of control samples and experiment samples and FC= exp/control .
  • asked a question related to Transcriptomics
Question
5 answers
I'm in the initial stages of planning a miRNA seq experiment using human cultured cells and decided on TRIzol extraction, Truseq small RNA prep kit, using an illumina HiSeq2500. The illumina webinar suggests 10-20 Million reads for discovery, the QandA support page suggests 2-5M, and I wrote the tech support to ask, who suggested I do up to 100M reads for rare transcripts. Exiqon guide to miRNA discovery manual says there is not really any benefit on going over 5M reads. I was hoping to save money by pooling more samples in a lane, so I was hoping someone with experience might be able to suggest a suitable number of reads.
Relevant answer
Answer
i am working on cardiomyopathy patients Blood samples . and wanted to do miRNA sequencing can some one please suggest how many millions reads i need to sequence 20 millions or 30 millions and also please suggest the platform as well .
  • asked a question related to Transcriptomics
Question
2 answers
Phenol - Chloroform based RNA extraction methods are most widely used for RNA extraction. I am wondering if people have tried alternate methods for cell lysis (yeast, animal cells, plants cell, etc), specifically using SDS and proteinaseK ? The idea is to avoid phase separation-based methods and toxic organic solvents like phenol.
- What kinds of buffers can be used for lysis?
- How does one get rid of SDS and other chaotropic agents used during cell lysis?
Thanks for your valuable insights.
Relevant answer
Answer
Agniva Saha the problem was RNA integrity. I don't have my full protocol notes but briefly, cell lysis was performed in a 'gentle' buffer consisting of hypo-osmotic sucrose and 0.5% Triton X-100, agitated at a low temperature. I supplemented with superasin. Following lysis (which was incomplete after 30 minutes) the buffer was supplemented with 1.5% SDS and sodium chloride, Proteinase K was added, and Prot K digestion performed at room temperature for 45 minutes.
This whole protocol is a huge tradeoff between the needs of RNA (low temperature, rapid separation from other cellular components) and deproteination using Prot K (ideal reaction temperature 50-65C, and for deproteination of DNA after ChIP we do 4-hour reactions to get rid of all the protein.) The result was that I isolated less than 1/3 of the typical yield of low-integrity RNA (RIN ~5.5-6.5).
Compare this to phenol, which is such a strong protein and nucleic acid denaturant that it results in a) near complete lysis and b) near complete protection of RNA from endogenous RNases within seconds of efficient sample homogenisation. I don't really think there's an alternative that goes close.
I will say though, every method I mentioned above for avoiding phase separation results in no loss in RNA integrity if performed properly. I haven't done a head to head yield comparison, but my routine RNA extraction protocol involves Nucleozol supernatant mixed 1:1 with ethanol and run through a column, and the yields are better in my hands than alcohol precipitation of RNA, with better purity.
  • asked a question related to Transcriptomics
Question
7 answers
What advantages does transcriptome have over proteome as the final product of gene expression is protein? Why to choose it?
Relevant answer
Answer
Not only it is much easier to work with transcriptomics but transcriptomics allow you to take into consideration the role played by non-coding sequences that are the majority of the genome and very often play a crucial role in biological regulation see for example:
  • asked a question related to Transcriptomics
Question
1 answer
Dear environmentalist in Bangladesh,
I would be happy to know where I can get the FTiR microscopy facility and the already developed protocol for micro-plastics characterisation in biological samples in Bangladesh?
Also suggest any transcriptomics marker to analyse in fish and molluscs.
Thanks in advance.
Relevant answer
Answer
This article might help you to MP analysis. You can get a detailed idea from the corresponding author of this paper.
Jabed Hasan, can you please help him to characterize the MP.
  • asked a question related to Transcriptomics
Question
5 answers
Dear colleagues, we plan to analyze human tumor tissue (lung, oral, and breast) samples using the Chromium single cell 3' gene expression solution. We need to store the collected samples for more than 3 months. What sample preparation and storage methods would you recommend?
Relevant answer
Answer
Hi Anna,
There are several freezing protocols in internet for mammalian cells.
My general protocol for freezing is like that, it should work with other mammalian cells:
*Make Cryoprotectant solution with following final composition:
10% DMSO (D2650, Sigma-Aldrich),
50% Fetal Bovine Serum,
40%-Cell culture medium without FBS,
1% antibiotic/antimycotic mixture (Penicillin/Streptomycin mixture).
*Suspend cells in cryoprotectant solution at a final concentration (5 million to 10 million cells/ml)
*Place the cell suspension in cryotubes (fill 70% only)(The cryopure tubes (cat# 72.377.002, Sarstedt or from any company)
*Place The cryopure tubes with cell suspension in Mr. Frosty™ freezing container (Cat# 5100-0001, Thermo Scientific)
*Place Mr. Frosty (with cryotubes) at -70˚C to -86C for 24h
*Either Transfer Cryotubes to liquid nitrogen, you can store for upto many years.
or
transfer these cryotubes tubes in cold boxes (-70 to -86C) and leave them there (-70 to -86C) for storage upto 1 year only
Thawing of cells:
Take cryotubes out of LN2 or Freezer (-70 to -86C) and thaw them at 37C water bath, for 5 to 10 minutes till the suspension melts.
Transfer the cell suspension in cell culture medium (you decide) in cell culture flask (you decide) under sterile hood.
Grow cells for a passage to recover from freezing shock than use for your RNA seq or other experiment after first or second passage
(I assume you know cell culture method, otherwise you can take help from anyone doing cell culture)
Good luck, any question, you can ask
Subhash
  • asked a question related to Transcriptomics
Question
3 answers
Hi all,
I want to know about the transcriptomic response to an infection. How long does it take for the genes to be activated?
Thanks for your insights
Julien
Relevant answer
Answer
Check out Aymoz et al. (2018) "Timing of gene expression in a cell‐fate decision system" (Mol. Syst. Biol. 14, 4, e8024) for a nice study on the timing of transcriptional responses mediated via MAPK to yeast pheromones, a system similar to antigen-induced gene activation pathways.
  • asked a question related to Transcriptomics
Question
5 answers
Hi all,
I want to know the kinetics of a transcriptomic response to infection. I am interested in the earliest time points, how minutes does it take a cell to activate a gene upon infection?
Thank you for your insights
Julien
Relevant answer
Answer
The cascade leading to the transcriptional factor activation is short, in the case of the canonical ones (NFkB, IMD). The accumulation of the transcripts is another subject matter. The structure of the promoters of AMPs suggest non immune related transcription factors are likely to bind and interfere with the specific transcription. HIF binding site, for instance, is frequently found nearby these genes, suggesting a potential hindrance upon redox stress.
  • asked a question related to Transcriptomics
Question
3 answers
I am following the way how a previous paper (PMID: 30948552) treating their spatial transcriptomic (ST) data. It seems like they combined all expression matrix (not mentioned whether normalized or log transformed) of different conditions, and calculate a gene-gene similarity matrix (by Pearson rather than Spearman), and they finally got some gene modules (clustered by L1 norm and average linkage) with different expression between conditions.
So I have several combination of methods to imitate their workflow.
For expression matrix, I have two choice. The first one is a merged count matrix from different conditions. The second one is a normalized data matrix (default by NormalizeData function in seurat, log((count/total count of spot)*10000+1)). For correlation, I have used spearman or pearson to calculate a correlation matrix.
But, I got stuck.
When I use a count matrix, no matter which correlation method, I get a heatmap with mostly positive value pattern, which looks strange. And for a normalized data matrix (only pearson calculated), I got a heatmap with sparse pattern, which is indescribably strange too.
My questions:
  1. Which combinations of data and method should I use?
  2. Would this workflow weaken the correlation of the genes since some may have correlations only in specific condition?
  3. Whatever you think of my work?
Looking forward to your reply!
Relevant answer
Answer
Correlation AnalyzeR: functional predictions from gene co-expression correlations
  • Henry E. Miller &
  • Alexander J. R. Bishop
BMC Bioinformatics volume 22, Article number: 206 (2021) Cite this article
  • 2563 Accesses
  • 1 Citations
  • 2 Altmetric
  • Metrics details
Abstract Co-expression correlations provide the ability to predict gene functionality within specific biological contexts, such as different tissue and disease conditions. However, current gene co-expression databases generally do not consider biological context. In addition, these tools often implement a limited range of unsophisticated analysis approaches, diminishing their utility for exploring gene functionality and gene relationships. Furthermore, they typically do not provide the summary visualizations necessary to communicate these results, posing a significant barrier to their utilization by biologists without computational skills.
Background Almost two decades after the completion of the Human Genome Project, the functionality of many genes remains largely enigmatic [1]. Many such “enigmatic genes” have immense biological significance, exemplified by the associations of thousands with cancer outcome [2]. Even genes which are well-characterized often play unexpected roles in different biological contexts (e.g., EZH2 is both a tumor-suppressor and an oncogene in different cancers [3]). Gene co-expression correlations provide a robust methodology for predicting gene function, as genes which share a biological process are often co-regulated [4,5,6]. Similar insights can be gained from using protein interaction (for example STRING [7] and InterologFinder [8]), phenome data, or even the combination of both [9]. Irrespective, generating expression data remains a cost-effective approach and co-expression analysis remains a prominent tool for exploratory systemic evaluation, largely because it is capable of considering gene co-expression across the genome. However, the applications which have been developed for such inference are hampered by key limitations. Tools like COXPRESdb [10] and GeneFriends [11] calculate gene set over-representation on an arbitrary number of co-expressed genes. Alternatively, GeneMANIA [12] and GIANT [13] construct co-expression networks and calculate gene set over-representation on an arbitrary number of nodes. Neither approach is sensitive to the genome-wide distribution of co-expression correlations or, with the exception of GIANT, differences between tissue/disease conditions. Furthermore, these functional predictions are limited in scope and do not generate relevant, user-friendly visualizations, limiting their utility for biologists without bioinformatics skills.Recently, Lachmann et al. introduced ARCHS4, a database with thousands of standardized RNA-Seq datasets [14]. We re-processed these data, calculating co-expression correlations with respect to tissue and disease (cancer/normal) condition and provided the results in a publicly accessible database. We now present Correlation AnalyzeR, a user-friendly interface to this co-expression database with a suite of tools for de-novo prediction of gene function, gene–gene relationships, and biologically relevant gene subgroups to facilitate discovery of novel relationships within genes of interest.
  • asked a question related to Transcriptomics
Question
1 answer
Hi all,
I am developing a high-throughput RNA extraction protocol for xylem vessels. Pre-emptively, the samples are going to be homogenized in a Genogrinder with a cryoblock attachment and then transferred to a 96-well format for total RNA extraction. For ease of transfer to the 96-well, I was thinking of maybe mixing the homogenized tissue with RNAlater-ICE purely for the fact that transfer of dry material to a 96-well will be too messy and not very high-throughput. Does anyone have experience with RNAlater altering the transcriptome gene expression profile? I read these attached articles on normal RNAlater and the influence it can have, but they only submerged whole tissues instead of ground tissues. What would be a good alternative to RNALater and RNALater-ICE? I am still inbetween using a 96-well plant RNA extraction kit vs CTAB RNA extraction. Would submerging the homogenized xylem in for eg. a kit's lysis buffer greatly affect the RNA integrity? Thanks in advance!
Relevant answer
Answer
Hi
RNA later is generally used when the whole the integrity of the RNA inside the tissue needs to be maintained, if it is not possible to freeze in the Liquid Nitrogen, immediately. It is possible that RNA later may also cause problem in RNA isolation, if not removed properly. Its removal is relatively easy from the surface of whole tissues, than if it is added to the homogenized tissue. For your experimental needs, the TRIZOL/TRI reagent may be a better good option. It maintains the quality of RNA in the homogenized tissue. And it should be useful of high-throughput isolation of RNA
  • asked a question related to Transcriptomics
Question
3 answers
I want to do RACE and I don't have any previous exposure. I am planning on using RLM-RACE kit invitrogen. But I am confused in a few places:
1. Should I choose the 5' or 3'? What criteria should I base my decision on?
2. After use (5' or 3') how do you store the product?
3. For sequencing do I clone it? Or do I give the product directly? I read that for the NGS-based approach there is no need for cloning.
Relevant answer
Answer
Pragya Tiwari Mam i am also going to use the same kit to do 3` RACE.I am planning to do pcr using 3` outer primer and 5` gene specific primer. Can i amplify my 3 kb size gene of interest using this 3` RACE only.Kindly reply
  • asked a question related to Transcriptomics
Question
5 answers
Hello dear fellow scientists,
I would like to ask some basic naive questions:
1) when scientists perform a transcriptomic study, lets say to compare a mutant to a Wild type plant, they tend to look at the genes that are at least 2 times more or two times less expressed between the two samples, why not all genes that are differentially expressed between the two genotypes? is it because it is more reliable ?
2) Usually when you perform a transcriptomic and a proteomic study (on the same sample and same conditions) you only find a low number of genes that show the same expression pattern (up-regulation or downregulation) between the two experiments, why ??
I did a transcriptomic and a proteomic study on a mutant and I found a small overlap between the differentially expressed genes and the differentially expressed proteins,
I mean its not surprising overall but I can't think of an explanation,
is it related to the degradation of transcripts ? post-translational regulations ?
I hope my questions are clear..
Sincerely
Relevant answer
Answer
Yes indeed... Polyadenylation controls many things, but mainly the stability of the transcript. It is well known in mammals that for instance those proteins involved in signal transduction have the tendency to harbor weaker polyadenylation signals, which conditions the stability of the mRNA and the protein levels.
  • asked a question related to Transcriptomics
Question
4 answers
Hi!
I have jellyfish samples (gonads and tentacles) preserved in ethanol and stored at -80º for about 2 years. I would like to know if I can use these samples to extract RNA for transcriptomics.
Thank you all in advance!
Relevant answer
Answer
Hi,
I recommend using NanoDrops and agarose gel electrophoresis to check the quality and integrity of your extracted RNA sample, so you can use it if it has good quality and integrity.
  • asked a question related to Transcriptomics
Question
2 answers
Hi!
In single cell droplet sequencing, 2 cell lysis buffer are often chosen: 0.5%CA-630, or 0.2% sarkosyl 160 + 6 % of the Ficoll PM 400. What is the difference of these 2 choice in RNA yielding, mRNA completence and etc.?
Thanks!
Relevant answer
Answer
The main consideration when choosing a lysis buffer is whether the chosen antibody will recognize denatured samples. When this is not the case, it will be noted on the antibody datasheet, and buffers without detergent or with relatively mild non-ionic detergents (NP-40, Triton X-100) should be used.
  • asked a question related to Transcriptomics
Question
6 answers
Hello,
I am looking to obtain global RNA-Seq data for either E. coli or P. putida. I assume RNA-seq data is publicly available for many microbes, but I am unsure where I can access this information. Does anyone have insight as to what website or database I can find this data?
Many thanks,
Shawn
  • asked a question related to Transcriptomics
Question
1 answer
Hi. I'm dealing with spatial transcriptomic data and find the gene of interest. Now we need to know what transcript isoform of the RNA was expressed in our sample. However, NCBI shows this gene has 3 isoforms while ENSEMBL only shows one. Thus we want to run spaceranger with the reference of NCBI, but 10X only provides the mice reference of ENSEMBL. So I downloaded the gff and fna file from NCBI, transfered the gff into gtf, then generated the reference directory as taught in the spaceranger tutorial. But spaceraneger can not work with this reference directory. It just crashes in the middle of the process. Did I do something wrong when generating the reference? Or does anyone have the mice NCBI reference for spaceranger?
Relevant answer
Answer
Hi Kleran
I think you followed the support (https://support.10xgenomics.com/spatial-gene-expression/software/pipelines/latest/advanced/references) and I see 2 points that could be at the origin of your problem:
- the first one is the genome you selected, is it mm10 genome?
- the second one is that data you downloaded must be compatible with STAR aligner, a point you need to dig in...
all the best
fred
  • asked a question related to Transcriptomics
Question
4 answers
Our lab has sent rat cardiac tissue for sequencing and have obtained indigestible fastq data files. Is there a software I can use to organize these fastq sequence files in order to obtain meaningful results?
Relevant answer
Answer
You may follow this pipeline.
Let me know if you are interested to outsource to my lab www.eminentbio.com
  • asked a question related to Transcriptomics
Question
4 answers
Differential Expression tables in R - transcriptomics
I want someone to explain to me please, what are the de (differential expression) tables are in RNAseq experiments.. I know they contain the P-value, adjusted P-value and log2fold.. But I am confused about what are these values measured for?
For example:
I have 231 sample, but they are collected according to age, bmi, and sex. so, the de age table is different than the de bmi and de sex.. although the entry numbers are the same, BUT, the p, p.adjust and log2fold values are different.
Can somebody explain to me why??
Relevant answer
Answer
Differential expression of Transcriptome means the difference between the expression of genes whether up-regulated or down-regulated. The read counts are deduced by mapping followed by gene annotation. Read counts means the number of time a sequence overlap to the genomic feature such as gene or transcripts. These all read counts of given samples (eg. control and treatment)are provided to the respective tool in form of a matrix which is then compared by log base 2 (control /treatment). P-value is the probability of occurrence of the test randomly. A P-value less than 0.05 is considered best.
  • asked a question related to Transcriptomics
Question
2 answers
Hi everyone,
At the moment I am designing a spatial gene expression experiment using the 10X Visium assay. There are a few papers out there that have used this assay. There are also several packages available to analyze the data (e.g. Seurat). However, if I am correct, none of these methods take biological replicates into account. In other words, is it possible to align different slices of biological replicates and then perform differential expression analysis to compare conditions?
Relevant answer
Answer
Honestly, this tech is still in its infancy, and is also _suuuuuper_ expensive. Most people are hard-pressed to make use of the gigabytes of sequence data they get from a single experiment, let alone take that forward to N=3 or more.
That isn't to say what you're proposing is impossible, or even impractical, but more to suggest that for such specific questions there are usually cheaper, more specific solutions. You absolutely can compare multiple RNAseq datasets, and adding a spatial component to this should also be possible provided your segmentation/tissue designation is solid enough (or you have cell-specific markers to aid comparisons), but really...I think you need to carefully formulate your question and work out exactly what this very, very expensive (but very, very neat) approach can do that other more conventional methods cannot.
Also, obviously I am jealous of your budget and resources, but that kinda goes without saying. ;-)
  • asked a question related to Transcriptomics
Question
5 answers
I am looking for ideal configuration details for a workstation to perform a metagenomic, transcriptomic and whole genomic analysis.
Relevant answer
Answer
Hi,
It depends on the size of the datasets in your study. e.g. for metagenomic data analysis, you can analyze your data with a limited memory(like qiime2,metaquast,dada2,etc), but for transcriptomic data analysis, most of the program (STAR, bowtie, bwa, etc) require a substantial amount of memory for mapping, indexing of the genome (specially Human, Mouse). For example, STAR aligner requires minimum 150GB of RAM for indexing human genome. Currently we are using Dell Power Edge (R940XA) rack server for our transcriptomic, metagenomic study. Briefly, the server has Intel Xeon Gold 5220 processor with 72 cores (number of core per process:18, 4RU,), 1 TB RAM (DDR4 SDRAM with ECC), 24.75 cache memory (L1+L2+L3), 10TB storage with RAID configuration.
Regards;
Anupam
  • asked a question related to Transcriptomics
Question
3 answers
Say, we'd like to publish an experimental paper in which a certain metabolic pathway is investigated from the points of various methods, such as RNA-Seq, mass spectrometry, enzyme assays, gene knock-out, etc. RNA-Seq is used for the analysis of differential expression of genes encoding the enzymes related to the pathway, so only a few tens (out of thousands) of differentially expressed genes are discussed in the paper. Nevertheless, we have to publish the full set of RNA-Seq raw reads since virtually any journal requires sequencing data availability.
There are no problems with uploading our reads to the SRA database and inserting an SRA accession number into the manuscript. But we'd like to analyze the rest of our RNA-Seq data and write one more article to publish elsewhere (without overlapping the aspects discussed in the first paper). Thus, we'll upload the reads in the SRA once, and then refer to the same accession number in two different articles. Is it OK? Is it ethical? Are there any copyright issues to face?
Relevant answer
Answer
Adding to the above points, this is a common practice and often adopted by many researchers, however, the problem only arises when publishing the research without providing any data to the public repositories.
  • This is done in order to protect the data from others in fear that they could use the data in their studies before the author/owner of the data.
  • This practice is not good and should be discouraged. Many journals/editors/reviewers failed to prevent this and let the author publish without providing the data for public access. This is a huge problem in the way of transparent and open/reproducible science.
  • It is basically the responsibility of journal to take care of.
  • asked a question related to Transcriptomics
Question
1 answer
Hello everybody,
I am PhD student and i am working with a nonmodel tree under drought stress. I want to know if i can use GSEA in my experimental design with my nonomodel plant. One of my experiments consist in three control plants and three stressed plants. I took total RNA and performed a RNA Sequencing, "de novo" assembly and DE analysis, thus I have about 190.000 genes with its normalized counts (TMM). I could create the files: data set (.gct) and phenotypes labels (.cls) .
1) But I can´t or I don´t know how to create a Gene sets file (.gmt) matched with GO terms because my IDs data set file comes from Illumina, they are like: c0001_g1. And there is more,
2) I do not quite understand if it is necessary to have a chip necessarily to run the analysis.
I would be very grateful in an answer adapted for biologists not specialists in bioinformatics.
Thnaks
Edgardo,
Relevant answer
Answer
Hi Edgardo,
Technically, you still can perform GSEA even though you are using non-model plant. Some of the necessary steps/files are as follows:
1. a gmt file. I don't reckon that there is a readily available gmt file for your plant. So, you may be able to use the gmt file of the closest plant relative to your species. This publication (https://pubmed.ncbi.nlm.nih.gov/23632162/) and this database (http://systemsbiology.cau.edu.cn/PlantGSEA/download.php) may help.
2. You will need to map your gene ID to match that of the gmt file. To do so, you will need to annotate the transcriptome which you have assembled. I am not familiar with the annotation process, but this manuscript may help ( ). After the transcriptome annotation, you can map your transcripts to genes. If the gene ID matches that of the gmt file, the files can be used directly for GSEA. If not, you will need to remap the gene ID before GSEA. BiomaRt (https://plants.ensembl.org/biomart/martview/76f17c602cf7f4cb9adc75e9a97c87df) may be helpful in this case.
I hope this helps.
Regards
Hong Sheng
  • asked a question related to Transcriptomics
Question
3 answers
Transcriptomic experiments was conducted and I obtained 355 differently expressed genes. Besides, a number of enrichment was analysed such as NOG, KOG, COG, GO, KEGG etc. There are about 10 plots generated. However, which plot should be shown in a scientific article?
Relevant answer
Answer
Anything which makes sense and provide/help in visualizing the thing you write and describe in paper.
I guess if you are the author, it should be you to decide what to put in your paper. If you are not sure, ask co-authors, if you have any.
  • asked a question related to Transcriptomics
Question
4 answers
I have RNA-Seq data for different cell lines and I'm looking to find lncRNAs which maybe deferentially expressed.
Relevant answer
Answer
Is there any method to work with NONCODE in R?
  • asked a question related to Transcriptomics
Question
1 answer
Hi all,
I am collecting blood from human donors on TEMPUS tubes for RNA stabilization. After RNA extraction, we want to use the tubes for different transcriptomics downstream applications that will take place in different labs.
I was wondering if aliquoting the blood/TEMPUS buffer mix right after collection was a viable option to optimize shipment of the samples. The workflow would be as follows: collection of blood on TEMPUS tube, thorough mixing/vortexing to ensure complete mixing of the blood with the TEMPUS buffer, then aliquoting of the entire contents of the TEMPUS tube into three Falcon 15mL, and storage at -80°C before shipment.
Has anyone ever done this or something similar? I may be paranoid but I am worried that the tube itself might be optimized for RNA preservation (e.g. special coating of the glass...). Better safe than sorry!
Relevant answer
  • asked a question related to Transcriptomics
Question
4 answers
I need to perform ligand-receptor interactions map for the data of bulk RNA sequencing (mouse). In all methods which I found they want to have matrix with columns of gene symbol and mean expression values for each cell type. I have only tsv files with metadata and counts. Do you know how to get this from the data I have. Is there any R library/protocol/tutorial for that? Which method you suggest for obtaining receptor-ligand Interactome for bulk RNA?
Here is how my metadata looks like:
id nCount_RNA nFeature_RNA PercentMito ERCCCounts PercentERCC Animal Plate
X11_E1 569589 11505 0.00331115945006 20 3.51E-05 11 11 X11A10
.......
Birthdate Gender Organ CellType RowID ColID
old Female BM GMP E 1
.......
Counts:
gene X11_E1 X11A10 X11A12 X11A3 X11A5 ........
Gnai3 23 4 22 25 94 ..........
.......
  • asked a question related to Transcriptomics
Question
2 answers
I tried to run this function sitetest to perform Site-level Differential Methylation Analysis using IMA package but I got error message.
sitetestALL = sitetest(dataf,gcase="KO",gcontrol="WT",testmethod ="wilcox" ,Padj="BH", rawpcut = NULL,adjustpcut =NULL,betadiffcut = NULL,paired = FALSE) and I got this error message: Error in wilcox.test.default(x[1:length(lev1)], x[(length(lev1) + 1):(length(lev1) + : not enough (finite) 'x’ observations
Can you help me to solve this problem?
Relevant answer
Answer
Hi
I suggest using if and else in lapply.
for example:
if(nrow(coulmn1)> 30) {
x <- with(data, cor(a, b))
}
else {
x <- 0
}
This is a good solution when the number of samples are small.
  • asked a question related to Transcriptomics
Question
2 answers
Hi,
I have raw data from [HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array that I want to process using the Expresso function for Affymetrix microarrays.
My samples include tumor tissues and matched adjacent tissues.
I am planning to use the RMA method which includes RMA+Quantiles+pmonly+median polishment, but it would be great if you share your experience with me. Which methods would you prefer to combine according to your statistical experience in this field?
Background correction Options:
  • Affymetrix MicroArray Suite (MAS)
  • Robust Multiarray Analysis (RMA)
  • None
Normalization Options:
  • Quantiles
  • Lowless
  • Cubic Spline (Qspline)
  • Invariant set
Probe match correction Options:
  • Perfect-Match (PM) only ["pmonly"]
  • Subtract with Mismatch (MM) ["subtractmm"]
  • Affymetrix MicroArray Suite (MAS) ["mas"]
Values presentation Options:
  • Average Difference ["avgdiff"]
  • Li & Wong (2001) outlier removal ["liwong"]
  • John Tukey median polishment ["medianpolish"]
Thank you in advance,
Sevcan
Relevant answer
Answer
I appreciate Sevcan Atay for the valuable topic. Would be interested to know as well.
  • asked a question related to Transcriptomics
Question
4 answers
As we know in nucleic acid extraction/purification process using a young plant material is better than old ones. It would be resulted better nucliec acid purity because old plant material has higher sugar and phenolic compund than younger material.
However i did not know It would be affected to transcriptome profile or not? Since the expression of several gene might be different in old and young plant material.
Does anyone have an idea?
Relevant answer
Answer
Hi Mabrur, it is a very well-known fact that transcriptome expression is highly variable and is age-specific, time-specific, gender-specific, organ / tissue specific, environment-specific and also differs individual to individual and the same matters for plants too. It all depends what you are trying to find in the transcriptome and does that entity have any impact on its expression profile due to any or all of the above-mentioned factors and based on this decide whether it is better to go for transcriptome mapping or transcriptome assembly in your case. Also you need to take in account several other factors like adopting proper extraction protocols, yielding good proportion of the transcriptome you are interested (coding / non-coding / whole), good library preps, and using proper transcriptomic controls for judging asay variations, use substantial amount of replicates (biological and technical both) etc. Only then you can rely on the results
of the transcriptome experiments.
Regards...
  • asked a question related to Transcriptomics
Question
10 answers
What do you think about the balance between exploring widely different designs vs. local optimization at different levels of biology (genomics, transcriptomics, proteomics, anatomy, etc.)? Which levels are more or less modular or plastic?
In the endocrine system, for example, one feels that having tropic hormones (i.e., those controlling the release of other signaling hormones at other glands) may offer a finer and perhaps more robust regulation, compared to a being where all hormones were non-tropic. However, the anatomic location of elements in these networks is not trivial. For example, in the renin-angiotensin-aldosterone system, renin is produced in the kidney, and aldosterone eventually exerts its effects in the kidney as well. However, the intermediate step by angiotensin-converting enzyme (ACE) mainly occurs in the lungs, which could introduce a delay in the regulation.
Do we have good explanations for the sites of production and action of different hormones in the body? Are there common principles to be learned as optimized by evolution in this respect? Or are happenstances/contingent evolution stronger determinants?
Thank you for sharing your thoughts!
Relevant answer
  • asked a question related to Transcriptomics
Question
7 answers
Dear colleagues!
I am trying to figure out how to do an extraction for soil microorganisms for further metabolomic analyses. I am only interested in the microbiota part, so I would like to discard any organic matter present in the samples.
Do you know any useful method for that? Any suggestions that could help me?
Thanks for your help!
  • asked a question related to Transcriptomics
Question
4 answers
I want to use salmon tool to quantitate the transcripts coming from different tissues. All the transcripts I've found seems to be an assembled reference.
I think it would be easier to find tissue-specifc references as this can potentially make the analysis more robust!
Relevant answer
Answer
The ideal is you produce your own control (ideally, at least three replicates to each condition + three replicates of control samples), even if you find fastq files of tissue transcriptomes.
The library building process and the sequencing itself can introduce some "biases". So, if you have all sequencing samples produced at the same time with the same conditions, they will be more comparable and your results more reliable.
As Amir Vajdi said if you have no disease or stress or other condition and you want to compare the expression between tissues, you can compare one tissue to the other.
  • asked a question related to Transcriptomics
Question
3 answers
I have a list of LipidMAP IDs for a bunch of metabolites and I need them all converted to Human Metabolite Database IDs. I have over 300 entries so I need a way of converting these IDs in bulk, all at the same time. I have tried The Chemical Translation Service but this doesn't seem to have updated LipidMaps entries as it doesn't find matching HMDB IDs for a lot of the LipidMAP IDs. Does anybody know of a software or service that allows conversion of LipidMAP IDs to HMDB IDs in bulk?
Relevant answer
Dear Frankie;
I usually use a trick on MetaboAnalyst web server (www.metaboanalyst.ca). I dosent have direct conversion system. therefore, put the names in the enrichement section. The database will offer you most of the HMDB IDs.
Browse to the homepage
There is a red line " Welcome>> click here to start << "
Then use the "enrichment analysis" section.
Best
  • asked a question related to Transcriptomics
Question
3 answers
I am trying to optimize a spatial transcriptomics assay and I have to validate the tissue permeabilization with in-house-printed slides before buying the final ones. But I am experiencing problems with Codelink protocols. It seems that I am not printing any probe on the slide therefore I can not see the RNA footprint after the tissue permeabilization.
I ordered an biotilinated amine-modified oligo but I can not detect it, therefore I think that I am not actually printing. If somebody have used this protocol before, can please tell me the critic parts which I have to pay attention to?
Relevant answer
Answer
Our CRO lab (Arrayjet Ltd.) have successfully printed oligos on Codelink slides for use in spatial transcriptomics projects, mentioned in this Nature Comms paper: https://www.nature.com/articles/ncomms13182
The slides do have a limited shelf-life, so your problem may lie there.
If either of you are still working on this, I'd be pleased to arrange a call/webchat with our Application Scientist and Project Manager who extensively optimised the printing protocol.
I can be reached directly at: shawkings@arrayjet.co.uk or simply drop me a line here and I'll do what I can to help.
Best wishes,
Sam
  • asked a question related to Transcriptomics
Question
5 answers
I have been trying to check the supplementary file for this paper entitled "Transcriptomic and proteomic analyses of the pMOL30-encoded copper resistance in Cupriavidus metallidurans strain CH34" on the journal web page. Unfortunately, the journal didn't provide it. The paper doi number is
Please help me in this regard.
Thanks
Relevant answer
Answer
Frederic Lepretre Probably. By the way, I found the corresponding author's email by the google search and got the information that I need.
  • asked a question related to Transcriptomics
Question
3 answers
I have obtained CNV data from the TCGA GDC portal. The data is barcoded and is difficult to understand. I have checked for different annotation tools like CNVTools, PennCNV, QunatiSNP.... Can anyone suggest which will be a better annotation tool for annotating CNV data from GRCh38.p0 Genome build???
Relevant answer
Answer
We have used multiple tools to analyze TCGA data. Links have been attached in the paper for easy reproduction. It will help for a good start with TCGA
  • asked a question related to Transcriptomics
Question
4 answers
Can you recommend a good review on methods for transcriptome analysis? In our lab we expand human T cells and magnetically seperate cells being positive for our target protein. Now, we want to compare transcription status between these cells and control sets. For this, i'm looking for a good review comparing different transcriptomic techniques (eg. singe cell RNA seq., microarrays, ht-RNA seq etc.) with a special focus on costs, time requirement, advantages and limits. Many thanks and kind regars, Marc
Relevant answer
Answer
Hi colleague
Find the following URL, may help you:
Regards..
  • asked a question related to Transcriptomics
Question
5 answers
We recently identified a novel transcriptional isoform of a gene in brain. It's endogenous expression is very low compared to the annotated one. Exon 1 of the gene is missing, and a portion of a long terminal repeat (33bp) spliced into Exon 2. Thus, the first ATG for this new isoform is found in Exon 6 due to the loss of Exon 1. my question is: 1. is the new isoform translated into protein? 2. if not, how can we test it is a non-coding RNA? 3. If it is translated, how can we test the protein it makes.
Thank you very much!
Relevant answer
Answer
northern blot analysis
  • asked a question related to Transcriptomics
Question
2 answers
We are planning to do RNA-Seq for RNA extracted from two types of samples:
  1. Routine snap-frozen mouse fetal tissues
  2. Laser microdissected tissue sections (FFPE sections and/or cryosections)
We only need gene expression profiling, not any deeper data. We are considering Lexogen Quantseq and Qiagen UPX sequencing. UPX is cheaper but not sure if it has been applied for this type of samples. Are there other methods worth considering?
Relevant answer
Answer
Informatics analysis would be the limiting factor. If you are going with commercial sequencing, make sure it is included.
  • asked a question related to Transcriptomics
Question
3 answers
I will start a study using peripheral cells in blood samples of new coronavirus infected individuals in São Paulo, Brazil. The aim of the project is to perform epigenetic and transcriptomic analyzes in these patients. However, is necessary to inactivate the virus first. For this, we intent to use Biomerieux lysis buffer. Can this inactivation process affects the analysis?
Thank you for attention.
Relevant answer
Answer
I will also recommend following the recently published article in Nature journal.
  • asked a question related to Transcriptomics
Question
5 answers
I'm working on transcriptomic data from Physcomitralla patens mutants, and would like to check differentially expressed genes lists for functional clustering, enrichment an so on.
The issue is, I used the genome assembly and annotation from Phytozome, so my gene IDs are not recognized by any GO analysis platform. I also couldn't convert my IDs to any recognizable dene IDs.
For most genes I have Gene Ontology IDs, though.
Is there any platform that allows to start such analysis with GO IDs and not gene IDs?
Thanks to everyone!
Relevant answer
Answer
You can use a simple script which I have written in python here for GO enrichment with a list of input genes:
(just run it in google colab)
  • asked a question related to Transcriptomics
Question
8 answers
I developed a time-course study of kidney fibrosis and evaluated the expression of nominated genes using real-time PCR. Evaluation of genes expression during time-course demonstrated oscillatory patterns of expression in both sham and treated mice groups, now my question is how can I interpret the oscillatory pattern of these genes. I have 5 diagrams with different oscillatory pattern and I'm not sure how to discuss them.
Relevant answer
Answer
Hi again, Ali Motahharynia
I think you should plot your points using error bars with replicates. Unfortunately, you can not conclude anything because you don't have replicates (at least three per time point).
Now, assuming that your points are the average of some replicates, what is exactly your question? I mean, they look similar in their evolution, but I don't know what you want to know about them. Could you be more specific in your question?
And finally, an advice for the future, please put names in your plots.
Regards,
  • asked a question related to Transcriptomics
Question
4 answers
We want to run 10x Genomic Visium spatial transcriptomics on lung tissue from COVID-19+ patients. How can we inactivate the virus so it's safe to work with the tissue at BSL2? The first step is to freeze the tissue in a isopentane/liquid nitrogen bath. After putting the tissue on the slide it is incubated in 100% methanol for 30 minutes at -20 degrees. Would either of these steps inactivate the virus?
Relevant answer
Answer
Hi,
I Think 100% methanol fixation would be useful for getting rid of infection causing virus. Apart from that you could try fixing tissue by using 4% PFA or 10% Neutral Buffered formalin. That would be effective! Also you may try working with RNA later. It should destroy viral protease
  • asked a question related to Transcriptomics
Question
3 answers
When you assembly a transcriptome with Trinity, for example, only one final fasta is created with the transcriptome. The de novo transcriptome assembly does not assemble transcripts for the separate alleles, and usually there is only one transcript generated and it is mapped to both alleles. Is there any software that allows assembly de novo and with reference genome transcrips for separete alleles?
Relevant answer
Answer
Thanks a lot for your response Karol and Ireneusz!
  • asked a question related to Transcriptomics
Question
5 answers
I have been asked to discover what are the genetic causes that allow Moloch horridus to be able to drink water through the skin and the change of colour. There was no genomic information about this specie, so we have sequenced it, assembly and annotated structurally (thanks to ab initio and transcriptomics approach) and functionally through GO terms with BLAST2GO.
However, we have to use comparative genomics in order to identify the genes. We thought of using 1 to 1 orthologues because the most part of these kind of projects use it, but if we are comparing close species that do not share this property I don't see the point in looking for them.
Another doubt I have is about the study of expansion or reduction of family genes and the use of a phylogenetic tree. And the last thing is about enrichment of GO terms, I would like to know why is it useful. Thank you so much
Relevant answer
Answer
Got it. Regrettably, I have not had sufficient time to explore positive selection, only getting as far as installing and implementing PAML, which is a useful tool for such measurements. Although we tried an analysis, we did not use sufficient rigor for me to speak knowledgeably besides to say this seems like a useful way to distinguish orthologs.
  • asked a question related to Transcriptomics
Question
7 answers
Bivalve specimens are preserved in 70% ethanol for about 6 months. What is possibility of being able to extract DNA/RNA of bacteria that was previously consumed?
Relevant answer