Science topic

Microarray Analysis - Science topic

Explore the latest questions and answers in Microarray Analysis, and find Microarray Analysis experts.
Questions related to Microarray Analysis
  • asked a question related to Microarray Analysis
Question
1 answer
I am performing an Affymetrix microarray analysis and aiming to identify differentially expressed genes. I have a list of differentially expressed genes after my analysis, however, there are some probe sets which are mapping to a single gene. For example, probe sets 209201_x_at, 211919_s_at, and 217028_at are mapping to CXCR4 with 3 different expression values.
What is an appropriate method to select a specific probe set if I want to identify differentially expressed gene? Is averaging the expression values of the probe sets for a single gene works?
Many thanks!
Relevant answer
Answer
This is not that simple, there are various approaches, each having advantages and disadvantages. Maybe these paper will help to solve your problem.
  • asked a question related to Microarray Analysis
Question
6 answers
Good day, dear colleagues!
Can I use LogFC values for co-expression analysis?
We study the role of RPOTmp - the dual targetting (mitochondria and plastids) single-subunit RNA-polymerase in plants - for this reason our lab made various transgenic plants with altered expression of RPOTmp and conducted two-channel DNA-microarray experiment.
What I'm trying to estimate - are there any genes that are co-expressed with RPOTmp? Or clusters of genes that are co-expressed in response to retrograde and anterograde signals made by altered RPOTmp expression.
So there's likely no any sense to perform the enrichment analysis using a table of expression values of the lines and wild type (although near every package stated that).
Relevant answer
Answer
Dear Igor
This statement "the gene is expressed constitutively, so there are no co-expression nets that have this gene" makes no sense. Constitutively expressed genes can be co-expressed with other genes. For instance, we would expect a high correlation in the expression of a set of house-keeping genes. Maybe you need to take some time to understand the concepts before doing the actual analysis. I think the question is not about the co-expression, but whether some specific genes show alteres expression in the different genetic backgrounds. I would then employ a linear regression on the normalized expression values. Your hypothesis would that expression level of gene Y is dependent on the expression of gene x. You could employ the expression of known co-expressed genes as a continuous variable to estimate Y.
  • asked a question related to Microarray Analysis
Question
4 answers
" Error in `.rowNamesDF<-`(x, value = value) : invalid 'row.names' length "
this error was showed me when I was trying to download gset data in R program but its seems there are some problems.
> gset <- getGEO("GSE77182", GSEMatrix =TRUE, AnnotGPL=FALSE, destdir ="data/")
Found 1 file(s)
GSE77182_series_matrix.txt.gz
Using locally cached version: data//GSE77182_series_matrix.txt.gz
Rows: 59899 Columns: 6
-- Column specification ---------------------------------------------
Delimiter: "\t"
chr (1): ID_REF
dbl (5): GSM2045612, GSM2045615, GSM2045616, GSM2045618, GSM2045620
i Use `spec()` to retrieve the full column specification for this data.
i Specify the column types or set `show_col_types = FALSE` to quiet this message.
Using locally cached version of GPL21369 found here:
data//GPL21369.soft
Error in `.rowNamesDF<-`(x, value = value) : invalid 'row.names' length
Relevant answer
Answer
Simran Venkatraman
Premanand A.
Jagajjit Sahu thank a galaxy dear friends
  • asked a question related to Microarray Analysis
Question
4 answers
I have a dataset with 149 columns as GSM ID's and first columns as a Gene Name (Screenshot attached). Total 20,000 rows (Genes) are present. How can I analyze the dataset to find the Biological pathways using KEGG or other pathway database.
All GSM ID's are lung cancer micro array expression data.
I know how to do Differential Gene Expression data and pathway analysis but don't know how to analyze this type of datasets.
Also comment if you feel that dataset is not correct or cannot be used to find the pathways or other information is required.
Any help will do great. Thanks in advance
Relevant answer
Sir can you share a tutorial or workbook or something like that. I can work in R, Python and Linux.
Thanks
  • asked a question related to Microarray Analysis
Question
4 answers
for PCR analysis and western blot, is there a minimum required amount of average expression level of DEGs? For example, more than 200 or 500. I have various answers to that question and I am a little confused.
Relevant answer
Answer
For PCR an expression value of > 200 is sufficient, I think. Here it is not a question of Fold Change (FC): The same FC with an expression value of 10 vs. 20 would be hardly detectable with PCR or other techniques. With expression values of 200 vs. 400, this is surely much better (identical FC of 2).
  • asked a question related to Microarray Analysis
Question
1 answer
I have a microarray data and for analysing that I am using r software. I have a cancer data for each time pt.
I have 5 samples for to
5 samples for t1
4 samples for t2
3 samples for t3
All r disease model...I want to do microarray analysis to find out the differential gene for each time pt. Compare to t0. I want to take t0 as a reference for each time pt.. s
So how can I make contrast matrix for that.
Relevant answer
Answer
What is the software that you are planning to use for the analysis? Depends on that how you write the contrasts would vary. If you plan to use "limma" (https://www.bioconductor.org/packages/devel/bioc/vignettes/limma/inst/doc/usersguide.pdf) read the user manual and it is easy to understand. As long as you have 2 or more replicates you can do differential expression.
Regards,
Shajahan
  • asked a question related to Microarray Analysis
Question
3 answers
I have a question about "network score" of IPA Network analysis. In many papers, the top 5 networks were listed in tables, while in these tables some network scores are high (around 50), but others are low (less than 20). We use the same method for network analyses, and got the impression that we can see tight association between genes when "the network score" is higher than 40. However, we have not found literature discussing the meaningful "network score" (we found one paper described that “the networks are selected if their score is higher than 21”). We would appreciate it if you could let us know information about such a meaningful network score or your impression/experience of the network score (for example, did you see tight association of genes when the network score was less than 20?).
Relevant answer
Answer
Anyone have the actual citation for these? They are dead links all of them!!
Kind Regards
R,
  • asked a question related to Microarray Analysis
Question
1 answer
I tried to find the data of all together from GEO but I couldn't, so what if I got the data of the breast cancer cell lines which are MCF7, MDAMB321 and SKB3R. Then, I got the data of the gene I want to check, which is HK2 and do the microarray analysis through R studio to check the differential gene expression of KH2 among the cell lines.
Relevant answer
Answer
As far as I know, there is no data portal for GEO allowing the simultaneous analysis of multiple microarray datasets fitting particular criteria (such as breast cancer cell line). You will probably have to identify the datasets of interest, download the microarray data, then apply an R package for microarray analysis on each dataset, then analyse the combined results. GEO does offer a web analysis tool for many microarray datasets though (GEO2R) which runs the R code for you.
  • asked a question related to Microarray Analysis
Question
4 answers
Dear fellow Researchers,
I am currently trying to analyze Affymetrix microarray data through dChip software and I have the input files - probe sequence and CDF for Rat 230 2, yet facing issues in obtaining expected results. Could anyone please help me out if gene info file is much necessary (as only CDFinput is mentioned as mandatory as per the protocol I have) and where to obtain them?
Thank you in advance
Relevant answer
Answer
You can search for your question through the following link:
  • asked a question related to Microarray Analysis
Question
1 answer
I would like to perform differential expressed Genes analysis of a NimbleGen data. I have dataset of 48 .pair files and 48 .calls files. 
1) Can I perform DE genes analysis only with these data without using oligo package? ( my data contains single channel only 532 output) 
2) what is the appropriate method for getting differential expressed genes?
3) When I transformed my pair info to xys file by extracting X-Y and signal values, those results are not accurate. The genes that were shown to be DE are not correlated with my experiment conditions.
Please help me 
Thank you in advance
Best Regards
Tunc
Also want to add : Our pair and call file don't have header. That is why, we don't know any thing about the NDF file. We do know that our chip is 100718.hg18 but we don't know the correct GPL file. In the lab method, it was reportad that Nimblegen Human Expression Array 12 x135K chip used.
Relevant answer
Answer
Hi Tunc Morova , I am facing a similar issue right now, analysing the PAIR files. Did you have any luck?
  • asked a question related to Microarray Analysis
Question
3 answers
Does anyone know of a source for microarrays to study tRNA expression? I was told microarrays.com provided these. I have asked them directly but no reply so far.
Relevant answer
Answer
tRNA microarrays were initially developed to assess the aminoacylation status of specific tRNAs. Although it is true that tRNA deep-seq techniques have problems (mainly due to modified bases), microarrays also present serious limitations in terms of detection sensitivity. My advice would be to use one of the several strategies for tRNA library preparation and go for small RNA sequencing.
  • asked a question related to Microarray Analysis
Question
8 answers
On a given microarray design there are multiple different probes spotted for many genes. The (normalized) signals of the features (all referring to the same gene) often are quite different (log2 values can vary between 2 and 16, so essentially from "almost undetectable" to "completely saturated").
If a gene set analysis or an over-representation analysis is performed, there should be one value per gene.
How to select which signal to use for the gene? I don't feel good to take the average of all the multiple features, because they are often so different. Taking the highest signal only also seems to be wrong.
Any ideas?
The attached file shows a table with example data (from an Agilent Microarray) with 5 different probes addressing the gene "PRDM". The last 4 colums show the log signal intensities for 4 different samples. The values range from 3 to 10, so there is a more than 100-fold difference in the signal intensities between the probes.
Relevant answer
Answer
https://www.networkanalyst.ca/. NetworkAnalyst software may help you.
  • asked a question related to Microarray Analysis
Question
10 answers
hello .. 
I'm trying to analysis two different microarray datasets from different chips using web-based tool. 
i don't know how to do that .. should i use one off them only ? 
or should i combine them using some kind of algorithm ? 
thank you 
Relevant answer
Answer
You may use NetworkAnalyst software.
  • asked a question related to Microarray Analysis
Question
5 answers
Does anybody use a specific software (free) for the densitometric analysis of protein array data, or do you know how to add this tool in ImageJ?
Thanks
Relevant answer
Answer
  • asked a question related to Microarray Analysis
Question
5 answers
I have analyzed the dataset of GSE38132 from gene expression omnibus. The data is from cell line breast cancer ZR-75-1 which comprises of 9 conditions with 4 replicates for each condition making it 36 samples. I used limma R package to normalize the data (quantile normalization). I noticed a great change in a group where three replicates shows similar expression where the 4th replicate of the same sample is different from all other three. I confirmed the expression from raw data by cross checking with the probe id and found 1 replicate is different from all other three. As a double validation I checked the normalized sample deposited in NCBI-GEO and I found the same. Is this possible?
Please see the heat map first four are replicates from the same sample
Relevant answer
Answer
It is likely that the authors did not even try to check the sample and hybridization quality. To assess the sample quality you would need to have a look in the lab books or talk with the people who actually did the processing (might not be possible!). To assess the hypridization quality you need to create diagnostic plots of the raw data. I am not too deep into Illumina arrays. Boxplots of signal intensities, possibly startified by probe type (pos/neg controls, regular genes) should be a minimum. I don't know if spatial plots are possible and if unspecific/background signals are considered.
  • asked a question related to Microarray Analysis
Question
3 answers
I have two batches of samples which were collected during two time period. If I perform batch correction to remove batch effect will it affect the downstream analysis of gene expression studies?
Relevant answer
Answer
Batch correction has been described from biologist point of view in this paper: Unbiased data analytics for biomarker discovery in precision medicine.
In brief, better to have some internal controls and use comBat algorithms. If you don't have internal control, there are some algorithm too. See the paper.
  • asked a question related to Microarray Analysis
Question
3 answers
Generally in microarray differential expression analysis studies the lower bound for |logc| is chosen around 1 to make fold change 2 which sounds like a common sense. In other cases, when |logfc| >= 1 gives zero differentally expressed genes, logfc is chosen to get a "reasonable" amount of differentially expressed genes. It stands to reason, that a more rational way of choosing logfc would be to infer it from the microarray platform's accuracy or the quality of the hybridizations in the particular microarray-experiment or some other evidence-based criteria.
How to decide which logfc to choose?
Relevant answer
Answer
The fold change or log-fold change can be used as a measure of effect size in
high throughput experiments including gene expression analysis. However, the statistical significance obtaine from repetitions is another important number. One can used multiple testing adjusted p-values or false discovery rate values and ofte it is then advisable by plotting everything as a volcano plot, where log-fold change of each gene is on the x-axis and the log10 p-value (adjusted) is on the y-axis. I recommend the limma package in R to do such an analysis.
  • asked a question related to Microarray Analysis
Question
16 answers
Hello everyone,
Currently I am trying to do K - mean clustering on microarray dataset which consists of 127 columns and 1000 rows. When I plot the graph, it gives an error like "figure margins too large". Then, I write this in R console:
par("mar") #It will give current dimensions
par(mar=c(1,1,1,1) #Tried to update the dimensions
But; it did not work. So, can anyone suggest me another way of fixing this problem? (Attached the some part of code in below)
Thanks,
Hasan
--------------------------------------------------------------------------------------------------------------
x = as.data.frame(x)
km_out = kmeans(x, 2, nstart = 20)
km_out$cluster
plot(x, col=(km.out$cluster+1), main="K - Means Clustering Results with K=2",xlab"", ylab"", pch=20, cex=2)
>Error in plot.new() : figure margins too large
Relevant answer
Answer
  • asked a question related to Microarray Analysis
Question
11 answers
While examination of the differential expression of non coding RNAs from blood samples or cell cultrues and animal models, how many times should we repeat the microarray analysis experiments ?
Is the repetition of the microarray analysis change according to the examples? For example, we should repeat the experiments at least 3 times on cell culture model to identify non codings expression profiles, is that certain information?
Relevant answer
Answer
I think if a particular study has been done for quite few times and you just want to look for the some of the facts than you can chose few replicates.In addition to this it is important to find the truth first and then consider the numbers.
  • asked a question related to Microarray Analysis
Question
6 answers
Hi all,
I am running a Random Forest –Mean Decrease in Accuracy algorithm for feature selection on my Microarray data in order to use the selected genes as a classifier to discriminate between 2 classes of cell lines. I am having problems to interpret the output information given by the algorithm. It gives me a small list of selected genes and for each gene there is a Pearson correlation value, a fold change value and a q-value (False Discovery Rate) .
The variable “class” is discrete (normal vs disease), so what does the Pearson correlation mean in this case?
Should I take the q-value showed as a multiple test correction and give less importance, or exclude, the genes that showed a q-value (FDR) higher than 0.2 (or any other pre-determined cut-off for significance)?
I would appreciate any suggestion on how to interpret the results of a RF-MDA for feature selection algorithm.
Relevant answer
Answer
Hello Priscila,
Since you mentioned about using a GUI and not having a manual, you may try contacting the developers for the best explanation (possibly may have a look at the source code), to be exactly sure what the pearson coeffs and q values related to. In the mean time, you may have a look at the following studies where they have explained the importance of various correlation factors in RF-MDA:
Thanks
  • asked a question related to Microarray Analysis
Question
3 answers
I'm looking to take an Arabidopsis RNA-Seq differentially expressed gene set and search it against other publicly available RNA-Seq (and possibly microarray) experiments to find the experiments that found the most similar patterns of deferentially expressed genes.
Does anyone know if a tool that enables this has already been created?
Relevant answer
Answer
You can check out following two resources for gene expression analysis:
Another option is, you can browse datasets of Arabidopsis on GEO (https://www.ncbi.nlm.nih.gov/geo/). It host most publicly available datasets. It predominantly stores raw files from different transcriptome experiments. Once you have listed down GEO IDs of your interest, you can yourself analyze these datasets with click of button using portal GEO2R (https://www.ncbi.nlm.nih.gov/geo/geo2r/). It hardly takes minute and you can store result of each analysis you do. You need not have computational background to execute this program. Its as simple as plug n play.
I hope it helps !!
All the best.
  • asked a question related to Microarray Analysis
Question
3 answers
Hello,
I am trying to do normalization the data of GSE8397 with MAS5.0 by using R:
setwd("D:/justforR/GSE8397")
biocLite()
library(affy)
affy.data = ReadAffy()
However, the data used to 2 platforms: Affymetrix Human Genome U133A and Affymetrix Human Genome U133B Array.
The code gave me the warning message: "Error in affyio::read_abatch(filenames, rm.mask, rm.outliers, rm.extra, :
Cel file D:/justforR/GSE8397/GSM208669.cel does not seem to be of HG-U133A type"
So, how can I keep normalizing the data when they are in both U133A and B? Should I try another method of normalization (RMA or GCRMA?)
DO you have any ideas about this problem?
Thank you so much!
Relevant answer
Answer
Hi Phung,
I guess it's impossible to analyze these two types of affymetrix arrays by simple command lines. in fact U133A and U133B share only 168 probes among the 22k sondes in both designs. take a look at this post (https://www.biostars.org/p/283639/). in fact both designs are sold to be used as complementary and can't be compared.
fred
  • asked a question related to Microarray Analysis
Question
4 answers
As example i have download .cel file now how can i get the data regarding upregulation and downregulation? and what is the principal behind this values?
Relevant answer
Answer
Its a while since I used this. If you refer to micro array data, here is my script for HuGene-1_0-st-v1 arrays (easily modifiable to other versions). Im using R. Just download all .CEL files to a folder, and set this folder as your working folder.
library("oligo")
library("pd.hugene.1.1.st.v1")
celFiles <- list.celfiles()
affyRaw <- read.celfiles(celFiles)
eset <- rma(affyRaw, background=TRUE, normalize=TRUE, subset=NULL, target="core")
library(annotate)
library(hugene11sttranscriptcluster.db)
annodb <- "hugene11sttranscriptcluster.db"
ID <- featureNames(eset)
Symbol <- as.character(lookUp(ID, annodb, "SYMBOL"))
Name <- as.character(lookUp(ID, annodb, "GENENAME"))
Entrez <- as.character(lookUp(ID, annodb, "ENTREZID"))
theProbes <- exprs(eset)
df <- cbind.data.frame(theProbes, ID, Symbol, Name, Entrez )
write.table(df, "something.csv", sep=";" ) # Nice to have a spreadsheet of expression values
Now you can use LIMMA.
pData <- Here you have to make a .CSV file with phenotypic data, which is described with the dataset.
library(limma)
affy <- read.table(file="something.csv", sep=";", header=T, row.names=1 )
library("Biobase")
eset <- ExpressionSet(assayData=affy)
Make the design as you which, E.g:
design <- model.matrix(~ timepoint + ID)
fit <- lmFit(eset, design)
fit <- eBayes(fit)
topTable(fit, coef="something", adjust="BH", n=Inf, confint=T)
  • asked a question related to Microarray Analysis
Question
3 answers
I am trying to perform miRNA microarray analysis for a human cell line. Though the samples have good A260/A280 ratio (2 and above) and good RIN (most above 8) the A260/A230 ratio is low in many of the samples (as low as 0.3 in some samples).
So the question is, is it wise to proceed with these samples? can the A260/A230 ratio affect the quality of the microarray (I will use Agilent microarray chip)?
PS: The analysis will be done with a company and not in-house. Qiagen miRNeasy kit was used to extract the RNA, a DNASE treatment step was included.
Relevant answer
Answer
Dear Sherif
There is no need to bother about A260/A230 ratio when you have a good RIN value. If your miRNA purity and concentration is good you can proceed for sequencing. I have got a good miRNA microarray result even from FFPE tissue which got poor RIN value. All the best.
  • asked a question related to Microarray Analysis
Question
5 answers
I am planning to extract the whole serum protein from serum samples for further microarray analysis. what is your recommendations please. a good protocol is highly appreciated.
Thanks in advance
Relevant answer
Answer
Dear Pranita Kamble Waghmare,
Thank you so much for your answer. It is really helpful.
  • asked a question related to Microarray Analysis
Question
13 answers
Hi there
I would like to know and to get your feedback about you favourite Gene-enrichment analysis software based in a graphical environment, preferably on-line.
Please give me your feedback about what you like most !... Thanks
All the best
Paco
Relevant answer
Answer
Dear Francisco,
For quick GO enrichment in multiple gene lists at once, I use g:Cocoa from g:Profiler:
Its main problem is that usually the analysis goes too deep into the GO annotation and the results might be hard to "digest".
G:Cocoa also allows other enrichment analyses such as KEGG pathways.
For broad functions enrichment I use GOTermMapper:
Another useful online tool for GO enrichment is Metascape:
To analyse protein interactions / genes network from a list of genes I use Genemania:
Best,
Gautier
  • asked a question related to Microarray Analysis
Question
5 answers
Can anyone suggest me collection numeric variables for deep learning? I have set of features (DEGs with their fold change value) from micro array. I want to prepare training as well as test set for deep learning. For this atleast three numeric variables are needed, any suggestion?
Thanks in advance
Relevant answer
Answer
Machine learning works on a simple rule – if you put garbage in, you will only get garbage to come out. By garbage here, I mean noise in data.
This becomes even more important when the number of features are very large. You need not use every feature at your disposal for creating an algorithm. You can assist your algorithm by feeding in only those features that are really important. I have myself witnessed feature subsets giving better results than complete set of feature for the same algorithm. Or as Rohan Rao puts it – “Sometimes, less is better!”
Not only in the competitions but this can be very useful in industrial applications as well. You not only reduce the training time and the evaluation time, you also have less things to worry about!
Top reasons to use feature selection are:
  • It enables the machine learning algorithm to train faster.
  • It reduces the complexity of a model and makes it easier to interpret.
  • It improves the accuracy of a model if the right subset is chosen.
  • It reduces overfitting.
  • The following methods can be used for feature selection:
  • a. Filter methods (like LDA, ANOVA, Chi-square, pearson correlation)
  • b. Wrapper methods (forward selection, backward selection, recursive feature elimination methodsp)
  • asked a question related to Microarray Analysis
Question
16 answers
Happy Sunday everyone,
I am trying to calculate row - wise mean and variance in R and then I will sort them. I used to "Absent/Present" calls from the Affymetrix algorithm to flag genes with questionable expression levels, but there are many NAs in the dataframe. So, I have to remove those genes which have questionable expression levels (NA's) and do mean - variance calculations. What I did is that;
library(data.table)
dat <- as.data.table(df)
rowvar <- function(x, na.rm =F) rowSums((x - rowMeans(x, na.rm=T))^2, na.rm=T)/(rowSums(!is.na(x)) - 1)
dat[,`:=` (variance = rowvar(.SD, na.rm = T), mean = rowMeans(.SD, na.rm = T))]
But; it gives an error like, "Error in rowMeans(x, na.rm = T) : 'x' must be numeric". So, how I can handle with this error?
I have attached the document that I am currently working on it.
Thank you for your interest.
Hasan,
Relevant answer
Answer
@Hasan, I also recommend stackoverflow.com for all your R questions, it is by far the best and most up to date forum.
  • asked a question related to Microarray Analysis
Question
29 answers
Hello everyone,
I have a Excel spreadsheet which contains 17 columns, 54,675 rows and I need to calculate each of 54,675 row's mean and variance in R Studio. After that, I have to add each rows' mean and variance as a new column in Excel spreadsheet. So, how I can deal with this issues? I suppose apply() function works but somehow I could not do it. Any suggestions?
Thanks,
Hasan
Relevant answer
Answer
If you need it in Excel, why don't you do it in Excel?
If R, you get the row means with rowMeans(). To get the variances you will have to apply() the function var() to the rows. Here is an example code, assuming that the data is in a 54675x17 data.frame or matrix "df":
rm <- rowMeans(df, na.rm=TRUE)
rv <- apply(df, MARGIN=1, FUN=var, na.rm=TRUE)
Both, rm and rv, will be numerical vectors of length 54675. You can save them as a csv and import it in Excel. You can also add them as new columns to df (df <- cbind(df, "Mean"=rm, "Variance"=rv)) and save the entire df object. If you use the package openxlsx, you can use write.xlsx(df, file="new filename.xlsx") to save it as an xlsx file.
  • asked a question related to Microarray Analysis
Question
4 answers
I want to draw a trend of the expression of genes which i have and for doing so i want to know whether the samples in GEO series matrix (I mean GSMxxxxxx) are comparable with each other (sth like normalization and these processes happened to them) or not? and also whats the meaning of this caution in GEO that i attached to my question?
thanks
Relevant answer
Answer
Not directly. The might be comparable to a certain degree after a common normalization, but even then it is important to consider the experimental differences (cell types/extraction, passage numbers, RNA isolation, amplification and labelling procedures). It is very dangerous to interpret differences between groups that are taken from different experiments/series, as there might be a serieous confounding of experimental conditions and the groups.
  • asked a question related to Microarray Analysis
Question
3 answers
Hello everyone,
I have to analyse data from Affymetrix microarray (Human Genome U133 Plus 2.0 Array) with Bioconductor and it is the first time I am using Bioconductor. I got .cel files from NCBI GEO but I could not get the chip description file. So, how I can obtain a CDF?
And one more thing, when I check number of genes in my dataset, the R program shows that it contains 54,675 genes. However; this number should be between 20,000 - 25,000. So, I am wondering that there might be any replica of them?
Any suggestions and someone can help please?
Thanks,
Hasan
Relevant answer
Answer
Thank you for your answer and time Dr. Joachim. The link that you send helped me a lot and as you said the number 54675 probe set is true as well. Thanks again.
  • asked a question related to Microarray Analysis
Question
2 answers
Am trying to design a probe for microarray analysis. For the control genes i prefer constitutively expressed House-keeping genes. If anyone could guide me with designing a probe and primer for a particular gene say for example 23s rRNA it would be helpful. manuscripts state the sequence and corresponding primer pairs but no description about the methodology involved in designing. Can anyone could help me with this?
Relevant answer
Answer
Thank you Marcus for your suggestion.
  • asked a question related to Microarray Analysis
Question
6 answers
Hello.
I have been working on a transcriptomics data from NCBI's GEO for a while now. However, I have recently been made aware of this phenomenon [see image attached]:
- When plotting the probes (y-axis) against the subjects (x-axis), the heatmap generated shows a very large area (around 10,000 probes) with an intensity lower than their vicinity, both for control and for diseases individuals;
- This effect is also (more visible) on the other image, that shows that for around 10,000 probes (x-axis), the intensity (y-axis) is lower than the average intensity.
My question is: do you have any idea what this could be due to? Is this "area" of lower intensity a common thing in microarray analysis? Should I exclude this area from the analysis? 
Thank you in advance
Relevant answer
Answer
It seems you did not scale your data when you plotted the heatmap, and you did not use hierarchical clustering to show the relationship between your samples. So your heatmap is full of low absolute expression probes, but not the relative expression after scaling.
The low expression of microarray data is not common, but for RNA-seq is very often. Your data should be microarray data, so the best way is to obtain CEL file or other raw data from the author to repeat the normalization process by yourself.
  • asked a question related to Microarray Analysis
Question
5 answers
I am working on some of the microarray data of some of the genes I am intersted in. I wanted to know the expression levels of some of the genes in different cell types. People have deposited data in duplicates or triplicates for every particular cell types and all the microarray experiment was done in the same kind of microarray chip. Now I have to get a mean value for every cell types and compare it with other cell types. Till now I have normalized the data using gcrma package. Now I think I have to normalize the values with any house keeping gene values and proceed furthur. I am not sure how to proceed. I need help. Please guide me through this.
I also have to find the gene coexpression values for all the gene pairs. I calculated the pearson correlation coefficient from the normalized values from all the data sets. Is that okay or I have to calculate PCC only after normalizing the values by any house keeping gene value. Please help me. Thank you
Relevant answer
Answer
COMBAT is a useful tool to correct batch.Or simple you can just adjust it in your model without COMBAT.
  • asked a question related to Microarray Analysis
Question
9 answers
I need to compare a gene's expression between tumor site and matched normal tissue from TCGA database. I've tried using Firehose to search differential expression of the gene among different types of cancers. The problem is that the amount of tumor samples is not equal to the amount of normal samples. But I need to compare matched tumors and normal tissues.  Is there any tools to do that? 
Relevant answer
Answer
Try xena.ucsc. You will fall in love with it.
  • asked a question related to Microarray Analysis
Question
12 answers
I want to convert expression level value to z-score (mean-x/sd). I have two type of samples in my microarray (Affymetrix GeneChip Human Genome U133 Plus 2.0) (31 normal vs 30 case) Do I have to calculate the mean and sd for the Normal samples only and use z-score formula then do it for case samples or I have to find the mean and sd for the whole samples ?
Relevant answer
Answer
mean and sd are calcuated using all 61 samples.
Make sure you use log(signals) to calculate the z-scores.
Note that z-scores are usually calculated to show co-regulation patterns or to identify samples with differing (outlying) profiles in heatmaps. If you have just want to identify the genes with the best statistical evidence for differential expression between the groups, then you may simply run t-tests (again, using the log(signal)-values).
  • asked a question related to Microarray Analysis
Question
4 answers
From microarray  or RNA seq expression data, for valadating the data do we need to select the genes randomly or we can choose what matters to us?
Relevant answer
Answer
Depends on what your aim is, If you really want to validate the overall screening result, you should select genes at random.
If you just want to confirm that those genes you aim to continue to work on are regulated as estimated from the screening, you should better use these genes. But this is still only a technical validation. If there are not too many interesting candidate genes and if possible, you should go for a biological validation using appropriate experimental assays to determine the biological role of the regulation (knock-in, knock-out, inhibitors, enhancers).
  • asked a question related to Microarray Analysis
Question
2 answers
Analysis of data
Relevant answer
Answer
Thanks mam. I have received data from DMET analysis. How will I present the data in various representations? What all software and applications will help in the same?
  • asked a question related to Microarray Analysis
Question
12 answers
Hi experts,
Since RNA-seq with NGS technology is changing gene expression studies with great advantages. We still observe a lot of studies using microarray (i.e. Affymetrix Gene Atlas, etc.) techniques and even qPCR (to a certain extent).
I personally believe and biased towards NGS technology and RNA-sequencing for gene expression studies. Not only that, RNA-seq has the ability to discover novel gene transcripts to open a potential new field of study.
However, RNA-seq can be costly, but I personally believe in the end, it's better than microarray. So in what instances can I say that microarray is better than RNA-seq? I am working with primary cells, cell lines, and mouse as my animal model for brain-related studies.
I am looking forward to hearing your opinion.
Relevant answer
Answer
RNA-Seq is a powerful tool if you're trying to detect novel transcripts/splice forms, go on an unbiased "fishing trip" for genes/biomarkers, detect extremely rare transcripts, or look at changes in transcript abundance that occur over a very wide dynamic range. 
However, if you're interested in studying the expression of a known panel of transcripts and none of them are expressed at an extremely high or low level, a well-designed microarray will work just as well and cost less. Microarrays can be used for most of the same experiments as targeted PCR primer sets for RNA-Seq.
  • asked a question related to Microarray Analysis
Question
4 answers
Hi!
Does anybody know a programme/software/website to perform HeatMaps without using the R language??
I have a set of 3099 genes up regulated and a set of 2686 genes down regulated under my unique experimental condition and I would like to compare them.
Thanks a lot!
Relevant answer
Answer
Hi,
I never use before, but I see good heatmaps from this website:
Furthermore, you can use excel to do heatmaps, I think taht you need to asing a color grade to a value range:
Good Luck!!
I agree with Thomas, lear R is a great tool to do bioinformatics analysis and figures.
  • asked a question related to Microarray Analysis
Question
1 answer
So, basically there are abundance values for my first 3 sets of experiment that have variable control values for all peptides within each protein. How do I make use of this data to get significant peptides or calculate the fold change statistically? I was wondering if there is a way to do this without control abundance data. Also, should I use normalisation techniques, and which one?
Relevant answer
Answer
Dear Abhijeet KISHANPAL Mavi,
I will recommend that you use Perseus for your statistics. Perseus is a fairly userfriendly "click" software to do the kind of statistics I understand you are planning to do.
The program can be downloaded from the link below:
The developing group each year hosts an excellent summer school, which I really can recommend. However, if attending the summer school is not an option they also publish all the lectures on YouTube. Try and search for e.g. "Perseus summer school iTRAQ" for some instructions.
I hope this helps and good luck with your research.
  • asked a question related to Microarray Analysis
Question
10 answers
Hello everyone. I have normalized reads of RNA-seq data and I am trying to generate a venn diagram of upregulated and downregulated genes. I have three replicates each of control and test samples. I tried to search online but couldn't decide which tools would be better to use. Can anyone please suggest me any windows based offline/online tools to generate venn diagrams from RNA-seq data? Thank you very much.
Raghu.
Relevant answer
Answer
Hello Raghuram Sir,
As I understand your question, you have read counts of the control and treatment samples but not the list of differentially expressed genes. Therefore, first, you have to perform the differential expression which can be done using Cuffdiff, DESeq or any other program. However, these programs are Linux based. I have no idea about any windows based program for differential expression studied. After performing the differential expression analysis you will get the list of genes upregulated/downregulated in the treatment vs control with the level of significance in term of P and Q values. Now with this list you can make venn diagram using the program 'Venny' . This Venny program is online tool and very simple.
All the Best
  • asked a question related to Microarray Analysis
Question
2 answers
I have used the MultiNA to quantify RNA for the first time.
Could someone help me interpret the output results? Does it have an equivalent number to RIN?
Can I trust the "Total conc" readout?
Many thanks
Relevant answer
I think you can trust the "Total conc" readout.
  • asked a question related to Microarray Analysis
Question
3 answers
Our GeneSpring user license has expired so I am investigating whether there is an appropriate online open source application I can use to analyse microarray data.
Relevant answer
Answer
R is a good option for open source analysis tools of a wide variety of data. If you were using GeneSpring I'm assuming you may  want to analyze Agilent array data. There are specific packages for different Agilent arrays, such as https://www.bioconductor.org/packages/release/bioc/html/agilp.html, https://www.bioconductor.org/packages/3.5/bioc/html/AgiMicroRna.html and https://www.bioconductor.org/packages/3.5/bioc/html/LVSmiRNA.html to name a few.
  • asked a question related to Microarray Analysis
Question
4 answers
Hi,
I have 33 ligands in total, which were analyzed through SAM. Reported in an article entitled "Analysis of the major patterns of B cell gene expression changes in response to short-term stimulation with 33 single ligands". I selected 10 ligands from above data and wants additional analysis but they didn't provide the RAW data/CEL, I downloaded the Processed data from "ArrayExpress". I reviewed the limma tutorial and want to make sure the downloaded data file for limma. I need a starting point for analysis through limma, I attached one of processed data file as an example, Can I use processed data files as an input for limma and which type of analysis will be performed? I will be waiting for your valuable answers.
Thank you,
Relevant answer
Answer
I have some sample code here for a paper, where the data is downloaded from the GEO database and analysed using Limma. You can modify some sections, and use the Bioconductor Array Express package instead of Geoquery. 
Hope it helps. Good luck.
  • asked a question related to Microarray Analysis
Question
7 answers
I analysed PPI network after integrated gene expression data from alzheimer's disease experiment within PPI network and reveals some sub network. First, I used (limma package) for Differentially Expressed Gene analysis. Second, I mapped DEG genes on the PPI network and assign the gene fold change value to corresponding proteins. Third, I search the network by selected my candidate gene and reveals sub-networks. I scored them by my formula, then I merge the top scoring sub networks.
Now, I want to validate my results (merged sub network) and I have no idea how to do.
Could anyone help me or suggested a method to validate my outcome please? I will highly appreciated
Relevant answer
Answer
we may perform Gene Ontology enrichment analysis for specific outcome in that sub-network  and correlate to the computational prediction. but before doing that we could have some clue of evidence about that specific PPI in databases like String, then we could do comparative analysis.
  • asked a question related to Microarray Analysis
Question
5 answers
Dear colleagues, I have Affymetrix microarray data, from endothelial cells, co-cultured with mononuclear cells in conditions of normoxia, hyperoxia and hypoxia. Control cultures of endothelial cells are also cultured (alone without mononuclear cells) in these same conditions. The affymetrix microarray  data have been processed with the Expression Console(Gene level >> extended:RMA-Sketch) and filtered.  I wish to use excel to elucidate differentially expressed genes.
Please, what steps do I need to take to  proceed in the elucidation of  these differentially expressed gene using excel? I am new to high-throughput data analysis.
Thank you in advance for your response.
Relevant answer
Answer
these steps are helpful for exporting the excel file
you can just input that excel file to this jar tool and it will do that query
  • asked a question related to Microarray Analysis
Question
5 answers
Hi all,
Performing RNA-Seq data sets needs to know which the most accurate and reliable platform to go with. Could you suggest such pipeline?
Note// I have good experience with the Tuxedo package (Bowtie, Top Hat, and CummeRbund) in addition to EdgeR, 
Thanks
Relevant answer
Answer
This paper should answer most of your questions:
Anders, S., McCarthy, D., Chen, Y., Okoniewski, M., Smyth, G., Huber, W., & Robinson, M. (2013). Count-based differential expression analysis of RNA sequencing data using R and Bioconductor. Nature Protocols, 8(9), 1765–1786. http://doi.org/10.1038/nprot.2013.099
If you want to look at some example scripts, you can see them in my github repository for a recent data set we analysed. https://github.com/uhkniazi/BRC_Organoid_Joana
Once you get the count matrix, you have various options for performing any sort of differential expression of genes - EdgeR, DESeq2 etc. 
  • asked a question related to Microarray Analysis
Question
3 answers
I have access to 3 experiments from GEO. The sample type for one experiment is blood and other two types is skin. I have the RPKM values of control and patient samples from these tissues. The platform for all these experiments is same.
How to proceed with meta-analysis for these experiments? There are very few papers regarding the protocol. It will be a great help if I can get any pipeline.
  • asked a question related to Microarray Analysis
Question
4 answers
I am doing differential expression studies using iTRAQ. I have problems with identifying the fold change / fold enrichment on the downregulated iTRAQ ratios. For example, iTRAQ ratio for 117:114 shows 3.256, which means that it shows upregulation of 3 fold change, but how about downregulated ratios since it shows value less than 1, for example 0.2679. Is it possible for us to calculate how many fold change from the iTRAQ ratio with PVal (ratio) given? I am using ProteinPilot Software.
Thanks in advance! 
Relevant answer
Answer
Hi Yee, 
Basically, using iTRAQ you can measure the absolute and relative quantitative ratios of peptides and proteins. 
Maybe you should take a look this paper. You can find more detailed answers of your questions. 
  • asked a question related to Microarray Analysis
Question
9 answers
While finding the differently expressed genes from the microarray data, which are the necessary parameters that we have to taken into account for a more satisfying result? Which are the intervals(maximum value and minimum value) can be set for FDR, fold change etc. in accordance with log2 normalized p-value.
Relevant answer
Answer
Unfortunately, there is no good general answer to this.
There are genes for which a slight regulation is biologically relevant, and others that can be considerable regulated without much biological impact.
The p-values and whatever you get from an FDR-based selection depends not only on your selected cut-off but also on the sample size.
A strategy to select "candidates" could involve two steps:
1) select the top 50 genes with largest abs. LFC and also the top 50 genes with the lowest p-values. Some may overlap, so you get a list of at most 100 genes.
2) go through this list and make a subselection based on your biological understanding of the genes in the experimental context.
Then you should have a list that should allow you to get an idea what experiments to plan next.
  • asked a question related to Microarray Analysis
Question
1 answer
Hello.
I tried to perform meta-analysis of differential gene expression data  using GEO.
A-madman program looks like fancy. However, it is not working in the process.
The error occurred when I perform click analyze after grouping on Basket tap.
Any one help this program or recommend another program or R-package?
Thanks in advance
  • asked a question related to Microarray Analysis
Question
5 answers
Dear All, 
I am trying to see which CpG sites (with its associated genes) are involved in particular pathways and diseases, and get an overview of the functions of these genes. 
Currently, I have tried to import my dataset (>800k CpG sites total) which shows the following: 1) each CpG site as the ID, 2) p-value, 3) q-value, 4) fold change and 5) difference. My data sets are quite large with >200,000 CpG sites (the row limit of IPA) -  is there a way to import a file this large? 
I have also tried importing a file with more specific CpG sites of around 1000 CpG sites but it is not being mapped properly by IPA as I have 0 mapped sites due to errors or possibly I am using the wrong template (i.e. not expression data)? 
I think the errors are coming from my formatting in my excel file to IPA, where either the headings are incorrect and the way I am assigning each header/observation is incorrect i.e. I think I set my Identifier as Illumina (which is what I used to get my CpG methylation data), but I do not know what other options I can choose instead of this. IPA also showed errors first with 'no IDs matched to particular genes',and then with 'removing fold change between 1 and -1'. 
In summary, I would really appreciate any tips/guidance with uploading CpG methylation data into IPA. 
Thank you very much.
Relevant answer
Answer
Thank you very much Mr. Kamstra and Dr. Muley! 
I am looking at human samples from infant cord blood lymphocytes. 
So far, I have a smaller list of CpG sites that are relevant based on p-values and fold-change (as what Dr.Muley suggested). 
Mr. Kamstra, I have associated CpG sites with genes using GenomeStudio, but would I have to use biomart to convert these gene names to ensemble IDs or entrez IDs? 
Thank you for your help!
  • asked a question related to Microarray Analysis
Question
3 answers
I am trying to find out expression profile of my candidate genes from RNAseq or CAGE data from cancers using publicly available RNA seq data
I prefer any online search tools at this stage for a quick analysis.
  • asked a question related to Microarray Analysis
Question
16 answers
I have some genes with their FPKM values now i want to convert this value in to log2 fold change. 
Relevant answer
Hello Tinku,
First, you have to divide the FPKM of the second value (of the second group) on the FPKM of the first value to get the Fold Change (FC). then, put the equation in Excel =Log(FC, 2) to get the log2 fold change value from FPKM value. 
  • asked a question related to Microarray Analysis
Question
11 answers
Hi Everyone,
I'm using microarray data to identify DEGs and map its PPI network but now I want to use multiple datasets reported by different studies in Acute Myeloide Leukemia (AML). Please specify a good methodology step by step and also please specifically I can Merge different datasets. Please also need some info regarding the requirements for merging.
Thanking you in Advance.
Relevant answer
Answer
Batch correction only works when the experimental conditions are (nearly) evely distributed over the batches. Otherwise, batch correction models will make things even worst (see link).
This is a severe problem when selecting data from different experiments, where many experimental issues are 100% confounded with the batches/experiments. If additionally biological groups are counfounded with experiments (like: I take group A from GSEx and group B from GSEy and compare A against B), the result is almost completely arbitrary (usually, results will already cluster nicely within experiments/batches, indicating considerable (but artificial!) differences in the expression profiles between the groups, but after batch-correction, these differences will be even exaggerated.
Further, if there are many different batches and relatively few samples per batch, a correction with combat will lead to an overestimate of the residual degrees of freedom for tests for DE (what might be corrected manually, but ignorance would be a bad guide). This prblem is also discussed in the attached link.
  • asked a question related to Microarray Analysis
Question
6 answers
I heard about, scanned with microarrayed slide like show image below, black background+fluorescence dots. But what I saw is all-of-white slide with black frosted ends(conatantly black, below). I have no photos on my PC, but I saw white slide+black end.   I don't know about why scanned photo is only block&white. plz give your opinion..
Relevant answer
Answer
@Björn Abendroth typhoon 9410
  • asked a question related to Microarray Analysis
Question
6 answers
Hello guys!
We have several transcriptome data sets, which came from the samples that were treated at low temperature for different time length, let’s say at 4 ℃ for 1, 3, 5, 7 hours. After analyzing those data we have got deferentially expressed genes (DEGs) for each treatment time point. For example, when sample treated with4 ℃ for 1 hour, we got 2000 up-regulated and 3000 down-regulated genes; For 3 hours, 1500 up and 2500 down-regulated genes; 5 hours …; 7 hours ….
My question is how I can analyze these DEGs further to get certain portion of genes which are really crucial at low temperature in this sample?
And is there tools to this work?
By the way, my data sets come from RNA-seq, and the fold-change value of each unigene at different treatment time point is calculated with DEseq.
Thanks in advance.
Relevant answer
Answer
thank you @ Audrey.  
  • asked a question related to Microarray Analysis
Question
2 answers
well... I ruined my microarray... So, I want to ask something for you.
Is that OK I store my buffers in RT?
Pre hybridization buffer-5X SSC/0.1% SDS/1% BSA
Hybridization buffer-5X SSC/0.1% SDS/50% Formamide
Low stringencity wash buffer-1X SSC/0.2% SDS
High stringencity wash buffer-0.1X SSC/0.2% SDS
0.1X SSC
50% DMSO
this is my buffers. I store these buffers at 4 celcius now, because BSA is store at 4 degree. but some solute cannot solve at 4 degree.
Relevant answer
Answer
Thx :)
BSA and SDS salted out when it storaged at 4 degree, but it's completely solved at RT.
  • asked a question related to Microarray Analysis
Question
2 answers
Currently differential gene expression identification usually using RPKM, TMP or TMM, however the sequencing depth is controlled by people and all the quantification are relative. To compare between samples, some methods use the distribution based normalization, like DESeq2 and edgeR. The problem is that these methods are not that correct too. While we sequence a low expression samples with high depth and a high expression sample shallow, all these methods seems can not detect the true difference.  One of the idea is that if there are a group of universal genes with unchanged expression level, these genes should be taken as the baseline to perform  normalization and compare between samples. 
I have noticed about one paper using this idea to normalize the gene expression of plant tissue, they established the stable expression database. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5178351/.
But for prokaryotic microorganisms, it seems that there does not have any stable expressed gene set yet. 
Any comments will be appreciated.
Relevant answer
Answer
Hi Xiao-Tao,
DESeq2 and edgeR, indeed, have a problem when global changes come into play (e.g. if cells produce different overall amounts of RNA under different conditions - a fairly normal situation among prokaryotes), because they actually assume that the majority of genes DO NOT change. As you said, there are apparently no genes that keep their expression levels under all possible conditions in all possible genetic backgrounds, even ribosomal RNAs vary hugely between different growth phases.
I can see to possible solutions here. 1) Use spike-in RNA. When you know from how many cells you isolate RNA, the spike-in will always permit you to estimate the actual abundances of transcripts independently of the depth of sequencing. 2) Use RNA-seq in conjunction with other quantitative techniques (e.g. northern blotting or RT-qPCR) applied to the same samples. Again, if you know from how many cells you isolate RNA and you deposit on the gel accordingly (e.g. 1/10 of the amount you isolated, whatever it be), by probing for several RNAs you will get a precise idea about how the abundance of these RNAs in one sample relates to that in another. Like this you will derive normalisation factors and will be able to account for the differential sequencing depth.
  • asked a question related to Microarray Analysis
Question
13 answers
Hi all,
I want to analyse an RNA-seq data set from a paper so I've got the data from GEO. In the data, they have a column called "unique hits" for each ID that I think mostly relevant to the next step. However, I just don't know if they are the values that I can use to analyse the gene expression level.
I've used log_RMA and RMA in the microarray before to analyse gene expression but I don't know if these are the same.
Thank you!
Relevant answer
Answer
In next gen sequencing the first crucial step is mapping your sequence fragments, the reads, to the reference sequence (genome or transcriptome). However, reads are short (typically 75 bp or 100 bp) and it is not always possible to get a single best hit.  It is common to use only those reads with a single best match (= "unique hits") in downstream  analyses.
RNA seq is different from microarrays. Microarrays are based on hybridization and produce a continous fluorescence signal, whereas RNA-seq is based on short read sequencing and mapping and produces as a discrete signal the number of unique reads mapped per gene. The normalization and statistical analysis for RNA seq differs from microarrays. The most commonly used analysis packages for RNA seq are EdgeR and DESeq2. You can find those on Bioconductor.
  • asked a question related to Microarray Analysis
Question
3 answers
Hello friends
I am doing micro array data analysis(HGU1333plus2), i got the expression matrix file by using gcrma , but the some probe is represent multiple gene like this . how can we treat this, then some probe is not matched it shows NA can delete it , next i take this file for analyze  WGCNA , please share your knowledge ,
221251_x_at
1
221251_x_at
INO80B /// INO80B-WBP1
NA
65133_i_at
1
65133_i_at
INO80B /// INO80B-WBP1
NA
223072_s_at
1
223072_s_at
INO80B /// INO80B-WBP1 /// WBP1
NA
1559716_at
1
1559716_at
INO80C
INO80C
229582_at
1
229582_at
INO80C
INO80C
220165_at
1
220165_at
INO80D
INO80D
Relevant answer
Answer
Hi Mathavan,
If you are asking about this particular case, it can be explained by the fact that INO80B and WBP1 genes can sometimes occur as a read-through transcript that cannot be distinguished from either of the genes individually by the probes in question (see https://genome.ucsc.edu/cgi-bin/hgc?hgsid=579057461_Tm25CaPexEPPpKrMhRyrGSqaNmCR&c=chr2&l=74456725&r=74462493&o=74455022&t=74460891&g=refGene&i=NR_037849).
You can also look up more information about the U133 array's probes sensitivity and specificity to certain transcripts to understand why this could be for other probes and genes. For this particular example you can look at the information for probe 65133_i_at here https://genecards.weizmann.ac.il/cgi-bin/geneannot/GA_search.pl?keyword_type=probe_set_id&array=HG-U95&target=genecards&keyword=65133_i_at and for the 223072_s_at probe  see https://genecards.weizmann.ac.il/cgi-bin/geneannot/GA_search.pl?keyword_type=probe_set_id&array=HG-U133&target=genecards&keyword=223072_s_at.
  • asked a question related to Microarray Analysis
Question
1 answer
I have obtained the microarrays data for the large cohort (both sexes). I have performed initial GWAS for all the SNPs from all the chromosomes to check the genetic association with trait which I am interested in. I found some regions but the most interesting is the one in X chromosome (in my opinion it is not a fake). However, I am a bit confused because I do not know - can I? and how can I? - analyse these data. for women there is standard 3 alleles distribution but for men, it possible to have only 2 variants: presence of allel or lack of allel.
- should I divide cohort for separate analysis for men and women subsets?
- what kind of statistics should I use for men, because I think there is impossible use simple MAF? and are the statistics results only for men subset from PLINK are reliable?
- or do you have any more advice?
I would be very grateful for all you help.
Relevant answer
Answer
I guess separate analyses need be done for men and women as men have XY chromosomes and would differ from the normal (MFA) analysis.
I think software PLINK should help you do the needful.
There is plenty of literature on GWAS in different organisms. The field is exploding. For fruitful guidance in your case, a journal like American Journal of Human Genetics would be very useful to search problems and solutions similar to yours.
  • asked a question related to Microarray Analysis
Question
1 answer
I performed the Cell Cycle Control Phospho Antibody Array (http://www.fullmoonbio.com/product/cell-cycle-control-phospho-antibody-array/) with 7 control and 7 treatment samples. To identify the signal intensities I used GenePix Pro 7 and created .GPR files.
How do I continue with my statistics? I want to normalize the data and calculate z-scores or SAM. I can normalize tha data in Excel, but I am sure there is a more convenient way to proceede. I read about the program Prospector from Invitrogen and the protMAT website, but Prospector is not working with my .GPR files. 
I am new to protein array and microarray research and would be very happy for any suggestions.
Thank you so much!
Relevant answer
Answer
Dear Denise,
the basic problem of high-througput data (common to metabolomics, transcriptomics and protein arrays) is the huge number of variables (protein species in your case) with respect to statistical units (the 14 samples) that open the way to a plethora of chance correlations. Thus the main lane IS TO EXCHANGE THE ROLE OF VARIABLES AND STATISTICAL UNITS. Simply operate on the transpose of tour original data matrix, i,e. the matrox having as rows the protein species and as columns (variables) the samples. On this matrix operate a Principal Component Analysis (that allow a dual representation of the same data set in terms of loadings (correlation coefficients of the variables with components) and scores (values of different component for each sample). So operate this pC, if you have 14 variables (7 control + 7 treatment) you will have in principle 14 components but, due to the mutual correlation between the abundance of different protein species you will end up into very few (2 or 3) principal acomponents explaining the by far major part of total variance.
Look at the component loadings and you will surely get a PC1 with all loadings of the same sign (size component) this is the signature of a common global profile shared by all the samples. Then go to PC2, PC3..PC4. You must see if there is a component in which the loadig values are significantly different between control and treated (ideally you will get  a component in which control and treated have opposite sign loadings) allowing for a perfect separation of the two groups in the loading space (shape components). If this is the case you will go to the scores of the discriminating component and look for protein species having the higher (in absolute value) scores on the component : those protein species are the ones allowing for the separation of the two groups and you solved your problem, If you will run a PCA on the correlation matrix you do not need to normalize protein values, normalization is implicit in the correlation metrics.
See:
  • asked a question related to Microarray Analysis
Question
5 answers
I am working on a biological dataset which is not following ideal normal/gaussian distribution.. Which statistical test and technique would be best to analyze this dataset ??
Relevant answer
Answer
What "difference" do you mean?
- a difference in distribution?
- a stocastic difference in magnitude?
- a more specific difference of the distributions (e.g. difference in concentration, in variation, or in some quantile?)
  • asked a question related to Microarray Analysis
Question
3 answers
Hi,
First let me start of by saying that working with Proteomic datasets is quite new, and while I find it terribly interesting I am currently having trouble finding some answers related to my dataset.
Very briefly, my question would be, how and if I can use "Raw intensities" to examine protein expression and interactions (i am using perseus). I am working with raw intensities as I've been told that LFQ intensities cannot be used if there are large variations of protein identifications between samples, which there in my case is. Nevertheless, first let me start of by describing my dataset before moving on to the specific questions I have.
Dataset
* I am comparing four different methods for isolation of the same plasma constituent.
* There are three unique biological samples (3 different controls) in each isolation method (12 samples).
* Additionally, all four methods are performed as technical duplicates, meaning I have a A and a B series, both on the same dataset (22 samples)
Questions
1. First and foremost, am I even able to do statistical analysis on my dataset?
2. Should I normalize my peak intensities? What I've understood from my reading, is that raw intensities only somewhat correlate with actual abundance and if one want to analyse raw intensities one need to use some form of peak intensity normalization. I've been looking at a normalization method called EigenMS and Global normalization, and while global normalization seems simple enough my thought is that due to large differences between isolation methods, this form of normalization cannot be used. My question would then be, should I normalize my data, and if yes, what would be the best method?
3. How should I group the different methods when analysing? Currently I am grouping all three controls per isolation method (6 with technical duplicates) into the same group using the annotation rows feature.
Any help is greatly appreciated, and if there is any features of my dataset I forgot to tell, please dont hesitate to ask.
Relevant answer
Answer
Hi, sorry for kinda late reply.
I think your suggested workflow could work. Be careful when imputating data. Always check the histogram of log2 LFQ intensities that you do not introduce too many data (e.g. for normal distribution). You would be able to see that by a kinda bimodal distribution. I think in your dataset this will be difficult when comparing different isolation methods because alone by comparing 4 and 1 you would have to imputate more values than you actually measured in method 4. Again, comparing between similiar isolation methods is fine I guess. :)
To compare different methods I really would simply compare the numbers of quantified proteins for your replicates.
For ibaq- indeed it was "developed" for aboslute quantification but it is also used for relative comparision in some studies and immunoprecipitations.
Best
Hendrik
  • asked a question related to Microarray Analysis
Question
1 answer
We have done miRNA Microarray using Agilent Human miRNA Microarray Kit
Ver. 3.0 (Cat No: AGT-G4470C). I have .gpr files of my samples but I could not analyze their miRNA profile on genespring. How can analyze them on Genespring?
Relevant answer
Answer
  • asked a question related to Microarray Analysis
Question
3 answers
cancer microarray dataset from geo dataset and .cel file using
  • asked a question related to Microarray Analysis
Question
6 answers
I observed this in one of my microarray experiments in which the first two gene got upregulated and last gene was downregulated. these three genes belonged to the same operon. Kindly suggest
Relevant answer
Answer
If you are sure that these genes are in the operon and can confirm these results using qRT-PCR, then the difference can be due transcriptional attenuation, e. g.  premature termination of transcription. Transcriptional termination is one of the mechanisms to regulate gene expression. 
  • asked a question related to Microarray Analysis
Question
4 answers
Dear colleagues,
Which statistical test(s) can I use, to elucidate differentially expressed genes after affymetrix microarray analysis? I am using endothelial cells, co-cultured with mononuclear cells, in three different oxygen tensions. 
Thank you in advance for your answer...
Relevant answer
Answer
You can use ANOVA for multiple comparison but i highly recommend you to check multiple test correction like fdr or Benjamini and Hochberg's methods since you test multiple genes. 
  • asked a question related to Microarray Analysis
Question
4 answers
Dear All, 
I am working on Micro-array data analysis  with this GEO acc no. GSE31747, In this paper PMID 22024983 (https://www.ncbi.nlm.nih.gov/pubmed/22028943) they have used ANOVA method to identify the significant genes,Now i am  trying to compare their results with Limma package but i am not getting any single significant genes(P-Value<0.05 ). So can any one tell me if Limma not give any significant genes what else other methods i can go.
Relevant answer
Answer
Hello Mathavan,
Fold change =  Mean Exp in treatment/ Mean Exp in control, People put a cutoff of 2 fold change to say it DEGs.
Now the statistical part is that if all the replicates in treatment and control are similar then you will have a very low p value and vice versa.
Example here
10, 15, 55 Control
85,95,90 Treatment
Fold Change = (85+95+90/3)/(10+15+35/3) = 90/20 = 4.5 fold. so it differentially expressed.
Now the statistical significance of the same.
p value of t test (you can easily do it in excel). I calculated for the given set = 0.0059
It is lesser than 0.05. So it is significant. I am not sure what test Limma uses. But you can use t.test or ANOVA in excel or in R after normalisation. Please write back if you need further help.
  • asked a question related to Microarray Analysis
Question
9 answers
Good morning! Recently, I tried to confirm the gene microarray results by qPCR using SYBR methods (StepOne Plus), but got some 2-Tm Peaks melt curve (Figure 1). After decreasing the primer concentration from 400nM to 200nM and increasing the temperature from 60 to 65, some of them started to look better (Figure 2), so I decided to decrease the primer concentration again, but the problem came again (Figure 3). Is there anyone can give me some suggestions?
See the figures in my Google Drive, thanks
Relevant answer
Answer
Could it be possible that you have two isoforms, e.g. splice variants, of the mRNA you are amplifying? If their expression were differently affected by the treatment, it might explain the variation in the ratio of the two peaks in your melting curve.
  • asked a question related to Microarray Analysis
Question
3 answers
I am culturing mesenchymal stem cells on treated 1 mm^2 culture surfaces. This cannot be scaled up in area and may only be replicated in triplicate, i.e I have 3 mm^2 total area. If I assume that confluent cells are approx 625 µm^2 in area, then that will give me approx 1.5 x 10^3 cells per substrate. Is it best to pool these three samples to give me 4,500 cells total, or can I extract and detect proteins via microarray in triplicate with 1,500 cells? would DNA microarray be a better option?
Relevant answer
Answer
To close out this post, I can report that I abandoned protein microarray and focused instead on multiplex PCR. Specifically I employed the Fluidgm platform for analysis of nanolitre PCR samples.
  • asked a question related to Microarray Analysis
Question
6 answers
I want to analyze RNA_Seq data that I found on the internet to practice. I found 8 RNA_Seq datasets (four different immune cells from mice and everytime two biological replicates). The goal of the analysis is to discover if there are genes that are differentially expressed.
I want to used edgeR to analyze the data. In order to order to do that, I need to specify a design matrix. There I am stuck. Cell type is one factor with 4 different levels in this analysis, but what about biological replicate? Should it also get an own factor with two levels?
I think biological replicate should not get a own factor, but I can not really explain why. It is a hunch.
Thanks in advance.
Relevant answer
Answer
I would argue in favor of design matrix with intercept, as i am not sure how baseline changes will be estimated in no intercept RCBD. Follow example here for RCBD in edgeR [http://www.bioconductor.org/packages/release/bioc/vignettes/edgeR/inst/doc/edgeRUsersGuide.pdf] page 38.
To see significant effects due to block i would definitely look for DEGs due to mouse. If mouse were transgenic and were sacrificed at same time of the day (circadian changes etc) then i wouldn't bother to include them as block ... But if there are hundred of genes as DE due to mouse then you should definitely use a block design. 
  • asked a question related to Microarray Analysis
Question
2 answers
 I never did microarray before and have limited knowledge in the analysis, therefore looking for a place which would give me an out put easy to analyze as well as good in quality. I am studying the changes of IFN alpha  stimulating genes with the adeno virus vector therapy.
Relevant answer
Answer
Hello Chandrika,
if it is possible to send your samples from Canada to Germany (possible to stablilize them or could you extract RNA in Canada?) then I could recommend BioRetis at Charité in Berlin (http://www.bioretis-analysis.de/db/comparison). They could hybridize your Chips for some 100 € per Chip and analyze the data (possible with my help). Please ask Andreas Grützkau at DRFZ: Andreas Grützkau <Gruetzkau@drfz.de>. He could help you with all questions of transporting and hybidizing your samples. Published results of our way to analyze chip data you could find in my ResearchGate Profile.
Regards
Joachim
  • asked a question related to Microarray Analysis
Question
4 answers
Normalizing microarray data across platforms can be very tidious. But, when data is present across different platforms like Illumina, Affymetrix and Agilent, does quantile normalization across individually normalized data for each experiment and platform remove  batcheffects.
Relevant answer
Answer
Thanks Matthew,
This was quite insightful. In my case I am bound to microarrays, because lot of studies of my interest have not been yet extensively covered by RNA-seq.  I was thinking of using SVA package (it includes the combat function too), but some lookup on these kind of problems showed that people have downloaded already normalized intensities(experiment-wise) provided in GEO from multiple experiments and merged them (probably by mapping the probe ids to Entrez/ Ensembl ids and taken care of multiple probing) and then quantile normalized the merged data. I was wondering, if this was any good. I see that you have done the same thing, but you have used the raw-intensities for quantile normalization, which looks like a better idea. Thanks !!
  • asked a question related to Microarray Analysis
Question
4 answers
I'm working with samples from FACS where I sort one specific population and I aim to do a microarray analysis. After the FACS, I centrifuge the sorted cells at 4500rpm for 15min, remove the supernatant and store the samples at -80C. Then I extract the RNA using the Promega ReliaPrep RNA cell kit for small samples which includes a step for elimination of the genomic DNA. After this, we check the quantitity and purity on nanodrop (we usually get around 5-6ng/ul and A260/A280 of 2.0) and check the integrity (RIN) with bioanalyzer (usually we have RIN around 6-8 but the quantity is very lower than nanodrop). Why the quantity measurments are different using different methods? In this case, should I trust nanodrop values or it is not recommended?
Relevant answer
Answer
I agree with the answers above.  Basically, when you are down near the limit of detection on the nanodrop, slight variation in the zeroing of the instrument can cause significant error.  I would generally trust the bioanalyzer, but I would primarily rely on quantitation of the rRNA bands.
The other factor to consider is that when you use a column-based RNA preparation, it is not uncommon to get some of the finer column material in your RNA preparation.  This can cause light scattering in a spec reading and be really inconsistent from one reading to another.  I like to vortex samples and then spin them in a microfuge to pellet any particulates before taking a reading.
  • asked a question related to Microarray Analysis
Question
3 answers
For how long we can store the processed microarray chips before scanning
Relevant answer
Answer
Once it is read there is no need to reuse it as the data you received at the time of chip reading were the best expressed those signals will not be the same if you store them 
if necessary store in dark with date mentioned so that during retrieval the date can suggest the depreciation of the signals  
  • asked a question related to Microarray Analysis
Question
1 answer
Dear all,
I want to ask the company to make the tissue array with our own liver tissue (HCC) from our tissue bank. 
Can someone tell me how many sample should be on one slide? I mean how to design the array of the tissue?
Thanks for your answer.
Dachen.
Relevant answer
Answer
That depends on how you design the array and for what purpose. For example the size of the needles used to make the array will determine how many cores can fit.  If you use a 0.6mm needle you can fit more cores than if you use a 2.0mm needle. You should have at least 3 cores from each sample, and several different types of tissue as positive and negative controls arranged in a way that will make it easy to orientate yourself when viewing under a microscope. Chapter 6 of this book describes some important considerations for building tissue microarrays and may help you in your design (http://link.springer.com/protocol/10.1385%2F1-59259-759-9%3A061)
  • asked a question related to Microarray Analysis
Question
4 answers
Hi, Biomart is down so is there any other way to convert a list of official gene symbols back to probeset IDs for a given chip.
We have a list of gene symbols from the Hugene 1.0ST array but I need to convert those IDs to equivalent probeset IDs from the HUG133A array to run them through CMAP.
Thanks,
Steve
Relevant answer
Answer
you can also try genome browser with searches by probeset and gene and mysql databases of independent mappings :
  • asked a question related to Microarray Analysis
Question
3 answers
I have data from microarray analysis using affymetrix. I have the fold change done by the affymetrix, but not the log2. I am wondering What is better for doing the heat map; using the fold change or log2 fold change? why?
Thank you in advance for your contributions.
Relevant answer
Answer
As suggested in the article “Analysis of microarray experiments of gene expression profiling” (Tarca et al. 2006), the fold change of a given gene is measured as a ratio and these raw ratios are then log-transformed (usually log2). This is expected to give a mean log-ratio of zero and improve the symmetry of the data distribution. So, I would suggest you to have a thorough look on the paper before analysis. If you have the input files, which have been generated after microarray scanning, you can generate log 2 values using GeneSpring or other software available.
  • asked a question related to Microarray Analysis
Question
4 answers
I’ve worked on salinity tolerance & transcriptomics in rice using Agilent 4x44K rice genome array. And now writing a paper on it.
I’ve done extensive Gene Ontology enrichment analysis and Gene network analysis and have some lists of probes. Such as..
Os01g0557500
Os01g0645200
Os05g0382200
Os06g0152200
Os06g0701600
Os08g0503700
Os09g0286400
Os09g0299400
Os09g0484900
Os10g0436900
I want to know which genes are these probes referring to.
Also, which stress adaptation related pathway(s) is/are influenced by my list of significant genes.
Can anyone please help? You can be a co-author of my paper as well.
Thanks
Relevant answer
Perform GO annotations here:
WEB-based GEne SeT AnaLysis: http://bioinfo.vanderbilt.edu/webgestalt/
and
as starting points and then you come across more tools to do so!
Also, try DAVID: https://david.ncifcrf.gov/ for doing GOFat and GOSlim analyses.
Good luck.
Thanks,
Biswa
  • asked a question related to Microarray Analysis
Question
2 answers
I'm thinking of doing some GeneOntology on some supplementary information I found in a paper (DOI: 10.1021/acs.jproteome.5b00770). However the information on identified proteins is available from both  iTRAQ and iBAQ analysis. I'm not sure what the difference is or what one to use. What is the difference between the two?
Thank you. 
Relevant answer
Answer
  • asked a question related to Microarray Analysis
Question
2 answers
Let's say that I want to compare the effect of monocytes' stimulation with factor X on the gene expression (microarrays) performed by five distinct groups (this is just an example). All of them have uploaded the data into bioinformatical databases. Each of these groups analyzed gene expression with a different microarray platform. Each of these groups have preprocessed the signal from the machine in a slightly different way and applied different normalization procedure therefore two types of data are available: raw and processed. It seems that the authors knew what they were doing while processing the data so I am trying to make use of the processed results. Naturally it would be wrong just to compare the processed expression data of the cells after the stimulation between the groups. I wonder if for every group, for every gene I could calculate the relative change in the expression as the ratio: before_stimulation/after_stimulation and compare these values?
  1. That would free me from the effect of distinct platforms (since within each pair the same platform was utilized)
  2. Reduce the effect of the data transformation on the resulting values of gene expression (since data transformation within the pair was the same)
  3. Free me from the effect of distinct monocyte cultures in the beginning
Alternatively, I will have to utilize the raw data but since by transforming the raw signal in a different way than most of the authors I will mostly obtain slightly different results that they have published... This seems odd...
*Also, I know that distinct microarray platforms analyze different genes sets - that is another problem.
Relevant answer
Answer
One important issue to consider is the expression of genes in the unstimulated control may differ among the data sets. Once the data for each data set and transformed for each data set, the genes expression profile of  the control strain essentially drops out when the strain is transformed. Depending upon the format of each data set, examine the unstimulated control samples for any major differences that can be attributed to differences in the handling and  processing of the control samples
  • asked a question related to Microarray Analysis
Question
6 answers
Hi, I am using Ingenuity Pathway Analysis (IPA) to analyse data of breast cancer microarrays. I want to use the upstream regulator analysis for obtain relevant transcriptional regulators,but I don´t understand exactly how it works. The analysis predicts the upstream regulators that explain the observed gene expression changes in my data, but why that upstream regulator is not in the differentially expressed genes that I have?
For example I have TNF as a upstream regulator, but this gene in not part of my data. Why?
Thanks for the help.
Relevant answer
Answer
Upstream regulator is indicative of the TF activity driving the observed GE changes.  It will not necessarily, and usually not, be altered at the mRNA level itself.
  • asked a question related to Microarray Analysis
Question
6 answers
Hello,
I was wondering which cell detachment technique would be most suitable for RNA extraction? I will be doing microarray analysis and need the gene expression profile to stay intact. 
Thanks!
Relevant answer
Answer
You can do a control experiment with one batch with trypsinization and another batch with trypsinization along with another treatment and then do a qPCR for certain groups of cell attachment, 18s, etc genes before microarry.
  • asked a question related to Microarray Analysis
Question
6 answers
I am working on optimizing Gene selection in microarray data for Cancer Classification. I am going to use SVM in (libsvm) as wrapper approach to evaluate Gene subsets using 10 K fold cross validation.
Microarray data consider as huge dimensional data ( i.e Lymphoma data set consists 4026 Genes 'features' and  62 instances and 3 class labels).
Does libsvm support multiclass classification, As in my work, Lymphoma & MLL has 3 classes?
What is the appropriate svm type and kernal type and parameters for the chosen kernal (c,gamma, etc...) in LIBSVM multi class classification  like microarray data?
Relevant answer
Answer
Try using Gridsearch algorithm in python
  • asked a question related to Microarray Analysis
Question
9 answers
Hii..everyone..
I am planning to conduct a QPCR gene expression analysis of some defense related genes in wheat. I am using Resistant and Susceptible plants to compare the gene expression. I am planning to do time-course study at 0hour(control), 12hpi, 24hpi and 48hpi time points after fungal inoculation. I am planning to have my time point 0 hour (un inoculated)as a control in my experiment. 
My question is that can we use the 0 hour(uninoculated) samples as a control in my experiment. I am not using the Tween 20 in fungal suspension spraying?
Relevant answer
Answer
Dear Chethana,
I perform always these experiments by comparing the incoculated and uninoculated samples at the same time point. I use also 0 time to check if there is variation in the expression by comparing this time point with the uninoculated ones.
Regards, Renato
  • asked a question related to Microarray Analysis
Question
7 answers
I need to extract some human cell lines' RNA for Microarray analysis but I don't know what is the best way for purifying RNA from human cell line
Online searching, I found 'MagMAX™-96 for Microarrays Total RNA Isolation Kit'. but it needs some accessories currently we don have in our lab such as Magnetic-Ring stand.
Anybody has the experience on Microarray sample preparation?
Relevant answer
Answer
I would suggest to use a column based purification method (Qiagen, M&N etc) maybe in combination with Trizol. This only requires a centrifuge that holds eppendorf tubes. But as Abhijit also write - check the RIN numbers after purification. If they are high - 8 or above you are good to go. 
  • asked a question related to Microarray Analysis
Question
9 answers
I have microarray data with fold change values with cut-off value of 0.6 and above for upregulated genes, and -0.6 and below for down regulated genes. I want to ask- how can I convert fold change values to log2 Ratio and vice versa? Does excel has such feature?  I also want to ask how can I generate heat map using my microarray data? What are the prerequisites like fold change or p-value or log2 ratio for generation of heatmap? What tools are required for it? Can it also be generated in excel, how?
Relevant answer
Answer
If you really want to study expression profiles, forget Excel. Really.
Instead start learning some software made to handle such data. A very commonly used software for this task is R (www.r-project.org) that can be extended withh 1000s of specialized add-on packages, many to specifically handle microarray data (see www.bioconductor.org). It will take you some time and nerves to learn it, but it is absolutely worth it - not onyl for solving your recent problems but also to improve structured thinking and career options. And R is free, and there are tons and tons of tutorials and books freely available in the internet as well.
However, in Excel you can get the log2 of a value with =LOG(x;2) and the inverse function is the poser to base 2: =2^x or =POWER(2;x)
Heatmaps can not really be made with Excel. You can fiddle around with conditional highlightning of cells, but Excel doesn't let you cluster rows or columns. There are some tools in the internet that will produce heatmaps. But again, better learn R!
  • asked a question related to Microarray Analysis
Question
10 answers
hello , 
we have a study were we used two different chips and we need to analyze the data now .. 
does any one know how to normalize microarray data from two different chips ?
Relevant answer
Affy chips are similar in prope sets but they are different in number of probe sets. Affy U133 Plus 2.0 covers about 54000 probes and  U133A 2.0 Array  covers only about 22,000 probe sets. Therefore, if the affy chips are not the same you should first filter out the non-overlapping probe sets.
Also for integration you will have two steps: probe level integration and gene level integration. For probe set level integration you can use several methods and packages such as VirtualArray (1) or (2). These methods merge two datasets and make a unique dataset that should be normalized together in the following steps. Special statistical consideration are needed since any small deviation in these methods will have a intense effects on your results. Statistically, you will need batch effect removal using (sva, limma or VirtualArray) package. 
For gene level integration you can follow many meta-analytical tools such as RankPord, MetaOmics, MetaArray and so on. These methods will compute differential genes for each dataset separately and finally will use a method (fisher, combined p value, combine effect size, vote counting and so on) to unify the results. Some tools also use pathway enrichment results (MetaOmics) to qualify and unify the results. 
Finally, take care and check the quality of the analysis using Naser's comment along with the link (3) which is very deep information about the QC in Affy chips. Also I do recommend MetaQC package to check the quality of both data sets
  • asked a question related to Microarray Analysis
Question
6 answers
We are starting some work with online available data file.
The Agilent micro-array result file shows several columns representing various parameters. 
Please indicate whether "gMedianSignal" is the actual expression values? I suppose that the "gBGMedianSignal" should be subtracted to remove background effect.
Then the remainder is to be log transformed.
Please apprise whether this is correct or not!
Looking forward,
Relevant answer
Answer
Yes. "gMedianSignal" is the median pixel intensity at 570nm (Cy3 emission = "green" [g]). You may also consider using the "gMeanSignal" which is the mean signal intensity. You can use the ratio of these two values to check the homogeneity of the pixel intensities within the spots. "BG" stands for (local) background intensity.
The signal intensities for the Cy5 dye are in the columns starting with "r" (for "red", measued at 670nm).
Simple substraction of the local background can give you many spots with zero or even negative remaining intensities, what will not work well with log-transformation. It is recommended to use the "normexp" method. See http://bioinformatics.oxfordjournals.org/content/23/20/2700.full
  • asked a question related to Microarray Analysis
Question
4 answers
I have had a request for 15 µm sections from very precious TMAs.  I don't think this will work, because thick sections usually roll, the individual cores usually fold if the section is too thick. I usually section them at 3-5 µm.
Does anyone have any experience of cutting TMAs thicker?
Relevant answer
Answer
Thanks John.  I was not convinced that we could justify thicker sections, and we have agreed that they will receive 5 µm sections as usual.  
  • asked a question related to Microarray Analysis
Question
3 answers
Hi,
I would like to know what is Rinmatched in a Microarray experiment? and what does it indicate when it is equal to 0 or 1? I am familiar with RIN that is RNA Integrity Number , but don't have a clue about Rinmatched. Is there any difference between these parameters? I have also enclosed a screenshot of GEO2R panel. What is the difference between two controls with RIn=6? One has Rinmatched=0 and another equals to 1
Many thanks
Mona Azodi
Relevant answer
Answer
I guess it is; there are controls with same RINs, as it is in attachment file, but different rinmatched. Both of them are 6, but one of them has rinmatched=0 and another equals to 1
I do not know what the difference is between these two controls if the rinmatched is related to RIN? I don not know which one to compare with the disease samples?
The data is related to a publication, I could not figure out by reading the text
here it is:
Genetic Neuropathology of Obsessive Psychiatric Syndromes
Thanks a lot
  • asked a question related to Microarray Analysis
Question
11 answers
Which datasets are you exploiting to evaluate recommender systems?
Relevant answer
Answer
There are multiple datasets on web which someone can use during evaluation step. Depending on you model and the auxiliary information used (tags, timestamps, ratings, etc.) you should choose the best dataset close to your needs. The most used datasets in literature can be found here:
  • asked a question related to Microarray Analysis
Question
7 answers
I am using ImageJ with micro-array profile but it takes a lot of time to place the circles of the software on my object to measure the color intensity. 
Basically I want that the software recognize all my samples in the same picture and allows me to measure color intensity.  Thanks!
Relevant answer
Answer
Design a software yourself rather than than getting one done by somebody else.