Science topic

Bioconductor - Science topic

Explore the latest questions and answers in Bioconductor, and find Bioconductor experts.
Questions related to Bioconductor
  • asked a question related to Bioconductor
Question
1 answer
Hi good people,
I am trying to analyze cytoF data with the phonograph algorithm of Cytokit. I faced a lot of issues running the code in R. Is there anyone who has performed the same analysis? I would like to know which version of R and Bioconductor did you use.
Thank You,
Sadi
Relevant answer
Answer
You should be able to use cytofkit shinyapp without running any R code: just open R, and type library(cytofkit) and then cytofkitShinyAPP()
However, cytofkit runs on older versions of R (if I remember well, only R 3.6 or older versions).
  • asked a question related to Bioconductor
Question
5 answers
I'm aware that Biobase is part of the Bioconductor project and that various other packages use it. But what are the functions of this package, and what kind of data do we use it for?
Relevant answer
Answer
Dear Dr. Ronán Michael Conroy , you are probably right. Thanks for the comment.
  • asked a question related to Bioconductor
Question
5 answers
When I want to install biocondutor pakage, there was a problem:" package ‘bioconductor’ is not available (for R version 3.6.3)" , anyone can help?
Relevant answer
Answer
uninstall R and R studio
  • asked a question related to Bioconductor
Question
2 answers
Dear all, hope you all are doing well,
I'm installing edgeR in R version 3.4.4 (2018-03-15) in ubuntu 18.4 by running the command..
if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager")
then
BiocManager::install("edgeR")
but I'm getting the error which is as follows...
if (!requireNamespace("BiocManager", quietly = TRUE)) +     install.packages("BiocManager") > BiocManager::install("edgeR") 'getOption("repos")' replaces Bioconductor standard repositories, see '?repositories' for details replacement repositories:     CRAN: https://cloud.r-project.org Bioconductor version 3.6 (BiocManager 1.30.16), R 3.4.4 (2018-03-15) Installing package(s) 'BiocVersion', 'edgeR' Error in download.file(url, destfile, method, mode = "wb", ...) :   unused argument (checkBuilt = FALSE) In addition: Warning messages: 1: In .inet_warning(msg) :   package ‘BiocVersion’ is not available (for R version 3.4.4) 2: In .inet_warning(msg) : dependency ‘locfit’ is not available Installation paths not writeable, unable to update packages   path: /usr/lib/R/library   packages:     boot, class, cluster, codetools, KernSmooth, lattice, MASS, nlme, nnet,     rpart, spatial Warning message: In .inet_warning(msg) : download of package ‘edgeR’ failed
Please help and let me know how can i solve the above issue.
thank you.
Relevant answer
Answer
Have you tried to install it from the source file?
  • asked a question related to Bioconductor
Question
5 answers
dear all, hope you all are doing well, I just installed R version 4.1.2 in ubuntu 20.4 and after that I have done the installation of edgeR by running the following command..
if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager")
then..
BiocManager::install("edgeR")
After that it require a limma package, so when I was installing it throwing an error, which is as follows..
Warning message:
package(s) not installed when version(s) same as current; use `force = TRUE` to
re-install: 'limma'
Please help and let me know to find solution.
thank you
Relevant answer
Answer
you're very welcome.
  • asked a question related to Bioconductor
Question
7 answers
I am unable to open the raw data files of microarray data
into R program. I contacted
I checked for a few resources online, bioconductor packages
also EBI training courses
etc but it does not open the files.
Relevant answer
Answer
Retrieve that list of GEO accession and sample types
for ex:
Sample sample_type
GSM123444 Control
GSM123445 treatment1
GSM123456 treatment2
GSM123457 treatment3
GSM123458 treatment4
store the above list in a text file ex path.txt
Then import into R
data=read.table("path.txt", header=T)
Best!!
  • asked a question related to Bioconductor
Question
5 answers
Can anyone suggest to me what will be the best way to learn how to use the Perl or R bioconductor? Can anyone suggest a good link so I am able to learn on my own? Though my PhD was focussed on immunology, I want to learn how to use these two tools in the field of epigenetics?
Relevant answer
  • asked a question related to Bioconductor
Question
5 answers
I am analysing a 14-parameter flow cytometry panel in FlowJo v10.3 and would like to clean up the data before analysis. There are two plugins (flowClean and FlowAI) which use R to get rid of bad quality data (e.g. interrupted flow or signal acquisition issues).
Despite following the tutorials, I am getting various error messages including:
"Could not create Gating-ML elements:
gating:RectangleGate
The target sample does not have some parameters referenced in the GatingML definition"
When this happens, I get some basic plots, but the programme does not split my "good events" from my "bad quality" events.
Alternatively "FlowJo could not derive the expected parameter" (ie the calculation fails totally).
Can anyone tell me why this is happening and how to fix it please?
Relevant answer
Answer
I met the same "FlowJo could not derive the expected parameter" problem.
After I assured R package flowAI worked in RStudio, I found out it's caused by my path of fcs files in Chinese.
So I changed related path in english, it worked!
Hope this will help you.
Have a good day:)
  • asked a question related to Bioconductor
Question
4 answers
I have a dataset with 149 columns as GSM ID's and first columns as a Gene Name (Screenshot attached). Total 20,000 rows (Genes) are present. How can I analyze the dataset to find the Biological pathways using KEGG or other pathway database.
All GSM ID's are lung cancer micro array expression data.
I know how to do Differential Gene Expression data and pathway analysis but don't know how to analyze this type of datasets.
Also comment if you feel that dataset is not correct or cannot be used to find the pathways or other information is required.
Any help will do great. Thanks in advance
Relevant answer
Answer
Sir can you share a tutorial or workbook or something like that. I can work in R, Python and Linux.
Thanks
  • asked a question related to Bioconductor
Question
4 answers
Hi everyone, I have a transcription dateset in the format HG-U133_Plus_2 affymatrix and I want to convert it into official gene symbols. I have tried bioconductor "hgu133a.db" library but it doesn't work. Can somebody help me out understanding how to do it? I put here some of the probes I'm unable to convert. Thanks in advance!
1552256_a_at 1552257_a_at 1552258_at 1552261_at 1552263_at 1552264_a_at 1552266_at 21552269_at 1552271_at 1552272_a_at 1552274_at 1552275_s_at 1552276_a_at 1552277_a_at 1552278_a_at 1552279_a_at 1552280_at 1552281_at 1552283_s_at 1552286_at 1552287_s_at 1552288_at
Relevant answer
Answer
Dear Riccardo;
A) the database of DAVID can convert the affy IDs.
B) If you are not skilled in programming:
1) download the latest annotation file for HG-U133_Plus_2 from the company website:
find the file named
HG-U133_Plus_2 Annotations, CSV format, Release 36 (36 MB, 7/12/16)
2) unzip the file
3) open the file in MS excel
4) use "vlookup" function of excel to find the symbols. If you have a handful IDs you may search within the file.
If familiar with coding in R, then use annotationDbi (Bioconductor package)
Best
  • asked a question related to Bioconductor
Question
3 answers
The Bioconductor package and some other libraries need this version. So if someone uses or knows about the alternate source kindly help.
  • asked a question related to Bioconductor
Question
6 answers
I am trying to install Bioconductor packages to open CEL files and analyze the raw data files generated by Affymetrix microarrays. I found some workflows on the Bioconductor website but I could not install the packages, maybe due to the different Bioconductor versions. I would greatly appreciate if anyone can give some suggestions about the workflows that I should use and/or how to download the old Bioconductor versions.
Relevant answer
Answer
Thank you for your advice!
  • asked a question related to Bioconductor
Question
3 answers
Dear colleagues,
I am looking for a python library equivalent to the one found in the bioconductor annotation package for TxDb object "TxDb.Hsapiens.UCSC.hg19.knownGene". Also, I am interested to know how essential this particular package is to you and among your fellows in the field of Genomics or in similar disciplines?
much appreciated,
Nawaf.
Relevant answer
Answer
Nawaf Alomran Thank you for this information. This is a smarter way :)
  • asked a question related to Bioconductor
Question
3 answers
I have more than 500 Datasets of RNA-sequencing data (both FASTA and FASTQ format) and I'd like to analyze gene expression and differentially expressed genes.
files with FASTA format are in my PC (windows OS) and with FASTQ format are imported to the galaxy website (usegalaxy . eu).
I'm not familiar with gene expression analysis (GEA) and recently installed R and I'm working with Bioconductor packages (like DESq2,edgeR, biobase and etc..) to learn how to use them for GEA. IDK how much, but it seems it takes a long time to learn and use them.
Here my question is could anyone let me know what is the best and fast way for GEA.
is R the best software for GEA, is yes, is any simple tutorial for GEA by R?
Regarding the huge mass of RNA-seq data and my pc may not be able to analyze them, is there any software on Galaxy website for GEA.
Any guide is warmly apriciated
Relevant answer
Answer
Hi Ali,
first of all, I would suggest starting with a smaller dataset when learning how to analyse RNA-seq data. This will reduce a lot of headache and will give you an easier time bug-fixing and testing stuff.
Second, if you are serious about doing bioinformatics you will need to switch to Linux/MacOS at some point. You can keep Windows and install a virtual maschine running Linux to circumvent this. You will also need to dive deeper into bioinformatics to understand and build on what you are analysing (which is all possible but this goes beyond the scope of what you want to read here).
So to make it short, you can use galaxy and R for GE analysis of RNA-seq data. As you already have your data in galaxy, you can use "salmon quant" to quasi-map and quantify transcript levels (input are FQ files from one sample and a reference cdna you need to provide). Then you can use the salmon output to run DESeq2 and analyse differential gene expression between conditions (you need to specify you used salmon for mapping and provide a transcript ID to gene ID mapping file). Depending on what organism you are working with, it should be rather easy to find the cdna reference (just google) and a transcript ID to gene ID mapping file (you can use biomart, also just google).
Hope this helps!
Alexander
  • asked a question related to Bioconductor
Question
3 answers
tl;dr: Why data linearization is applied in the identification o differentially expressed genes/proteins?
I am new to data analysis of big data like proteomics. As far as I know, a simple t-test is not enough as there is a high chance of false positives. I've been reading and it seems that Limma is a good package with better statistics to be applied in the identification of differentially expressed proteins (and genes). Most of the papers apply linearization in the process of identifying the genes but I would like to understand why this step is necessary.
An example of a R code that I generally see people using is this one:
design <- model.matrix(~factor(c(2,2,2,2,1,1,1,1)))
Thank you in advance!
Relevant answer
Answer
Dear Luiz Gustavo Nogueira Almeida the core of the limma package is linear model. A very simple explanation about why use model.matrix is that this function allow u to create several type of design matrix. According to the matrix created LmFit will estimates the parameters of interest.
So back to your example
data <- model.matrix(~factor(c(2,2,2,2,1,1,1,1)))
Means that you have two group, so when you fit your model you will estimates to parameters the first one will be the intercept that esitmates the population average in the first group and the second one that estimates the difference between the population averages of the second group and the first group. There is a whole chapter about design matrix on the book wrote by Rafael Irizarry and Michael Love called Data Analysis for the life Sciences.
  • asked a question related to Bioconductor
Question
2 answers
Hi, I have send my RRBS raw data to 2 different bio-informatics. The gene lists i got as result differ extremely, with an overlap of less than 25%. The only step i detect as different in the pipelines they used is the Methylation calling step: one used Bismarck, and the other used Bioconductor. How should i choose the right one? THanks in advance
Relevant answer
Answer
THank you Ali!
The functional analyses show more or less the same pathways. Could this be indicating that the different bioinformatic approaches are not relevant for this case?
  • asked a question related to Bioconductor
Question
16 answers
Hello everyone,
Currently I am trying to do K - mean clustering on microarray dataset which consists of 127 columns and 1000 rows. When I plot the graph, it gives an error like "figure margins too large". Then, I write this in R console:
par("mar") #It will give current dimensions
par(mar=c(1,1,1,1) #Tried to update the dimensions
But; it did not work. So, can anyone suggest me another way of fixing this problem? (Attached the some part of code in below)
Thanks,
Hasan
--------------------------------------------------------------------------------------------------------------
x = as.data.frame(x)
km_out = kmeans(x, 2, nstart = 20)
km_out$cluster
plot(x, col=(km.out$cluster+1), main="K - Means Clustering Results with K=2",xlab"", ylab"", pch=20, cex=2)
>Error in plot.new() : figure margins too large
Relevant answer
Answer
  • asked a question related to Bioconductor
Question
3 answers
We are doing a study analyzing the expression of certain genes and correlating that with response to chemotherapy. So far I have been manually going through every dataset on the NCBI website and teasing out which ones have "therapy response" or any variation of that as a variable. Is there a more efficient way to do this? Like a query to filter out highthroughput/microarray data that also contains therapy response/pathologic complete response/etc. Any help would be greatly appreciated. Thanks.
Relevant answer
Answer
Hi Norman,
if you used the filters in GEO profiles and found nothing maybe you could go an other way. try in the pubmed query and see in found papers which one has data deposited in GEO...
fred
  • asked a question related to Bioconductor
Question
4 answers
Hi.
I have two conditions of my Gene Expression microarray data sets. The control/untreated and the treated sample. Each condition has two replicates. However, the first and second replicates were prepared at different times.
I analysed the data using the two replicates in Agilent GeneSpring. I wanted to perform the Volcano plate. I chose the Moderate T-Test and Benjamini-Hochberg FDR, (FC >2) however I did not get any significant entities using the corrected P-value. I changed the corrected P-value cut off until P>0.98 then I get some entities, but still there are very few, less than 5.
I thought this problem could be due to the batch effect as the replicates were prepared at different times.
And since I'm still new in preparing the samples, the handling was inconsistent.
I believe that the the best thing is to run again the replicates at the same time, but hopefully that will be the last thing that I have to do since it is costly to repeat the microarray again.
I need some suggestion on how to correct this.
Is it fine to report the entities without performing the multiple testing and corrected P-value? How should I resolve the problem of false positives if I don't do the correction?
Besides Agilent GeneSpring, is there any recommendation to perform the analysis (Agilent datasets), maybe by Bioconductor R platform to tackle this problem?
Thank you.
Relevant answer
Answer
Thank you Frederic Lepretre and Maribel Baldellou for the advices..
My lab usually prepared two replicates for statistical analysis in microarray, and ran by the experienced technical staff.. but since this time the problem is about technicalities, we will prepare new samples and run the replicates at the same time.
As we need to have some hints from the DEG, for now we use FC>2 and will validate some using qPCR. I will design the primers around the probe.
Thanks again!
  • asked a question related to Bioconductor
Question
3 answers
Good day!
I'm trying to carry out co-expression analysis using CEMiTool after limma preparation of microarray results.
It's pointed in CEMiTool userguide that one should use unprocessed expression data.
Experimentally I found that there is no big difference between evaluated co-expression modules if I change FDR p.value in topTable function from 0.05 to 0.1.
But it appears that there really is a big difference whether to use topTable(...adjust.method = "BH" or "none" before submitting data to CEMiTool - the genes changes their positions in co-expression modules.
Should I use the Benjamini-Hochberg correction? Or maybe I should not filter data by p.values and correct it at all?
The values I use are average Lfc's, each from 3 repeatings.
Relevant answer
Answer
Yep! that´s make sense to me. Adjusted p-value does not have a big impact in such case but neither hurt. In my opinion, applying BH and selecting p-values above 0.05 as you did should be fine.
Best
  • asked a question related to Bioconductor
Question
3 answers
> source("https://bioconductor.org/biocLite.R") Bioconductor version 3.8 (BiocInstaller 1.32.1), ?biocLite for help Warning message: 'BiocInstaller' and 'biocLite()' are deprecated, use the 'BiocManager' CRAN package instead. > if (!requireNamespace("BiocManager", quietly = TRUE)) + install.packages("BiocManager") > BiocManager::install("BiocInstaller", version = "3.8") Bioconductor version 3.8 (BiocManager 1.30.4), R 3.5.2 (2018-12-20) Installing package(s) 'BiocInstaller' Warning: package ‘BiocInstaller’ is in use and will not be installed installation path not writeable, unable to update packages: class, codetools
Need Suggestions
Relevant answer
Answer
Unfortunately, this is an issue with PC's when R packages (and dependencies) are downloaded to a user (not a root) directory... try "chmod -R 777" in the command line to allow read/write/execute permissions...
  • asked a question related to Bioconductor
Question
6 answers
I am running an R script that downloads and preprocesses all the available methylation data sets from TCGA. I'm using the Bioconductor package MethylMix for this. However, when I try to process the 450K breast cancer methylation data set (size ~13GB), I get a "Cannot allocate vector of size 12.8 GB" error.
I am running R 3.4.0 on 64-bit x86_64_pc-linux-gnu using my school's computing cluster, and each node has the following properties:
  • Dual Socket
  • Xeon E5-2690 v3 (Haswell) : 12 cores per socket (24 cores/node), 2.6 GHz
  • 64 GB DDR4-2133 (8 x 8GB dual rank x8 DIMMS)
  • No local disk
  • Hyperthreading Enabled - 48 threads (logical CPUs) per node
so it seems as though there should be enough memory for this operation. The operating system is Linux, so I thought that R will just use all available memory, unlike on Windows? And checking the process memory using ulimit returns "unlimited." I am not sure where the problem lies. My script is a loop that iterates over all cancers available on TCGA, if that makes any difference.
Relevant answer
Answer
The 64-bit version of R should be able to adress all available memory, even under Windows (64 bit).
A possible issue is that you may keep several copies of the object in memory. Also, every time you make a change to an object, the entire object is copied. Make sure that you don't have any copies and call the garbage collector before starting a memory-demanding process.
However, if you anyway process it in a loop, you can divide it in chunks and process them one after the other.
And having a look at Emilys hint is also surely a good idea.
  • asked a question related to Bioconductor
Question
2 answers
Hi all
Actually i have to design sgRNA using CRISPRseek and screen the whole genome for off-target analysis. Unfortunately, for banana plants BS genome Packages is not available with Bioconductor website (https://bioconductor.org/packages/release/bioc/html/BSgenome.html). Therefore, can anyone suggest alternatives or Packages. Or else how to create BS genome Packages for Banana using R script. I have no idea about creating a new BS genome Packages
Thanks in advance
Relevant answer
Answer
Thanks Sanjay. I ll let you know
  • asked a question related to Bioconductor
Question
7 answers
Our research group works with viral detection in human samples through PCR-based methods. We use to sequence the PCR amplicons to confirm the specific amplification of viral sequences in a Sanger-based platform (Applied Biosystems 3500 genetic analyzer). When analyzing the electropherograms generated it is common to observe degenerated bases (usually Ys and Ws) that seems to be not generated by errors in sequencing process, but to rather represent intra-host variability in the viral sequences.
This raised our interest in further investigate these candidate variations and search for possible active mutational processes, specifically we are interested in quantify the possible influence of APOBEC cytidine deaminases in generating these variations (by searching for mutations in APOBEC specific recognition sites, namely 5'-TC-3' over random candidate mutations). Is there any software, package or pipeline adapted for this analysis?
I've read about and downloaded the Minnor Variant Finder software (MVF, from Applied Bisystems), but it seems to be not suited for this question, once it was developed to identify low-frequency human variants and requires the parallel sequencing of a control sequence, which I don't know what could be in my case.
Thank you!
Relevant answer
Answer
Dear Glauco,
Hypermut, as suggested by Brian Foley, is to the best of my knowledge, quite good at estimating APOBEC-driven mutations, although it is more specific to A3G and A3F nucleotide contexts. The algorithm calculates the probability of a given seqeunce to be hypermutated by APOBECs by comparing the unbalance between mutated cytidines at APOBEC versus non-APOBEC (random) contexts, and provides a p-value related to that comparison.
  • asked a question related to Bioconductor
Question
3 answers
Hello everyone,
I have to analyse data from Affymetrix microarray (Human Genome U133 Plus 2.0 Array) with Bioconductor and it is the first time I am using Bioconductor. I got .cel files from NCBI GEO but I could not get the chip description file. So, how I can obtain a CDF?
And one more thing, when I check number of genes in my dataset, the R program shows that it contains 54,675 genes. However; this number should be between 20,000 - 25,000. So, I am wondering that there might be any replica of them?
Any suggestions and someone can help please?
Thanks,
Hasan
Relevant answer
Answer
Thank you for your answer and time Dr. Joachim. The link that you send helped me a lot and as you said the number 54675 probe set is true as well. Thanks again.
  • asked a question related to Bioconductor
Question
5 answers
Hello.
So I have 2 large fastq files that I need to analyze and compare for differential gene expression in R.
1. How would I go about opening them to see how they look like?
2. What packages can I use to analyze and compare them? I tried bioconductor, but it does not work because these files are too large.
Thanks for your help in advance!
  • asked a question related to Bioconductor
Question
2 answers
Analysis of data
Relevant answer
Answer
Thanks mam. I have received data from DMET analysis. How will I present the data in various representations? What all software and applications will help in the same?
  • asked a question related to Bioconductor
Question
4 answers
While working with gcrma I found that the package ‘hgu95av2cdf’ is not available (for R version 3.4.0). 
So I would like to know a stable version of R for which all packages from Bioconductor are available 
Relevant answer
Answer
The current release of Bioconductor is version 3.5; it works with R version 3.4.0. Users of older R andBioconductor users must update their installation to take advantage of new features and to access packages that have been added to Bioconductor since the last release..
regards,
Milan
  • asked a question related to Bioconductor
Question
4 answers
Hi,
I have 33 ligands in total, which were analyzed through SAM. Reported in an article entitled "Analysis of the major patterns of B cell gene expression changes in response to short-term stimulation with 33 single ligands". I selected 10 ligands from above data and wants additional analysis but they didn't provide the RAW data/CEL, I downloaded the Processed data from "ArrayExpress". I reviewed the limma tutorial and want to make sure the downloaded data file for limma. I need a starting point for analysis through limma, I attached one of processed data file as an example, Can I use processed data files as an input for limma and which type of analysis will be performed? I will be waiting for your valuable answers.
Thank you,
Relevant answer
Answer
I have some sample code here for a paper, where the data is downloaded from the GEO database and analysed using Limma. You can modify some sections, and use the Bioconductor Array Express package instead of Geoquery. 
Hope it helps. Good luck.
  • asked a question related to Bioconductor
Question
5 answers
Dear All, 
I am trying to see which CpG sites (with its associated genes) are involved in particular pathways and diseases, and get an overview of the functions of these genes. 
Currently, I have tried to import my dataset (>800k CpG sites total) which shows the following: 1) each CpG site as the ID, 2) p-value, 3) q-value, 4) fold change and 5) difference. My data sets are quite large with >200,000 CpG sites (the row limit of IPA) -  is there a way to import a file this large? 
I have also tried importing a file with more specific CpG sites of around 1000 CpG sites but it is not being mapped properly by IPA as I have 0 mapped sites due to errors or possibly I am using the wrong template (i.e. not expression data)? 
I think the errors are coming from my formatting in my excel file to IPA, where either the headings are incorrect and the way I am assigning each header/observation is incorrect i.e. I think I set my Identifier as Illumina (which is what I used to get my CpG methylation data), but I do not know what other options I can choose instead of this. IPA also showed errors first with 'no IDs matched to particular genes',and then with 'removing fold change between 1 and -1'. 
In summary, I would really appreciate any tips/guidance with uploading CpG methylation data into IPA. 
Thank you very much.
Relevant answer
Answer
Thank you very much Mr. Kamstra and Dr. Muley! 
I am looking at human samples from infant cord blood lymphocytes. 
So far, I have a smaller list of CpG sites that are relevant based on p-values and fold-change (as what Dr.Muley suggested). 
Mr. Kamstra, I have associated CpG sites with genes using GenomeStudio, but would I have to use biomart to convert these gene names to ensemble IDs or entrez IDs? 
Thank you for your help!
  • asked a question related to Bioconductor
Question
3 answers
I would like to integrate mRNA expression data (microarray) with PPI network. I tried to use the bioconductor package "STRINGdb" but when I tried to get the network. I get Error, it is only support 200 nodes.
Is there other methd to integrate the data??
Thanks in advance
Relevant answer
Answer
I would like  to build on Farhad's answer. A good place to start would be the Cytoscape app store http://apps.cytoscape.org/apps/with_tag/ppinetwork where you find a list of  apps that deal with PPI analysis. You can see that the StringDB app is also listed here. However, I would recommend you to try StrongestPath and CytoGEDevo apps as well. However if you are more interested in doing a web-based analysis, you can always use PantherDB http://pantherdb.org/. It is simple, easy and intuitive. 
Good luck
  • asked a question related to Bioconductor
Question
3 answers
Hello friends
I am doing micro array data analysis(HGU1333plus2), i got the expression matrix file by using gcrma , but the some probe is represent multiple gene like this . how can we treat this, then some probe is not matched it shows NA can delete it , next i take this file for analyze  WGCNA , please share your knowledge ,
221251_x_at
1
221251_x_at
INO80B /// INO80B-WBP1
NA
65133_i_at
1
65133_i_at
INO80B /// INO80B-WBP1
NA
223072_s_at
1
223072_s_at
INO80B /// INO80B-WBP1 /// WBP1
NA
1559716_at
1
1559716_at
INO80C
INO80C
229582_at
1
229582_at
INO80C
INO80C
220165_at
1
220165_at
INO80D
INO80D
Relevant answer
Answer
Hi Mathavan,
If you are asking about this particular case, it can be explained by the fact that INO80B and WBP1 genes can sometimes occur as a read-through transcript that cannot be distinguished from either of the genes individually by the probes in question (see https://genome.ucsc.edu/cgi-bin/hgc?hgsid=579057461_Tm25CaPexEPPpKrMhRyrGSqaNmCR&c=chr2&l=74456725&r=74462493&o=74455022&t=74460891&g=refGene&i=NR_037849).
You can also look up more information about the U133 array's probes sensitivity and specificity to certain transcripts to understand why this could be for other probes and genes. For this particular example you can look at the information for probe 65133_i_at here https://genecards.weizmann.ac.il/cgi-bin/geneannot/GA_search.pl?keyword_type=probe_set_id&array=HG-U95&target=genecards&keyword=65133_i_at and for the 223072_s_at probe  see https://genecards.weizmann.ac.il/cgi-bin/geneannot/GA_search.pl?keyword_type=probe_set_id&array=HG-U133&target=genecards&keyword=223072_s_at.
  • asked a question related to Bioconductor
Question
5 answers
I am trying to analyze GEO Data with Bioconductor R. I have imported the required packages and downloaded the dataset. But when i import simpleaffy using library function, I am getting the following message "Loading required package: genefilter
Error in loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]]) :
there is no package called ‘digest’
Error: package ‘genefilter’ could not be loaded"
And also, I am unable to read the GEO data using "read.affy". I am getting "Error: could not find function "read.affy".
How to rectify this error?
Relevant answer
Answer
I am getting "The page you were looking for was not found" error
  • asked a question related to Bioconductor
Question
1 answer
I performed the Cell Cycle Control Phospho Antibody Array (http://www.fullmoonbio.com/product/cell-cycle-control-phospho-antibody-array/) with 7 control and 7 treatment samples. To identify the signal intensities I used GenePix Pro 7 and created .GPR files.
How do I continue with my statistics? I want to normalize the data and calculate z-scores or SAM. I can normalize tha data in Excel, but I am sure there is a more convenient way to proceede. I read about the program Prospector from Invitrogen and the protMAT website, but Prospector is not working with my .GPR files. 
I am new to protein array and microarray research and would be very happy for any suggestions.
Thank you so much!
Relevant answer
Answer
Dear Denise,
the basic problem of high-througput data (common to metabolomics, transcriptomics and protein arrays) is the huge number of variables (protein species in your case) with respect to statistical units (the 14 samples) that open the way to a plethora of chance correlations. Thus the main lane IS TO EXCHANGE THE ROLE OF VARIABLES AND STATISTICAL UNITS. Simply operate on the transpose of tour original data matrix, i,e. the matrox having as rows the protein species and as columns (variables) the samples. On this matrix operate a Principal Component Analysis (that allow a dual representation of the same data set in terms of loadings (correlation coefficients of the variables with components) and scores (values of different component for each sample). So operate this pC, if you have 14 variables (7 control + 7 treatment) you will have in principle 14 components but, due to the mutual correlation between the abundance of different protein species you will end up into very few (2 or 3) principal acomponents explaining the by far major part of total variance.
Look at the component loadings and you will surely get a PC1 with all loadings of the same sign (size component) this is the signature of a common global profile shared by all the samples. Then go to PC2, PC3..PC4. You must see if there is a component in which the loadig values are significantly different between control and treated (ideally you will get  a component in which control and treated have opposite sign loadings) allowing for a perfect separation of the two groups in the loading space (shape components). If this is the case you will go to the scores of the discriminating component and look for protein species having the higher (in absolute value) scores on the component : those protein species are the ones allowing for the separation of the two groups and you solved your problem, If you will run a PCA on the correlation matrix you do not need to normalize protein values, normalization is implicit in the correlation metrics.
See:
  • asked a question related to Bioconductor
Question
3 answers
I am trying to develop a classification model using RNA-Seq gene expression data. Two independent models were developed successfully using RSEM and RPKM values. However, I was wondering if a transfer learning approach can be used to develop a more general model. I am also wondering if such approach would be useful for extracting a biologically relevant learning rule.
Relevant answer
Answer
Hi Ali, 
Thank you for your response. It is mRNA-Seq. I have edited the question.
  • asked a question related to Bioconductor
Question
1 answer
We have done miRNA Microarray using Agilent Human miRNA Microarray Kit
Ver. 3.0 (Cat No: AGT-G4470C). I have .gpr files of my samples but I could not analyze their miRNA profile on genespring. How can analyze them on Genespring?
Relevant answer
Answer
  • asked a question related to Bioconductor
Question
2 answers
Is it possible to extract the data from the GSE (or GPL) file from getGEO for this analysis as well?
Relevant answer
Answer
Thank you Noha for your prompt response.  It was not quite what I was looking for (as I already know how to use some of testing tools) but I was looking for a package that could take data directly from the GSEmatrix and directly analyze that.  Thank you
  • asked a question related to Bioconductor
Question
6 answers
i am very new to bioconductor and R, so have this in mind when answering the question. for my experiment i first have to run an in silico analysis on several gene expression datas from GEO and ArrayExpress. i wanna check the biologically meaningful differential expression between samples ( which are gene expressions from 3 different cell types) and then to visualize the data more explicitly i wanna do gene enrichment and also interpret it as a pathway. how can i do so? can someone please walk me through the steps and softwares or packages i would need to do this in silico analysis?
Relevant answer
Answer
Hi Neda
Since you refere to ArrayExpress I assume you will be analyzing microarrays - and for that the Limma package is the way to go - and it also have build in methods for doing gene-set and pathway enrichment analysis. A nice overview of the full workflow incl enrichment analysis can be found in this article http://nar.oxfordjournals.org/lookup/doi/10.1093/nar/gkv007 - and the Limma user guide is very nice and probably contain everything you need ( https://bioconductor.org/packages/release/bioc/vignettes/limma/inst/doc/usersguide.pdf ).
If you want to analyse RNA-seq you should go with DESeq or edgeR as previously suggested.
WARNING: You cannot directly compare different datasets from GEO or ArrayExpress (meaning comparing one cell type from one entry to another cell type in another entry). The problem is that the samples are created in different labs at different times (with different technologies) which by itself my affect the gene expression analysis. This problem is typically referred to as batch effects and means that you cannot conclude whether any differences you find are due to the cell types or due to the batch effect.
  • asked a question related to Bioconductor
Question
3 answers
I want to get Entrez IDs for Affymetrix probe sets (hgu133a) to map them on genome-scale metabolic models (GSMMs). Generally, GSMMs use Entrez gene IDs; therefore, to integrate gene expression profiles with these reconstructions, ID conversion between probe sets in respective Affy platform and Entrez IDs is required. The problem with this conversion in Bioconductor is the presence of multiple mapping between identifiers. A simple example would be:
> select(hgu133a.db, c("200080_s_at"), c("SYMBOL","ENTREZID", "GENENAME"))
'select()' returned 1:many mapping between keys and columns
PROBEID SYMBOL ENTREZID GENENAME
1 200080_s_at H3F3A 3020 H3 histone, family 3A
2 200080_s_at H3F3B 3021 H3 histone, family 3B (H3.3B)
3 200080_s_at H3F3AP4 440926 H3 histone, family 3A, pseudogene 4
The question is, what is the choice here? 3020, 3021 or 440926? 
One should notice that the resulting Entrez IDs will be used for GPR (gene-protein-reaction) purpose; therefore, the expression level of all these three Entrez IDs is the same. 
Thanks in advance for sharing your thoughts.
Relevant answer
Answer
Hi Oveis,
If you plan to average the levels, I would use geometric mean to remove outlier influence.  However, since it seems like you are using the "s" set of affy probes (I did not see that when I first responded), they cross-hybridize to the same sequences as the design and it would be more difficult to determine biological impact as it is an inaccurate "group, probe, then analyze" method just using the R packages.
  • asked a question related to Bioconductor
Question
2 answers
I understand the practical details but is there anyone using Bioconductor programs to do the full gene sequencing pathway from alignment to variant calling.
Relevant answer
Answer
samtools mpileup pipeline or GATK might be useful. But don't sue them if you get any issue ;)
  • asked a question related to Bioconductor
Question
11 answers
Hello,
I'm studying about detection of differential expressed genes (DEGs) by using disease vs healthy samples microarray data. I use Limma in Bioconductor for analyze the DEGs. I realize that some of DEGs are both up and down regulated. For example, while ARAP2 gene was upregulated in 2 probe set, this gene down regulated in 3 probe set at one dataset. How is this situation occur in transcriptome level. Is this gene up or else down regulated in real? How are we explain both up and down regulated genes in same dataset?
Relevant answer
Answer
Thanks for your answers. All of them are valuable information for me. 
  • asked a question related to Bioconductor
Question
4 answers
> edesign
Time Replicate Control hypoxic
Array1 3 1 1 2
Array2 2 1 2 1
Array3 2 1 1 2
Array4 1 2 2 1
Array5 1 2 1 2
Array6 3 2 2 1
Array7 3 2 1 2
Array8 2 2 2 1
Array9 2 2 1 2
Array10 1 3 2 1
Array11 1 3 1 2
Array12 3 3 2 1
Array13 3 3 1 2
Array14 2 3 2 1
Array15 2 3 1 2
Array16 1 1 2 1
Array17 1 1 1 2
Array18 3 1 2 1
fit <- p.vector(eset, design, Q = 0.05, MT.adjust = "BH", min.obs = 20)
Error in dat[, as.character(rownames(dis))] : subscript out of bounds\
Relevant answer
Answer
It's hard to say. A first guess wuld be that you should write edesign instead of design in the call to function p.vector. I'm not sure what the argument min.obs is for. But, maybe you have tor reduce the value to a number <= 18, since you have 18 arrays in your dataset.
hth
Matthias
  • asked a question related to Bioconductor
Question
8 answers
I have a dataframe with a first column contains the gene symbol and the others column contains an expression values. the Column of symbol can contain the same symbol more then one time.
So I would like for each set of line with the same symbol calculate the average (or median) of the lines. I will have only a single line by gene in the end.
Thank you in advance
Relevant answer
Answer
I will use datatable if you have large matrix and you need speed. http://cran.r-project.org/web/packages/data.table/vignettes/datatable-intro.pdf .
exprmat_dt <- data.table(exprmat)
setkey(exprmat_dt,genename) # for sorting datatable
exprmat_dt[,list(mean=mean(expr),sd=sd(expr)),by=genename]
  • asked a question related to Bioconductor
Question
14 answers
Dear All,
I would like to analyze several datasets from GEO. This would be some cancer data involving genes, as well as, miRNA. However, the approaches how to analyze the differential expression are very different and sometimes unclear. I decided to try the method used in some paper. They wrote that they calculated the absolute log2FC using limma @ Bioconductor, although the FC calculated by limma is not absolute (or is?). In addition, I have not found anywhere some information how to do it… so is there something wrong or do I do something wrong? I appreciate any suggestion.
Second thing is that in my work I would like to evaluate the differential expression of genes from few platforms (I mean, integrate the mRNA and miRNA data from e.g. Affymetrix and Agilent arrays). What would be the best method for array normalization?
Thanks in advance!
Relevant answer
Answer
I assume that an absolute fold change is the one that uses ratios, rather than negative numbers. So an absolute fold change of 0.5 corresponds to a (conventional) fold change of -2. You take the negative reciprocal to convert from one to the other. However limma works with log 2 values which are negative when less than one. This is the usual way that limma does things. I'm afraid that you will have to try and figure things out from all this. But I'm am suspicious of the claim that they used limma for absolute fold changes.
My advice for anyone trying to trying to normalize data from different platforms is: DON'T! Although you can remove a lot of batch effects with ComBat, you are swamping the system with huge amount of technical variation: different samples, done in different places, done on different days, done on different technologies which is the worst type of technical variation of them all. The purpose of normalization is to remove technical variation and leave only biological variation, but you are throwing every conceivable piece of technical variation together. I personally would have no confidence in any result you derive from this and I suspect your combined gene list will be a piece of junk representing lists of genes could well correspond to the use of different technologies or different days of the week etc.
If you want to do comparisons between different experiments, analyse each of them individually and compare the resultant gene lists that you derive. What you lose in n-numbers you will more than compensate for with reduced technical variation. That's my advice anyway.
  • asked a question related to Bioconductor
Question
2 answers
I have control samples (6 biological replicates, and each of the inturn have a technical replicate), I need to include all of them (12 samples) in contrast analysis.
Targets file:
Sample             Block           Treatment
Control11            1                Control
Control12             2               Control
Control13            3               Control
Control14           4                Control
Control15          5               Control
Control16          6               Control
Control21        1               Control
Control22        2                Control
Control23       3              Control
Control24        4          Control
Control25        5           Control
Control26        6          Control
Treat1              1           treatment
Treat2             2           treatment
Treat3            3            treatment
Treat4            4           treatment
Treat5           5              treatment
Treat6            6               treatment
Relevant answer
Answer
Have a look at the User's Guide for limma. It contains a description on how to use the function duplicateCorrelation() to address technical replicates.
  • asked a question related to Bioconductor
Question
7 answers
In LIMMA for gene expression data normalization, offset is used to correct background and quantile for between array normalization. How same works and setting different offset values like 16 or 50, means what ?. It will be easy for me to understand in words rather than in equations..
Relevant answer
Thanks for your nice reply, now you opened my brain to dig more in this topic, and as I told you earlier. I am totally satisfied by your answer it was totally more scientific than mine. I may expressed it in unspecified wording, and this may confuse the reader, but I meant the same answer you mentioned. But this gives me a nice lesson to not answer a question without writing it in a proper way (at least scientifically expressed, as you did) nice to know you and thanks for you, Raju, for bringing up this question.
  • asked a question related to Bioconductor
Question
1 answer
Could anyone please tell me about the right choice of Bioconductor annotation package for Affymetrix porcine gene 1.0 ST array data set?
Relevant answer
Answer
There are several porcine annotation packages available:
Only the last three (porcine.db, porcinecdf, porcineprobe) are meant for Affymetrix arrays although not sure if they are compatible with yours. You may ask the Bioconductor annotation package maintainers at http://support.bioconductor.org
  • asked a question related to Bioconductor
Question
3 answers
Hi,
I have calculated the RPKM value from the RNA_SEQ(NGS) raw data. And also i known a lot of people analyze it by the R and Biocondutor GES packages. But i do not know the details process steps and the details R program. Who can gives a detail R program to process the RPKM using the R program? Thank you very much!
Relevant answer
Answer
you have to consider this online course 
week 4 & 5
  • asked a question related to Bioconductor
Question
3 answers
Hi everyone,
Experimentalist trying to get introduced to some basic bioinformatics, I've recently started to use R Bioconductor, but getting quite lost yet.
I'm would like to use this new skills to convert a Genebank file (.gbk) downloaded from ncbi, to a multifasta file containing all genes, nucleotides (.ffn), so it would be great if any now can give me the exact script I have to use:
1) Import the .gb file
2) transform .bg to .ffn
3) Export .ffn to tab file (so I can open it using textedit)
I know is kind of very basics, but thank you in advance.
Cristina
Relevant answer
Answer
  • asked a question related to Bioconductor
Question
5 answers
Hello,
I was wondering if the following approach is correct:
- I have a predefined list of the Ensembl gene IDs (n=28) and I want to perform Gene Ontology using topGO in R.
- I don't need to use expression values, but I do need to set a universe of genes. For that I chose all gene IDs available in Ensembl (n=64769)
If needed, the code can be provided.
Thanks!
Relevant answer
Answer
Your gene universe should really be limited to the original set from which your significant genes were taken.  For example, if your 28 significant genes were derived from an analysis of whole human genome microarrays, then you would limit your gene universe to just the ensembl gene IDs for the human genome (which is something like 20,300 or thereabouts).
Gene ontology hypergeometric enrichment is derived from basic set theory.  It is just the probability of k successes in n draws without replacement from a finite population of size N containing exactly K successes.  If you arbitrarily over-define one of the sets for comparison (K or N), you bias your enrichment results.
P.S. as an example of how to code TopGO to restrict your enrichment, here is an example from the Baliga lab at the Inst. for Systems Biology - http://baliga.systemsbiology.net/events/sysbio/sites/baliga.systemsbiology.net.events.sysbio/files/uploads/topGO_FunctionalEnrichment.r
The example looks for enrichment of a very small subset of human genes associated with glioblastoma (a type of brain tumors).
  • asked a question related to Bioconductor
Question
3 answers
I have an experiment with multiple time points and I have a table of enriched GO terms for each time. After wondering and discussing with colleagues, we couldn't reach an agreement on how to best represent such information.
Relevant answer
Answer
If you want to visualize the the change of GO enrichments through time you could consider Hive plots (using axes as time points and enrichment values on the axes).
Alternatively you can use D3 to achieve something like this: http://stackoverflow.com/questions/18569581/filtering-data-for-d3-js-sankey-diagrams
  • asked a question related to Bioconductor
Question
8 answers
I'm just beginning to wade into the world that is R. I'm currently having some very basic problems. The problem is so basic that I cannot find any examples out there. I don't have any problem getting my data into R, but what I want to know is how to best group the data prior to importing.
Please refer to the spreadsheet for details. Should I use option_1 (sheet 1) or option_2 (sheet 2) as the format for my data? Does it matter? Will this affect what I can do for my analysis?
What I ultimately want to do is compare the data (ANOVA) from the SC animals to the SD animals. There are some data points missing for some of the data as you can see from the file. I want to be able to compare the data using either the protein column or the peptide column. Do I need the unique_ID column?
The actual data has more samples and more data points.
Any and all suggestions are greatly appreciated.
Relevant answer
Answer
Hi Peter
"Thanks for the responses. I was hoping for a shortcut to get my analysis moving, but clearly I'll need to go do some more R homework."
There are some great free texts on the R-website that guide you through everything slowly (really great as reference material as well). Additionally Andy Field's book is amazing and will guide you through any basic analyses you need. Furthermore, try these resources for quick hints / help as in R you can get stuck on minor coding issues which they might resolve:
http://rprogramming.net/ - General help and topics
http://www.statmethods.net/ - great online guide and quick help options
Additionally, I would recommend downloading RStudio which will drastically increase its ease of use and power.
  • asked a question related to Bioconductor
Question
4 answers
I'm reprocessing a previously processed microarray gene expression dataset from NCBI GEO, using bioconductor packages. This is Illumina microarray chip. The data provided at GEO is non-normalized but it contains a lot of negative values. Probably it is caused by previous background subtraction. Is there any way to convert/transform those negatives and use them in further analyses? Or should I exclude them?
Relevant answer
Answer
Thank you for your replies. I have studied few pappers about this issue starting from Michael Black suggestion. It turns that the cause of large amount of negative values is probably BeadStudio preprocessing (background subtraction) algorithm. Some publications inform that this procedure discards (by making values negative) many of significant intensity values. This preprocessing software has been used on those data and after that data was uploaded at GEO. It seems that the only way to work with this data is obtaining raw, non-preprocessed intensity values. Am I right?
  • asked a question related to Bioconductor
Question
1 answer
I am trying to use the oligo package for bioconductors to analyze my latest affy, a MoGene 2.0 array. I am encountering problems with the build of the pd.mogene.2.0.st.v1 package using pdInfoBuilder. I would like to know if anyone has a working script.
Relevant answer
Answer
You can find ready to use pd.mogene.2.0.st.v1 file at http://nmg-r.bioinformatics.nl/NuGO_R.html
  • asked a question related to Bioconductor
Question
1 answer
I've used the "AgiMicroRna" package of 'bioconductor' using R to analyze my miRNA microarray data. till data analysis was just fine. Arriving at diffferential expression was butter smooth using Pedro Lopez's guide to AgiMicrRna package. Now further on to gene annotation, pathway enrichment GO and interactome (KEGG).etc. lies the hurdle.. I'd really appreciate inputs from one and all in this regard...Any body done this before..??..could you share your strings with me..??
Relevant answer
Answer
Generally uses: TargetSCAN for miR gene annotations and then DAVID for functional clustering and annotations
Goodluck
  • asked a question related to Bioconductor
Question
13 answers
CRAN and BioConductor are full of very exciting applications. Therefore it can be hard to get visibility amongst the mass. Furthermore it is not always sufficient to refer to a package available on R/BioConductor when this package has not been peer-reviewed.
Publishing software related to an established method can get tricky if you aim for visibility. High impact factor journals are unlikely to accept and those who do are not necessarily visible enough.
So what are the "best" journals to publish such methods?
Relevant answer
Answer
I would assume that publishing on CRAN or BioConductor does not prevent you from publishing a journal article to help with visibility.
The Journal of Statistical Software (http://www.jstatsoft.org/) could be what you are looking for. The focus of the journal is not so much the method (which can be an established method) but the implementation. From this point of view you can release your package on CRAN and try to get more visibility through a paper in the journal.
  • asked a question related to Bioconductor
Question
2 answers
Using Geneplotter R package, there is a function named plotMA (http://www.bioconductor.org/packages/2.13/bioc/manuals/geneplotter/man/geneplotter.pdf). To get the plot, your object (data.frame) needs at least three columns, the first containing the mean expression values (for the x-axis), the second one is logarithmic fold change (for the-y axis) and the third is a logical vector indicating significance (for the coloring of the dots). I have attached my file. I uploaded the file via Rcmdr with the name of my data-set as dat1 and then input the following command:
plotMA(dat1, ylim = NULL, colNonSig = "gray32", colSig = "red3", colLine = "#ff000080", log = "x", cex=0.45, xlab="mean expression", ylab="log fold change")
However, it gave me the following error:
Error in .local(object, ...) :
When called with a data.frame, plotMA expects the data frame to have 3 columns, two numeric ones for mean and log fold change, and a logical one for significance.
I tried many things, without any luck. Any suggestions?
Relevant answer
Answer
Thank you Mr. Thomas Mohr. It worked !
  • asked a question related to Bioconductor
Question
2 answers
Can you direct me to an open source program or good tutorials in R or matlab that can do/infer copy number variations from microarray data?
Relevant answer
Answer
Yes, basically correlate over/under expression of a given gene(s) to high/low CNV to specific chromosome band regions...
  • asked a question related to Bioconductor
Question
1 answer
.
Relevant answer
Answer
Once you obtain your DDCt values for each sample simply use the standard deviation function of Excel (separately for each treatement/genotype group). To get the SEM (standard error of the mean) divide the obtained standard deviation by the square root of the number of samples in the particular group. Remember that if you normalise the SEM has to be divided by the same value as your average was.This is how I was taught to do it, seems correct, but if it's not I'd be happy to hear how to do it properly.
  • asked a question related to Bioconductor
Question
15 answers
In R, what is your favourite approach to cluster genes by their expression profiles? There is a myriad approaches and tools all over: standard clustering, specialized tools, tools like Aracne. I often use a linear model to remove the group-wise effects and then apply a clustering using abs(cor) as the distance metric.
Relevant answer
Answer
I use GEDI and self-organizing maps because they are suitable to time-course and illustrative. The maps allow easily calculate entropy that describes organization of gene ensembles. I also like MeV.
  • asked a question related to Bioconductor
Question
42 answers
To my knowledge there are at least 11 different methods available (http://www.biomedcentral.com/1471-2105/14/91/). What tests do you prefer and for which kind of data/conditions?
Relevant answer
Answer
Just a clarification:
Cufflinks can be used for building transcript models against which obtain counts with HTSeq-count. Our pipeline is: fastq+Tophat+ref genome>.bam files, .bam files+cufflinks> .gtf files (one per library)
.gtf files+reference transcriptome+cufmerge> merged.gtf file (merged annotation)
.bam +merged.gtf +HTSeq-count> annotated count matrix to input in edgeR or DESeq.
As it was said, the FPKM values from cufflinks can't be used in edgeR and DESeq because these packages analyze count data. In theory the FPKM values could be used with limma... after proper normalization. If needed would treat those values as expression levels from a single color microarray and apply some normalization. But I trully think that the nature of RNAseq calls for count-based models.
  • asked a question related to Bioconductor
Question
11 answers
I am unable to install biocLite.R in my system. I am using R 2.15.7 on Windows 7. I used the following command:
but its showing this error message:
Error in file(filename, "r", encoding = encoding) :
cannot open the connection
In addition: Warning message:
In file(filename, "r", encoding = encoding) :
unable to resolve 'bioconductor.org'
I have already set proxy for R using http_proxy="address:port"; and http_proxy_user="username:password". I don't know what is the problem.
Relevant answer
Answer
When if it still fails, check out the answers in this:
  • asked a question related to Bioconductor
Question
1 answer
I am starting now to use Bioconductor, can you suggest me a good user guide?
Regards,
Marco
Relevant answer
Answer
Bioinformatics and Computational Biology Solutions Using R and Bioconductor by Gentleman et al is a classic. It may be a bit long and outdated. I personally liked Bioconductor Case Studies by Hahne et al.
  • asked a question related to Bioconductor
Question
2 answers
Hi All
I am completely new to R and I am currently working on a project using R. I would like to know, how do we normalize using R for gene expression data.
Relevant answer
Answer
I could always take the help of really good books! :) Thanks a lot Laurin :)