[show abstract][hide abstract] ABSTRACT: We have used large surveys of Affymetrix GeneChip data in the public domain to conduct a study of antisense expression across diverse conditions. We derive correlations between groups of probes which map uniquely to the same exon in the antisense direction. When there are no probes assigned to an exon in the sense direction we find that many of the antisense groups fail to detect a coherent block of transcription. We find that only a minority of these groups contain coherent blocks of antisense expression suggesting transcription. We also derive correlations between groups of probes which map uniquely to the same exon in both sense and antisense direction. In some of these cases the locations of sense probes overlap with the antisense probes, and the sense and antisense probe intensities are correlated with each other. This configuration suggests the existence of a Natural Antisense Transcript (NAT) pair. We find the majority of such NAT pairs detected by GeneChips are formed by a transcript of an established gene and either an EST or an mRNA. In order to determine the exact antisense regulatory mechanism indicated by the correlation of sense probes with antisense probes, a further investigation is necessary for every particular case of interest. However, the analysis of microarray data has proved to be a good method to reconfirm known NATs, discover new ones, as well as to notice possible problems in the annotation of antisense transcripts.
Journal of integrative bioinformatics 01/2010; 7(2).
[show abstract][hide abstract] ABSTRACT: A chimeric transcript is a single RNA sequence which results from the transcription of two adjacent genes. Recent studies estimate that at least 4% of tandem human gene pairs may form chimeric transcripts. Affymetrix GeneChip data are used to study the expression patterns of tens of thousands of genes and the probe sequences used in these microarrays can potentially map to exotic RNA sequences such as chimeras.
We have studied human chimeras and investigated their expression patterns using large surveys of Affymetrix microarray data obtained from the Gene Expression Omnibus. We show that for six probe sets, a unique probe mapping to a transcript produced by one of the adjacent genes can be used to identify the expression patterns of readthrough transcripts. Furthermore, unique probes mapping to an intergenic exon present only in the MASK-BP3 chimera can be used directly to study the expression levels of this transcript.
We have attempted to implement a new method for identifying tandem chimerism. In this analysis unambiguous probes are needed to measure run-off transcription and probes that map to intergenic exons are particularly valuable for identifying the expression of chimeras.
Journal of integrative bioinformatics 01/2010; 7(3).
[show abstract][hide abstract] ABSTRACT: We describe various types of outliers seen in Affymetrix GeneChip data. We have been able to utilise the data in the Gene Expression Omnibus to screen GeneChips across a range of scales, from single probes, to spatially adjacent fractions of arrays, to whole arrays, to whole experiments. In this review we describe a number of causes for why some reported intensities might be misleading on GeneChips.
Briefings in Functional Genomics and Proteomics 06/2009; 8(3):199-212.
[show abstract][hide abstract] ABSTRACT: Affymetrix GeneChip technology enables the parallel observations of tens of thousands of genes. It is important that the probe set annotations are reliable so that biological inferences can be made about genes which undergo differential expression. Probe sets representing the same gene might be expected to show similar fold changes/z-scores, however this is in fact not the case.
We have made a case study of the mouse Surf4, chosen because it is a gene that was reported to be represented by the same eight probe sets on the MOE430A array by both Affymetrix and Bioconductor in early 2004. Only five of the probe sets actually detect Surf4 transcripts. Two of the probe sets detect splice variants of Surf2. We have also studied the expression changes of the eight probe sets in a public-domain microarray experiment. The transcripts for Surf4 are correlated in time, and similarly the transcripts for Surf2 are also correlated in time. However, the transcripts for Surf4 and Surf2 are not correlated. This proof of principle shows that observations of expression can be used to confirm, or otherwise, annotation discrepancies. We have also investigated groups of probe sets on the RAE230A array that are assigned to the same LocusID, but which show large variances in differential expression in any one of three different experiments on rat. The probe set groups with high variances are found to represent cases of alternative splicing, use of alternative poly(A) signals, or incorrect annotations.
Our results indicate that some probe sets should not be considered as unique measures of transcription, because the individual probes map to more than one transcript dependent upon the biological condition. Our results highlight the need for care when assessing whether groups of probe sets all measure the same transcript.