GATExplorer: Genomic and Transcriptomic Explorer; mapping expression probes to gene loci, transcripts, exons and ncRNAs

Bioinformatics and Functional Genomics Research Group, Cancer Research Center (CiC-IBMCC, CSIC/USAL), Salamanca, Spain.
BMC Bioinformatics (Impact Factor: 2.67). 04/2010; 11:221. DOI: 10.1186/1471-2105-11-221
Source: PubMed

ABSTRACT Genome-wide expression studies have developed exponentially in recent years as a result of extensive use of microarray technology. However, expression signals are typically calculated using the assignment of "probesets" to genes, without addressing the problem of "gene" definition or proper consideration of the location of the measuring probes in the context of the currently known genomes and transcriptomes. Moreover, as our knowledge of metazoan genomes improves, the number of both protein-coding and noncoding genes, as well as their associated isoforms, continues to increase. Consequently, there is a need for new databases that combine genomic and transcriptomic information and provide updated mapping of expression probes to current genomic annotations.
GATExplorer (Genomic and Transcriptomic Explorer) is a database and web platform that integrates a gene loci browser with nucleotide level mappings of oligo probes from expression microarrays. It allows interactive exploration of gene loci, transcripts and exons of human, mouse and rat genomes, and shows the specific location of all mappable Affymetrix microarray probes and their respective expression levels in a broad set of biological samples. The web site allows visualization of probes in their genomic context together with any associated protein-coding or noncoding transcripts. In the case of all-exon arrays, this provides a means by which the expression of the individual exons within a gene can be compared, thereby facilitating the identification and analysis of alternatively spliced exons. The application integrates data from four major source databases: Ensembl, RNAdb, Affymetrix and GeneAtlas; and it provides the users with a series of files and packages (R CDFs) to analyze particular query expression datasets. The maps cover both the widely used Affymetrix GeneChip microarrays based on 3' expression (e.g. human HG U133 series) and the all-exon expression microarrays (Gene 1.0 and Exon 1.0).
GATExplorer is an integrated database that combines genomic/transcriptomic visualization with nucleotide-level probe mapping. By considering expression at the nucleotide level rather than the gene level, it shows that the arrays detect expression signals from entities that most researchers do not contemplate or discriminate. This approach provides the means to undertake a higher resolution analysis of microarray data and potentially extract considerably more detailed and biologically accurate information from existing and future microarray experiments.

1 Bookmark
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Background: Accurate analysis of whole-gene expression and individual-exon expression is essential to characterize different transcript isoforms and identify alternative splicing events in human genes. One of the omic technologies widely used in many studies on human samples are the exon-specific expression microarray platforms. Results: Since there are not many validated comparative analyses to identify specific splicing events using data derived from these types of platforms, we have developed an algorithm (called ESLiM) to detect significant changes in exon use, and applied it to a reference dataset of 270 human genes that show alternative expression in different tissues. We compared the results with three other methodological approaches and provided the R source code to be applied elsewhere. The genes positively detected by these analyses also provide a verified subset of human genes that present tissue-regulated isoforms. Furthermore, we performed a validation analysis on human patient samples comparing two different subtypes of acute myeloid leukemia (AML) and we experimentally validated the splicing in several selected genes that showed exons with highly significant signal change. Conclusions: The comparative analyses with other methods using a fair set of human genes that show alternative splicing and the validation on clinical samples demonstrate that the proposed novel algorithm is a reliable tool for detecting differential splicing in exon-level expression data.
    BMC Genomics 10/2014; 15(1):879. DOI:10.1186/1471-2164-15-879 · 4.04 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Recent advances in high-throughput experimental techniques have led to an exponential increase in both the size and the complexity of the datasets commonly studied in biology. Data visualisation is increasingly used as the key to unlock this data, going from hypothesis generation to model evaluation and tool implementation. It is becoming more and more the heart of bioinformatics workflows, enabling scientists to reason and communicate more effectively. In parallel, there has been a corresponding trend towards the development of related software, which has triggered the maturation of different visualisation libraries and frameworks. For bioinformaticians, scientific programmers and software developers, the main challenge is to pick out the most fitting one(s) to create clear, meaningful and integrated data visualisation for their particular use cases. In this review, we introduce a collection of open source or free to use libraries and frameworks for creating data visualisation, covering the generation of a wide variety of charts and graphs. We will focus on software written in Java, JavaScript or Python. We truly believe this software offers the potential to turn tedious data into exciting visual stories. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
    Proteomics 02/2015; DOI:10.1002/pmic.201400377 · 3.97 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Background Many long non-coding RNAs(lncRNAs) have been found to be a good marker for several tumors. Using lncRNA-mining approach, we aimed to identify lncRNA expression signature that can predict breast cancer patient survival.Methods We performed LncRNA expression profiling in 887 breast cancer patients from Gene Expression Omnibus (GEO) datasets. The association between lncRNA signature and clinical survival was analyzed using the training set(n = 327,from GSE 20685). The validation for the association was performed in another three independent testing sets(252 from GSE21653, 204 from GSE12276, and 104 from GSE42568).ResultsA set of four lncRNA genes (U79277, AK024118, BC040204, AK000974) have been identified by the random survival forest algorithm. Using a risk score based on the expression signature of these lncRNAs, we separated the patients into low-risk and high-risk groups with significantly different survival times in the training set. This signature was validated in the other three cohorts. Further study revealed that the four-lncRNA expression signature was independent of age and subtype. Gene Set Enrichment Analysis (GSEA) suggested that gene sets were involved in several cancer metastasis related pathways.Conclusions These findings indicate that lncRNAs may be implicated in breast cancer pathogenesis. The four-lncRNA signature may have clinical implications in the selection of high-risk patients for adjuvant therapy.
    Journal of Experimental & Clinical Cancer Research 10/2014; 33(1):84. DOI:10.1186/PREACCEPT-1637809151128130 · 3.27 Impact Factor

Full-text (4 Sources)

Available from
May 23, 2014