Mayday - integrative analytics for expression data

Center for Bioinformatics Tübingen, University of Tübingen, Sand 14, 72076 Tübingen, Germany.
BMC Bioinformatics (Impact Factor: 2.67). 03/2010; 11(1):121. DOI: 10.1186/1471-2105-11-121
Source: PubMed

ABSTRACT DNA Microarrays have become the standard method for large scale analyses of gene expression and epigenomics. The increasing complexity and inherent noisiness of the generated data makes visual data exploration ever more important. Fast deployment of new methods as well as a combination of predefined, easy to apply methods with programmer's access to the data are important requirements for any analysis framework. Mayday is an open source platform with emphasis on visual data exploration and analysis. Many built-in methods for clustering, machine learning and classification are provided for dissecting complex datasets. Plugins can easily be written to extend Mayday's functionality in a large number of ways. As Java program, Mayday is platform-independent and can be used as Java WebStart application without any installation. Mayday can import data from several file formats, database connectivity is included for efficient data organization. Numerous interactive visualization tools, including box plots, profile plots, principal component plots and a heatmap are available, can be enhanced with metadata and exported as publication quality vector files.
We have rewritten large parts of Mayday's core to make it more efficient and ready for future developments. Among the large number of new plugins are an automated processing framework, dynamic filtering, new and efficient clustering methods, a machine learning module and database connectivity. Extensive manual data analysis can be done using an inbuilt R terminal and an integrated SQL querying interface. Our visualization framework has become more powerful, new plot types have been added and existing plots improved.
We present a major extension of Mayday, a very versatile open-source framework for efficient micro array data analysis designed for biologists and bioinformaticians. Most everyday tasks are already covered. The large number of available plugins as well as the extension possibilities using compiled plugins and ad-hoc scripting allow for the rapid adaption of Mayday also to very specialized data exploration. Mayday is available at

Download full-text


Available from: Florian Battke, Aug 06, 2015
  • Source
    • "Differentially expressed genes were determined by setting fixed thresholds taking the background noise of the self-hybridization into account. MayDay (Battke et al., 2010) was used for analysis of expression patterns in individual datasets. Microarray data were deposited at Gene Expression Omnibus database, GEO ID: GSE35832. "
    [Show abstract] [Hide abstract]
    ABSTRACT: The whole genome sequence of Rhodopirellula baltica SH1(T), published nearly 10years ago, already revealed a high amount of sulfatase genes. So far, little is known about the diversity and potential functions mediated by sulfatases in Planctomycetes. We combined in vivo and in silico techniques to gain insights into the ecophysiology of planktomycetal sulfatases. Comparative genomics of nine recently sequenced Rhodopirellula strains detected 1120 open reading frames annotated as sulfatases (Enzyme Commission number (EC) 3.1.6.*). These were clustered into 173 groups of orthologous and paralogous genes. To analyze the functional aspects, 708 sulfatase protein sequences from these strains were aligned with 67 sulfatase reference sequences of reviewed functionality. Our analysis yielded 22 major similarity clusters, but only five of these clusters contained Rhodopirellula sequences homologous to reference sequences, indicating a surprisingly high diversity. Exemplarily, R. baltica SH1(T) was grown on different sulfated polysaccharides, chondroitin sulfate, λ-carrageenan and fucoidan. Subsequent gene expression analyses using whole genome microarrays revealed distinct sulfatase expression profiles based on substrates tested. This might be indicative for a high structural diversity of sulfated polysaccharides as potential substrates. The pattern of sulfatases in individual planctomycete species may reflect ecological niche adaptation.
    Marine Genomics 12/2012; 9. DOI:10.1016/j.margen.2012.12.001 · 1.97 Impact Factor
  • Source
    • "We included a new visualization to analyze distributions of genetic variations in more detail. Furthermore, we integrated Reveal into our visual analytics software Mayday (Battke et al., 2010), allowing for combined and highly interactive analyses of genotypic and expression data as well as meta-data (e.g. disease phenotype). "
    [Show abstract] [Hide abstract]
    ABSTRACT: The analysis of expression quantitative trait locus (eQTL) data is a challenging scientific endeavor, involving the processing of very large, heterogeneous and complex data. Typical eQTL analyses involve three types of data: sequence-based data reflecting the genotypic variations, gene expression data and meta-data describing the phenotype. Based on these, certain genotypes can be connected with specific phenotypic outcomes to infer causal associations of genetic variation, expression and disease. To this end, statistical methods are used to find significant associations between single nucleotide polymorphisms (SNPs) or pairs of SNPs and gene expression. A major challenge lies in summarizing the large amount of data as well as statistical results and to generate informative, interactive visualizations. We present Reveal, our visual analytics approach to this challenge. We introduce a graph-based visualization of associations between SNPs and gene expression and a detailed genotype view relating summarized patient cohort genotypes with data from individual patients and statistical analyses. Reveal is included in Mayday, our framework for visual exploration and analysis. It is available at
    Bioinformatics 09/2012; 28(18):i542-i548. DOI:10.1093/bioinformatics/bts382 · 4.62 Impact Factor
  • Source
    • "GenomeRing is integrated into our visual analytics platform Mayday (Battke et al., 2010) as a visualization which can display data from multiple perspecies data sets. Using Mayday's facilities for data and meta-information management, we can for example add information about gene expression in the GenomeRing visualization. "
    [Show abstract] [Hide abstract]
    ABSTRACT: The number of completely sequenced genomes is continuously rising, allowing for comparative analyses of genomic variation. Such analyses are often based on whole-genome alignments to elucidate structural differences arising from insertions, deletions or from rearrangement events. Computational tools that can visualize genome alignments in a meaningful manner are needed to help researchers gain new insights into the underlying data. Such visualizations typically are either realized in a linear fashion as in genome browsers or by using a circular approach, where relationships between genomic regions are indicated by arcs. Both methods allow for the integration of additional information such as experimental data or annotations. However, providing a visualization that still allows for a quick and comprehensive interpretation of all important genomic variations together with various supplemental data, which may be highly heterogeneous, remains a challenge. Here, we present two complementary approaches to tackle this problem. First, we propose the SuperGenome concept for the computation of a common coordinate system for all genomes in a multiple alignment. This coordinate system allows for the consistent placement of genome annotations in the presence of insertions, deletions and rearrangements. Second, we present the GenomeRing visualization that, based on the SuperGenome, creates an interactive overview visualization of the multiple genome alignment in a circular layout. We demonstrate our methods by applying them to an alignment of Campylobacter jejuni strains for the discovery of genomic islands as well as to an alignment of Helicobacter pylori, which we visualize in combination with gene expression data. GenomeRing and example data is available at
    Bioinformatics 06/2012; 28(12):i7-15. DOI:10.1093/bioinformatics/bts217 · 4.62 Impact Factor
Show more