Figure 5 - uploaded by Florian Battke
Content may be subject to copyright.
Profile plot after QT clustering . The profile colors are determined by the QT clustering (for details refer to the text). Values have been z -scored for presentation clarity. 

Profile plot after QT clustering . The profile colors are determined by the QT clustering (for details refer to the text). Values have been z -scored for presentation clarity. 

Source publication
Article
Full-text available
DNA Microarrays have become the standard method for large scale analyses of gene expression and epigenomics. The increasing complexity and inherent noisiness of the generated data makes visual data exploration ever more important. Fast deployment of new methods as well as a combination of predefined, easy to apply methods with programmer's access t...

Contexts in source publication

Context 1
... 4 Heatmap of the clustered experiments . The heat map shows expression values mapped to a color gradient from low (green) to high expression (red). Experiments are arranged according to a hierarchical clustering dendrogram. The order of genes and the color of gene identifiers is determined by the QT clustering (for details refer to the text) which is also used in figure 5.  ...
Context 2
... R [18] and its wealth of available packages and thus allows the application of third-party methods directly on Mayday ’ s data. R processes can also be connected to Mayday over the network allowing complex calculations to run on a powerful workstation or cluster and communicating with a Mayday instance running on the researcher ’ s lap- top, for instance. Furthermore, all gene expression data and meta information currently opened in Mayday can be queried using standard SQL, including the possibility to create new views and custom tables. These shells both feature syntax-highlighting editors with persistent history, greatly increasing programmers ’ productivity (see figure 2). Time series analyses as well as replicate studies often require researchers to compare different datasets, e.g. to find systematic shifts in expression over time. Mayday now offers a specialized view for this purpose in addition to the cross-dataset analyses possible with our R and SQL command-line interfaces. For integrative pathway analyses, biochemical pathways from several sources, including KEGG [19] and MetaCyc [20] can be visualized as networks. The expression data of enzymes and concentration data of metabolites can be summarized and visualized on the network in different forms, including profile plots and heatmaps. Gene annotations can be imported from external databases. We currently offer direct support for the Gene Ontology [21] and KEGG databases. Gene identifier mapping can be done automatically using the PICR [22] service. To demonstrate the new functionalities of Mayday, we present here an analysis of a large time series in Streptomyces coelicolor . For streptomycetes it has proved very difficult to identify the key regulators that control expression of the pathway specific regulators. Mayday was used to monitor the expression dynamics of the bacterium in a time series dataset with unprecedented resolution. A custom-designed Affymetrix array containing 22,779 probe sets interrogating genes, intergenic regions, and predicted noncoding RNAs was used to study the gene expression in mostly hourly intervals starting at 20 h after inoculation, up to 60 h [23]. Altogether, 32 time points were studied. Phosphate was depleted in the medium at 36 h. All oligos of the probe sets were mapped to their genomic locus on the chromosome or on one of the two plasmids of Streptomyces coelicolor . For each probe set the start and end genomic coordinate together with the strand orientation were written to a tab-separated file. Within Mayday we imported data from 32 CEL files using Mayday ’ s R interpreter. For normalization we used the robust multi-array average method (RMA) [24] as provided in the affy [25] package of BioConductor [26]. We imported genomic locus information from the tab-separated file described above for later steps in the analysis. Using a custom processing pipeline, we automatically compute regularized variance for each probe and then apply a filtering step to create a probe list of most variant probesets. Of 22,779 probesets, 64 remain after filtering with a regularized variance threshold of 0.3. Based on this probelist of variant probesets, we create a new dynamic probelist to select only those probes that, apart from being the most variant, interrogate protein coding genes (SCOxxxx), and query the plus strand of the Sco genome (see figure 3). 32 probesets remain. Changing any of the filter parameters automatically updates all plots based on the dynamic probelist. The time series sampling reflects the development of Streptomyces coelicolor from early growth phase to stationary phase. Accordingly, the expression differences between the samples taken at two consecutive time points should, in general, be smaller than those between samples from time points that lie further apart. Furthermore, the differences between time points should reflect the rate of change in the metabolic state of the culture. To assess this hypothesis, we performed a hierarchical clustering of the transposed matrix, i.e. clustering of the experiments, using the most variant genes. We used the Euclidean distance and MAYDAY ’ s implementation of the rapid neighbour-joining algorithm [14]. The resulting cluster tree is visualized along with a heatmap in figure 4. As expected, the early (20 h) and late time (60 h) points are at the outermost leaves of the tree and consecutive time points are clustered very closely together. The tree nicely depicts the consecutive points of time along the growth curve of the organism. It also shows the major expression change occurring between 35 and 36 hours after inoculation. This largest expression change coincides exactly with the time of complete phosphate depletion in the fermenter. Since the heatmap suggests the existence of distinct groups of genes within the probelist, we use QT clustering with a diameter of 0.4 and use the resulting clusters to color a profile plot showing the z -scored profiles of the genes (figure 5). The dynamic architecture of the metabolic switch is clearly visible with different groups of genes being up-resp. down-regulated in a successive order of time points (35, 39 and 43 hours in this subset). The heatmap also shows that there are some genes that clearly separate the time points 46-60 from the ear- lier ones. Using the GeneMining plugin, we search for those genes that optimally separate these two groups of experiments (using the quartet mining algorithm, for details see MAYDAY ’ s website). Of the 32 genes in the dynamic probelist described above, 15 belong to the list selected by the quartet mining algorithm. These genes all exclusively belong to the actinorhodin pathway, a genomic cluster of genes (SCO5071-SCO5092). The experimental data also contains optical measurements of the amount of actinorhodin produced. Com- bining ScoCyc [27] pathway information, expression values and external measurements of actinorhodin levels, we produce an interactive visualization of the actinorhodin pathway (figure 6). On first glance, it is obvious that spectrometrically measured actinorhodin concentration rises in response to the upregulation of several enzymes in this pathway. Interesting target com- pounds for analysis can be selected from the pathway image for further wet-lab investigation. Since the dataset used here is part of a larger experiment where biological replicates were produced in separate fermentation runs, we decided to investigate whether we could detect systematic differences between these replicates. Figure 7 shows Mayday ’ s time series alignment tool with one of the QT clusters as an example. The genes in that cluster are up-regulated one hour later in the second fermentation (F202) than in the reference fermentation (F199). This time shift could be traced to a one-hour delay in phosphate depletion in the second fermentation. Mayday is a comprehensive platform for the analysis and the visual exploration of microarray data. According to Allison et al. [28] the most important statistical components of a microarray experiment analysis involve the following steps: design, preprocessing, inference or classification and validation. During the last years analysis of microarray data has become highly sophisticated, new methods are published almost daily. These range from preprocessing and normalization to novel statistical and machine learning methods. A software that wants to keep pace with these developments has to provide possibilities to enable the rapid integration of new methods as well as making them as usable as possible. An important focus of exploration of high-dimensional data, such as microarray data, lies on visualization. The advantage of our design is the tight integration of both analysis and visualization as well as the various visualization techniques themselves. This combination of automatic and visual analysis leads to a visual analytics approach that provides more insights in the structure of the data. We think that with Mayday such a visual analytics approach for the analysis of high-dimensional microarray data has been realized. We present a very versatile open-source framework for efficient microarray data analysis, designed for biologists and bioinformaticians. All common tasks of microarray analyses are already covered and the wide range of functionality from the already existing plugins can swiftly be extended with new plugins written in Java, ad-hoc scripting interfaces facilitate rapid prototyping of new algorithms as well as interactive specialized data exploration. Mayday ’ s interactive visualization methods in conjunction with the meta-data concept provide sig- nificant insight into complex data and have successfully been applied in many microarray analyses. New methods and tools are continuously added to Mayday ’ s platform to keep up with new developments. Our coming release includes two new visualizations based on genomic locus information: A track based visualization and a view showing expression (or meta information) values as colored boxes aligned to a linear chromosome laid out continuously in stacked rows. Both are fully interactive and integrated with all other visualizations. Most recently, novel ultra-high throughput DNA sequencing technologies have been developed that enable researchers to obtain the complete genomes of organisms faster and at a lower cost than classical methods [29]. Moreover, these technologies can be applied to measure gene expression (RNA-Seq) [30] and protein- DNA interactions (ChIP-Seq) [31], and many current studies use RNA-Seq and microarray data compara- tively. Our new genomic plots will be especially useful in the context of such new types of data. We ’ re currently working on an integration of these new data types into Mayday, separately or in multi-platform ...

Citations

... Each species is attributed a line which traverses (if present) or avoids (if absent) pangenomic blocks divided into two 'rings', corresponding to the backward and forward strands (as illustrated in Appendix XIX). It works on in-house pangenomes and is compatible to their other in-house tool Mayday [213] which can link the visualization to a browser with gene annotation for example. Interactivity is limited to zooming, rotating, and panning the view. ...
Thesis
The popularization of sequencing technologies in the past twenty years led to a high increase of the number of sequenced genomes. The diversity of the newly sequenced reference genomes highlighted the biases of using a single reference, which is not enough to access all the diversity within a species. There are many examples of intraspecific variations within plants, including presence / absence and copy number variations. These variations can have a strong effect on plant phenotypes, as exemplified by the African rice in which the presence of the gene Sub1A is linked to drought tolerance. The concept of pangenome appeared to better integrate these variations within genomics approaches. A pangenome can be built from genes only or from any genomic fragments found within a group, and is useful to compare their distributions between multiple individuals. Depending on the presence rate, many categories of elements can be defined; the main ones are the elements present in all the individuals (part of the ‘core’ genome) and these absent in at least one of them (part of the ‘variable’ genome). Pangenomics still lacks tools, especially for visualization. This is particularly true for eukaryotes (including plants) which have larger and more complex genomes than bacteria. Pangenomes were first built for bacteria, but their related tools cannot properly work on bigger genomes. My PhD investigated the creation of novel visual representations and tools for the visualization of plant pangenomes (and eukaryotes in general).Within this dissertation, I introduce the state of the art of pangenome visualization: I distinguish pan-gene from pangenomes, the latter often being represented by pangenome graphs where each sequence is a node and each observed sequence succession forms an edge; I also identify unspecific, qualifying, positioned, structural and composite visualization tools. The first chapter introduce ten principles for creating a genomic visualization tool, for future biology or bioinformatics scientists interested in datavisualization. The second chapter describes my first pangenome visualization, published in the journal Bioinformatics under the name ‘Panache: a Web Browser-Based Viewer for Linearized Pangenomes’. I detail the visual representation used within Panache and the creation of the resulting web application built in JavaScript, enabling the dynamic exploration of pangenomic data. The third and final chapter details the design of a composite visualization tools for pangenomes, called SaVanache, and enabling the navigation between four view scales. There are four of them: one for global diversity, one for structural variations, one for the presence / absence variations, and one for nucleotide variations. I propose a new approach for the annotation and visual representation of structural variations within a pangenome graph, based on a pivot path within the graph used as a reference coordinate system.
... Thus, many scholars have studied the related problems of time-series clustering, Niennattrakul et al. [17] uses k-Medoids to cluster multimedia sequences. Battke et al. [18] applies a bottom-up hierarchical clustering method to identify common and uncommon sub-sequences in a broad category of time series. ...
Article
Full-text available
Spatiotemporal modelling of short-term forecasts of metro passenger flows continue to face tremendous challenges. First, there is a need to consider the functional domain made up of several similar stations; Secondly, complex spatiotemporal models depend on a large number of learnable parameters. This paper proposed a spatiotemporal dual self-adaptive network based on the cluster (CG-TaLK) to accurately predict the inflow and outflow of subway passengers. Specifically, through the division of clustering, the members of each group learn a shared embedding, and use the inner product of embedding to mine the flow pattern between urban functional areas, so as to provide more accurate spatial information for prediction. In addition, in order to limit the number of parameters, we migrate a temporal adaptive convolution (TaLK) to capture the time correlation of each station according to the characteristics of passenger flow. The self-adaptive mechanism in space and time can enhance the fitting ability of the model. By comparing six representative algorithms on Hangzhou Metro dataset, the results show that the proposed method is effective and takes up the least parameters. Meanwhile, experiments show that the algorithm can find the main communication between function areas.
... Multivariate time-series data have been used in manufacturing systems and predictive maintenance [5,6]. Timeseries data obtained from gene expression measurement [7][8][9], for instance, can be used by biologists to understand the correlation between types of genes, analyze gene interactions, and compare regulatory behaviors for genes of interest. Medical experts also utilize timeseries data from blood pressure measurements [10] to understand and deal with cases such as monitoring illness progression, and understanding ecological and behavioral processes related to a disease, which may lead to improved diagnoses. ...
... Wijk et al. [140] conducted pioneering work in which they use a bottom-up hierarchical clustering approach to identify common and uncommon subsequences that occur in large time-series. Battke et al. [7] overcame the issue of hierarchical clustering speed for large time-series datasets by implementing the rapid neighbor-joining algorithm [141]. Alkhushayni et al. [142] also looked at how to analyze homology cluster groups utilizing agglomerative hierarchical clustering algorithms and methods. ...
... A self-organizing map (SOM), a model-based method developed by Kohonen [143], is a specific type of neural network (NN) used for model-based clustering. SOM has been used to analyze temporal data and is utilized for pattern discovery in temporal data [7,16,17,42,51,63,144]. The introduction of Recurrent SOM [145] and Recursive SOM [146] has enhanced SOM for mapping time-series data [147]. ...
Article
Full-text available
We present a comprehensive, detailed review of time-series data analysis, with emphasis on deep time-series clustering (DTSC), and a case study in the context of movement behavior clustering utilizing the deep clustering method. Specifically, we modified the DCAE architectures to suit time-series data at the time of our prior deep clustering work. Lately, several works have been carried out on deep clustering of time-series data. We also review these works and identify state-of-the-art, as well as present an outlook on this important field of DTSC from five important perspectives.
... Differentially expressed genes were identified at a significance threshold of p < 0.05 (FDR-corrected p-values) in two comparisons: (1) miCox79 vs. miCTR; and (2) miCox474 vs. miCTR. Mayday (Battke et al., 2010) was used for the heatmap generation. Venn diagrams were generated using InteractiVenn (Heberle et al., 2015). ...
Article
Full-text available
Age-related impairment of mitochondrial function may negatively impact energy-demanding processes such as synaptic transmission thereby triggering cognitive decline and processes of neurodegeneration. Here, we present a novel model for age-related mitochondrial impairment based on partial inhibition of cytochrome c oxidase subunit 4 (Cox4) of complex IV of the respiratory chain. miRNA-mediated knockdown of Cox4 correlated with a marked reduction in excitatory and inhibitory synaptic marker densities in vitro and in vivo as well as an impairment of neuronal network activity in primary neuronal cultures. Transcriptome analysis identified the deregulation of gene clusters, which link induced mitochondrial perturbation to impaired synaptic function and plasticity as well as processes of aging. In conclusion, the model of Cox4 deficiency reflects aspects of age-related dementia and might, therefore, serve as a novel test system for drug development.
... Multivariate time series data have been also used in manufacturing systems and predictive maintenance [32], [33]. In the surveyed visual analytics papers, time series data, e.g., obtained from gene expression measurement [34], [35], [36], [37] can be used by biologists to understand the correlation between different types of genes, analyze gene interactions, and compare regulatory behaviors for interesting genes. Moreover, medical experts utilize time series data e.g., blood pressure measurements [38], to understand and deal with different cases such as monitoring illness progression, and understanding ecological and behavioral processes related to a disease which may lead to improved disease diagnoses. ...
... From a data mining perspective, Aghabozorgi et al. [12] state that Euclidean Distance and DTW are the most popular distance measures in time series data; however, Euclidian VOLUME 4, 2016 Distance is the most widely used distance measure in the surveyed visual analytics papers e.g. [66], [67], [69], [34], [91], [57], [52], [85], [50], [113], [88], [59], [75], [93], [44], [60], [89], [53], [95], [90], [43], [96], [114], [70], [81], [56], [115] as it is the most straightforward distance measure compared to others. DTW has only been used in [53], [48], [79], [56] to calculate the similarity of time series data, and papers [72], [35], [34], [85], [86], [61], [83] use correlation and cross-correlation in their works. ...
... [66], [67], [69], [34], [91], [57], [52], [85], [50], [113], [88], [59], [75], [93], [44], [60], [89], [53], [95], [90], [43], [96], [114], [70], [81], [56], [115] as it is the most straightforward distance measure compared to others. DTW has only been used in [53], [48], [79], [56] to calculate the similarity of time series data, and papers [72], [35], [34], [85], [86], [61], [83] use correlation and cross-correlation in their works. ...
Article
Full-text available
Visual analytics for time series data has received a considerable amount of attention. Different approaches have been developed to understand the characteristics of the data and obtain meaningful statistics in order to explore the underlying processes, identify and estimate trends, make decisions and predict the future. The machine learning and visualization areas share a focus on extracting information from data. In this paper, we consider not only automatic methods but also interactive exploration. The ability to embed efficient machine learning techniques (clustering and classification) in interactive visualization systems is highly desirable in order to gain the most from both humans and computers. We present a literature review of some of the most important publications in the field and classify over 60 published papers from six different perspectives. This review intends to clarify the major concepts with which clustering or classification algorithms are used in visual analytics for time series data and provide a valuable guide for both new researchers and experts in the emerging field of integrating machine learning techniques into visual analytics.
... For more detailed analysis, the ANOVA model was filtered based on the respective gene lists ( Fulton et al., 2009). TAM-specific surface marker expression was visualized in a heatmap using Mayday (Battke, Symons and Nieselt, 2010). ...
Article
Full-text available
Tumor-associated macrophages (TAMs) are frequently the most abundant immune cells in cancers and are associated with poor survival. Here, we generated TAM molecular signatures from K14cre;Cdh1flox/flox;Trp53flox/flox (KEP) and MMTV-NeuT (NeuT) transgenic mice that resemble human invasive lobular carcinoma (ILC) and HER2+ tumors, respectively. Determination of TAM-specific signatures requires comparison with healthy mammary tissue macrophages to avoid overestimation of gene expression differences. TAMs from the two models feature a distinct transcriptomic profile, suggesting that the cancer subtype dictates their phenotype. The KEP-derived signature reliably correlates with poor overall survival in ILC but not in triple-negative breast cancer patients, indicating that translation of murine TAM signatures to patients is cancer subtype dependent. Collectively, we show that a transgenic mouse tumor model can yield a TAM signature relevant for human breast cancer outcome prognosis and provide a generalizable strategy for determining and applying immune cell signatures provided the murine model reflects the human disease.
... A hurdle in further development of S. coelicolor as a 'superhost' is the limited knowledge of M1152 metabolism and its regulatory system, even while insight can be gained from analysing snapshots of gene expression levels during regular time intervals of a batch fermentation ( Battke et al., 2010;Jäger et al., 2011;Liao et al., 2014;Love et al., 2014;Mi et al., 2019). Complementary to cataloguing gene expression is elucidation of the metabolic behaviour, which is inherently connected with enzymes catalysing most metabolic transformations. ...
... In-depth analyses, such as expression profile clustering of differentially expressed genes, were performed within the expression analysis framework Mayday ( Battke et al., 2010). Prior to this, the raw count files were imported into Mayday SeaSight (Battke and Nieselt, 2011) for common, timeseries-wide normalization. ...
Preprint
Full-text available
Streptomyces coelicolor M1152 is a widely used host strain for the heterologous production of novel small molecule natural products, genetically engineered for this purpose through e.g. deletion of four of its native biosynthetic gene clusters (BGCs) for improved precursor supply. Regardless of its potential, a systems understanding of its tight regulatory network and the effects of the significant genomic changes in M1152 is missing. In this study, we compare M1152 to its ancestor M145, thereby connecting observed phenotypic differences to changes on transcription and translation. Measured protein levels are connected to predicted metabolic fluxes, facilitated by an enzyme-constrained genome-scale model (GEM), that by itself is a consensus result of a community effort. This approach connects observed differences in growth rate and glucose consumption to changes in central carbon metabolism, accompanied by differential expression of important regulons. Results suggest that precursors supply is not limiting secondary metabolism, informing that alternative strategies will be beneficial for further development of S. coelicolor for heterologous production of novel compounds.
... For more detailed analysis, the ANOVA model was filtered based on the respective gene lists ( Fulton et al., 2009). TAM-specific surface marker expression was visualized in a heatmap using Mayday (Battke, Symons and Nieselt, 2010). ...
Article
Full-text available
Tumor-associated macrophages (TAMs) are frequently the most abundant immune cells in cancers and are associated with poor survival. Here, we generated TAM molecular signatures from K14cre;Cdh1flox/flox;Trp53flox/flox (KEP) and MMTVNeuT (NeuT) transgenic mice that resemble human invasive lobular carcinoma (ILC) and HER2+ tumors, respectively. Determination of TAM-specific signatures requires comparison with healthy mammary tissue macrophages to avoid overestimation of gene expression differences. TAMs from the two models feature a distinct transcriptomic profile, suggesting that the cancer subtype dictates their phenotype. The KEP-derived signature reliably correlates with poor overall survival in ILC but not in triple-negative breast cancer patients, indicating that translation of murine TAM signatures to patients is cancer subtype dependent. Collectively, we show that a transgenic mouse tumor model can yield a TAM signature relevant for human breast cancer outcome prognosis and provide a generalizable strategy for determining and applying immune cell signatures provided the murine model reflects the human disease.
... Differentially expressed transcripts were divided in clusters according to the normalized number of aligned reads in each stage by K-means clustering implemented in Mayday [93] based on Euclidian correlation between expression values. The list of transcripts in each cluster was used in Blast2GO to identify the enriched GO terms. ...
... The relation among GO terms was assigned using REVIGO with the Resvik algorithm option [94] and plotted in R with the Treemap library (github.com/mtennekes/treemap.git). To build the expression heatmap by functional categories, the counts of each transcript belonging to a protein group in such category were added up, and later transformed in Z-scores, clustered, and plotted in a heatmap using Mayday [93]. ...
Article
Full-text available
Background There are clear differences in embryo development between angiosperm and gymnosperm species. Most of the current knowledge on gene expression and regulation during plant embryo development has derived from studies on angiosperms species, in particular from the model plant Arabidopsis thaliana. The few published studies on transcript profiling of conifer embryogenesis show the existence of many putative embryo-specific transcripts without an assigned function. In order to extend the knowledge on the transcriptomic expression during conifer embryogenesis, we sequenced the transcriptome of zygotic embryos for several developmental stages that cover most of Pinus pinaster (maritime pine) embryogenesis. Results Total RNA samples collected from five zygotic embryo developmental stages were sequenced with Illumina technology. A de novo transcriptome was assembled as no genome sequence is yet published for Pinus pinaster. The transcriptome of reference for the period of zygotic embryogenesis in maritime pine contains 67,429 transcripts, which likely encode 58,527 proteins. The annotation shows a significant percentage, 31%, of predicted proteins exclusively present in pine embryogenesis. Functional categories and enrichment analysis of the differentially expressed transcripts evidenced carbohydrate transport and metabolism over-representation in early embryo stages, as highlighted by the identification of many putative glycoside hydrolases, possibly associated with cell wall modification, and carbohydrate transport transcripts. Moreover, the predominance of chromatin remodelling events was detected in early to middle embryogenesis, associated with an active synthesis of histones and their post-translational modifiers related to increased transcription, as well as silencing of transposons. Conclusions Our results extend the understanding of gene expression and regulation during zygotic embryogenesis in conifers and are a valuable resource to support further improvements in somatic embryogenesis for vegetative propagation of conifer species. Specific transcripts associated with carbohydrate metabolism, monosaccharide transport and epigenetic regulation seem to play an important role in pine early embryogenesis and may be a source of reliable molecular markers for early embryogenesis. Electronic supplementary material The online version of this article (10.1186/s12870-018-1564-2) contains supplementary material, which is available to authorized users.
... For cluster visualization, jClust [5] provides a graphical user interface, as does ClustVis [6], a web tool using 2D scatterplots and localizations. Mayday [7] is a powerful and distributed R-based platform for analysis and visualization, which was initially designed for microarray analyses. Likewise, The Hierarchical Clustering Explorer [8] focuses on microarrays and visualizes the data primarily as dendrograms and heat maps, similar to Clusterphile [9], which does focus on interactively exploring the data. ...
Article
Full-text available
Background Studies that aim at explaining phenotypes or disease susceptibility by genetic or epigenetic variants often rely on clustering methods to stratify individuals or samples. While statistical associations may point at increased risk for certain parts of the population, the ultimate goal is to make precise predictions for each individual. This necessitates tools that allow for the rapid inspection of each data point, in particular to find explanations for outliers. Results ACES is an integrative cluster- and phenotype-browser, which implements standard clustering methods, as well as multiple visualization methods in which all sample information can be displayed quickly. In addition, ACES can automatically mine a list of phenotypes for cluster enrichment, whereby the number of clusters and their boundaries are estimated by a novel method. For visual data browsing, ACES provides a 2D or 3D PCA or Heat Map view. ACES is implemented in Java, with a focus on a user-friendly, interactive, graphical interface. Conclusions ACES has been proven an invaluable tool for analyzing large, pre-filtered DNA methylation data sets and RNA-Sequencing data, due to its ease to link molecular markers to complex phenotypes. The source code is available from https://github.com/GrabherrGroup/ACES.