ArticlePDF Available

Exact and Approximate Area-Proportional Circular Venn and Euler Diagrams

Authors:

Abstract and Figures

Scientists conducting microarray and other experiments use circular Venn and Euler diagrams to analyze and illustrate their results. As one solution to this problem, this paper introduces a statistical model for fitting area-proportional Venn and Euler diagrams to observed data. The statistical model outlined in this paper includes a statistical loss function and a minimization procedure that enables formal estimation of the Venn/Euler area-proportional model for the first time. A significance test of the null hypothesis is computed for the solution. Residuals from the model are available for inspection. As a result, this algorithm can be used for both exploration and inference on real data sets. A Java program implementing this algorithm is available under the Mozilla Public License. An R function venneuler() is available as a package in CRAN and a plugin is available in Cytoscape.
Content may be subject to copyright.
A preview of the PDF is not available
... The impact of winter rapeseed was evaluated by comparing the October 2019 data between treatment grouping A and B against the treatment grouping C and D. Also, within treatment October 2018 vs October 2020 comparisons were made to identify individual proliferation of each diversification. The proportions (%) of shared and unique OTUs between treatments from different data sets were visualized with Venn diagrammes by applying script ps_venn.R (Russel 2021) from the phyloseq object and package eulerr (Micallef and Rodgers 2014;Wilkinson 2012). ...
Article
Full-text available
Diversification of agricultural practices, including changes in crop rotation, intercropping or cover cropping, influence the soil microbiome. Here the impact of tillage and crop diversification on the soil microbiome is reported, being one of the few boreal studies. The field experiment consisted of four treatments with four replications all having a short cereal rotation practice namely an oat (Avena sativa) – spring barley (Hordeum vulgare) – wheat (Triticum aestivum) rotation for the past 10 years until spring 2018. During that period two of the treatments were conventionally tilled with moldboard ploughing whereas the other two were no-tillage treatments. From the growing season 2018 until fall 2020 the main crop in all treatments was spring barley. The first conventional tillage treatment was diversified with English ryegrass (Lolium perenne) as an undersown cover crop for the next three growing seasons. The first no-tillage treatment continued with spring barley only. The second conventional tillage and no-tillage treatment had winter rapeseed in rotation in 2019. Bulk soils were sampled in May 2018 before diversification and then in October 2018, 2019, and 2020. The results showed a clear effect of tillage on the beta-diversity of the soil microbiome and an increase in fungal richness. Barley monoculture interrupted with winter rapeseed resulted in a minor change of the fungal and bacterial community composition. Other fungal and bacterial alpha diversity measures did not react to tillage or diversification nor did the gene copy abundances involved in the N cycle. In conclusion tillage had a profound effect on the soil microbiome hindering impact of the diversification.
... R software is also used to visualize microbial composition using the ggplot2 package (Wickham 2016), create a clustered heatmap using the pheatmap version 1.0.12 package (Kolde 2015), create a proportional Venn diagram with the Venneuler package (Wilkinson 2012) and plot Principal Coordinate of Analysis (PCoA) using the phyloseq package. ...
Article
Full-text available
Corals thrive in symbiotic relationships with a variety of microorganisms, including endosymbiont algae. The interaction between coral and microbial associations has been extensively researched since it is thought to play a function in coral health. Temperature and light are two abiotic elements that contribute to coral life. Corals in reef-flat environments frequently face variations in these two characteristics due to their proximity to shallow seas. This study aims to compare the microbial diversity and abundance associated with the coral Acropora pulchra on the reef flat under two conditions, namely corals that emerged to the surface at low tide (SF) and corals that submerged over time (SM) and to compare the microbial diversity of both with those found in its adjacent seawaters. Microbial analysis on 16S rRNA region V4 showed that the alpha diversity of coral microbial communities and seawaters was not significantly different. However, differences in abundance were noticed at the phylum and genus levels. With p-value < 0.05, PCoA analysis using the Bray-Curtis test showed that the coral microbial community was significantly different from the surrounding seawaters. This study indicates that under different conditions, corals of the same species can be dominated by different microbial groups. This study also confirms the uniqueness between coral microbes and their adjacent seawaters. The abundance of certain microbes is a host mechanism for survival. ARTICLE HISTORY
... We extracted results from the LS algorithm and classified genes with a q-value <0.3 (Benjamini-Hochberg correction) as having a circadian rhythm in V females. (131), eulerr (132), and ggplot2 (133). ...
Article
Full-text available
Sex peptide (SP), a seminal fluid protein of Drosophila melanogaster males, has been described as driving a virgin-to-mated switch in females, through eliciting an array of responses including increased egg laying, activity, and food intake and a decreased remating rate. While it is known that SP achieves this, at least in part, by altering neuronal signaling in females, the genetic architecture and temporal dynamics of the female’s response to SP remain elusive. We used a high-resolution time series RNA-sequencing dataset of female heads at 10 time points within the first 24 h after mating to learn about the genetic architecture, at the gene and exon levels, of the female’s response to SP. We find that SP is not essential to trigger early aspects of a virgin-to-mated transcriptional switch, which includes changes in a metabolic gene regulatory network. However, SP is needed to maintain and diversify metabolic changes and to trigger changes in a neuronal gene regulatory network. We further find that SP alters rhythmic gene expression in females and suggests that SP’s disruption of the female’s circadian rhythm might be key to its widespread effects.
Preprint
Full-text available
Background Viruses, the majority of which are uncultivated, are among the most abundant biological entities on Earth. From altering microbial physiology to driving community dynamics, viruses are fundamental members of microbiomes. While the number of studies leveraging viral metagenomics (viromics) for studying uncultivated viruses is growing, standards for viromics research are lacking. Viromics can utilize computational discovery of viruses from total metagenomes of all community members (hereafter metagenomes) or use physical separation of virus-specific fractions (hereafter viromes). However, differences in the recovery and interpretation of viruses from metagenomes and viromes obtained from the same samples remain understudied. Results Here, we compare viral communities from paired viromes and metagenomes obtained from 60 diverse samples across human gut, soil, freshwater, and marine ecosystems. Overall, viral communities obtained from viromes were more abundant and species rich than those obtained from metagenomes, although there were some exceptions. Despite this, metagenomes still contained many viral genomes not detected in viromes. We also found notable differences in the predicted lytic state of viruses detected in viromes vs metagenomes at the time of sequencing. Other forms of variation observed include genome presence/absence, genome quality, and encoded protein content between viromes and metagenomes, but the magnitude of these differences varied by environment. Conclusions Overall, our results show that the choice of method can lead to differing interpretations of viral community ecology. We suggest that the choice of whether to target a metagenome or virome to study viral communities should be dependent on the environmental context and ecological questions being asked. However, our overall recommendation to researchers investigating viral ecology and evolution is to pair both approaches to maximize their respective benefits.
Article
Full-text available
Neuroblastoma is the most common extracranial solid tumor in children. A subgroup of high-risk patients is characterized by aberrations in the chromatin remodeller ATRX that is encoded by 35 exons. In contrast to other pediatric cancer where ATRX point mutations are most frequent, multi-exon deletions (MEDs) are the most frequent type of ATRX aberrations in neuroblastoma. 75% of these MEDs are predicted to produce in-frame fusion proteins, suggesting a potential gain-of-function effect compared to nonsense mutations. For neuroblastoma there are only a few patient-derived ATRX aberrant models. Therefore, we created isogenic ATRX aberrant models using CRISPR-Cas9 in several neuroblastoma cell lines and one tumoroid and performed total RNA-sequencing on these and the patient-derived models. Gene set enrichment analysis (GSEA) showed decreased expression of genes related to both ribosome biogenesis and several metabolic processes in our isogenic ATRX exon 2-10 MED model systems, the patient-derived MED models and in tumor data containing two patients with an ATRX exon 2-10 MED. In sharp contrast, these same processes showed an increased expression in our isogenic ATRX knock-out and exon 2-13 MED models. Our validations confirmed a role of ATRX in the regulation of ribosome homeostasis. The two distinct molecular expression patterns within ATRX aberrant neuroblastomas that we identified imply that there might be a need for distinct treatment regimens.
Article
Stress-related psychiatric disorders and the stress system show prominent differences between males and females, as well as strongly divergent transcriptional changes. Despite several proposed mechanisms, we still lack the understanding of the molecular processes at play. Here, we explore the contribution of cell types to transcriptional sex dimorphism using single-cell RNA sequencing. We identify cell-type-specific signatures of acute restraint stress in the paraventricular nucleus of the hypothalamus, a central hub of the stress response, in male and female mice. Further, we show that a history of chronic mild stress alters these signatures in a sex-specific way, and we identify oligodendrocytes as a major target for these sex-specific effects. This dataset, which we provide as an online interactive app, offers the transcriptomes of thousands of individual cells as a molecular resource for an in-depth dissection of the interplay between cell types and sex on the mechanisms of the stress response.
Article
Euler diagrams are a popular technique to visualize set‐typed data. However, creating diagrams using simple shapes remains a challenging problem for many complex, real‐life datasets. To solve this, we propose RectEuler: a flexible, fully‐automatic method using rectangles to create Euler‐like diagrams. We use an efficient mixed‐integer optimization scheme to place set labels and element representatives (e.g., text or images) in conjunction with rectangles describing the sets. By defining appropriate constraints, we adhere to well‐formedness properties and aesthetic considerations. If a dataset cannot be created within a reasonable time or at all, we iteratively split the diagram into multiple components until a drawable solution is found. Redundant encoding of the set membership using dots and set lines improves the readability of the diagram. Our web tool lets users see how the layout changes throughout the optimization process and provides interactive explanations. For evaluation, we perform quantitative and qualitative analysis across different datasets and compare our method to state‐of‐the‐art Euler diagram generation methods.
Article
Full-text available
Foraminifera, the most ancient known calcium carbonate-producing eukaryotes, are crucial players in global biogeochemical cycles and well-used environmental indicators in biogeosciences. However, little is known about their calcification mechanisms. This impedes understanding the organismal responses to ocean acidifi-cation, which alters marine calcium carbonate production, potentially leading to biogeochemical cycle changes. We conducted comparative single-cell transcriptomics and fluorescent microscopy and identified calcium ion (Ca 2+) transport/secretion genes and α-carbonic anhydrases that control calcification in a foraminifer. They actively take up Ca 2+ to boost mitochondrial adenosine triphosphate synthesis during calcification but need to pump excess intracellular Ca 2+ to the calcification site to prevent cell death. Unique α-carbonic anhydrase genes induce the generation of bicarbonate and proton from multiple CO 2 sources. These control mechanisms have evolved independently since the Precambrian to enable the development of large cells and calcification despite decreasing Ca 2+ concentrations and pH in seawater. The present findings provide previously unknown insights into the calcification mechanisms and their subsequent function in enduring ocean acidification.
Article
Full-text available
Background: Traditional Chinese medicine (TCM) formulas are combinations of Chinese herbal medicines. Knowledge of classic medicine formulas is the basis of TCM diagnosis and treatment and is the core of TCM inheritance. The large number and flexibility of medicine formulas make memorization difficult, and understanding their composition rules is even more difficult. The multifaceted and multidimensional properties of herbal medicines are important for understanding the formula; however, these are usually separated from the formula information. Furthermore, these data are presented as text and cannot be analyzed jointly and interactively. Objective: We aimed to devise a visualization method for TCM formulas that shows the composition of medicine formulas and the multidimensional properties of herbal medicines involved and supports the comparison of medicine formulas. Methods: A TCM formula visualization method with multiple linked views is proposed and implemented as a web-based tool after close collaboration between visualization and TCM experts. The composition of medicine formulas is visualized in a formula view with a similarity-based layout supporting the comparison of compositing herbs; a shared herb view complements the formula view by showing all overlaps of pair-wise formulas; and a dimensionality-reduction plot of herbs enables the visualization of multidimensional herb properties. The usefulness of the tool was evaluated through a usability study with TCM experts. Results: Our method was applied to 2 typical categories of medicine formulas, namely tonic formulas and heat-clearing formulas, which contain 20 and 26 formulas composed of 58 and 73 herbal medicines, respectively. Each herbal medicine has a 23-dimensional characterizing attribute. In the usability study, TCM experts explored the 2 data sets with our web-based tool and quickly gained insight into formulas and herbs of interest, as well as the overall features of the formula groups that are difficult to identify with the traditional text-based method. Moreover, feedback from the experts indicated the usefulness of the proposed method. Conclusions: Our TCM formula visualization method is able to visualize and compare complex medicine formulas and the multidimensional attributes of herbal medicines using a web-based tool. TCM experts gained insights into 2 typical medicine formula categories using our method. Overall, the new method is a promising first step toward new TCM formula education and analysis methodologies.
Article
Background Regenerative SystemsOptimization with Finite-Difference and Simultaneous Perturbation Gradient EstimatorsCommon Random NumbersSelection Methods for Optimization with Discrete-Valued θConcluding Remarks
Book
The theory of multidimensional scaling arose and grew within the field of the behavioral sciences and now covers several statistical techniques that are widely used in many disciplines. Intended for readers of varying backgrounds, this book comprehensively covers the area while serving as an introduction to the mathematical ideas behind the various techniques of multidimensional scaling.
Article
The problem of deciding whether an intercept model or a no-intercept model is more appropriate for a given set of data is a problem with no simple solution. Often, the underlying physical situation will suggest an appropriate model; however, there still may be interest in assessing which model best fits the data or is the better predictor. In this article a different interpretation of regression through the origin is derived, that of a full fit to the original data set augmented by one further point. Examination of the leverage and influence of the augmented data point can provide help in comparing the models.