Conference PaperPDF Available

# Visualise Undrawable Euler Diagrams

Authors:

## Abstract and Figures

Given a group of overlapping sets, it is not always possible to represent it with Euler diagrams. Euler diagram characteristics might collide with the sets relationships to depict, making it impossible to outline a correct draw. In order to be able to show a greater class of instances, Euler diagrams have been extended allowing more general patterns, but so far all the most common definitions cannot represent all the possible connection between sets.We aim to introduce methods and constructions to produce a clear representation, as close as possible to Euler diagrams, even for sets that are not formally drawable in that way. We investigate on the reasons that make a diagram undrawable, in order to evaluate how and when to apply the mentioned structures, and to give the foundations necessary to design algorithms for this purpose.
Content may be subject to copyright.
Visualise Undrawable Euler Diagrams
Paolo Simonetto, David Auber
LaBRI, Universit
´
e Bordeaux I
paolo.simonetto@labri.fr, auber@labri.fr
May 8, 2008
Abstract
Given a group of overlapping sets, it is not always
possible to represent it with Euler diagrams. Euler dia-
gram characteristics might collide with the sets relation-
ships to depict, making it impossible to outline a correct
draw. In order to be able to show a greater class of
instances, Euler diagrams have been extended allowing
more general patterns, but so far all the most common
deﬁnitions cannot represent all the possible connection
between sets.
We aim to introduce methods and constructions to
produce a clear representation, as close as possible
to Euler diagrams, even for sets that are not formally
drawable in that way.
We will investigate on the reasons that make a dia-
gram undrawable, in order to evaluate how and when to
apply the mentioned structures, and to give the founda-
tions necessary to design algorithms for this purpose.
KeywordsEuler diagrams, overlapping clustering
1 Introduction
Euler diagrams [4] are the most natural and used way
to depict sets and their reciprocal relationships. They
consist in an association between regions of the plan and
the abstract sets, where the topological concepts of in-
clusion, exclusion and overlap of these regions are used
to represent the analogue sets relationships (ﬁg. 1).
As these diagrams were introduced by Euler by ex-
amples, there is not a complete agreement about their
formal deﬁnition. Some topological characteristics
(such the shape of the sets or the way they intersect)
might be identiﬁed either as essential traits or merely
aesthetic ones, making authors propose different deﬁni-
tions [5], [3], [9].
Drawing Euler diagrams is a difﬁcult task. The po-
tential growth of the complexity of the diagram is ex-
ponential with respect to the increase in the number of
sets represented, as n sets might form up to 2
n
inter-
sections [8]. For this reason, drawing Euler diagrams is
challenging even for instances of a dozen of sets [9].
(a) (b) (c)
Figure 1: Euler diagrams as studied by different authors.
(a) shows Euler diagrams as studied in [5]. They do
not permit overlapping lines and multiple-line crossing
points. (b) shows the more general Euler diagrams stud-
ied in [3]. Multiple line crossing point and overlapping
boundary are allowed, as well as disconnected overlaps
between the same sets. (c) shows Extended Euler dia-
grams (EED) as deﬁned in [9]. Holes inside the sets are
permitted.
Euler diagrams and clustering. The main practical
aim of our research is to visualise overlapping cluster-
ing in a clear way. Large telecommunication networks,
biological and social networks, ﬁnancial data, are usu-
ally represented as graphs and visualised through em-
bedding of graphs. Grouping elements in these graphs
exactly corresponds to deﬁning set combinations, and
the visualisation of these sets can be achieved using Eu-
ler diagrams.
Even if clustering is classically intended as partition-
ing the elements, overlapping clustering is an interesting
approach in many ﬁelds. Algorithms producing possi-
bly overlapping sets have been deﬁned, for instence, for
analysing social networks [7] or protein-protein interac-
tion networks [1].
In order to visualise each clustering detected, we
need to ensure we are always able to represent overlap-
ping sets. Standard Euler diagram deﬁnitions are not
able to represent all the possible set conﬁgurations, as
some of them have a topological structure that inevitably
violates the basic diagram rules.
hal-00319119, version 1 - 5 Sep 2008
Author manuscript, published in "12th International Conference on Information Visualisation, London : United Kingdom (2008)"
DOI : 10.1109/IV.2008.78
In section 2 we will present some work relative to
Euler diagrams. In particular, we will describe more in
depth the problem of drawing Euler diagrams, and we
comprehension. In section 3, we will describe the re-
lation between Euler diagrams and graphs, and we will
introduce some useful deﬁnitions. In section 4 we will
introduce the Euler representation, the kind of diagrams
we will use to overcome standard Euler diagram limita-
tions. Finally, in section 5 we will analyse how is prac-
tically possible to manage the characteristics of Euler
representations, and we illustrate a possible algorithm
2 Related work
Initial usage of these diagrams has been made by Eu-
ler for reasoning on categorical proposition and syllo-
gisms [4]. John Venn also studied Euler diagrams as a
tool for logical reasoning, proposing a particular sub-
class of them successively called Venn Diagrams [8].
Nowadays, Euler diagrams are widely and more fre-
quently used in the set theory ﬁeld. Answering to prob-
lems related to their existence and drawability has be-
come crucially important.
The problem of identifying and drawing a Euler dia-
gram is called the Euler Diagram Generation Problem
(EDGP). The usual way to approach this problem goes
trough the detection of the topological structure of the
intersections between the sets, the creation of a skele-
ton graph and the identiﬁcation of a planar embedding
on the plane. The several approaches to EDGP differ
in the input given and the properties of returned Euler
diagrams.
Euler diagram deﬁnitions. Flower and Howse [5]
developed a method to obtain a clear and simple sub-
class of Euler diagrams (ﬁg. 1.a). In this class, the lines
of the diagram do not overlap and intersect just pair-
wise. Although these limitations create nicer diagrams,
they are not merely aesthetic, as they reduce the range
of the representable instances.
EDGP has also been studied as planarization of hy-
pergraphs [6]. Hypergraphs are graphs in which edges
are identiﬁed as generic subsets of nodes, rather than
couples of them. Drawing hypergraphs in their vertex-
based planar representation has been proved to be equiv-
alent to the generation problem of a class of Euler-like
diagrams (EED, ﬁg. 1.c) by Verroust and Viaud [9].
They introduced this class of diagrams, that can be in-
formally thought of Euler diagrams that might contain
holes, and proved that they are always drawable when
representing eight or less intersecting sets.
(a) (b) (c)
Figure 2: From the Euler diagram to the intersection
graph. (a) the original diagram. (b) individuation of
the diagram zones. (c) the resulting intersection graph.
The dashed line is not part of the graph, but shows how
to reverse the procedure. For drawing the boundary of
the class B we need to enclose the nodes b, ab, abc and
intersect the edges (a, ab), (ac, abc).
The possibly more exhaustive analysis on general
Euler diagrams (ﬁg. 1.b) and their representation has
been done by Stirling C. Chow in his PhD thesis [3].
Chow analysed the drawability of Euler diagrams in
several different cases, although his work was essen-
tially focused on the correspondent problems for area-
proportional Euler diagrams.
Representable instance classes. Each of the quoted
approaches is able to depict a different class of in-
stances. Euler diagrams as deﬁned by Flower and
Howse (ﬁg. 1.a) cannot represent, for instance, the di-
agram in ﬁg. 1.b. The class of instances respresentable
by those simple Euler diagrams is actually a proper sub-
set of the instances representable by Euler diagrams as
deﬁned by Chow (ﬁg. 1.b). In his work [3], Chow
also showed how Euler diagrams are a proper subset of
Euler-like diagrams like EED (ﬁg. 1.c). Unfortunately,
even EED can represent just a proper subset of all the
possible instances of EDGP.
All the previous approaches are not suitable to be
used to represent general groups of overlapping sets, un-
less accepting to have no-output for non representable
instances.
Diagram readability. As we will necessarily have to
force some rules of well deﬁned Euler diagrams, it is es-
sential to understand which characteristics are more im-
portant for their comprehension. Benoy and Rodgers [2]
ity of Euler diagrams according to three aesthetic pa-
rameters: set boundary irregularity, zone area inequal-
ity, boundary closeness. They found evidence that all of
them strongly contribute in diagram comprehension.
hal-00319119, version 1 - 5 Sep 2008
3 Euler diagrams and graphs
Even if it is possible to deﬁne Euler diagrams in a
mathematical and formal way, working directly with
them is quite complicated. For this reason, Euler dia-
grams are usually studied and analysed as graphs.
We will represent diagrams as graphs in a way which
is quite common in literature. This way is illustrated in
ﬁg. 2 and consists in the construction of what we will
call intersection graphs.
We start having a collection of sets to represent.
These sets are deﬁned independently of each other on
a set of elements, so they will generally overlap. We
will indicate this collection with C = {C
a
, C
b
, . . .}
1
.
To avoid confusion with the more common word “set”,
we will call each C
x
class and C itself classiﬁcation.
Zone decomposition. Starting from a Euler diagram,
it is possible to divide it in zones (ﬁg. 2.b). Zones are the
regions of the plan described by the way classes overlap:
each of them contains all the, and only the, elements that
are contained exactly in the same set S of classes. For
instance, if S
ab
= {C
a
, C
b
}, than the relative zone will
contain all the, and only the, elements that are contained
in the classes C
a
and C
b
, but not in others.
We will label each of the zones with the letters asso-
ciated to the classes in S, so Z
ab
2
represents the men-
tioned zone. More formally, we will identify Z
ab
with
the set:
Z
ab
=
\
C
x
S
ab
C
x
!
\
\
C
x
/S
ab
C
x
= C
a
C
b
C
c
similarly to what has been deﬁned by other authors [3].
Intersection graphs. From the zone decomposition
we can easily construct a graph, called intersection
graph (ﬁg. 2.c), that shows the interconnections be-
tween the classes. The graph has one node for each zone
of the diagram, and one edge for each shared boundary
between two zones.
It is possible to prove that intersection graphs and
Euler diagrams have the same expression power, and
that there exists a bijection between equivalent Euler di-
agrams and equivalent intersection graphs [3]. This is
proved showing constructive methods to move from one
structure to the other.
For the reverse operation, that is obtaining a Euler
diagram from an intersection graph, it is sufﬁcient to re-
alise where the classes boundaries have to be drawn. For
1
We will identify classes in pictures using just the pedix in capital
letters.
2
We will identify zones in pictures using just the pedix in lower
case letters.
(a) (b)
Figure 3: (a) the complete graph K
5
generates an exam-
ple of a diagram that is not Eulerian, as any attempt to
draw it generates disconnected zones. In fact, we will
have to disconnect the zones d and e (see ﬁg. 4.a) to
draw the dashed link. The same graph is drawable, if
we allow duplicated zones. (b) an example of a graph
that is not drawable whitout disconnecting classes, even
if allowing disconnected zones. The circular sets are all
meant to be distinct. This time any attempt to draw the
dashed link brings undesired overlaps, so the class E
will have to remain disconnected.
each class, we need to consider the cut of the class nodes
and the corresponding cutting edges. The set boundary
can be drawn keeping in mind that it has to group the
class nodes and intersect each cutting edge (ﬁg. 2.c).
As they are equivalent, we will use diagrams or in-
tersection graphs indifferently according to which one
is clearer in the speciﬁc case.
4 Euler representation
To be able to represent classiﬁcations that do not have
a Euler diagram, we need to use a structure less restric-
tive. We will call this structure Euler representation,
(a) (b) (c)
Figure 4: (a) a diagram with disconnected zones, as
zone a and zone b are represented by separated regions
divided by the zone ab. (b) a diagram with disconnected
classes, as class B has the zones ab, b separated from
zone bc. (c) a diagram with disconnected zones and
classes, as zone b is duplicated and the class B is dis-
connected.
hal-00319119, version 1 - 5 Sep 2008
(a) (b) (c)
Figure 5: Visualisation of disconnected classes. (a) the
original diagram, showing the relationships we aim to
represent. Let us suppose the zones ac, c are not directly
reachable from the others. (b) shows a possible way to
depict a link between separated zones of the same class.
This representation does not show straight away that the
class A contains ac, especially if they are positioned far
apart from each other. (c) shows the duplication of the
zone ac and its nodes. A spotted boundary is used to
indicate that the zone has been cloned and not simply
represented with separated regions. This representation
shows in a more immediate way that the classes A and
C interact with each other, as well as it shows all the
elements of the same class in the same connected area.
and we will design its properties investigating the fac-
tors that make an EDGP instance undrawable.
Zone connectivity. According to Chow [3], a set of
closed curves is a Euler diagram if every non-empty
zone is represented as a connected region. The zone
connectivity is the ﬁrst problem for the existence of a
Euler diagram. We can easily show EDGP instances
3
that are not drawable without splitting the zones in dis-
connected regions (ﬁg. 3.a), proving that zone connec-
tivity is actually a limiting condition.
Relaxing this condition we are practically allowed
to duplicate a zone in a different area of the diagram
(ﬁg. 4.a), as long as we keep the classes connected. Un-
fortunately, this is not sufﬁcient to draw every EDGP
instance, as some of them are not representable even
dropping this bound (ﬁg. 3.b).
3
These difﬁcult instances are usually built starting from unplanar
graphs and mapping sets in the graph elements in an suitable way.
For instance, we can associate sets N
i
to the nodes, sets E
j
to
the edges, and impose that each set E overlaps only with the sets
N associated to the nodes incident to their edges. This implies that
the edges cannot overlap, otherwise we will describe an intersection
between sets E that are not deﬁned in our model.
For the Kuratowski’s theorem, a graph containing a subdivision
of the complete graph K
5
or the complete bipartite graph K
3,3
is
not planar, or in other words, it cannot be represented without draw-
ing crossing edges. These graphs, through the association explained,
bring us to examples of undrawable Euler diagrams.
(a) (b) (c)
Figure 6: Another example of disconnected classes.
(a) the original diagram, built without particular con-
straints. Let us now assume class D and E are not
reachable from A, B, C. (b) the diagram obtained when
representing the zone a as two separated regions. In
this case, node duplication is generally not meaningful
for a better comprehension of the diagram. (c) the same
graph obtained duplicating the zones ad and ae and their
nodes. Again, node duplication can be made clear by
using a dashed line for the boundary. Altrough this so-
lution allows us to see all the nodes of the same class
in a connected region, it tends to be less readable than
the previous one because of the greater number of extra
Class connectivity. In Euler diagrams classes are rep-
resented by a connected region, as implied by the us-
age of a single closed curve for each class. Again,
we can see that this condition is restrictive showing
EDGP instances that are not drawable without repre-
senting classes with separated regions (ﬁg. 3.b).
Relaxing this condition we are allowed to draw zones
that are separated from each other (ﬁg. 4.b). Clearly we
are now able to draw each EDGP instance, as we are no
more forced to link zones together.
Representation characteristics. From the previous
analysis, we can deduce that Euler representations
should allow classes to be represented by separated re-
gions, if necessary. Disconnecting zones do not seem to
be necessary, but sometimes they allow to obtain more
readable diagrams. For the same reason we might also
decide to duplicate a zone, creating a copy of the zone in
another region of the graph and cloning all its elements.
Summarising, Euler representations are charac-
terised by:
classes not necessarily connected,
zones not necessarily connected and eventually
even cloned,
where the usage of these patterns is limited as much as
possible.
hal-00319119, version 1 - 5 Sep 2008
(a) (b) (c)
Figure 7: Some tests on the intersection graph. (a) checking that all the class schemas are connected. (b) checking that
subgraphs induced by the nodes containing any possible subset of classes are connected. Here the subgraphs induced
by the nodes containing ab, ac, and abc are shown. Together with the ones containing a, b, and c (that correspond
exactly to the class schemas in the previous picture), they are all the possible non empty subgraphs of this kind. (c)
checking that the complementar class schemas are connected. At this point we need to consider also a node associated
to the external area, that will always be part of the complementar class schemas.
Some examples of the application of these methods
are shown in ﬁg. 5 and ﬁg. 6. In particular, ﬁg. 5.a shows
just a disconnected class, ﬁg. 6.b also a disconnected
zone, and ﬁg. 5.c and ﬁg. 6.c examples of zones cloning.
5 Properties of the intersection graph
Because of the bounds relaxation we did and the
new structures we introduced, we have a high degree of
freedom on representing diagrams. Choosing the more
readable representation between all the possible ones re-
quires at ﬁrst to identify the most important properties
of diagram comprehension. Assuming that Euler dia-
grams are more readable than general Euler representa-
tions, we need to try to:
1. avoid every undesired overlap. This is an indis-
putable point as we aim to draw just the non empty
zones.
2. keep the classes as connected as possible. This will
avoid having classes represented as separated re-
gions (ﬁg. 4.b) in the ﬁnal diagram.
3. keep the single zones as connected as possible.
This avoids zones being represented by more than
one region (ﬁg. 4.a), as it makes it difﬁcult to un-
derstand the exact iteration of the zones with the
rest of the diagram.
4. keep even zones that share the same subset of la-
bels as connected as possible. This avoids discon-
nected overlaps between the same sets (ﬁg. 1.b), as
they make it difﬁcult to trace how the intersection
between classes is divided in the several zones.
5. avoid holes in the classes. Diagrams with holes
(ﬁg. 1.c) can generate confusion between holes and
set inclusions.
6. make classes assume a smooth and regular shape.
As we will practically work with embeddings of the
intersection graph, it is extremely useful to see how the
previous diagram properties are translated in graph em-
bedding properties:
1. make the intersection graph planar.
2. make the subgraphs induced by the nodes of the
same class connected (ﬁg. 7.a). We will call these
induced subgraphs class schemas.
3. avoid node duplications in the intersection graph.
In other words, limit the usage of node duplications
in order to satisfy the previous points.
4. make the subgraphs induced by the nodes of all the
same subset of classes connected, rather than just
the nodes of the same class (ﬁg. 7.b).
5. make the subgraph induced by the nodes outside
each class schema connected (ﬁg. 7.c). We will
call these induced subgraphs complementar class
schemas. It is also necessary to add a node associ-
ated to the null zone, corresponding to the external
area. As this node is never part of the class schema,
it is always in the complementar one.
6. place nodes in an area of the plan as compact and
regular as possible.
hal-00319119, version 1 - 5 Sep 2008
Figure 8: a Euler representation example. The diagram
structure is not planar (it has a K
5
minor), so it cannot be
represented with Euler diagrams. The Euler representa-
tion proposed uses zone duplication for bdg and dh, and
has a disconnected class A.
Algorithms design. An algorithm that points to de-
tecting a good Euler representation has to identify an in-
tersection graph satisfying the previous points as much
as possible, in order of importance. The most immedi-
ate way consists of identifying all the zones of the given
classiﬁcation, associating one intersection graph’s node
to each of them, and selecting carefully the edges to in-
sert.
Node duplication, that corresponds to allow a zone
to be disconnected, can be used when it is no longer
possible to select useful edges in the graph. Discon-
nected class nodes will correspond, instead, to discon-
nected classes. Choosing to leave them disconnected, or
to use node duplications to connect them, it is all matter
of decision. As we saw, it depends on the speciﬁc case
and on the speciﬁc relation one aims to represent.
6 Conclusions
We started by introducing several ways of deﬁn-
ing Euler diagrams, showing or referencing proofs of
their inability to represent every classiﬁcation. We then
analysed why Euler diagrams cannot be always drawn,
pointing out two separat reasons that might impeding
this process.
This analysis allowed us to detect some methods to
show otherwise unrepresentable relationships. Using
disconnected regions for classes, and graphically link-
ing them together, is the simplest approach. We saw
that this always works, but that the results are not neces-
sarily the best possible. Another option we pointed out
consists of representing some zones as disconnected re-
gions. This might help to reduce the number of ﬁctional
links we need to introduce. A last possibility is to clone
a whole zone in another part of the graph, cloning even
the nodes of the zone. This helps in particular when
we want to keep all the nodes of a class in the same con-
nected region, even when the overlapping classes are not
directly connected to each other.
Finally, we analysed the way each condition is ex-
pressed in the intersection graph. Structure graphs of
this kind are the ﬁrst step of most approaches to Eu-
ler diagrams generation. Knowing how the previous
patterns are mapped in these graphs is essential to de-
cide how, when and where to use them. An algorithm
paradigm has also been pointed out, while concrete im-
plementations of this approach need to conveniently de-
ﬁne the necessary metrics according to the particular ap-
plication.
References
[1] Gary D. Bader and Christopher W.V. Hogue. An
automated method for ﬁnding molecular complexes
in large protein interaction networks. January 13
2003.
[2] Florence Benoy and Peter Rodgers. Evaluating the
comprehension of euler diagrams. In IV, pages 771–
780. IEEE Computer Society, 2007.
[3] Stirling Christopher Chow. Generating and draw-
ing area-proportional Euler and Venn diagrams.
PhD thesis, 2007.
[4] Leonhard Euler. Lettres une princesse d’allemagne,
letters no. 102-108, 1761.
[5] Jean Flower and John Howse. Generating euler di-
agrams. Lecture Notes in Computer Science, 2317,
2002.
[6] D.S. Johnson and H.O. Pollak. Hypergraph pla-
narity and the complexity of drawning venn dia-
grams. Journal of graph theory, 11(3):309–325,
1987.
[7] Gergely Palla, Albert-L
´
aszl
´
o Barab
´
asi, and Tam
´
as
Vicsek. Quantifying social group evolution. Nature,
446(7136):664–667, 2007.
[8] John Venn. On the diagrammatic and mechani-
cal representation of propositions and reasonings,
1880.
[9] Anne Verroust and Marie-Luce Viaud. Ensuring
the drawability of extended euler diagrams for up
to 8 sets. In Diagrammatic Representation and In-
ference, Third International Conference, Diagrams
2004, Cambridge, UK, Lecture Notes in Computer
Science. Springer.
hal-00319119, version 1 - 5 Sep 2008
... Techniques for time visualization are less diverse than set visualization, being broadly classified into linear and cyclical methods. The survey then overviewed the few existing visualization techniques that can claim to visualize both time and sets: TimeSets [6], Time-Sets [5], Hypenet [8], Bubble Sets [3], Dynamic Euler Diagrams [7], Linear Representations [9], and Circos [4]. ...
... In the graph-drawing community, most attention has been afforded to hypergraph supports [9] for both fixed and free vertex locations, e.g. [1,4,5,8]. ...
... There are some connections to traversing a path in the Euler diagram. But not all sets can be represented by Euler diagrams [8]. There is more background about "well formed Euler diagrams" and what can and cannot be done in these papers [5,10]. ...
Article
This report documents the program and the outcomes of Dagstuhl Seminar 19192 “Visual Analytics for Sets over Time and Space”, which brought together 29 researchers working on visualization(i) from a theoretical point of view (graphdrawing, computational geometry, and cognition (ii) from a temporal point of view (visual analytics and information visualization overtime, HCI), and (iii) from a space-time point of view (cartography, GIScience). The goal of the seminar was to identify speciﬁc theoretical and practical problems that need to be solved in order to create dynamic and interactive set visualizations that take into account time and space, and to begin working on these problems. The ﬁrst 1.5 days were reserved for overview presentations from representatives of the diﬀerent communities, for presenting open problems, and for forming interdisciplinary working groups that would focus on some of the identiﬁed open problems as a group. There were three survey talks, ten short talks, and one panel with three contributors. The remaining three days consisted of open mic sessions, working-group meetings, and progress reports. Five working groups were formed that investigated several of the open research questions. Abstracts of the talks and a report from each working group are included in this report.
... Calculating Euler diagrams gets more complicated and difficult when the complexity of the diagram grows [19]. The complexity is defined by the amount of sets in the intersection sets in U , which can be as high as 2 n , whereby n equals the amount of sets in M . ...
... More issues become apparent when the complexity increases. To verify if an Euler or Euler-like diagram (such as described in e.g., [19]) is correctly drawn, an agreement about the visualization is needed. The following points need to be fulfilled by the diagram in order to be recognized as correctly drawn: ...
... Another way to create drawable Euler diagrams is to split or clone sets [19]. The newly created or cloned sets can then be drawn as several geometrical forms that are not intersecting each other. ...
Conference Paper
Full-text available
A vast majority of internet users has adopted new ways and possibilities of interaction and information exchange on the social web. Individuals are becoming accustomed to contribute and express their opinion on various platforms and websites. Commercial online polls allow operators of online newspapers, blogs and other forms of media sites to provide such services to their users. Consequently, their popularity is rapidly increasing and more and more potential areas of application emerge. However, in most cases the expressed opinions are stored and displayed without any further actions and the knowledge that lies in the answers is discarded. This research paper explores the possibilities, advantages and limits of applying semantic technologies to these online polls. For this purpose, a list of requirements was assembled and possible system architectures for semantic knowledgebases were investigated with the focus on providing consistent and extensive data for further processing. In a next step, the current state of the art of relevant visualization technologies was analyzed and further research challenges were identified. Our results discuss possible applications within the scope of a challenging case study. A comprehensive data pool provided by our industry partner allows for testing various improvements to user experience and traction of the polling system.
... [2,7,8,15]), and those where the positions can be chosen by the layout algorithm (e.g. [10,18,19]). For a more detailed overview and in-depth classification of set visualization methods we refer to the survey by Alsallakh et al. [4]. ...
Article
Motivated by a new way of visualizing hypergraphs, we study the following problem. Consider a rectangular grid and a set of colors $\chi$. Each cell $s$ in the grid is assigned a subset of colors $\chi_s \subseteq \chi$ and should be partitioned such that for each color $c\in \chi_s$ at least one piece in the cell is identified with $c$. Cells assigned the empty color set remain white. We focus on the case where $\chi = \{\text{red},\text{blue}\}$. Is it possible to partition each cell in the grid such that the unions of the resulting red and blue pieces form two connected polygons? We analyze the combinatorial properties and derive a necessary and sufficient condition for such a painting. We show that if a painting exists, there exists a painting with bounded complexity per cell. This painting has at most five colored pieces per cell if the grid contains white cells, and at most two colored pieces per cell if it does not.
... The problem of drawing Euler diagrams has been studied recently for both cases when the locations of the elements can be freely chosen (see e.g. [17,18]) and when the elements have to be drawn at fixed positions (see e.g. [3,8,9,10,15]). ...
Article
Consider a set of n points in the plane, each one of which is colored either red, blue, or purple. A red-blue-purple spanning graph (RBP spanning graph) is a graph whose vertices are the points and whose edges connect the points such that the subgraph induced by the red and purple points is connected, and the subgraph induced by the blue and purple points is connected. The minimum RBP spanning graph problem is to find an RBP spanning graph with minimum total edge length. First we consider this problem for the case when the points are located on a circle. We present an algorithm that solves this problem in O(n²) time, improving upon the previous algorithm by a factor of Θ(n). Also, for the general case we present an algorithm that runs in O(n⁵) time, improving upon the previous algorithm by a factor of Θ(n).
Article
We study a problem motivated by sparse set visualization. Given n points in the plane, each labeled with one or more primary colors, a colored spanning graph (for short, CSG) is a graph in which the vertices of each primary color induce a connected subgraph. The Min-CSG problem asks for the minimum sum of edge lengths in a colored spanning graph. We show that the problem is NP-hard for k primary colors when k≥3 and provide a (2−13+2ϱ)-approximation algorithm for k=3 that runs in polynomial time, where ϱ is the Steiner ratio. Further, we give an O(n) time algorithm in the special case that the given points are collinear and k is constant.
Article
A set diagram represents the membership relation among data elements. It is often visualized as secondary information on top of primary information, such as the spatial positions of elements on maps and charts. Visualizing the temporal evolution of such set diagrams as well as their primary features is quite important; however, conventional approaches have only focused on the temporal behavior of the primary features and do not provide an effective means to highlight notable transitions within the set relationships. This paper presents an approach for generating a stepwise animation between set diagrams by decomposing the entire transition into atomic changes associated with individual data elements. The key idea behind our approach is to optimize the ordering of the atomic changes such that the synthesized animation minimizes unwanted set occlusions by considering their depth ordering and reduces the gaze shift between two consecutive stepwise changes. Experimental results and a user study demonstrate that the proposed approach effectively facilitates the visual identification of the detailed transitions inherent in dynamic set diagrams.
Article
We study an algorithmic problem that is motivated by ink minimization for sparse set visualizations. Our input is a set of points in the plane which are either blue, red, or purple. Blue points belong exclusively to the blue set, red points belong exclusively to the red set, and purple points belong to both sets. A red-blue-purple spanning graph (RBP spanning graph) is a set of edges connecting the points such that the subgraph induced by the red and purple points is connected, and the subgraph induced by the blue and purple points is connected. We study the geometric properties of minimum RBP spanning graphs and the algorithmic problems associated with computing them. Specifically, we show that the general problem can be solved in polynomial time using matroid techniques. In addition, we discuss more efficient algorithms for the case in which points are located on a line or a circle, and also describe a fast (12ρ+1)-approximation algorithm, where ρ is the Steiner ratio.
Conference Paper
Since an image can easily be modeled by its adjacency graph, graph theory and algorithms on graphs are widely used in image processing. Of particular interest are the problems of estimating the number of the maximal cliques in a graph and designing algorithms for their computation, since these are found relevant to various applications in image processing and computer graphics. In the present paper we study the maximal clique problem on intersection graphs of convex polygons, which are also applicable to imaging sciences. We present results which refine or improve some of the results recently proposed in [18]. Thus, it was shown therein that an intersection graph of n convex polygons whose sides are parallel to k different directions has no more than n 2k maximal cliques. Here we prove that the number of maximal cliques does not exceed n k . Moreover, we show that this bound is tight for any fixed k. Algorithmic aspects are discussed as well.
Conference Paper
As an image can easily be modeled by its adjacency graph, graph theory and algorithms on graphs are widely used in imaging sciences. In this paper we define a knapsack graph, which is an intersection graph of integer translates of knapsack polygons, and consider the maximal clique problem on such graphs. A major application of intersection graphs is found in visualization of relations among objects in a scene. Efficient algorithms for the maximal clique problem are applicable to problems of computer graphics and image analysis, while properties of the knapsack polygon have been used in obtaining theoretical results in discrete geometry for computer imagery. We first show that the maximal clique problem on knapsack graphs is equivalent to the maximal clique problem on intersection graphs of homothetic right triangles. The latter was shown to be equivalent to the maximal clique problem on max-tolerance graphs and solvable in optimal O(n 3) time [28]. Thus, if the linear constraints defining the knapsack polygons are known, then the maximal clique problem on knapsack graphs can be solved using the algorithm from [28]. If the polygons are given by lists of their vertices and the defining constraints are unknown, we show how these can be found efficiently in computation time bounded by a low degree polynomial in the polygons size.
Conference Paper
Full-text available
This paper shows by a constructive method the existence of a diagrammatic representation called extended Euler diagrams for any collection of sets X 1,...,X n , n<9. These diagrams are adapted for representing sets inclusions and intersections: each set X i and each non empty intersection of a subcollection of X 1,...,X n is represented by a unique connected region of the plane. Starting with an abstract description of the diagram, we define the dual graph G and reason with the properties of this graph to build a planar representation of the X 1,...,X n . These diagrams will be used to visualize the results of a complex request on any indexed video databases. In fact, such a representation allows the user to perceive simultaneously the results of his query and the relevance of the database according to the query.
Article
Full-text available
Recent advances in proteomics technologies such as two-hybrid, phage display and mass spectrometry have enabled us to create a detailed map of biomolecular interaction networks. Initial mapping efforts have already produced a wealth of data. As the size of the interaction set increases, databases and computational methods will be required to store, visualize and analyze the information in order to effectively aid in knowledge discovery. This paper describes a novel graph theoretic clustering algorithm, "Molecular Complex Detection" (MCODE), that detects densely connected regions in large protein-protein interaction networks that may represent molecular complexes. The method is based on vertex weighting by local neighborhood density and outward traversal from a locally dense seed protein to isolate the dense regions according to given parameters. The algorithm has the advantage over other graph clustering methods of having a directed mode that allows fine-tuning of clusters of interest without considering the rest of the network and allows examination of cluster interconnectivity, which is relevant for protein networks. Protein interaction and complex information from the yeast Saccharomyces cerevisiae was used for evaluation. Dense regions of protein interaction networks can be found, based solely on connectivity data, many of which correspond to known protein complexes. The algorithm is not affected by a known high rate of false positives in data from high-throughput interaction techniques. The program is available from ftp://ftp.mshri.on.ca/pub/BIND/Tools/MCODE.
Article
Full-text available
The rich set of interactions between individuals in society results in complex community structure, capturing highly connected circles of friends, families or professional cliques in a social network. Thanks to frequent changes in the activity and communication patterns of individuals, the associated social and communication network is subject to constant evolution. Our knowledge of the mechanisms governing the underlying community dynamics is limited, but is essential for a deeper understanding of the development and self-optimization of society as a whole. We have developed an algorithm based on clique percolation that allows us to investigate the time dependence of overlapping communities on a large scale, and thus uncover basic relationships characterizing community evolution. Our focus is on networks capturing the collaboration between scientists and the calls between mobile phone users. We find that large groups persist for longer if they are capable of dynamically altering their membership, suggesting that an ability to change the group composition results in better adaptability. The behaviour of small groups displays the opposite tendency-the condition for stability is that their composition remains unchanged. We also show that knowledge of the time commitment of members to a given community can be used for estimating the community's lifetime. These findings offer insight into the fundamental differences between the dynamics of small groups and large institutions.
Article
An Euler diagram C = {c1, c2,...,cn} is a collection of n simple closed curves (i.e., Jordan curves) that partition the plane into connected subsets, called regions, each of which is enclosed by a unique combination of curves. Typically, Euler diagrams are used to visualize the distribution of discrete characteristics across a sample population; in this case, each curve represents a characteristic and each region represents the sub-population possessing exactly the combination of containing curves’ properties. Venn diagrams are a subclass of Euler diagrams in which there are 2 n regions representing all possible combinations of curves (e.g., two partially overlapping circles). In this dissertation, we study the Euler Diagram Generation Problem (EDGP), which involves constructing an Euler diagram with a prescribed set of regions. We describe a graph-theoretic model of an Euler diagram’s structure and use this model to develop necessary-and-sufficient existence conditions. We also use the graph-theoretic model to prove that the EDGP is NP-complete. In addition, we study the related Area-Proportional Euler Diagram Generation Problem (ω-EDGP), which involves
Article
We introduce two new notions of planarity for hypergraphs based on dual generalizations of the standard Venn diagram. These definitions are illustrated by results concerning the existence and nonexistence of such diagrams for certain classes of hypergraphs. We conclude by showing that the general problem of determining whether such diagrams exist is NP-complete.
Conference Paper
This article describes an algorithm for the automated generation of any Euler diagram starting with an abstract description of the diagram. An automated generation mechanism for Euler diagrams forms the foundations of a generation algorithm for notations such as Harel's higraphs, constraint diagrams and some of the UML notation. An algorithm to generate diagrams is an essential component of a diagram tool for users to generate, edit and reason with diagrams. The work makes use of properties of the dual graph of an abstract diagram to identify which abstract diagrams are "drawable" within given wellformedness rules on concrete diagrams. A Java program has been written to implement the algorithm and sample output is included.
Conference Paper
We describe an empirical investigation into layout criteria that can help with the comprehension of Euler diagrams. Euler diagrams are used to represent set inclusion in applications such as teaching set theory, database querying, software engineering, filing system organisation and bio-informatics. Research in automatically laying out Euler diagrams for use with these applications is at an early stage, and our work attempts to aid this research by informing layout designers about the importance of various Euler diagram aesthetic criteria. The three criteria under investigation were: contour jaggedness, zone area inequality and edge closeness. Subjects were asked to interpret diagrams with different combinations of levels for each of the criteria. Results for this investigation indicate that, within the parameters of the study, all three criteria are important for understanding Euler diagrams and we have a preliminary indication of the ordering of their importance.