Conference PaperPDF Available

# Visualise Undrawable Euler Diagrams

Authors:

## Abstract and Figures

Given a group of overlapping sets, it is not always possible to represent it with Euler diagrams. Euler diagram characteristics might collide with the sets relationships to depict, making it impossible to outline a correct draw. In order to be able to show a greater class of instances, Euler diagrams have been extended allowing more general patterns, but so far all the most common definitions cannot represent all the possible connection between sets.We aim to introduce methods and constructions to produce a clear representation, as close as possible to Euler diagrams, even for sets that are not formally drawable in that way. We investigate on the reasons that make a diagram undrawable, in order to evaluate how and when to apply the mentioned structures, and to give the foundations necessary to design algorithms for this purpose.
Content may be subject to copyright.
Visualise Undrawable Euler Diagrams
Paolo Simonetto, David Auber
LaBRI, Universit
´
e Bordeaux I
paolo.simonetto@labri.fr, auber@labri.fr
May 8, 2008
Abstract
Given a group of overlapping sets, it is not always
possible to represent it with Euler diagrams. Euler dia-
gram characteristics might collide with the sets relation-
ships to depict, making it impossible to outline a correct
draw. In order to be able to show a greater class of
instances, Euler diagrams have been extended allowing
more general patterns, but so far all the most common
deﬁnitions cannot represent all the possible connection
between sets.
We aim to introduce methods and constructions to
produce a clear representation, as close as possible
to Euler diagrams, even for sets that are not formally
drawable in that way.
We will investigate on the reasons that make a dia-
gram undrawable, in order to evaluate how and when to
apply the mentioned structures, and to give the founda-
tions necessary to design algorithms for this purpose.
KeywordsEuler diagrams, overlapping clustering
1 Introduction
Euler diagrams  are the most natural and used way
to depict sets and their reciprocal relationships. They
consist in an association between regions of the plan and
the abstract sets, where the topological concepts of in-
clusion, exclusion and overlap of these regions are used
to represent the analogue sets relationships (ﬁg. 1).
As these diagrams were introduced by Euler by ex-
amples, there is not a complete agreement about their
formal deﬁnition. Some topological characteristics
(such the shape of the sets or the way they intersect)
might be identiﬁed either as essential traits or merely
aesthetic ones, making authors propose different deﬁni-
tions , , .
Drawing Euler diagrams is a difﬁcult task. The po-
tential growth of the complexity of the diagram is ex-
ponential with respect to the increase in the number of
sets represented, as n sets might form up to 2
n
inter-
sections . For this reason, drawing Euler diagrams is
challenging even for instances of a dozen of sets .
(a) (b) (c)
Figure 1: Euler diagrams as studied by different authors.
(a) shows Euler diagrams as studied in . They do
not permit overlapping lines and multiple-line crossing
points. (b) shows the more general Euler diagrams stud-
ied in . Multiple line crossing point and overlapping
boundary are allowed, as well as disconnected overlaps
between the same sets. (c) shows Extended Euler dia-
grams (EED) as deﬁned in . Holes inside the sets are
permitted.
Euler diagrams and clustering. The main practical
aim of our research is to visualise overlapping cluster-
ing in a clear way. Large telecommunication networks,
biological and social networks, ﬁnancial data, are usu-
ally represented as graphs and visualised through em-
bedding of graphs. Grouping elements in these graphs
exactly corresponds to deﬁning set combinations, and
the visualisation of these sets can be achieved using Eu-
ler diagrams.
Even if clustering is classically intended as partition-
ing the elements, overlapping clustering is an interesting
approach in many ﬁelds. Algorithms producing possi-
bly overlapping sets have been deﬁned, for instence, for
analysing social networks  or protein-protein interac-
tion networks .
In order to visualise each clustering detected, we
need to ensure we are always able to represent overlap-
ping sets. Standard Euler diagram deﬁnitions are not
able to represent all the possible set conﬁgurations, as
some of them have a topological structure that inevitably
violates the basic diagram rules.
hal-00319119, version 1 - 5 Sep 2008
Author manuscript, published in "12th International Conference on Information Visualisation, London : United Kingdom (2008)"
DOI : 10.1109/IV.2008.78
In section 2 we will present some work relative to
Euler diagrams. In particular, we will describe more in
depth the problem of drawing Euler diagrams, and we
comprehension. In section 3, we will describe the re-
lation between Euler diagrams and graphs, and we will
introduce some useful deﬁnitions. In section 4 we will
introduce the Euler representation, the kind of diagrams
we will use to overcome standard Euler diagram limita-
tions. Finally, in section 5 we will analyse how is prac-
tically possible to manage the characteristics of Euler
representations, and we illustrate a possible algorithm
2 Related work
Initial usage of these diagrams has been made by Eu-
ler for reasoning on categorical proposition and syllo-
gisms . John Venn also studied Euler diagrams as a
tool for logical reasoning, proposing a particular sub-
class of them successively called Venn Diagrams .
Nowadays, Euler diagrams are widely and more fre-
quently used in the set theory ﬁeld. Answering to prob-
lems related to their existence and drawability has be-
come crucially important.
The problem of identifying and drawing a Euler dia-
gram is called the Euler Diagram Generation Problem
(EDGP). The usual way to approach this problem goes
trough the detection of the topological structure of the
intersections between the sets, the creation of a skele-
ton graph and the identiﬁcation of a planar embedding
on the plane. The several approaches to EDGP differ
in the input given and the properties of returned Euler
diagrams.
Euler diagram deﬁnitions. Flower and Howse 
developed a method to obtain a clear and simple sub-
class of Euler diagrams (ﬁg. 1.a). In this class, the lines
of the diagram do not overlap and intersect just pair-
wise. Although these limitations create nicer diagrams,
they are not merely aesthetic, as they reduce the range
of the representable instances.
EDGP has also been studied as planarization of hy-
pergraphs . Hypergraphs are graphs in which edges
are identiﬁed as generic subsets of nodes, rather than
couples of them. Drawing hypergraphs in their vertex-
based planar representation has been proved to be equiv-
alent to the generation problem of a class of Euler-like
diagrams (EED, ﬁg. 1.c) by Verroust and Viaud .
They introduced this class of diagrams, that can be in-
formally thought of Euler diagrams that might contain
holes, and proved that they are always drawable when
representing eight or less intersecting sets.
(a) (b) (c)
Figure 2: From the Euler diagram to the intersection
graph. (a) the original diagram. (b) individuation of
the diagram zones. (c) the resulting intersection graph.
The dashed line is not part of the graph, but shows how
to reverse the procedure. For drawing the boundary of
the class B we need to enclose the nodes b, ab, abc and
intersect the edges (a, ab), (ac, abc).
The possibly more exhaustive analysis on general
Euler diagrams (ﬁg. 1.b) and their representation has
been done by Stirling C. Chow in his PhD thesis .
Chow analysed the drawability of Euler diagrams in
several different cases, although his work was essen-
tially focused on the correspondent problems for area-
proportional Euler diagrams.
Representable instance classes. Each of the quoted
approaches is able to depict a different class of in-
stances. Euler diagrams as deﬁned by Flower and
Howse (ﬁg. 1.a) cannot represent, for instance, the di-
agram in ﬁg. 1.b. The class of instances respresentable
by those simple Euler diagrams is actually a proper sub-
set of the instances representable by Euler diagrams as
deﬁned by Chow (ﬁg. 1.b). In his work , Chow
also showed how Euler diagrams are a proper subset of
Euler-like diagrams like EED (ﬁg. 1.c). Unfortunately,
even EED can represent just a proper subset of all the
possible instances of EDGP.
All the previous approaches are not suitable to be
used to represent general groups of overlapping sets, un-
less accepting to have no-output for non representable
instances.
Diagram readability. As we will necessarily have to
force some rules of well deﬁned Euler diagrams, it is es-
sential to understand which characteristics are more im-
portant for their comprehension. Benoy and Rodgers 
ity of Euler diagrams according to three aesthetic pa-
rameters: set boundary irregularity, zone area inequal-
ity, boundary closeness. They found evidence that all of
them strongly contribute in diagram comprehension.
hal-00319119, version 1 - 5 Sep 2008
3 Euler diagrams and graphs
Even if it is possible to deﬁne Euler diagrams in a
mathematical and formal way, working directly with
them is quite complicated. For this reason, Euler dia-
grams are usually studied and analysed as graphs.
We will represent diagrams as graphs in a way which
is quite common in literature. This way is illustrated in
ﬁg. 2 and consists in the construction of what we will
call intersection graphs.
We start having a collection of sets to represent.
These sets are deﬁned independently of each other on
a set of elements, so they will generally overlap. We
will indicate this collection with C = {C
a
, C
b
, . . .}
1
.
To avoid confusion with the more common word “set”,
we will call each C
x
class and C itself classiﬁcation.
Zone decomposition. Starting from a Euler diagram,
it is possible to divide it in zones (ﬁg. 2.b). Zones are the
regions of the plan described by the way classes overlap:
each of them contains all the, and only the, elements that
are contained exactly in the same set S of classes. For
instance, if S
ab
= {C
a
, C
b
}, than the relative zone will
contain all the, and only the, elements that are contained
in the classes C
a
and C
b
, but not in others.
We will label each of the zones with the letters asso-
ciated to the classes in S, so Z
ab
2
represents the men-
tioned zone. More formally, we will identify Z
ab
with
the set:
Z
ab
=
\
C
x
S
ab
C
x
!
\
\
C
x
/S
ab
C
x
= C
a
C
b
C
c
similarly to what has been deﬁned by other authors .
Intersection graphs. From the zone decomposition
we can easily construct a graph, called intersection
graph (ﬁg. 2.c), that shows the interconnections be-
tween the classes. The graph has one node for each zone
of the diagram, and one edge for each shared boundary
between two zones.
It is possible to prove that intersection graphs and
Euler diagrams have the same expression power, and
that there exists a bijection between equivalent Euler di-
agrams and equivalent intersection graphs . This is
proved showing constructive methods to move from one
structure to the other.
For the reverse operation, that is obtaining a Euler
diagram from an intersection graph, it is sufﬁcient to re-
alise where the classes boundaries have to be drawn. For
1
We will identify classes in pictures using just the pedix in capital
letters.
2
We will identify zones in pictures using just the pedix in lower
case letters.
(a) (b)
Figure 3: (a) the complete graph K
5
generates an exam-
ple of a diagram that is not Eulerian, as any attempt to
draw it generates disconnected zones. In fact, we will
have to disconnect the zones d and e (see ﬁg. 4.a) to
draw the dashed link. The same graph is drawable, if
we allow duplicated zones. (b) an example of a graph
that is not drawable whitout disconnecting classes, even
if allowing disconnected zones. The circular sets are all
meant to be distinct. This time any attempt to draw the
dashed link brings undesired overlaps, so the class E
will have to remain disconnected.
each class, we need to consider the cut of the class nodes
and the corresponding cutting edges. The set boundary
can be drawn keeping in mind that it has to group the
class nodes and intersect each cutting edge (ﬁg. 2.c).
As they are equivalent, we will use diagrams or in-
tersection graphs indifferently according to which one
is clearer in the speciﬁc case.
4 Euler representation
To be able to represent classiﬁcations that do not have
a Euler diagram, we need to use a structure less restric-
tive. We will call this structure Euler representation,
(a) (b) (c)
Figure 4: (a) a diagram with disconnected zones, as
zone a and zone b are represented by separated regions
divided by the zone ab. (b) a diagram with disconnected
classes, as class B has the zones ab, b separated from
zone bc. (c) a diagram with disconnected zones and
classes, as zone b is duplicated and the class B is dis-
connected.
hal-00319119, version 1 - 5 Sep 2008
(a) (b) (c)
Figure 5: Visualisation of disconnected classes. (a) the
original diagram, showing the relationships we aim to
represent. Let us suppose the zones ac, c are not directly
reachable from the others. (b) shows a possible way to
depict a link between separated zones of the same class.
This representation does not show straight away that the
class A contains ac, especially if they are positioned far
apart from each other. (c) shows the duplication of the
zone ac and its nodes. A spotted boundary is used to
indicate that the zone has been cloned and not simply
represented with separated regions. This representation
shows in a more immediate way that the classes A and
C interact with each other, as well as it shows all the
elements of the same class in the same connected area.
and we will design its properties investigating the fac-
tors that make an EDGP instance undrawable.
Zone connectivity. According to Chow , a set of
closed curves is a Euler diagram if every non-empty
zone is represented as a connected region. The zone
connectivity is the ﬁrst problem for the existence of a
Euler diagram. We can easily show EDGP instances
3
that are not drawable without splitting the zones in dis-
connected regions (ﬁg. 3.a), proving that zone connec-
tivity is actually a limiting condition.
Relaxing this condition we are practically allowed
to duplicate a zone in a different area of the diagram
(ﬁg. 4.a), as long as we keep the classes connected. Un-
fortunately, this is not sufﬁcient to draw every EDGP
instance, as some of them are not representable even
dropping this bound (ﬁg. 3.b).
3
These difﬁcult instances are usually built starting from unplanar
graphs and mapping sets in the graph elements in an suitable way.
For instance, we can associate sets N
i
to the nodes, sets E
j
to
the edges, and impose that each set E overlaps only with the sets
N associated to the nodes incident to their edges. This implies that
the edges cannot overlap, otherwise we will describe an intersection
between sets E that are not deﬁned in our model.
For the Kuratowski’s theorem, a graph containing a subdivision
of the complete graph K
5
or the complete bipartite graph K
3,3
is
not planar, or in other words, it cannot be represented without draw-
ing crossing edges. These graphs, through the association explained,
bring us to examples of undrawable Euler diagrams.
(a) (b) (c)
Figure 6: Another example of disconnected classes.
(a) the original diagram, built without particular con-
straints. Let us now assume class D and E are not
reachable from A, B, C. (b) the diagram obtained when
representing the zone a as two separated regions. In
this case, node duplication is generally not meaningful
for a better comprehension of the diagram. (c) the same
graph obtained duplicating the zones ad and ae and their
nodes. Again, node duplication can be made clear by
using a dashed line for the boundary. Altrough this so-
lution allows us to see all the nodes of the same class
in a connected region, it tends to be less readable than
the previous one because of the greater number of extra
Class connectivity. In Euler diagrams classes are rep-
resented by a connected region, as implied by the us-
age of a single closed curve for each class. Again,
we can see that this condition is restrictive showing
EDGP instances that are not drawable without repre-
senting classes with separated regions (ﬁg. 3.b).
Relaxing this condition we are allowed to draw zones
that are separated from each other (ﬁg. 4.b). Clearly we
are now able to draw each EDGP instance, as we are no
more forced to link zones together.
Representation characteristics. From the previous
analysis, we can deduce that Euler representations
should allow classes to be represented by separated re-
gions, if necessary. Disconnecting zones do not seem to
be necessary, but sometimes they allow to obtain more
readable diagrams. For the same reason we might also
decide to duplicate a zone, creating a copy of the zone in
another region of the graph and cloning all its elements.
Summarising, Euler representations are charac-
terised by:
classes not necessarily connected,
zones not necessarily connected and eventually
even cloned,
where the usage of these patterns is limited as much as
possible.
hal-00319119, version 1 - 5 Sep 2008
(a) (b) (c)
Figure 7: Some tests on the intersection graph. (a) checking that all the class schemas are connected. (b) checking that
subgraphs induced by the nodes containing any possible subset of classes are connected. Here the subgraphs induced
by the nodes containing ab, ac, and abc are shown. Together with the ones containing a, b, and c (that correspond
exactly to the class schemas in the previous picture), they are all the possible non empty subgraphs of this kind. (c)
checking that the complementar class schemas are connected. At this point we need to consider also a node associated
to the external area, that will always be part of the complementar class schemas.
Some examples of the application of these methods
are shown in ﬁg. 5 and ﬁg. 6. In particular, ﬁg. 5.a shows
just a disconnected class, ﬁg. 6.b also a disconnected
zone, and ﬁg. 5.c and ﬁg. 6.c examples of zones cloning.
5 Properties of the intersection graph
Because of the bounds relaxation we did and the
new structures we introduced, we have a high degree of
freedom on representing diagrams. Choosing the more
readable representation between all the possible ones re-
quires at ﬁrst to identify the most important properties
of diagram comprehension. Assuming that Euler dia-
grams are more readable than general Euler representa-
tions, we need to try to:
1. avoid every undesired overlap. This is an indis-
putable point as we aim to draw just the non empty
zones.
2. keep the classes as connected as possible. This will
avoid having classes represented as separated re-
gions (ﬁg. 4.b) in the ﬁnal diagram.
3. keep the single zones as connected as possible.
This avoids zones being represented by more than
one region (ﬁg. 4.a), as it makes it difﬁcult to un-
derstand the exact iteration of the zones with the
rest of the diagram.
4. keep even zones that share the same subset of la-
bels as connected as possible. This avoids discon-
nected overlaps between the same sets (ﬁg. 1.b), as
they make it difﬁcult to trace how the intersection
between classes is divided in the several zones.
5. avoid holes in the classes. Diagrams with holes
(ﬁg. 1.c) can generate confusion between holes and
set inclusions.
6. make classes assume a smooth and regular shape.
As we will practically work with embeddings of the
intersection graph, it is extremely useful to see how the
previous diagram properties are translated in graph em-
bedding properties:
1. make the intersection graph planar.
2. make the subgraphs induced by the nodes of the
same class connected (ﬁg. 7.a). We will call these
induced subgraphs class schemas.
3. avoid node duplications in the intersection graph.
In other words, limit the usage of node duplications
in order to satisfy the previous points.
4. make the subgraphs induced by the nodes of all the
same subset of classes connected, rather than just
the nodes of the same class (ﬁg. 7.b).
5. make the subgraph induced by the nodes outside
each class schema connected (ﬁg. 7.c). We will
call these induced subgraphs complementar class
schemas. It is also necessary to add a node associ-
ated to the null zone, corresponding to the external
area. As this node is never part of the class schema,
it is always in the complementar one.
6. place nodes in an area of the plan as compact and
regular as possible.
hal-00319119, version 1 - 5 Sep 2008
Figure 8: a Euler representation example. The diagram
structure is not planar (it has a K
5
minor), so it cannot be
represented with Euler diagrams. The Euler representa-
tion proposed uses zone duplication for bdg and dh, and
has a disconnected class A.
Algorithms design. An algorithm that points to de-
tecting a good Euler representation has to identify an in-
tersection graph satisfying the previous points as much
as possible, in order of importance. The most immedi-
ate way consists of identifying all the zones of the given
classiﬁcation, associating one intersection graph’s node
to each of them, and selecting carefully the edges to in-
sert.
Node duplication, that corresponds to allow a zone
to be disconnected, can be used when it is no longer
possible to select useful edges in the graph. Discon-
nected class nodes will correspond, instead, to discon-
nected classes. Choosing to leave them disconnected, or
to use node duplications to connect them, it is all matter
of decision. As we saw, it depends on the speciﬁc case
and on the speciﬁc relation one aims to represent.
6 Conclusions
We started by introducing several ways of deﬁn-
ing Euler diagrams, showing or referencing proofs of
their inability to represent every classiﬁcation. We then
analysed why Euler diagrams cannot be always drawn,
pointing out two separat reasons that might impeding
this process.
This analysis allowed us to detect some methods to
show otherwise unrepresentable relationships. Using
disconnected regions for classes, and graphically link-
ing them together, is the simplest approach. We saw
that this always works, but that the results are not neces-
sarily the best possible. Another option we pointed out
consists of representing some zones as disconnected re-
gions. This might help to reduce the number of ﬁctional
links we need to introduce. A last possibility is to clone
a whole zone in another part of the graph, cloning even
the nodes of the zone. This helps in particular when
we want to keep all the nodes of a class in the same con-
nected region, even when the overlapping classes are not
directly connected to each other.
Finally, we analysed the way each condition is ex-
pressed in the intersection graph. Structure graphs of
this kind are the ﬁrst step of most approaches to Eu-
ler diagrams generation. Knowing how the previous
patterns are mapped in these graphs is essential to de-
cide how, when and where to use them. An algorithm
paradigm has also been pointed out, while concrete im-
plementations of this approach need to conveniently de-
ﬁne the necessary metrics according to the particular ap-
plication.
References
 Gary D. Bader and Christopher W.V. Hogue. An
automated method for ﬁnding molecular complexes
in large protein interaction networks. January 13
2003.
 Florence Benoy and Peter Rodgers. Evaluating the
comprehension of euler diagrams. In IV, pages 771–
780. IEEE Computer Society, 2007.
 Stirling Christopher Chow. Generating and draw-
ing area-proportional Euler and Venn diagrams.
PhD thesis, 2007.
 Leonhard Euler. Lettres une princesse d’allemagne,
letters no. 102-108, 1761.
 Jean Flower and John Howse. Generating euler di-
agrams. Lecture Notes in Computer Science, 2317,
2002.
 D.S. Johnson and H.O. Pollak. Hypergraph pla-
narity and the complexity of drawning venn dia-
grams. Journal of graph theory, 11(3):309–325,
1987.
 Gergely Palla, Albert-L
´
aszl
´
o Barab
´
asi, and Tam
´
as
Vicsek. Quantifying social group evolution. Nature,
446(7136):664–667, 2007.
 John Venn. On the diagrammatic and mechani-
cal representation of propositions and reasonings,
1880.
 Anne Verroust and Marie-Luce Viaud. Ensuring
the drawability of extended euler diagrams for up
to 8 sets. In Diagrammatic Representation and In-
ference, Third International Conference, Diagrams
2004, Cambridge, UK, Lecture Notes in Computer
Science. Springer.
hal-00319119, version 1 - 5 Sep 2008