Content uploaded by Marcel Worring

Author content

All content in this area was uploaded by Marcel Worring on Apr 01, 2015

Content may be subject to copyright.

Noname manuscript No.

(will be inserted by the editor)

Visualizing Multi-Dimensional Decision Boundaries in 2D

M.A. Migut ·M. Worring ·C.J. Veenman

Received: date / Accepted: date

Abstract In many applications well informed decisions have to be made based on

analysis of multi-dimensional data. The decision making process can be supported by

various automated classiﬁcation models. To obtain an intuitive understanding of the

classiﬁcation model interactive visualizations are essential. We argue that this is best

done by a series of interactive 2D scatterplots. We deﬁne a set of characteristics of the

multi-dimensional classiﬁcation model that have to be visually represented. To present

those characteristics for both linear and non-linear methods, we combine visualization

of the Voronoi based representation of multi-dimensional decision boundaries in scat-

terplots with visualization of the distances to the multi-dimensional boundary of all the

data elements. We use interactive decision point selection on the ROC curve to allow

the decision maker to reﬁne the threshold of the classiﬁcation model and instantly ob-

serve the results. We show how the combination of those techniques allows exploration

of multi-dimensional decision boundaries in 2D.

Keywords interactive data mining ·knowledge discovery ·decision boundary

visualization ·multi-dimensional space ·classiﬁcation

1 Introduction

In many domains experts have to make decisions based on the analysis of multi-

dimensional data. The core element of the decision making process is an accurate

and transparent classiﬁcation model. For the validation of the model and deepening

M.A. Migut

Intelligent System Lab Amsterdam, University of Amsterdam, The Netherlands

E-mail: mmigut@gmail.com

M.Worring

Intelligent System Lab Amsterdam, University of Amsterdam, The Netherlands

E-mail: M.Worring@uva.nl

C.J. Veenman

Digital Technology & Biometrics Department, NFI, The Hague, The Netherlands

E-mail: c.veenman@nﬁ.minjus.nl

2

the insight into the domain and it’s underlying processes, intuitive means to assess the

characteristics the classiﬁer are important.

To support multi-dimensional decision making, it is favorable to obtain a visual

comprehension of the classiﬁer. Many visualization techniques are used to present the

results of the classiﬁer, such as ROC curves or Precision and Recall graphs (Duda

et al, 2000). These are very useful techniques for visualizing, organizing, and selecting

classiﬁers based on their performance (Provost and Fawcett, 1997).

Performance alone, however, is not suﬃcient to understand a classiﬁer. The re-

sulting graphs are an aggregation of classiﬁcation results on individual data elements.

Additionally, we need to assess, which elements are easily classiﬁed, which are more

complex to handle, and what mistakes are made. This is an intricate interplay between

the characteristics of the data, the classiﬁcation model, and its parameter settings. For

selecting the suitable classiﬁer we need to go beyond performance as the only mea-

sure. The best overall performance is often not the optimal solution in applications

where risk or proﬁt are assessed, e.g. in disaster management, ﬁnance, security, and

medicine (Keim et al, 2008; Thomas and Cook, 2005). In the medical ﬁeld, for example,

diagnosing speciﬁc disorders is a common task performed by a medical expert. The con-

sequences of making mistakes, either diagnosing the healthy patients or not diagnosing

the ill patients, may be fatal. Obviously the expert has a diﬃcult task to understand

the multi-dimensional data, make decisions based on it and moreover foresee the con-

sequences of those decisions. Crucial in the decision making process is determining the

cost of the diﬀerent types of mistakes made by the classiﬁcation model, assuring critical

cases are handled appropriately, and ﬁnding the balance between those mistakes.

One of the most informative characteristics of the classiﬁcation model and its rela-

tion to the data is the decision boundary. The decision boundary determines the areas

in space where the classes are residing. It also provides a reference for determining

classiﬁcation diﬃculty. Elements close to the boundary are the ones which are diﬃcult

to classify, while others have a higher certainty of class membership. Being able to

visualize the decision boundary would be a great aid in decision making.

Decision boundaries can easily be visualized for 2D and 3D datasets (Duda et al,

2000). Generalizing beyond 3D forms a challenge in terms of the visualization and

its use by the domain expert. The challenge in visualizing the decision boundary is

that the boundary is deﬁned in a multi-dimensional space. Transforming such a multi-

dimensional boundary to a representation in lower dimensions, that can be displayed

and understood by the experts is diﬃcult. Deﬁning the core characteristics of the

multi-dimensional decision boundary that should be represented in the low dimensional

representation is crucial.

Several attempts have been made to visualize decision boundaries for multi - dimen-

sional data (Caragea et al, 2001; Hamel, 2006; Poulet, 2008; Migut and Worring, 2010).

The ﬁrst two methods are speciﬁc to limited types of classiﬁers and hence can not be

used to compare diﬀerent methods. The method in (Poulet, 2008) can be applied to

diﬀerent classiﬁers, however it does not allow to relate the visualization of the decision

boundary to the data elements in terms that are meaningful for the domain experts, in

terms of class membership. The method in (Migut and Worring, 2010) does not allow

to analyze the data elements in relation to the decision boundary as the distances to

the boundary are not visually expressed. None of those approaches on their own, allow

the expert to examine diﬀerent classiﬁers in terms of the decision boundary and costs

of classiﬁcation in terms of misclassiﬁed examples.

3

The aim of this paper is to expand on the previous studies (Migut and Worring,

2010; Poulet, 2008) to ﬁnd a generic approach to analyze a decision boundary of a

multi-dimensional classiﬁer in 2D. To this end, we formalize the problem by providing

a set of decision boundary characteristics that the 2D visualization should represent.

Moreover, we formalize the approach taken by (Migut and Worring, 2010) and by

combining it with methods proposed by (Poulet, 2008) we provide an expert with a

visualization of the decision boundary that expresses all the important characteristics

of the classiﬁer and allows the analysis of classiﬁcation results. We interactively cou-

ple the data visualization, the decision boundary visualization, classiﬁer performance

visualizations and the distance to the decision boundary. The integrated solution pro-

vides an expert with the possibility to visually explore the classiﬁer and the costs of

classiﬁcation, as well as visually compare diﬀerent classiﬁers for all data dimensions.

The paper is organized as follows. The subsequent section presents a review of

several attempts to visualize multi-dimensional decision boundaries. We pin-point the

shortcomings of the existing techniques and propose to use the beneﬁcial features of

those. Then, we formally state the problem, we deﬁne a set of tasks that require a visual

representation of the decision boundary, and propose a set of classiﬁer characteristics

that should be visually expressed in 2D. We also describe in detail why the visualization

of a decision boundary in 2D is so challenging. From there, we show how to interactively

integrate several visualization techniques that contribute to a solution satisfying our

requirements. In the following section we demonstrate the approach using two bio-

medical datasets, illustrating that the proposed methodology is suitable for exploring

multi-dimensional classiﬁers.

2 Related work

Several attempts have been made to visualize decision boundaries for multi-dimensional

data (Poulet, 2008; Caragea et al, 2001; Hamel, 2006). We summarize those techniques

and analyze which characteristics of the decision boundary they capture.

In (Caragea et al, 2001) authors visualize the Support Vector Machine classiﬁer

(SVM) using a projection-based tour method. The authors show visualizations of his-

tograms of the data predicted class, visualization of the data and the support vectors in

2d projections and weighting the plane coordinates to choose the most important fea-

tures for the classiﬁcation. The methods are all applicable to SVM only, which means

that they are not generic.

In (Poulet, 2008) authors display the histograms of the distance to the boundary

distribution of correctly classiﬁed examples and misclassiﬁed examples for SVM. Those

histograms are linked to a set of scatterplots or parallel coordinates plots. The bins

of the histogram can be selected and the points on the scatterplot with corresponding

distances to the multi-dimensional boundary are consequently highlighted. The au-

thors claim that those highlighted elements are showing the separating boundary on

the scatterplots. The proposed method could also be applied to other classiﬁers like

decision trees or regression lines. This is an interesting approach to decision boundary

visualization oﬀering a good estimation of the quality of the boundary. However, if for

a certain 2D projection, the elements close to the decision boundary are scattered all

over the plot, it is no longer possible to understand how the classiﬁers separates the

data. This means that it is not possible to directly assess whether there is a decision

boundary between two arbitrary points in the visualization. Figure 3(a) and (b) show

4

an example using this technique for a combination of two arbitrary dimensions of an

arbitrary dataset (LIVER dataset). Linking the histogram of the distances to the de-

cision boundary with the corresponding points on the scatterplot is not enough to give

an insight into how the classiﬁer separates the data.

The method in (Hamel, 2006) uses self-organizing maps (SOM) to visualize results

of SVM. SOM’s are also used to visualize decision boundaries in (Yan and Xu, 2008).

Yan proposes two algorithms, one to obtain data points on decision boundaries and a

second one to illustrate decision boundaries on SOM maps. The decision boundaries are

not visualized in the original data space (domain space). Even though the described

techniques to visualize decision boundaries are very interesting, none of them alone

can be used to show all the characteristics of the decision boundary and the costs of

classiﬁcation for diﬀerent classiﬁers.

3 Visualization of multi-dimensional decision boundaries

3.1 Problem analysis

In this section we formalize our problem and propose a set of characteristics of a

decision boundary that a successful visualization should represent.

Assume a training dataset with kobjects represented by feature vectors with nu-

merical data in an n-dimensional metric space. In this paper we only consider the two

class problems. In section 5 however, we indicate how to transfer the techniques to

multi-class problems.

Also, the features are limited to those which have meaning to the expert, so pis

small. A classiﬁer is trained on the dataset, resulting in a decision boundary in the

n-dimensional space. An object classiﬁed as positive is deﬁned as true positive (TP) if

the actual label is also positive and is called false positive (FP) if the actual label is

negative. In a similar way, an object classiﬁed as negative is called true negative (TN)

if the actual label is negative and false negative (FN) if the actual label is positive.

The problem at hand is how to visualize such an n-dimensional decision boundary

to support the expert’s decision making process. To that end, let us ﬁrst look at the

tasks that the expert has to perform. We summarize them as follows:

[T1 ] Task 1: Analyze which of the pdimensions are most important

[T2 ] Task 2: Analyze and compare how diﬀerent classiﬁers separate the data

[T3 ] Task 3: Analyze the relation between boundary and data elements

[T4 ] Task 4: Analyze and compare classiﬁcation costs

These tasks have several implications that make the visualization of the p-dimensional

decision boundary a challenging task. There are three important characteristics of the

classiﬁer that a visualization of the decision boundary should capture:

1. Separation: the visualization must be in agreement with the actual classiﬁcation.

All objects assigned to a positive class by the classiﬁer (TP and FP) must be

visually diﬀerentiated from the members of the negative class (TN and FN). On

the level of the individual data objects, the visualization must represent whether

there is a decision boundary between each pair of objects in the multi-dimensional

space.

5

2. Direction: for two arbitrary data objects in the visualization, the representation

of the decision boundary must unambiguously show on which side of the decision

boundary each object is located in the multi-dimensional space.

3. Distance: for each visualized data object the distance to the decision boundary in

multi-dimensional space should be represented.

To compare diﬀerent classiﬁers, the visualization technique must be coherent and

represent those three characteristics independently of the classiﬁcation method used.

Moreover, the data visualization technique must be chosen such that it allows the visu-

alization of these characteristics and more importantly that it supports the conceptual

framework of the experts. A taxonomy of multidimensional visualizations is given by

(Keim, 2002). The categories listed include standard 2D/3D displays, geometrically

transformed displays, iconic displays, dense pixel displays, and stacked displays. Dif-

ferent techniques serve diﬀerent purposes. Since experts understand their data best in

the original feature values, we consider here only techniques that support this.

Interactive visualizations of multi-dimensional datasets which represent the fea-

tures explicitly are e.g. scatterplots, heatmaps, parallel coordinates and parallel sets

(Bendix et al, 2005). It is desirable to provide experts with an easy to understand visual

representation of the data. We therefore choose the frequently used scatterplots. They

are basic building blocks in statistical graphics and data visualization (Cleveland and

McGill, 1988). Multidimensional visualization tools that feature scatterplots, such as

Spotﬁre (Inc., 2007), Tableau/Polaris (Stolte et al, 2002), GGobi (Swayne et al, 2003),

and XmdvTool (Ward, 1994) typically allow mapping of data dimensions also to graph-

ical properties such as point color, shape, and size. As in a 2D scatterplot data elements

are drawn as points in the Cartesian space deﬁned by two graphical axes deﬁned by the

real attributes values (in domain space), they accommodate the conceptual framework

of the user. Moreover their familiarity among the users favor their use for the purpose

of this paper. However, the number of dimensions that a single scatterplot can visualize

is considerably less than found in realistic datasets. Therefore, the scatterplots should

be visualized for all combinations of the dimensions, where all the dimensions can be

explored by the user. Diﬀerent dimensions can be explored though, for example using

an interactive axis, where the user selects the attribute to be plotted. In this way the

variables are plotted against each other preserving the meaning of their values. Conse-

quently, the decision boundary in multi-dimensional space has to be visualized in such

a 2D setting.

3.2 Axes parallel projections

The decision boundaries for multi-dimensional classiﬁers are either planes/hyperplanes

for linear classiﬁers or can exhibit complex shapes for non-linear classiﬁers. For two di-

mensions at the time, the multi-dimensional data can easily be projected into 2D space.

The classiﬁer, however, can not be meaningfully projected into these two dimensions,

as it is not deﬁned in this 2D space. Hence, the pro jection into 2D will not represent

the multi-dimensional classiﬁer. The resulting projections do not necessary separate

elements belonging to diﬀerent classes as imposed by the multi-dimensional classiﬁer.

Only for the multi-dimensional linear decision boundary that is perpendicular to the

projection plane, the separating information will be preserved. Other linear boundaries,

as well as non-linear boundaries, projected to 2D are meaningless. Straightforward

projection of the classiﬁer to 2D captures neither the Separation nor the Direction

6

−1 0 1 2 3

−1.5

−1

−0.5

0

0.5

1

1.5

2

Feature 1

Feature 2

(a)

0 0.2 0.4 0.6 0.8 1

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

False Positive rate (FPr)

True Positive rate (TPr)

(b)

Fig. 1 (a) Voronoi diagram for a 2-dimensional dataset of two Gaussian distributed classes

together with the approximated decision boundary following the Voronoi cells’ boundaries

(thick solid line). The approximation follows the labels as imposed by the classiﬁer (linear

support vector machine) and therefore does not violate the actual classiﬁer visualized with a

dashed line. (b) ROC curve with the current classiﬁer’s trade-oﬀ visualized as an operating

point on the curve.

characteristics of the multi-dimensional classiﬁer. The Distance characteristic is also

not represented. Since the distances in 2D data projection do not represent the ac-

tual distances in the multi-dimensional space, the distances of data elements to the

projected boundary would, therefore, also not be preserved. Therefore, the straightfor-

ward projection of the multi-dimensional boundaries to 2D is not the answer to our

problem. We will look for a methodology to represent the multi-dimensional decision

boundary, that will allow to see how the classiﬁer separates the data in the original

multi-dimensional space, when the data is projected into 2D.

3.2.1 Voronoi based decision boundary visualization

In this section we describe the Voronoi-based representation of the decision bound-

ary, as used in (Migut and Worring, 2010). In order to capture the characteristics of

the classiﬁer to represent the decision boundary we extend that technique with the

visualization of the histogram of distances, as used by (Poulet, 2008).

Lets now consider two elements in the dataset, alabeled as positive by classiﬁer B

(boundary) and blabeled as negative by classiﬁer B. The Separation characteristic

of the classiﬁer implies that in the visual representation of classiﬁer B the boundary

must lie somewhere between point aand b. If we assume it to be locally linear it would

yield a half plane containing aand not b. Without knowledge of the actual distances it

could be put midway the two elements. This resembles the Voronoi tessellation of the

space, if performed for all the elements in the dataset. Therefore, we use the Voronoi

tessellation to represent decision boundary in a 2D scatterplot.

A Voronoi diagram (Fortune, 1987; Aurenhammer, 1991; Duda et al, 2000) can be

described as follows. Given a set of points (referred to as nodes), a Voronoi diagram is

a partition of space into regions, within which all points are closer to some particular

node than to any other node, see ﬁgure 1. Formally, if Pdenotes a set of k-points, then

7

−2 0 2 4

−2

−1

0

1

2

Feature 1

Feature 2

(a)

−3 −2 −1 0 1 2 3

−2

−1

0

1

2

Feature 2

Feature 3

(b)

Fig. 2 Decision boundary for a 10-dimensional dataset of two Gaussian distributed classes

for the SVM classiﬁer: (a)features well separated by the decision boundary (b)features highly

fragmented by the decision boundary. Misclassiﬁed examples are marked with a green square.

for two distinct points (p, q)∈Pthe separator separates all points of the plane closer

to p from those closer to q:

sep(p, q) = x∈R2|δ(x, p)≤δ(x, q )

where δdenotes the Euclidean distance function. The region V(p) being the Voronoi

cell corresponding to a point p∈P, encloses part of a plane , where sep(p, q ) holds:

V(p) = \

q∈P−p

sep(p, q)

Two Voronoi regions that share a boundary are called Voronoi neighbors. We apply

the Voronoi diagram to each combination of two dimensions projected into 2D space

for a dataset labeled by the multi-dimensional classiﬁer. All the data objects are used

as nodes to make the Voronoi diagram.

The boundaries of the Voronoi regions corresponding to neighbors belonging to dif-

ferent classes (according to the labels assigned by a classiﬁer) form the decision bound-

ary. Such a representation of class separation for two given features is a piecewise linear

approximation of the actual decision boundary, as imposed by the multi-dimensional

classiﬁer, see ﬁgure 1.

Visualization of the 2D combinations of the dimensions used by the classiﬁer results

in a series of scatterplots. Figure 2 illustrates our approach. For the features that

separate the data well the approximated decision boundary ’disconnects’ the classes

well, resulting in two clusters of data. For the features that do not separate the data

well, we observe a high fragmentation of the classes.

To emphasize on which side of the boundary the elements are located, we color the

Voronoi regions belonging to one class, as labeled by the multi-dimensional classiﬁer.

We argue, that for complex boundaries, a region based visualization makes it easier to

comprehend to which class each data instance belongs. More motivation is given in the

next section.

8

As stated in the previous subsection, the consequence of using scatterplots of multi-

dimensional data is that the distances between data points from the original domain

space are not preserved. Therefore, the distances between the data elements and the

visualized decision boundary can also not be preserved. Using the Voronoi-based ap-

proach the distances between the actual objects and the decision boundary are indeed

not preserved, but class membership is. In fact, the Voronoi based approximation of

the decision boundary indicates precisely that in the multi-dimensional space there is

a decision boundary between each two data instances located on the diﬀerent side of

the boundary representation. By construction, the piecewise linear representation lies

exactly halfway the two points. This distance has no meaning in terms of the actual

distance between the objects and the decision boundary in the original space. The

position of the decision boundary could be optimized locally, between the two points

that are in direct neighborhood of the linear piece of the boundary. However, the dis-

tance has also no meaning when considering the ordering of the data elements in any

direction from the decision boundary. Therefore, the distances could not be optimized

globally. This means that we need another visual representation to show which of the

elements are closer to the decision boundary.

To visually indicate the distances between the data points and the decision bound-

ary we exploit the histogram of the distances to the boundary as proposed by (Poulet,

2008). The histogram is divided into four regions. On the positive side of the X-axis

the distances to the elements with positive original label are visualized. The negative

side of the X-axis is reserved for the data elements with negative original label. The

positive part of the Y-axis is reserved for the elements that are correctly classiﬁed by

the classiﬁer and the negative side of the Y-axis is for the distances to misclassiﬁed

examples. Figure 3(a) shows the four quarters. Histogram of distances allows mak-

ing the diﬀerence in the distance to the multi-dimensional decision boundary visually

distinctive adding to the ”completeness” of decision boundary visualization. The class-

membership and actual distances to the boundary can be explored. To make the visual

components correspond to each other, we use the same color for the corresponding

concept visualized in the histograms as we use in the scatterplot. The original labels

are represented in color of the histograms’ bins, while the classes, as assigned by the

classiﬁers, are represented in the color of the background of the histograms. Figure 6

shows that the Voronoi-based representation and the histogram of distances are com-

plementary.

3.3 Interactive trade-oﬀ inspection

In this section we show how to interactively couple trade-oﬀ visualizations with decision

boundary visualizations, as proposed in Migut and Worring (2010).

In general performance curves such as Precision and Recall graphs or ROC curves

capture the ranking performance of the binary classiﬁer, as its discrimination threshold

is varied. The Receiver Operating Characteristics (ROC) curve often used in medicine

visualizes the trade-oﬀs between hit rate and false alarm rate (McClish, 1989). The

Precision and Recall curve often used in information retrieval depicts the trade oﬀ

between the fraction of retrieved documents relevant to the search and the fraction

of the documents relevant to the query successfully retrieved. What these curves have

in common is that they give a balance between two competing and inversely related

measures. In applications where delicate decisions have to be made, this balance is

9

(a) (b) (c)

Fig. 3 (a) The histogram of the distances of the TP, TN, FP, FN to decision boundary, with

the highlighted bin of the closest TP to the boundary, as proposed in (Poulet, 2008) (b) The

True Positive with the closest distance to the decision boundary highlighted. We see which

elements are the closest but because they are scattered all over the plane, they do not indicate

how the decision boundary is related to other data elements; (c) The TP with the closest

distance to the boundary are highlighted. Due to Voronoi based representation of decision

boundary it is immediately visible how the boundary divides that 2D representation of the

data.

subtle and complex as it can have dramatic consequences. Therefore, the performance

curve should depict the trade-oﬀ between the classiﬁer’s errors for one or both classes.

As example for this paper, we use the curve that represents the trade-oﬀ between the

False Positives rate and False Negatives rate. On the performance graph FPr is plotted

on the X axis and FNr is plotted on the Y axis. These statistics vary with a threshold

on the classiﬁer’s continuous outputs. The trade-oﬀ of the current classiﬁer is visualized

by means of an operating point on the curve.

The relationship between the operating point on the ROC curve that corresponds

with the decision boundary visualized using Voronoi tessellation can be formally de-

scribed as follows. Let V(p) be the Voronoi cell corresponding to a point p. For a set

of points Pwe deﬁne V(P) = Sp∈PV(p). For classiﬁer B, let Btbe the decision

boundary in n-dimensional space deﬁned by the current operating point t. Further let

d(p, Bt) be a signed measure indicating how far element pis from the boundary where

the sign of dis positive if pis classiﬁed to the positive class and negative otherwise.

Given the set Pof classiﬁed elements, a set of points classiﬁed to the positive class for

the current operating point (P+

t) can be described as follows:

P+

t={p∈P|d(p, Bt)≥0}

P−

t={p∈P|d(p, Bt)<0}

For two discrete points on the ROC curve t1and t2, where FPrt1< F P rt2(and

consequently F N rt1> F N rt2), the relation between the corresponding Voronoi tes-

sellation is: V(P+

t1)⊂V(P+

t2), with ⊂denoting a proper subset. As for any twe have

P=P+

t∪P−

tany change in P+

timmediately leads to a change in P−

t, resulting in:

V(P−

t2)⊂V(P−

t1). We use the convention that with increasing value x of tfor the

operating point we are accepting more False Positives and with increasing value of y

of twe are accepting more False Negatives.

10

From the above it follows that there is a set of points T containing all the discrete

locations on the given classiﬁer’s ROC curve that correspond to classiﬁer’s outcomes

for diﬀerent trade-oﬀs. For each element in T the classiﬁer’s output is determined and

therefore we know which elements are changing their class membership. If we increase

the rate of False Positives then:

P∆

t=P+

t+ǫ−P+

t

In terms of the Voronoi visualization if we have two subsequent elements t1, t2∈T

we have

V(P+

t2) = V(P+

t1)∪V(P∆

t2),

for some arbitrary small ǫ. By moving the operating point to higher values of FP the

Voronoi cells are added to the region displayed.

When decisions change at those discrete points t1and t2, then most likely only one

or at most several data instanced will be assigned a diﬀerent label. In such cases the

change in color of the Voronoi regions is obviously easier to notice for the user than

just a change in label expressed by color or shape of the data elements.

In order to enable the expert to steer the classiﬁcation model according to the

desirable trade-oﬀ, we can interactively move the operating point along the ROC curve.

We connect the interactive ROC curve to the visualizations of the scatterplots, as used

in (Migut and Worring, 2010). This is an instantiation of the connect interaction

technique, as proposed by (Yi et al, 2007). Since we want to visually observe what eﬀect

the change of trade-oﬀ has on the classiﬁer, we instantly visualize the Voronoi-based

decision boundary for the adjusted operating point in all the scatterplots displayed.

Moreover, we integrate some additional interaction techniques, as proposed by (Yi

et al, 2007). The user is able to interactively change the dimensions on the scatterplot

(reconﬁgure), so that he can examine all possible combinations of dimensions. We

enable the user to highlight the element of interest in the scatterplot (select), resulting

in a color change of the selected element. The user can also de-select an element if he

is no longer interested in it. These techniques together with the connect interaction

technique allow to see the relation between the decision boundary and the selected

elements for all the visualized dimensions.

4 Visualization experiments

In the previous section we propose an interactive visualization framework to explore the

most interesting characteristics of the decision boundary. To illustrate how the frame-

work can be applied, we conduct several visual experiments. Due to the subjective

nature of the problem, we limit ourselves to the question how the decision bound-

ary visualization together with the histogram of distances and the interactive ROC

curve allow the user to perform the tasks listed in section 3.1. The tool to perform the

experiments is implemented in Protovis (Bostock and Heer, 2009). Protovis is a free

and open-source, Javascript and SVG based toolkit for web-native visualizations. The

Voronoi tessellation implementation in Javascript is the (Fortune, 1987) algorithm im-

11

plemented by Raymond Hill 1. The classiﬁers are trained in Matlab, using the Toolbox

for Pattern Recognition PRTools2and Data Description toolbox: ddtools3.

4.1 Experimental setup

For the visualization experiments we use two datasets, which are examples of expert’s

decision making problems. Those data sets (Liver-disorder and Diabetes) have a lim-

ited number of dimensions, but exhibit a complex relation between features and class

membership.

The Liver-disorders dataset from the UCI Machine Learning Repository4consists

of 345 objects described by 6 features. The objects are divided into two classes based

on whether they do or do not have a liver disorder. The second dataset Diabetes comes

from the UCI Machine Learning Repository5and consists of 768 objects described by

8 features. The Diabetes dataset was used to forecast the onset of diabetes. The data

is divided into classes based on whether the object was tested positive or negative for

diabetes. The diabetes dataset was also used in the study of (Poulet, 2008).

For each of the datasets the following sequence of actions is performed. The dataset

is divided into a training set (2/3) and a test set (1/3). The two arbitrary dimensions for

both training and test set, are ﬁrst visualized using scatterplots. The axes of the scatter-

plots are interactive, therefore the user can browse through the scatterplots to explore

all the combinations of dimensions. Subsequently, the classiﬁer is chosen, trained on

the training set and applied to the test set. The Voronoi based decision boundary is

visualized in the scatterplot of currently chosen dimensions. The performance on the

test set is visualized using ROC curve, together with the current operating point.

The classiﬁers we chose to compare are: 5-nearest neighbors, Fisher and Support

Vector Machine (representing diﬀerent types of classiﬁers). The optimized 10-fold cross-

validation, over 3 repeats, error rates for all examined classiﬁers and for both datasets

are listed in table 1. From a holistic point of view, the diﬀerence between the classiﬁers’

performance is statistically insigniﬁcant. But they do not make the same mistakes. So

it is worthwhile to explore which of the individual data instances are classiﬁed wrongly.

The classiﬁer can be examined using the visualization of the Voronoi-based ap-

proximation of the decision boundary and through the manipulation of the operating

point on the ROC curve. As an example, we present visualizations of the Voronoi-based

approximation of the decision boundary for several combinations of the dimensions for

the above mentioned classiﬁers for the LIVER dataset in ﬁgure 4 and for the DIA-

BETES dataset in ﬁgure 5. All the dimensions can be explored, but we choose only a

few combinations of dimensions to show what the visualizations look like.

The operating point on the ROC curve can be manipulated for the classiﬁer applied

to the training set. Therefore, the expert using the system can tune the classiﬁer to ac-

cept/reject a certain amount of positive/negative examples. He can tune the classiﬁer’s

threshold, according to the application’s needs.

1(’http://www.raymondhill.net/voronoi/rhill-voronoi.php’)

2(’http://prtools.org/’)

3(’http://homepage.tudelft.nl/n9d04/ddtools.html’)

4(’http://mlearn.ics.uci.edu/databases/liver-disorders/’)

5(’http://mlearn.ics.uci.edu/databases/pima-indians-diabetes/’)

12

(i) (ii) (iii))

(a) SVM

(i) (ii) (iii)

(b) FISHER

(i) (ii) (iii)

(c) 5-NN

Fig. 4 The decision boundary visualization for the LIVER dataset classiﬁed using (a) SVM,

(b) Fisher classiﬁer, (c) 5 nearest neighbor classiﬁer. For each classiﬁer the same set of dimen-

sions have been chosen from all the possible combinations of dimensions that can be explored

those are chosen for illustration purpose. The original class membership is visualized in color.

Red objects are instances diagnosed with the liver disorder and blue objects are healthy in-

stances. The circular shape indicates that the original label corresponds to the predicted label.

The triangular shape indicates that the original label diﬀers from the predicted label. The

decision boundary visualized using Voronoi-based approximation shows the class membership

as assigned by the multi-dimensional classiﬁer. The regions belonging to one of the classes are

ﬁlled with a green color. The regions not ﬁlled belong to the second class.

13

(i) (ii) (iii)

(a) SVM

(i) (ii) (iii)

(b) FISHER

(i) (ii) (iii)

(c) 5-NN

Fig. 5 The decision boundary visualization for the DIABETES dataset classiﬁed using (a)

SVM, (b) Fisher classiﬁer, (c) 5 nearest neighbor classiﬁer. For more details on visualizations

see the caption of ﬁgure 4.

14

Table 1 Performance of the selected classiﬁers for Diabetes and Liver dataset obtained with

10-fold cross-validation, with 3 repeats.

Classiﬁer Cross-val error% (±std)

DIABETES LIVER

5 Nearest Neighbors 0.28 (0.01) 0.33 ( 0.01)

Fisher 0.11 (0.003) 0.22 (0.001)

Support Vector Machine 0.23 (0.005) 0.31 (0.01)

4.2 Results

In this section we show how the obtained visualizations and the functionality of the

proposed methodology allow the user to perform the tasks stated in section 3.1. First of

all, guidelines are given how to read the visualizations, to prevent the misinterpretation

of the proposed visual representation of the decision boundary.

Since the data is projected into two dimensions the multi-dimensional structure

of the dataset is not preserved. For some combinations of dimensions the visualized

representation of the decision boundary might be highly fragmented. In some cases,

even though the boundary separates the data perfectly in the multidimensional space,

the boundary might even be highly fragmented for all 2D pro jections of the features.

That may wrongly be interpreted as overﬁtting of a classiﬁer. This can not be avoided

if we want to plot the boundary in only 2 dimensions and in relation to the original

features. An expert should be aware of this and keep it in mind while exploring the

dataset and the classiﬁer using those visual representations.

4.2.1 Analyze important dimensions (T1)

The visualization of pairs of dimensions allows to instantly identify which combination

of dimensions play an important role in the classiﬁcation process. If the decision bound-

ary is fairly simple, meaning that the amount of piece-wise linear elements constituting

the decision boundary is limited, it implies that for this particular combination of di-

mensions the classiﬁer separates data well in the multi-dimensional space. Figure 5

illustrates how this task is performed using only the Voronoi based representation of

the boundary. For each classiﬁer we can directly conclude that the combination of di-

mensions shown in (ii) indicates that one of the dimensions (f2) does not separate the

data well. The same dimension in combination with f5, shown in (i), separates the data

slightly better.

4.2.2 Analyze and compare how diﬀerent classiﬁers separate the data(T2)

Once we obtain a general idea about the importance of dimensions, we can compare

those interesting dimensions for diﬀerent classiﬁers. The visualization of the decision

boundary in relation to the data makes it clear which data elements are classiﬁed

correctly and which wrongly. Once we can visually examine which data objects are

on which side of the decision boundary, we can easily see for which data objects the

classiﬁers diﬀer in assigning the label. The user can directly observe the behavior of a

classiﬁer. Moreover, the classiﬁers can be compared, allowing the user to inspect spe-

ciﬁc generalization characteristics. Any inconsistently classiﬁed data elements by any

15

(a) SVM

(b) FISHER

Fig. 6 The LIVER dataset for dimensions f3 and f2 and (a) SVM and (b) Fisher classiﬁers. We

zoom into the plots to explore the certain are of the data. On the histogram of the distances we

highlight the bin corresponding to the correctly classiﬁed positive examples that are the closest

to the multi-dimensional decision boundary. The elements corresponding to those distances are

highlighted on the scatterplots.

of the compared classiﬁers could be instantly detected and analyzed in more detail.

Therefore, the similarity of the models generated by diﬀerent classiﬁers can be com-

pared, providing more insight than just accuracy. That means that even though two

classiﬁers might have similar performance in terms of accuracy, it might be favorable

to choose one of the classiﬁers above the other, depending on speciﬁc needs/knowledge

of an expert. Figure 4 illustrates how this task is performed. From the overview of the

16

(a) Error rate = 30% (b) Error rate = 37% (c) Error rate = 42%

Fig. 7 The Voronoi-based approximation of the decision boundary for the three diﬀerent

operating points for the LIVER dataset and the SVM classiﬁer and corresponding error rate:

(a) minimizing FP rate; (b) current operating point chosen by the classiﬁer; (c) maximizing

TP rate.

combinations of dimensions for diﬀerent classiﬁers, it can be directly seen, that some

data elements, are classiﬁed diﬀerently by diﬀerent classiﬁers. For example in (ii) for

diﬀerent classiﬁers, some easily noticeable diﬀerences are highlighted.

4.2.3 Analyze the relation between boundary and data elements(T3)

In order to analyze the data elements in relation to the decision boundary several

interaction techniques are provided. First, the data element of interest can be high-

lighted and therefore can be traced in all plots, revealing all its characteristics. Its

position with respect to the decision boundary can be established through the distance

histogram. The interactively linked visualizations of the combinations of dimensions

can be used to compare which label is assigned to the same data elements by diﬀerent

classiﬁers. Therefore, we can observe which data points are diﬃcult to learn correctly.

Those are the data points which, regardless of the performance of the classiﬁers, are

being assigned a wrong label. Figure 6 shows how this task can be performed. We took

two scatterplots from ﬁgure 4, namely (ii)(a) and (ii)(b) and we zoomed in into these

two plots. On the histograms for both classiﬁers, we selected the correctly classiﬁed

positive examples closest to the multi-dimensional decision boundary. Those elements

are highlighted on the scatterplots. Therefore, we can compare on a high level of detail

how elements are classiﬁed and how far they are from the decision boundary.

4.2.4 Analyze and compare classiﬁcation costs (T4)

The costs for the current operating point of the classiﬁer can be directly assessed

through the classiﬁcation error and observed on the visualizations of the decision

boundaries. If we are not interested in the equal error rate, we might want to lower

the number of false positives or on the contrary lower the number of false negatives.

Since the operating point on the ROC curve is interactive, the costs of the classiﬁcation

17

Fig. 8 Screenshot of combined visualizations to explore and analyze multi-dimensional deci-

sion boundaries in 2D. The components here are: the ROC curve, the histogram of the distances

to the multi-dimensional decision boundary and a scatterplot showing labels and Voroni-based

representation of multi-dimensional boundary.

can be instantly updated. This results in the immediate update in the decision bound-

ary visualization and in the distance histograms. Figure 7 shows how this task can

be performed. Once the operating point is changed, the visualization of the boundary

changes. To explore the elements that are assigned a diﬀerent label after changing the

operating point, we can look into details of these points.

4.3 Framework

We have shown that to perform the deﬁned tasks by exploring and analyzing the

decision boundary of the classiﬁer, we can use the Voronoi-based representation of

the boundary combined with the interactive histogram of the distances to the multi-

dimensional boundary and the ROC curve with an interactive operating point. Those

elements should therefore be part of the user interface, e.g. as shown in ﬁgure 8.

18

5 Conclusions

This paper proposes a method to visually represent a multi-dimensional decision bound-

ary in 2D. We formalized the characteristics of the classiﬁer, that should be captured by

the visual representation of the decision boundary in 2D, namely Separation,Direc-

tion, and Distance. We deﬁned four tasks that have to be performed by the expert:

(1) analyze important dimensions, (2) compare diﬀerent classiﬁers, (3) analyze the re-

lation between the boundary and the data, and (4) compare classiﬁcation costs. We

thoroughly described why it is challenging to visually represent a multi-dimensional de-

cision boundary in 2D, while complying with the classiﬁer’s characteristics and allowing

execution of the deﬁned tasks. To realize our idea, we developed a system that couples

the visualization of the dataset, a Voronoi-based visualization of the decision bound-

ary, the histogram of the distances to the multi-dimensional decision boundary, and

a visualization of the classiﬁer’s performance. We have shown, that using the Voronoi

decomposition on two dimensions of classiﬁed data we can visualize an approximation

of the multi-dimensional decision boundary, expressing the two characteristics of the

boundary: Separation and Direction. This visualization is an approximation of an

actual decision boundary and does not represent absolute distances between the data

elements and the decision boundary. We compensate for this by visualizing the dis-

tances using a histogram, expressing the Distance characteristic of the classiﬁer. This

combination of techniques allows the analysis of the classiﬁer’s behavior and it allows

the visual assessment of the quality of the model. It also allows to examine characteris-

tics of the dataset with respect to the classiﬁcation model used. The proposed method

is generic and can be used for diﬀerent kinds of classiﬁers, allowing visual comparison

among them. Moreover, such a visualized decision boundary can be explored for diﬀer-

ent trade-oﬀs of the classiﬁer by means of an ROC curve with an interactive operating

point. Through visual examples, we have shown that using this methodology we can

perform the four tasks corresponding with the challenges of expert’s decision making

process. In our approach we limited ourself to two class problems. However, the pro-

posed methodology could be translated to the multi-class problems. The challenging

part of this translation would be to represent the performance using an ROC curve for

a multi-class classiﬁer. For cclasses this could be realized by using a series of cROC

curves, each for: one versus all other classes. In this way, the proposed methodology

generalizes not only over the classiﬁers used, but also becomes dataset independent.

All the visualizations and interactions presented contribute to the ultimate goal of

being able to get insight in the classiﬁcation problem at hand and use the insight to

choose optimal classiﬁers.

Acknowledgements This research is supported by the Expertise center for Forensic Psychi-

atry, The Netherlands.

References

Aurenhammer F (1991) Voronoi diagrams – a survey of a fundamental geometric data

structure. ACM COMPUTING SURVEYS 23(3):345–405

Bendix F, Kosara R, Hauser H (2005) Parallel sets: Visual analysis of categorical data.

In: INFOVIS ’05: Proceedings of the Proceedings of the 2005 IEEE Symposium on

Information Visualization

19

Bostock M, Heer J (2009) Protovis: A graphical toolkit for visualization. IEEE Trans

Visualization & Comp Graphics (Proc InfoVis)

Caragea D, Cook D, Honavar VG (2001) Gaining insights into support vector machine

pattern classiﬁers using projection-based tour methods. In: KDD ’01: Proceedings

of the seventh ACM SIGKDD international conference on Knowledge discovery and

data mining, pp 251–256

Cleveland W, McGill ME (1988) Dynamic graphics for statistics. Statistics/Probability

Series

Duda RO, Hart PE, Stork DG (2000) Pattern Classiﬁcation. Wiley-Interscience Pub-

lication

Fortune S (1987) A sweepline algorithm for voronoi diagrams. Algorithmica 2:153–174,

URL http://dx.doi.org/10.1007/BF01840357, 10.1007/BF01840357

Hamel L (2006) Visualization of support vector machines with unsupervised learning.

In Proceedings of 2006 IEEE Symposium on Computational Intelligence in Bioinfor-

matics and Computational Biology

Inc S (2007) Spotﬁre. http://wwwspotﬁrecom

Keim DA (2002) Information visualization and visual data mining. IEEE

Transactions on Visualization and Computer Graphics 8(1):1–8, DOI

http://doi.ieeecomputersociety.org/10.1109/2945.981847

Keim DA, Mansmann F, Schneidewind J, Thomas J, Ziegler H (2008) Visual analytics:

Scope and challenges pp 76–90

McClish DK (1989) Analyzing a portion of the ROC curve. Medical Decision Making

9(3):190–195

Migut M, Worring M (2010) Visual exploration of classiﬁcation models for risk as-

sessment. Proceedings of the IEEE Symposium on Visual Analytics Science and

Technology (IEEE VAST) pp 11–18

Poulet F (2008) Towards Eﬀective Visual Mining with Cooperative Approaches.

Springer-Verlag, Berlin, Heidelberg

Provost F, Fawcett T (1997) Analysis and visualization of classiﬁer performance: Com-

parison under imprecise class and cost distributions. In: In Proceedings of the Third

International Conference on Knowledge Discovery and Data Mining, AAAI Press,

pp 43–48

Stolte C, Tang D, Hanrahan P (2002) Polaris: A system for query, analysis, and visual-

ization of multidimensional relational databases. IEEE Transactions on Visualization

and Computer Graphics 8(1):52–65

Swayne DF, Lang DT, Buja A, Cook D (2003) Ggobi: evolving from xgobi into an

extensible framework for interactive data visualization. Computational Statistics and

Data Analysis 43(4):423–444

Thomas J, Cook K (2005) Illuminating the Path: The Research and Development

Agenda for Visual Analytics. IEEE CS Press

Ward MO (1994) Xmdvtool: integrating multiple methods for visualizing multivariate

data. In: VIS ’94: Proceedings of the conference on Visualization ’94, IEEE Computer

Society Press, pp 326–333

Yan Z, Xu C (2008) Using decision boundary to analyze classiﬁers. 3rd International

Conference on Intelligent System and Knowledge Engineering 1:302 – 307

Yi J, Kang J, Stasko J, Jacko J (2007) Toward a deeper understanding of the role

of interaction in information visualization. IEEE Transactions on Visualization and

Computer Graphics 13(6):1224–1231