Content uploaded by Bernhard Jenny

Author content

All content in this area was uploaded by Bernhard Jenny on Oct 20, 2017

Content may be subject to copyright.

ARTICLE

Automation and evaluation of graduated dot maps

Nicholas D. Arnold

a

, Bernhard Jenny

b

and Denis White

a

a

College of Earth, Ocean, and Atmospheric Sciences, Oregon State University, Corvallis, OR, USA;

b

Faculty

of Information Technology, Monash University, Melbourne, Australia

ABSTRACT

Dot mapping is a traditional method for visualizing quantitative data,

but current automated dot mapping techniques are limited. The

most common automated method places dots pseudo-randomly

within enumeration areas, which can result in overlapping dots and

very dense dot clusters for areas with large values. These issues aﬀect

users’ability to estimate values. Graduated dot maps use dots with

diﬀerent sizes that represent diﬀerent values. With graduated dot

maps the number of dots on a map is smaller, reducing the likelihood

of overlapping dots. This research introduces an automated method

of generating graduated dot maps that arranges dots with blue-noise

patterns to avoid overlap and uses clustering algorithms to replace

densely packed dots with those of larger sizes. A user study compar-

ing graduated dot maps, pseudo-random dot maps, blue-noise dot

maps and proportional circle maps with almost 300 participants was

conducted. Results indicate that map users can more accurately

extract values from graduated dot maps than from the other map

types. This is likely due to the smaller number of dots per enumera-

tion area in graduated dot maps. Map users also appear to prefer

graduated dot maps over other map types.

ARTICLE HISTORY

Received 28 April 2017

Accepted 22 July 2017

KEYWORDS

Dot map; graduated dot

map; blue-noise dot map;

thematic cartography

1. Introduction

Dot mapping is a method of cartographic symbolization for presenting quantitative

information. Dot maps are best used to display data of raw totals for an enumeration

area when the objective is to show that the underlying phenomenon is not uniform

throughout that area (Slocum et al.2009). The primary purpose of dot maps is to depict

variation in spatial density patterns by varying distances between dots.

The selection of dot size is an important consideration when designing a dot map. Dots

that are too small can make a distribution seem sparse and insigniﬁcant, and dots that are

very large can make a distribution seem excessively dense (Robinson et al.1995, p. 498). The

selection of a dot unit value –the numerical value represented by each dot –is equally as

important. If the unit value is too large, no dots will be placed in areas with low quantities,

and if the unit value is too small, dots overlap to form large dark regions (Mackay 1949).

Traditionally, there are two divergent schools of thought (Monkhouse and Wilkinson 1978,

p. 27): some posit that dot sizes and values should be chosen such that the dots begin to

coalesce in the area with the highest density of dots (Dent et al.2009,p.125).Theyreasonthat

CONTACT Bernhard Jenny bernie.jenny@monash.edu

INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE, 2017

https://doi.org/10.1080/13658816.2017.1359747

© 2017 Informa UK Limited, trading as Taylor & Francis Group

the primary purpose of dot maps is to depict variation in spatial density patterns by varying

distances between dots. Others argue that a complementary purpose of dot maps is to enable

readers to extract raw totals, and therefore, dots should not touch so that their number is easy

to estimate or count (Imhof 1972,Dentet al.2009,Hey2012). Extracting raw totals is only

possible if (a) the dot unit value is simple to sum and multiply and (b) dots can be counted or

accurately estimated for each enumeration unit. Dot overlaps can often not be avoided when

outliers with high values are present, and dots are often too numerous to be counted.

Extracting raw totals from dot maps is therefore often very diﬃcult or impossible.

Dot mapping algorithms in commonly available software do not allow the cartogra-

phers to control dot overlap. Most automated methods rely on pseudo-randomly placing

dots, which can lead to artiﬁcial local clusters of overlapping or coalescing dots. This

clustering can be misleading, as it suggests a spatial pattern in the data that does not exist.

A more regular dot distribution is preferable, if no additional information is available

about the distribution of the mapped phenomenon inside individual enumeration areas.

Graduated dot maps improve upon conventional dot maps by addressing issues of

countability of large numbers of dots and dot overlap. Graduated dot maps use a

number of increasing dot sizes, each of which corresponds to a larger unit value.

When creating a graduated dot map, the cartographer selects the number of dot classes,

the dot values and the dot sizes. As an example, the graduated dot map in Figure 1 uses

three dot sizes to visualize a spatial distribution that varies between very dense and very

sparse areas. The large dots clearly illustrate pockets of high concentration, while the

small dots eﬀectively illustrate the sparse presence of the phenomenon along valley

bottoms, for example in the southeast of the map in Figure 1.

Conceptually, there are two key advantages to graduated dot maps. First, both

small and large enumeration values can be mapped simultaneously, because small

dots are placed in sparse areas, and large dots do not overlap and remain distin-

guishable, even in the densest areas. Second, graduated dot maps depict both

density patterns and raw totals. Because they have fewer dots than conventional

dot maps, there are fewer dots to examine and sum when extracting raw totals.

Figure 1. Graduated dot map with three dot sizes for 200, 1000 and 5000 swine (Extent shown

approx. 220 ×105 km. ©Atlas of Switzerland (1977), sheet 51, www.atlasofswitzerland.ch).

2N. D. ARNOLD ET AL.

Reading of raw totals can be expected to be more accurate from graduated dot maps

than conventional dot maps.

Despite these advantages, lack of both available software and algorithmic methodology

described in the literature hindered widespread adoption. Most cartographic textbooks that

discuss dot maps do not consider graduated dot maps (for example, Robinson et al.1995,

Dent et al.2009,Slocumet al.2009,Tyner2010) or do so only brieﬂy(Hakeet al.2002,Kraak

and Ormeling 2011). An exception is Imhof (1972), who discusses design considerations for

the combination of graduated dot maps with area features and proportional diagrams, but

does not provide a methodology or algorithm for creating graduated dot maps.

The goal of this research is to propose a methodology for creating graduated dot maps, to

evaluate their performance compared to conventional dot maps and area-proportional circle

maps, and to test whether users prefer graduateddotmapstoothervisualizationtechniques.

The proposed algorithmic method for creating graduated dot maps does not pseudo-

randomly place dots, but arranges dots in a distribution that exhibits blue-noise char-

acteristics, wherein dots have ‘a large mutual distance and no apparent regularity artifacts’

(Balzer et al.2009, p. 86:7). Regions of dense coalescing dots are identiﬁed using a

clustering algorithm and are replaced with dots of a larger size and unit value. Details of

this method, including blue-noise and clustering algorithms, are presented in Section 3.

A user study with almost 300 participants was conducted to evaluate graduated dot

maps compared to random dot maps, dot maps with blue-noise patterns and area-

proportional circle maps. The user study is presented in Section 4. Results of the user

study indicate that graduated dot maps are the preferred method and outperform

conventional dot maps.

2. Literature review

2.1. Graduated dot maps

It is not clear when the ﬁrst graduated dot map was produced. Robinson (1982)

discusses a map by Petermann (1857) that shows the population of Transylvania

using a mixture of area-proportional circles and dots, which illustrate the population

distribution of towns and farmsteads. However, Petermann’smapisahybrid

between a dot map and an area-proportional symbol map and not a pure graduated

dot map. Various cartographers have used graduated dot maps since then. An

exemplary set of maps is included in the Atlas of Switzerland (1977) showing farm

animals (Figure 1).

Various manual and automated techniques have been proposed for conventional dot

mapping, but no digital technique for generating graduated dot maps has been described.

2.2. Dot map readability

User studies about dot maps have largely focused on user perception of dots. According to

Taves (1941), when small numbers of dots were present, users were able to accurately

estimate dots; however, when seven or more dots were present, user accuracy decreased

and dots were underestimated. Olson (1975)andProvin(1977)testedusers’ability to estimate

dot density, and found that underestimation of dot number is all but universal. Mashoka et al.

INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE 3

(1986)comparedreadabilityandpreferenceofdot maps versus proportional circle maps.

Their ﬁndings further demonstrated that users underestimate numbers of dots and that

proportional circle maps were favored over dot maps for their accuracy and simplicity.

It is to note that accurate estimation of dot densities requires an equal-area projection

for maps at small scales to avoid misleading distortion of dot spacing (Jenny et al.2017).

2.3. Design principles for dot maps

When dots are placed manually, cartographers generally use one of three approaches:

uniform, geographically weighted and geographically based (Slocum et al.2009,p.322).In

the uniform technique, dots are placed uniformly across the enumeration area (Mackay 1949).

The uniform distribution technique creates choropleth-like maps, where dot patterns create

the impression of shaded enumeration areas. It is also valuable for bivariate or multivariate

maps when area color is used to depict diﬀerent information. This method has the disadvan-

tage that boundaries of enumeration areas are easily detected, and the geographic distribu-

tion inside enumeration areas is not represented. In the geographically weighted approach,

dots are placed such that they are shifted closertoneighboringenumerationareasofhigher

value, which creates the impression of a continuous phenomenon being mapped because

enumeration area boundaries are less visible. The geographically based approach places dots

as a function of ancillary data such as land coverinformation.Althoughthesemethodswere

common in manual cartography, automated techniques that apply these three approaches

are not widely available.

Multi-color dot maps were suggested by Jenks (1953) and Thomas (1955). This

technique eﬀectively shows multiple distributions with dots of diﬀerent colors, as

shown by Rogers and Groop (1981).

Prior to 1949, cartographers selected dot size, unit value and placement without tools

to assist them. In a landmark study, Mackay (1949) developed a nomograph to assist in

the selection of dot size and value. With the nomograph, a ratio of dot size to unit value

is identiﬁed that Mackay calls ‘the zone of coalescence’, which enables cartographers to

estimate the point at which dots will begin to coalesce, given the size and number of

dots per square inch. Although the nomograph has been a widely used tool in carto-

graphy, Kimerling (2009) points out that the nomograph has ‘serious drawbacks in the

modern age of computer cartography’. He extends Mackay’s nomograph to include an

automated method to deﬁne the amount of dot overlap.

2.4. Digital methods for creating dot maps

The most common automated method of producing dot maps is to pseudo-randomly

place dots in an enumeration area. This method uses a random number generator to

calculate coordinates of dot locations. Random dot placement is not a common

approach in manual cartography, as it can lead to unrealistic clusters and gaps in the

dot pattern that imply nonexistent spatial patterns (Slocum et al.2009). Dent et al.

(2009) recommend the use of zones of exclusion, which follows the geographically

based approach for dot placement. Zones of exclusion are created with ancillary data

to deﬁne regions where dots are not to be placed.

4N. D. ARNOLD ET AL.

If the size of dots is minimized and their number maximized, dot maps are perceived as

relative shades of gray, resulting in dot-density shading. Lavin (1986) introduced this

technique based on Jenks’(1953) pointillism technique. Dot-density shading does not

assign a speciﬁc unit value to dots; information can only be derived through dot numer-

ousness and spacing. Lavin’s method is well suited for geographically continuous distribu-

tions, but it is not intended for extracting data values, because dots cannot be perceived as

discrete symbols. The texture is very coarse, and the technique is not commonly used.

Based on the suggestion that overlap interferes with countability, Hey (2012) pro-

posed a method to produce dot maps with spiral patterns that do not have overlapping

dots. An Archimedean spiral pattern is used to determine the placement of ‘spiral arms’,

wherein multiple curves of dots are placed such that dots radiate out from a single

point. Although dots do not overlap, the dot clusters still have a very regular appear-

ance. Hey and Bill (2014)reﬁned the spiral-inspired method by introducing a new dot

arrangement, addressing the regular appearance. The method deﬁnes large dots as

regional boundaries for potential dot positions. Larger dots are to reserve the space in

which smaller real dots may wander. When calculating the ﬁnal dot positions on the

map, dots are shifted within the reserved space for potential dots to reduce pattern

regularity. Dots are allowed to touch but do not overlap.

De Berg et al.(2004) studied a problem relating to dot numerousness that has utility

in dot mapping: given a point set representing a certain distribution, how can it be

automatically simpliﬁed, generating a smaller point set? They tested (1) iterative algo-

rithms and (2) clustering algorithms to simplify a point set and generate an approxima-

tion of the original dots with the smallest error. The tested clustering algorithms require

the desired number of clusters as an input parameter. For our research this is not

practical, as the method of ﬁnding clusters in dots should not require the user to

predeﬁne the number of clusters.

Graphical design principles for generalizing dot maps have been studied by Spiess

(1990), and Yan and Weibel (2008) have developed an algorithm for point cluster

generalization. Yan and Weibel treat four basic types of information including statistical,

metric, thematic and topological information. The primary objective is to ensure that the

four types of information are transmitted from the original data to the generalized

result. Based on Voronoi diagrams, the method follows three basic procedures: (1)

compute a distribution range, which deﬁnes the area that dots are potentially placed

within; (2) delete dots based on their selection probability; and (3) determine the

number of dots in the ﬁnal set. The algorithm presents a potential approach to general-

izing clusters in graduated dot mapping applications.

3. Method for creating graduated dot maps

3.1. Method overview

The proposed method for creating graduated dot maps starts by creating a dot map with a

pseudo-random distribution. These dots are generated by calculating the number of dots

per enumeration area and randomly placed within their respective polygon. The pseudo-

randomly placed dots are then rearranged in a blue-noise pattern with the capacity-

constrained Voronoi tessellation (CCVT) algorithm by Balzer et al.(2009). CCVT disperses

INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE 5

dense groups of dots such that their distribution is uniform but randomized, while main-

taining the density distribution of the original dots. Blue-noise patterns and the CCVT

algorithm are discussed in the following subsection. Blue-noise dots can still overlap in

areas where dots are densely clustered. Next, the density-based spatial clustering of

applications with noise (DBSCAN) algorithm introduced by Ester et al.(1996) is used to

identify dense clusters of dots. The dots identiﬁed by the DBSCAN algorithm are removed

from the set of dots and are used in another iteration of the CCVT blue-noise algorithm to

create the next class of dots of a larger size and unit value. This process is repeated, as many

times as there are sizes of dots. In the end, the diﬀerent classes of dots are combined for the

ﬁnal map. The DBSCAN clustering algorithm and its combination with the blue-noise

algorithm are described in two separate subsections.

The user of this method needs to provide the number of dot classes, the dot sizes and

the unit values. The dot diameters should be suﬃciently diﬀerent to create a clear visual

diﬀerence between classes.

3.2. Creating a blue-noise dot pattern

The term blue noise refers to an isotropic, yet unstructured distribution of points (de

Goes et al.2012). This distribution exhibits a spectral density distribution with minimal

low frequency components and no spikes in power, resulting in dots that have ‘a large

mutual distance and no apparent regularity artifacts’(Balzer et al.2009). Blue-noise

sampling distributions can be generated with various approaches, and they have

many applications in computer graphics, such as photorealistic rendering, computer-

generated artistic stippling or texture synthesis (Pharr and Humphreys 2004, Lagae and

Dutré 2008, Yan et al.2015). Blue-noise sampling distributions have useful perceptual

characteristics that we utilize for creating dot maps (Figure 2). Because blue-noise

distributions have well-dispersed dots, we are able to avoid artiﬁcial local clusters of

dots that can be created by pseudo-random placement.

We use the CCVT algorithm, as proposed by Balzer et al.(2009), to produce blue-noise

patterns on dot maps. This particular algorithm provides three important functions for

the creation of graduated dot maps because it (1) reduces the number of dots; (2)

optimizes the distribution of dots; and (3) maintains the density distribution of the

original dots. The property of reducing the number of dots is important because it is

used to develop the next larger dot class. The CCVT partitions space into Voronoi

regions from an initial random distribution and iteratively optimizes the placement of

dots. Balzer et al.(2009, p. 86.1) note that the number of iterations has a direct eﬀect on

the quality of the distribution of dots, pointing out that ‘if the method is not stopped at

a suitable iteration step, the resulting point distributions will develop regularity artifacts’.

To avoid these regularities and stop the algorithm, Balzer et al.(2009) introduce a

‘capacity-constraint’, which simultaneously reduces the number of dots for a region

while maintaining the original density of the region. The capacity-constraint, which is

the factor by which the number of dots is reduced, is a modiﬁable parameter kthat we

use to reduce the number of dots.

6N. D. ARNOLD ET AL.

3.3. Identifying dot clusters

The DBSCAN clustering algorithm was proposed by Ester et al.(1996)andhasseveral

advantages over other clustering algorithms for this application. Many clustering algorithms

require the number of desired clusters as an input parameter, but in the case of identifying

spatial dot clusters, the number of clusters is unknown. Another advantage of DBSCAN is that

the algorithm can discover clusters of arbitrary shape. DBSCAN is a density-based clustering

algorithm. It identiﬁes clusters of points based on their mutual distance and the number of

nearby neighboring points. The result of the algorithm is a set of point clusters and a set of

noise points. Noise points are in low-density regions and are not part of any cluster.

DBSCAN requires two parameters: a search radius εand the minimum number of dots

μrequired to form a cluster. A point is added to a cluster if its distance to a cluster point

is smaller than the search radius ε.Ifεis identical to the dot diameter, dots may touch

but not overlap. εcan be adjusted to allow overlapping or spaced dots.

The minimum number of dots μdetermines the minimum number of dots required to

constitute a cluster. For example, if the minimum number of dots μis four, then three

dots that coalesce or overlap will not be identiﬁed as a cluster. We set μto two, which

means that only two dots are required to form a cluster, disallowing any dot overlap.

The DBSCAN clustering algorithm uses the concept of core points (Ester et al.1996). A

point is a core point if it is among at least μpoints with mutual distances shorter than ε.

A random point is selected as the starting point and if there are at least μ−1 points

closer than ε, the points are identiﬁed as core points, and the ﬁrst cluster is identiﬁed.

Otherwise, the point is considered noise. The algorithm iterates through all points until

each point is identiﬁed as either a cluster or noise point. Points can be classiﬁed as noise

or they can be added to existing or new clusters.

3.4. Combining CCVT blue-noise and DBSCAN clustering algorithms

The CCVT blue-noise algorithm and the DBSCAN clustering algorithm are used iteratively

to produce a graduated dot map. The input for this combined algorithm consists of m

enumeration areas with values v

m

to be mapped, and ndot unit values (d

n

) ordered in

increasing order. The output is nsets (s

n

) of dot coordinates.

Figure 2. Dots with a pseudo-random distribution (left) and with blue-noise pattern (right).

INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE 7

An initializing step produces pseudo-random dot locations for each enumeration area.

The CCVT blue-noise algorithm is then used to reduce the number of dots by a factor k

0

,so

that the dot unit value d

0

for the dots in set s

0

is d

0

=d

1

/k

0

. For example, if the unit value of

the smallest class of dots (d

1

) is to be 2000 units and k

0

is 2, the dot value d

0

is 1000. The

number of pseudo-randomly placed dots per enumeration area is v

m

/d

0

.

The following procedure is then iteratively applied to create the noutput dot sets: the

CCVT blue-noise algorithm is run on s

n

with the reduction factor k

n

. This results in a

reduction of the number of dots in s

n

by a factor of k

n

. Dots in s

n

have blue-noise

characteristics and are potentially arranged in dense clusters. The DBSCAN clustering

algorithm is run on the dots in set s

n

, which marks each dot as either belonging to a

cluster or occurring as a noise dot. The default search radius ε

n

for the DBSCAN

clustering algorithm is identical to the diameter of the dots in set s

n

. Dots that are

part of a cluster are removed from s

n

and added to s

n+1

. The ﬁnal set s

n

only contains

noise dots. The reduction factor for the next iteration is computed with k

n+1

=d

n+1

/d

n

.

This procedure is executed ntimes, creating the sets s

n

. Note that for the last set the

DBSCAN clustering algorithm is not run and no clustered dots are removed. The user

selects the number nof dot classes, the dot unit values d

n

and the dot size for each class.

The amount of overlap or distance among dots of one class can be adjusted by

adjusting the DBSCAN search radius ε

n

for each class. If the search radius ε

n

is identical

to the diameter of the dots of set s

n

, the application of the DBSCAN clustering algorithm

guarantees that dots of the same class do not overlap.

Although it is possible to produce many classes of dot sizes, we have found that three

classes is a good number to avoid confounding users’ability to diﬀerentiate between

dot sizes. However, we have not evaluated this choice in the user study. The reduction

factor for the initializing step k

0

is the only parameter that users cannot be expected to

specify. In our experiments, we used k

0

= 2, which resulted in visually satisfactory results.

Figures 3–5illustrate the algorithm. In the initialization step, the set s

0

of pseudo-

randomly distributed dots is produced (Figure 3 left). The CCVT blue-noise algorithm

then reduces the number of dots in s

0

by half and stores them in s

1

; the unit value is d

1

,

and the distribution has blue-noise characteristics (Figure 3 right). The iterative proce-

dure of identifying clusters and replacing clustered dots with fewer, larger dots begins

with using the DBSCAN clustering algorithm to identify clusters from s

n

(Figure 4 left).

Once these dots are identiﬁed, they are removed from s

n

and added to s

n+1

. Dots from s

n

+1

are used for the next iteration. Figure 4 (right) shows the result of the next iteration:

the dots in s

n

from the ﬁrst iteration were retained and the dots from s

n+1

were run

through the algorithm again. Figure 5 shows the ﬁnal result with three dot size classes.

The ﬁnal map in Figure 5 uses fewer points in high-density areas than Figure 4 (right),

but the enlargement of the dots visually compensates for the smaller number of dots.

This allows Figure 5 to still show the geometry of high-density areas. Note that the

smallest dots in Figure 5 are identical to the black dots in the left map of Figure 4.A

graduated dot map therefore shows the same number of small dots in areas with sparse

data as a pseudo-random dot map, and the smallest dots in a graduated dot map have

the same spatial distribution as a single-class blue-noise dot map.

8N. D. ARNOLD ET AL.

4. User evaluation

The objectives of a user study were to evaluate estimation accuracy and user prefer-

ences of graduated dot maps compared to other map types. Because graduated dot

maps reduce the number of dots, we hypothesize that users will more accurately

estimate values for graduated dot maps than the other map types, and that users will

prefer graduated dot maps over the other map types. The user study compared four

map types: dot maps with a pseudo-random dot distribution, dot maps with a blue-

noise dot distribution, graduated dot maps and area-proportional circle maps.

Perceptual scaling was applied to the area-proportional circle maps. Proposed by J.J.

Flannery, perceptual scaling allows for compensation of the expected value underesti-

mation (Flannery 1971, Dent et al.2009, Slocum et al.2009). The graduated dot maps in

Figure 3. Pseudo-random dots for the initialization step (left); CCVT algorithm reduces the number

of dots by a factor of two and creates a blue-noise distribution (right).

Figure 4. Clusters identiﬁed by the DBSCAN algorithm in red (left). Clustered points are replaced by

the CCVT blue-noise algorithm resulting in an intermediate map with two classes (right).

INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE 9

the user study were created with the described method, and census tract population

data for various regions of the United States were used. All maps and questions included

in this user study, as well as collected results, are documented in Arnold (2015).

The user study was built with the Qualtrics survey platform, and participants were

recruited via Amazon Mechanical Turk, a web-based crowdsourcing service where users

complete tasks or surveys and receive micro-payments. Heer and Bostock (2010) found

that crowdsourcing is viable for testing graphic perception and provides high-quality

responses. Respondents were paid $1.00 for completing the survey. The user study

consisted of a demographic survey, a short map-reading tutorial, a series of timed

map-reading tasks, a map preference survey and the question whether participants

attempted to count or estimate dots. Users were not permitted to go back to any

questions once a response was submitted. For the timed map-reading tasks, participants

were shown dot maps with pseudo-random distributions, dot maps with blue-noise

distributions, graduated dot maps and area-proportional circle maps. The preference

questions evaluated each of the map types for clarity and preference.

The demographic survey collected information regarding participants’gender, age,

country of residence and education level. Following the demographic survey, users

completed a tutorial explaining how to read conventional dot maps, graduated dot

maps and area-proportional symbol maps and showing a legend for each map type. Two

untimed example questions were shown to familiarize users with the questions and then

two timed questions were shown. Participants could repeat the tutorial if desired.

The map-reading tasks included a total of 45 dot maps, including 15 maps with

pseudo-random dots, 15 maps with blue-noise dots and 15 maps with graduated dots.

Because graduated dots are similar to area-proportional circles, ﬁve area-proportional

circle maps were included for comparison. Each dot map had one area highlighted in

gray, and users were asked to estimate the value represented by dots for this area. Each

area-proportional symbol map had one circle highlighted in gray, and users were asked

to estimate its value. All pseudo-random and blue-noise dot maps used the same unit

value (200). All graduated dot maps used the same three unit values (1000, 10,000 and

Figure 5. Final graduated dot map with three classes.

10 N. D. ARNOLD ET AL.

100,000). Participants were shown all maps of one type before being shown the maps of

other types. The order of type groups was randomized, and the maps within each type

group were randomized.

For each of the three dot map types, there were 10 maps with realistic administrative

boundaries (two examples are shown in Figure 6) and ﬁve maps overlaid with a regular

grid (an example is shown in Figure 7). To minimize learning eﬀects, the mapped values

were changed for each map, the enumeration areas of ﬁve maps were replaced with a

regular grid and maps with administrative boundaries were rotated to reduce potential

learning eﬀects. For each map, participants were given 10 s to view the map and were

asked to estimate the value of the gray area (see gray areas in Figures 6 and 7). Viewing

time was limited to 10 s such that participants had to estimate rather than count dots in

enumeration areas with many dots. After 10 s elapsed, the map disappeared and

respondents were required to enter their estimate. The legends for all maps of one

group were identical and were shown before the timed maps appeared and during the

= 1,000 = 10,000 = 100,000

= 200 = 1,000 = 100,000

= 10,000

Figure 6. A pseudo-random dot map (left) and a graduated dot map (right) for the user study.

Subjects are asked to estimate the value for the gray areas. Geometry is rotated for the second map.

= 1,000 = 10,000 = 100,000

Figure 7. Example of graduated dot map with grid overlay for the user study. Subjects are asked to

estimate the value for the gray square.

INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE 11

timed task, which prevented participants from losing time to familiarize themselves with

the legend. Each legend for the single-size dot maps showed the individual dot value, as

well as a sample of three varying densities of dots (Figure 6 left). Samples of varying

densities were used because Provin (1977) tested the eﬀects of legends on estimating

dot values and noted that users showed a marked improvement in average estimates

when such legends were present.

Five additional test maps with easily readable values were placed throughout the survey.

Four test maps were dot maps with ﬁve to seven dots in the enumeration area of interest;

one test map showed area-proportional circles with the test circle being identical to one of

the circles in the legend. The test maps were used to eliminate responses from participants

who entered random values rather than attempting to estimate values.

For the ﬁrst map preference question, participants were shown two maps of each

type individually and asked to rate each map from 1 to 5 based on ‘Clarity and Legibility’

and ‘Preference & Appeal’. A rating of 5 for both questions indicates that the map is

‘Very Clear’and ‘Very Appealing’, whereas a rating of 1 for both questions indicates ‘Not

Clear’and ‘Not Appealing’. The second type of map preference question showed

participants two, 2 ×2 image matrices (Figure 8) containing each map type, and

participants were asked to rank the maps on each matrix. Responses ranged from 1 to

4, with 1 being their ‘favorite’and 4 being their ‘least favorite’.

4.1. User evaluation results

Of the 420 participants in the user study, 123 did not correctly respond to at least three

of the ﬁve trivial test maps. The results of these respondents were discarded, and only

the results from the 297 remaining participants were analyzed. Of the 297 participants,

149 were female, 147 were male, 58% were under the age of 35 and 68% had completed

some level of college education. A total of 264 participants were from the United States,

28 were from India and 5 were from other countries.

4.1.1. Dot map estimates

The dot map estimation tasks asked respondents to estimate the value for regions on

the map. There were three groups of maps: one group with 15 maps with randomly

placed dots, one group with 15 maps with blue-noise dot maps and one group with 15

graduated dot maps. For each of the three map groups, the accuracy of responses was

compared across respondents.

The blue-noise algorithm can move dots outside of their enumeration area. For

analyzing the accuracy of responses for blue-noise and graduated dot maps, we used

the number of dots actually placed inside enumeration units.

The Kruskal–Wallis one-way ANOVA test was used to determine if each map group

showed a signiﬁcant diﬀerence in the distribution of estimates among respondents. This

test was used because we analyzed three independent groups of dot maps (Kruskal and

Wallis 1952). p-Values for each map group were <0.001, indicating that the results are

signiﬁcant. The rate of error for each dot map was compared to the total number of dots

per test area (Figure 9) and then averaged by map and by type giving 45 error estimates.

The scores indicate that user estimations of blue-noise dot maps were more accurate

than random dot maps, and graduated dot maps were more accurate than blue-noise

12 N. D. ARNOLD ET AL.

dot maps. In addition, the accuracy of user estimation is correlated with the number of

dots per estimated value, with increasing numbers of dots resulting in increasing relative

error. That is, with all three types of dot maps, error rates are low for enumeration areas

Figure 8. Example of a 2 ×2 image matrix shown for map preference questions. Subjects were

asked which maps they preferred. Values mapped are diﬀerent for the four maps.

Figure 9. Number of dots per test area versus relative error for the 45 dot maps used in the user study.

INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE 13

with small numbers of dots, and error rates are higher for areas with larger numbers of

dots. Our results also support previous ﬁndings that users all but universally under-

estimate dot values (Provin 1977).

4.1.2. Graduated dot maps vs. area-proportional circle maps

Because graduated dot maps are visually similar to area-proportional circle maps, the

results of the ﬁve area-proportional circle map estimations were compared to the results

of the 15 graduated dot maps. On average, the area-proportional circles were under-

estimated by 9.3% even though Flannery’s perceptual scaling was applied. The average

underestimation for graduated dot maps was 6.5%. The Kruskal–Wallis one-way ANOVA

test was used again to determine if there are signiﬁcant diﬀerences in estimations of

individual area-proportional circle maps and graduated dot maps. All tests returned

p-values <0.001, demonstrating signiﬁcant diﬀerences in estimations between gradu-

ated dot maps and area-proportional circle maps.

While interpretation of area-proportional circle maps was less error prone than single-

size dot maps, interpretation of area-proportional circle maps was slightly more prone to

errors than graduated dot maps with a moderate number of dots. However, graduated

dot maps with large values (and a large number of dots) seem to result in higher error

than area-proportional circle maps with similar values. For example, the test area in one

graduated dot map had a total of 39 dots (3 dots with a value of 100,000, 31 dots with a

value of 10,000 and 5 dots with a unit value of 1000), a total value of 615,000 and an

average error of −26.3% while an area-proportional circle map (with perceptual scaling

by Flannery) with a value of 405,500 had an average error of −5.3%.

4.1.3. Preference and clarity

The objective for the preference tests was to compare preferences of each map type. For

each map type, users were shown the map and asked to provide a numerical Likert-scale

response for questions of ‘Clarity and Legibility’(1 = Not Clear; 3 = Somewhat Clear;

5 = Very Clear) and ‘Aesthetic Preference’(1 = Not Appealing; 3 = Somewhat Appealing;

5 = Very Appealing) (Likert 1932). The participants were asked to rate two sets of maps;

see Figure 8 for the four maps of one set.

Figure 10 shows the average of the preference and clarity responses. Results indicate

that users found dot maps with random and blue-noise distributions to be least favorable

Figure 10. Mean preference and clarity ratings and standard deviations for two sets of four maps of

the same area.

14 N. D. ARNOLD ET AL.

and approximately equal in clarity and preference. Respondents found graduated dot

maps to be the most preferred maps with the clearest message. Area-proportional circle

maps ranked in-between. Friedman’s test was used to determine if statistically signiﬁcant

diﬀerences were found between each map type in each set (Friedman 1937). Results of the

test show that there is a signiﬁcant diﬀerence between map types (p-values of <0.001) for

each of the two tested sets.

4.1.4. Rank-order preference

Subjects were shown the same maps that were used in the preference and clarity

question in two, 2 ×2 image matrices showing the four types on a single page

(Figure 8). Using the matrix, they were asked to rank-order the maps from 1 to 4

(1 = ‘favorite’and 4 = ‘least favorite’).

Figure 11 shows a histogram of responses for each of the map types. The results show

a clear pattern of ranks. Users ranked graduated dot maps ‘1st’(their favorite) more

often than any other map type. The results also show that blue-noise dot maps were

often ranked ‘3rd’. Dot maps with a random distribution were most commonly ranked

‘last’by respondents. The ranking of area-proportional circle maps does not show a clear

tendency. Friedman’s test was used to determine if statistically signiﬁcant diﬀerences

were found between each map type in each set. p-Values for each test were <0.001,

indicating signiﬁcant diﬀerences between each map type.

4.1.5. Counting vs. estimating

Participants were asked two ﬁnal questions: (1) how often they attempted to count the

dots versus estimate the number of dots; and (2) whether users attempted to count dots

more often for graduated dots or single-size dots. For the ﬁrst question, 78% of

respondents indicated that they sometimes tried to count the dots, 19% of respondents

indicated that they tried to count the dots every time and only 7 respondents estimated

the dots every time. Note that attempting to count the dots for many maps was

impossible due to the large number of dots and the limited viewing time of 10 s.

Responses to the second question were more evenly split. Sixty percent of respondents

stated that they counted the graduated dots more often, 39% stated that they counted

Figure 11. Frequency of map rank responses for the test map sets. Graduated dot maps are ranked

ﬁrst more often than any other map type.

INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE 15

the single-size dots more often and only 3 respondents indicated that they did not

count the dots.

5. Discussion

Study participants showed improved accuracy for dot estimation tasks for graduated dot

maps compared to conventional dot maps. Our research shows that conventional dot maps

resulted in a high degree of underestimation, which reaﬃrms the ﬁndings of Olson (1975),

Provin (1977) and Mashoka et al.(1986). Users also underestimated values with graduated

dot maps, but to a much lesser extent. We observe that enumeration areas in graduated dot

maps with many dots did not have an advantage over conventional dot maps. However,

most enumeration areas in graduated dot maps use considerably fewer dots than corre-

sponding enumeration areas on conventional dot maps, which explains the overall advan-

tage. It is by their design that graduated dot maps have smaller numbers of dots per

enumeration area than conventional dot maps. A second reason for their smaller number of

dots is speciﬁc to our study. We designed dot maps following the school that dots should

begin to coalesce in the area with the highest density of dots. This resulted in a considerable

smaller unit value for pseudo-random and blue-noise dot maps (200) than for the smallest

unit value for graduated dot maps (1000). Future studies should compare blue-noise dot

maps and graduated dot maps that use the same (smallest) unit value.

Study participants underestimated the values of area-proportional circles to a slightly

higher degree than graduated dot maps, even though Flannery’sperceptualscalingwas

applied to the circles. Study participants preferred graduated dots to all others in the study.

Respondents indicated that graduated dot maps were clearer, more legible and more visually

appealing than the other maps, and they ranked graduated dot maps as their favorite (ranked

1st) in a rank-order test.

Our user study only focused on identifying absolute values for selected areas of interest.

It did not assess whether spatial patterns can be extracted eﬀectively from the diﬀerent

visualization methods. Additional studies are needed to compare the eﬀectiveness of the

diﬀerent methods for the interpretation of spatial patterns and other visual analysis task.

Although the results are favorable for graduated dot maps, there are some limitations to

the method and remaining questions. The ﬁrst limitation is that the notion of dot density

can possibly be lost due to the reduced number of dots and their placement. Future work

could include relocating small dots between larger ones to reclaim the density and test the

eﬀect on estimation accuracy. Another limitation of the proposed method is that dots are

allowed to move outside of their enumeration area when blue-noise dot patterns are

created. This is problematic when, for example, terrestrial-related dots are moved over

lakes and oceans. Using ancillary data as exclusion areas for geographically based dot

placement is a potential solution; however, future work is necessary to add this additional

constraint to the CCVT blue-noise algorithm. (For analyzing responses in our user study, we

used the number of dots actually placed inside enumeration units.)

Furthermore, the CCVT blue-noise algorithm can move large dots in dense regions

outward from the center of small enumeration areas to prevent overlap. In such cases, it

may appear to the reader that the large dots represent a large enumeration area, which

is misleading. The current implementation also does not guarantee that larger dots

(which replace a cluster of smaller dots) do not overlap with neighboring smaller dots.

16 N. D. ARNOLD ET AL.

Future work is also needed to determine an appropriate number of dot classes for

graduated dot maps. For graduated circle maps, Evans (1977) recommends four or ﬁve

classes for an audience with little experience in reading graphics and notes that experi-

enced readers may appreciate seven or eight classes. We chose three classes of dots to not

confound users’ability to detect diﬀerences between classes; however, we do not attempt

to evaluate the inﬂuence of the number of classes on estimation accuracy.

Another open question relates to the concept of subitizing. Coined by Kaufman et al.

(1949), subitizing refers to the judgment of small numbers of stimuli, a process that is more

accurate, more conﬁdent and more rapid than estimating or counting. Kaufman et al.(1949)

indicated that subitizing occurs when the number of stimuli is less than six. We hypothesize

that graduated dot maps are interpreted with a combination of counting, estimation and

subitizing. Future work could evaluate the processes by which users derive values for

graduated dot maps. Subitizing seems to also be relevant for the selection of appropriate

unit values for graduated dot maps. To minimize estimation errors, the unit values could be

chosen such that map readers subitize rather than estimate or count when extracting values

for an enumeration area. Optimum unit values could be determined that maximize the

number of enumeration areas in a map that only use four or ﬁve dots of each class (the

numbers suitable for subitizing). The unit values, however, need to be numbers that are

simple to sum and multiply and easy to remember, otherwise the advantage of subitizing

would be defeated by error-prone calculations necessary to compute total values.

In addition to the number of dot classes, the size of dots represents an area of potential

research. Given the purpose of the dot map, future work could determine whether area-

proportional dots have an advantage over graduated dots. If the purpose of the map is to

allow map readers to count dots, creating area-proportional dot sizes may not be neces-

sary. Conversely, if the purpose is to show density, then it could be advantageous to use

area-proportional dots (that is, dots with areas proportional to the unit value).

6. Conclusion

We present a method for creating graduated dot maps that produces visually pleasing

maps with improved estimation accuracy of raw totals. Our method combines blue-

noise dot distributions and a clustering algorithm, and represents the ﬁrst automated

method for producing graduated dot maps.

Graduated dot maps result in more accurate estimation than conventional dot maps.

The reason seems to be that graduated dot maps use fewer dots per enumeration area.

Study participants found graduated dot maps to be clearer, more legible and more

visually appealing than the other maps. Besides aesthetical considerations, the choice

between a graduated dot map and a conventional dot map should be based on the

map-reading task. If the primary goal is to communicate the spatial density pattern, a

conventional dot map is appropriate. If the goal is to communicate raw totals as well as

the density pattern, a graduated dot map should be preferred. The graduated dot map

technique is also recommended when a single dot unit value of a conventional dot map

results in large empty areas or overly dense areas.

INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE 17

Acknowledgements

The authors would like to sincerely thank the reviewers for their valuable comments and suggestions.

We also thank Abby Metzger, Oregon State University, for editing this text, and Jon Kimerling, Oregon

State University, and Guntram H. Herb, Middlebury College, for their help and comments.

Disclosure statement

No potential conﬂict of interest was reported by the authors.

ORCID

Bernhard Jenny http://orcid.org/0000-0001-6101-6100

References

Arnold, N.D., 2015. Automation and evaluation of graduated dot maps. MS thesis. Oregon State

University. Available from: http://hdl.handle.net/1957/56331

Atlas of Switzerland, 1977. Nutztiere—animaux de rapport. Sheet 51. Wabern, Bern:

Eidgenössische Landestopographie.

Balzer, M., Schlomer, T., and Deussen, O., 2009. Capacity-constrained point distributions: a variant

of Lloyd’s method. ACM Transactions on Graphics (Proceedings of SIGGRAPH 2006), ACM, 28 (3),

article 86, 8.

De Berg, M., et al., 2004. On simplifying dot maps. Computational Geometry, 27 (1), 43–62.

doi:10.1016/j.comgeo.2003.07.005

de Goes, F., et al., 2012. Blue noise through optimal transport. ACM Transactions on Graphics

(SIGGRAPH ASIA), ACM, 31 (6), article 171, 11.

Dent, B.D., Torguson, J.S., and Hodler, T.W., 2009.Cartography: thematic map design. 6th ed. Boston

[etc.]: McGraw Hill.

Ester, M., et al., 1996. A density-based algorithm for discovering clusters in large spatial databases

with noise. In:Proceedings of the second international conference on knowledge discovery and

data mining (KDD-96),2–4 August, Portland, Oregon. AAAI Press, 226–231.

Evans, I.S., 1977. The selection of class intervals. Transactions of the Institute of British Geographers,

2 (1), 98–124. doi:10.2307/622195

Flannery, J.J., 1971. The relative eﬀectiveness of some common point symbols in the presentation of

quantitative data. The Canadian Cartographer, 8 (2), 96–109. doi:10.3138/J647-1776-745H-3667

Friedman, M., 1937. The use of ranks to avoid the assumption of normality implicit in the analysis

of variance. Journal of the American Statistical Association, 32 (200), 675–701. doi:10.1080/

01621459.1937.10503522

Hake, G., Grünreich, D., and Meng, L., 2002.Kartographie: Visualisierung raum-zeitlicher Informationen.

Berlin: De Gruyter.

Heer, J. and Bostock, M., 2010. Crowdsourcing graphical perception: using Mechanical Turk to

assess visualization design. In:Proceedings of the SIGCHI conference on human factors in comput-

ing systems,10–15 April 2010, Atlanta, Georgia, USA. New York, NY: ACM, 203–212.

Hey, A., 2012. Automated dot mapping: how to dot the dot map. Cartography and Geographic

Information Science, 13 (1), 17–29. doi:10.1559/1523040639117

Hey, A. and Bill, R., 2014. Placing dots in dot maps. International Journal of Geographical

Information Science, 28 (12), 2417–2434. doi:10.1080/13658816.2014.928822

Imhof, E., 1972.Thematische Kartographie. Berlin: de Gruyter.

Jenks, G.F., 1953.“Pointillism”as a cartographic technique. The Professional Geographer, 5 (5), 4–6.

doi:10.1111/j.0033-0124.1953.055_4.x

18 N. D. ARNOLD ET AL.

Jenny, B., et al., 2017. A guide to selecting map projections for world and hemisphere maps. In: M.

Lapaine and E.L. Usery, eds. Choosing a map projection, lecture notes in geoinformation and

cartography. Berlin: Springer, 213–228. doi:10.1007/978-3-319-51835-0_9

Kaufman, E.L., et al., 1949. The discrimination of visual number. The American Journal of Psychology,

62 (4), 498–525. doi:10.2307/1418556

Kimerling, A.J., 2009. Dotting the dot map, revisited. Cartography and Geographic Information

Science, 36 (2), 165–182. doi:10.1559/152304009788188754

Kraak, M.-J. and Ormeling, F., 2011.Cartography: visualization of spatial data. New York: Guilford Press.

Kruskal, W.H. and Wallis, W.A., 1952. Use of ranks in one-criterion variance analysis. Journal of the

American Statistical Association, 47 (260), 583–621. doi:10.1080/01621459.1952.10483441

Lagae, A. and Dutré, P., 2008. A comparison of methods for generating Poisson disk distributions.

Computer Graphics Forum, 27 (1), 114–129. doi:10.1111/j.1467-8659.2007.01100.x

Lavin, S., 1986. Mapping continuous geographical distributions using dot-density shading. The

American Cartographer, 11 (1), 140–150. doi:10.1559/152304086783900068

Likert, R., 1932. A technique for the measurement of attitudes. Archives of Psychology, 140, 1–55.

Mackay, J.R., 1949. Dotting the dot map: an analysis of dot size, number, and visual tone density.

Surveying and Mapping, 9 (1), 3–10.

Mashoka, Z., Bloemer, H.H.L., and Pickles, J., 1986. Dot maps vs. proportional circle maps: an

assessment of readability, legibility, and preference. Bulletin of the Society of University

Cartographers, 20 (2), 1–6.

Monkhouse, F.J. and Wilkinson, H.R., 1978.Maps and diagrams: their compilation and construction.

3rd ed. London: Methuen.

Olson, J.M., 1975. Experience and the improvement of cartographic communication. The

Cartographic Journal, 12 (2), 94–108. doi:10.1179/caj.1975.12.2.94

Petermann, A., 1857. Physikalisch-geographisch-statistische Skizze von Siebenbürgen. Petermanns

Geographische Mitteilungen, 3, 508–513, map plate 25.

Pharr, M. and Humphreys, G., 2004.Physically based rendering: from theory to implementation.

Amsterdam/Boston: Elsevier/Morgan Kaufmann.

Provin, R.W., 1977. The perception of numerousness on dot maps. The American Cartographer,4

(2), 111–125. doi:10.1559/152304077784080374

Robinson, A.H., 1982.Early thematic mapping in the history of cartography. Chicago: University of

Chicago Press.

Robinson, A.H., et al., 1995.Elements of cartography. New York [etc.]: John Wiley & Sons.

Rogers, J.E. and Groop, R.E., 1981. Regional portrayal with multi-pattern color dot maps.

Cartographica, 18 (4), 51–64. doi:10.3138/E626-7703-6188-5558

Slocum, T.A., et al., 2009.Thematic cartography and geovisualization. 3rd ed. Upper Saddle River,

NJ: Pearson Prentice Hall.

Spiess, E., 1990. Generalisierung in thematischen Karten. In: K. Brassel and H. Kishimoto, eds.

Kartographisches Generalisieren. Publication No. 10 of the Swiss Society of Cartography, 63–70.

Taves, E.H., 1941. Two mechanisms for the perception of visual numerousness. Archives of

Psychology, 37, 1–47.

Thomas, E.N., 1955.“Balanced”colors for use on the multi-color dot map. The Professional

Geographer, 7 (6), 8–10. doi:10.1111/j.0033-0124.1955.076_8.x

Tyner, J.A., 2010.Principles of map design. New York: Guilford Press.

Yan, D.M., et al., 2015. A survey of blue-noise sampling and its applications. Journal of Computer

Science and Technology, 30 (3), 439–452. doi:10.1007/s11390-015-1535-0

Yan, H. and Weibel, R., 2008.An algorithm for point cluster generalization based on the Voronoi

diagram. Computers and Geosciences, 34 (8), 939–954. doi:10.1016/j.cageo.2007.07.008

INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE 19