Content uploaded by Bennett Kankuzi
Author content
All content in this area was uploaded by Bennett Kankuzi on Jul 09, 2015
Content may be subject to copyright.
A Domain Terms Visualization Tool for
Spreadsheets
Bennett Kankuzi
University of Eastern Finland
School of Computing, Joensuu Campus
Joensuu, Finland
Email: bfkankuzi@gmail.com
Jorma Sajaniemi
University of Eastern Finland
School of Computing, Joensuu Campus
Joensuu, Finland
Email: jorma.sajaniemi@uef.fi
Abstract—Earlier research on spreadsheet authors’ mental
models has shown that people possess several types of information
about spreadsheets. The research results also suggested that a tool
intended to aid in comprehension and debugging of spreadsheets
should make prominent real-world and problem domain concepts
and map them easily to spreadsheet-specific details.
This poster introduces a new spreadsheet visualization tool
which was developed to demonstrate that it is possible to have
an easy-to-use spreadsheet understanding and debugging tool
that relieves users from spreadsheet details and lets them utilize
more of their mental model of the application domain. The tool
translates traditional spreadsheet formulas into problem domain
narratives and highlights referenced cells.
I. INT ROD UC TI ON
Despite being one of the most widely used applications,
many spreadsheets have non-trivial errors that are hard to
detect [6]. We have thus chosen to address this problem by
studying mental models of spreadsheet authors in order to
better understand why the spreadsheet process is so error-prone
and to be able to devise new tools that better correspond to
spreadsheet authors’ mental processes.
Doyle and Ford [3] define a mental model as “a mental
image of the world around us that we carry in our heads de-
picting only selected concepts and relationships that represent
real systems” and thus, a mental model for a spreadsheet does
not carry all possible information, but just those aspects of the
spreadsheet that the user finds appropriate for the current task.
Earlier research we conducted [4] revealed that spreadsheet
authors have (at least) three mental models of a spreadsheet:
the real-world model that comprises general knowledge of the
world around us; the domain model that represents knowledge
of the problem domain and the functionality of the spreadsheet
in problem domain or application terms; and the spreadsheet
model that codes the expressions and data relationships in
the spreadsheet. The role of these mental models also vary
depending on the nature of the task at hand such that when
explaining a spreadsheet, the real-world and domain models
are prominent; in debugging, spreadsheet authors constantly
switch back and forth between the domain and spreadsheet
models; and particularly in error fixing, the spreadsheet model
is prominent. These findings suggest that a tool intended to aid
in comprehension of a spreadsheet should make prominent
real-world and domain concepts and map them easily to the
Fig. 1. Visualization of a Sales Report spreadsheet. The cell cursor is in C9,
whose formula is visible at the formula bar and narrated in domain terms in
the light green box. Cells referred to in the domain narrative are highlighted
with light green background. All formula cells are marked with a pink right
border.
spreadsheet; in debugging, a tool should not only display
possible spreadsheet errors in spreadsheet terms (e.g., cell
references) but also in domain terms in order to help authors
discern errors with a more pronounced mapping between the
problem domain and the spreadsheet.
It is against this background that we have developed an
interactive spreadsheet visualization tool whose purpose is
to ease the mapping between the domain/real-world mental
models and the spreadsheet mental model and thus relieving
users from spreadsheet details and letting them utilize more
of their mental model of the application domain.
II. TH E TOO L
The tool displays formulas in higher level symbolic names,
that is, in domain/real-world terms. The domain/real-world
information is extracted from labels (headers) through spatial
layout information of each cell referenced to in the for-
mula. We call this displayed domain/real-world information
adomain/real-world narrative or simply a domain narrative.
Features and characteristics: The tool has five distin-
guishing features and characteristics. First, the tool displays
formulas as a domain/real-world terms narrative in a box
just right below the active formula cell. For example, in
the spreadsheet depicted in Figure 1, a formula in cell C9
given as “=SUM(C5:C8)” in spreadsheet-specific terms is
automatically translated to “SUM( Jan |James Bourne ... Jan
Fig. 2. Example of a domain narrative when the column name is missing.
B4 is translated to “Cash” and B5 to “Inventories” and hence the formula in
B6 is translated as “Cash + Inventories”.
|Jasmine Hunt )” in domain terms. For each referenced cell,
its corresponding domain/real-world terms are extracted from
its column and row labels. The column and row names are
separated by “|”; the naming obeys a “column |row” direction
following the standard spreadsheet convention. Narratives are
positioned a little bit lower than the active formula cell to
avoid distractions when navigating through the cells.
Second, to track changes automatically, the narratives are
automatically re-generated as one works through the spread-
sheet. Automatic generation of visualizations is important
because users are not comfortable with tools that require much
user intervention [7].
Third, referenced cells are automatically highlighted and
their background colour matches with the background colour
of the narrative. For example in Figure 1, “Jan |James Bourne”
or C5, “Jan |Chris Hewitt” or C6, “Jan |Pat Hill” or C7,
and “Jan |Jasmine Hunt” or C8 are all highlighted in light
green colour. We have thus applied Gestalt Law of Similarity
whereby objects of the same colour are naturally perceived as
related [8]. In this case, matching background colours of the
narrative and highlighted cells helps the user to automatically
perceive the two as related.
Fourth, all formula cells are marked with a pink right border
so that a user can have a general overview of the structure of
a spreadsheet and differentiate formula cells from text cells as
well as input number cells. For example, in Figure 1, cell E5
is not a formula cell while cell E9 (with a pink right border)
is a formula cell. We have thus applied cognitive theories of
visual search [2] to provide a minimal clutter pop-out effect
that results in better speed and accuracy than the attentional
template approach utilized in normal spreadsheet systems.
Fifth, the tool is superimposed on the spreadsheet display.
This is important because users find it tedious and confusing to
determine the correspondence between a separate visualization
and the spreadsheet itself [1].
Label extraction: The domain/real-world information dis-
played in the narratives is extracted from labels through spatial
layout information. For each cell, its corresponding domain
terms are extracted from its column and row labels. An input
cell which does not have its corresponding column (row) name,
has its row (column) name used as its symbolic name as
illustrated in Figure 2.
Implementation: The tool is implemented as an add-on to a
popular spreadsheet system, Microsoft Excel, so that it should
not be completely different from the traditional spreadsheet
environment. This was to take advantage of existing user ex-
perience as their familiarity with existing tools and techniques
could become worthless if the tool is too different from what
they know and how they are used to work [5]. Experience is an
asset, and learning new tools is hard. Even if one manages to
convince end-users of a new tool’s benefits, they will still have
to learn to cope with it in practice. As such, their experience
with existing tools and techniques could become worthless if
the tool is too different from what they know and how they
are used to work [5]. Therefore, we took effort to make sure
that the tool should be available in the traditional spreadsheet
environment and not be completely different from it.
Potential uses: When creating or editing a spreadsheet, the
tool can help one in tracking effected changes as active for-
mula cell ranges are highlighted and a corresponding problem
domain description is displayed. This has the potential to help
in creating an error free spreadsheet as it provides an intuitive
on-spot feedback to the user on changes being done to the
spreadsheet. For the same reasons, the tool may also help in
the comprehension of spreadsheets created by others.
The tool may also help in the comprehension of a spread-
sheet as it can give a user a general overview of the structure
of a spreadsheet through the pink right border markings on all
formula cells. Finally, the tool may also help in locating errors.
For example, a problem domain description or narrative for a
given formula cell, that does not match with what is expected
in the problem domain, could provide a visual cue for an error.
III. CONCLUSION
This poster presented a description of a domain terms
visualization tool for spreadsheets that we are proposing to
aid in spreadsheet comprehension and debugging. This tool
has also been empirically evaluated with results showing that
the tool was easy to learn, helped users to locate more errors
in spreadsheets and generally increased the use of the domain
mental model of its users. Part of our future work on the tool
would be to attempt to allow spreadsheet authors to enter or
edit formulas using domain narratives.
REF ER EN CE S
[1] J. S. Davis, “Tools for spreadsheet auditing,” International Journal of
Human-Computer Studies, vol. 45, no. 4, pp. 429–442, 1996.
[2] R. Desimone and J. Duncan, “Neural Mechanisms of Selective Visual
Attention,” Annual Review of Neurosciences, vol. 18, pp. 193–222, 1995.
[3] J. K. Doyle and D. N. Ford, “Mental Models Concepts for System
Dynamics Research,” System Dynamics Review, vol. 14, no. 1, pp. 3–
29, 1998.
[4] B. Kankuzi and J. Sajaniemi, “An Empirical Study of Spreadsheet
Authors’ Mental Models in Explaining and Debugging Tasks,” in 2013
IEEE Symposium on Visual Languages and Human-Centric Computing
(VL/HCC). IEEE, 2013, pp. 15–18.
[5] D. Kulesz, “From Good Practices to Effective Policies for Preventing
Errors in Spreadsheets,” Proceedings of EuSpRIG 2011 Conference, 2011.
[6] R. R. Panko, “What we know about spreadsheet errors,” Journal of
Organizational and End User Computing, vol. 10, no. 2, pp. 15–21, 1998.
[7] J. Sajaniemi, “Modeling spreadsheet audit: A rigorous approach to
automatic visualization,” Journal of Visual Languages & Computing,
vol. 11, no. 1, pp. 49–82, 2000.
[8] M. Wertheimer, A Source Book of Gestalt Psychology. London:
Routledge & Kegan Paul, 1938.