ArticlePDF Available

Envisioning user access to a large data archive

Authors:

Abstract and Figures

We are overwhelmed by the vast amounts of data accumulating daily. The extraction of information from online data sources is becoming more and more difficult. For example, if a query to a large archive returns hundreds of "hits", the most effective presentation is probably not a list of items, but some other type of graphical display. The concept of spatialization offers a promising potential to overcome the current impediments of retrieving items from large volume archives. Spatialization involves effective combination of powerful scientific visualization techniques with spatial metaphors that represent data that are not necessarily spatial in nature. Familiar spatial concepts such as distance and direction, scale, arrangement etc. which are part of the human experience in everyday life, are applied to create lower-dimensional digital representations of complex digital data. Skupin and Buttenfield (1996;1997) have demonstrated how spatial metaphors can be constructed for abstract information spaces. However, as these authors (1996: 616) point out, there has not yet been any subject testing to determine the appropriateness of such methods for visualization. We are not certain how people comprehend spatialized views, or whether the components of distance, direction and so forth are understood by viewers. This paper presents an experimental design to explore how spatialized views are understood by users. Subject testing procedures on graphical displays are outlined, which include the collection of performance measures on information retrieval tasks. The experimental application will rely on data collected from the Alexandria Digital Library Project.
Content may be subject to copyright.
published as: Fabrikant, S. I. and Buttenfield, B. P. (1997). Envisioning User
Access to a Large Data Archive. Proceedings, GIS/LIS '97, Cincinatti, OH, Oct.
28-30, 1997: 686-691 (CD-ROM).
ENVISIONING USER ACCESS
TO A LARGE DATA ARCHIVE
Sara I. Fabrikant and Barbara P. Buttenfield
Department of Geography, Campus Box 260
University of Colorado, Boulder CO 80309
voice: (303) 492-3684
email: sara.fabrikant@colorado.edu and babs@colorado.edu
ABSTRACT
We are overwhelmed by the vast amounts of data accumulating daily. The
extraction of information from online data sources is becoming more and more
difficult. For example, if a query to a large archive returns hundreds of “hits”,
the most effective presentation is probably not a list of items, but some other
type of graphical display. The concept of spatialization offers a promising
potential to overcome the current impediments of retrieving items from large
volume archives. Spatialization involves effective combination of powerful
scientific visualization techniques with spatial metaphors that represent data that
are not necessarily spatial in nature. Familiar spatial concepts such as distance
and direction, scale, arrangement etc. which are part of the human experience in
everyday life, are applied to create lower-dimensional digital representations of
complex digital data. Skupin and Buttenfield (1996;1997) have demonstrated
how spatial metaphors can be constructed for abstract information spaces.
However, as these authors (1996: 616) point out, there has not yet been any
subject testing to determine the appropriateness of such methods for
visualization. We are not certain how people comprehend spatialized views, or
whether the components of distance, direction and so forth are understood by
viewers. This paper presents an experimental design to explore how spatialized
views are understood by users. Subject testing procedures on graphical displays
are outlined, which include the collection of performance measures on
information retrieval tasks. The experimental application will rely on data
collected from the Alexandria Digital Library Project.
2
INTRODUCTION
Searching data and retrieving information from large online data archives
can be a very frustrating experience. A user might encounter the following
interactions with a system, during a query session of a large data repository:
The user fills in a query form, by entering keywords in a text
entry field, as well as other related information such as dates,
numbers etc. to refine the search. Ideally, the system returns a
small set of “hits”, which are related to the search keyword and
include the desired information. However, an information
seeker is often overwhelmed by a huge amount of returned
query results. Consequently, the user has to go through the
time consuming process of sifting through large amounts of
data, which might not be related to the requested search.
Yet another query might result in zero “hits”, or leaves the user with the
feeling of having used a wrong keyword, or having misused a query option. In
both cases, the user has to refine the query, until the desired subset of items is
returned by the system, thus requiring the users time and effort.
SPATIALIZATION
The concept of spatialization offers a promising potential to overcome to
current impediments to efficient information processing and retrieval.
Spatialization refers to the effective combination of powerful scientific
visualization techniques with spatial metaphors that represent complex high-
dimensional data sets, which may be non-spatial in nature. Familiar spatial
concepts such as, distance, direction, scale and arrangement which are part of the
human’s experience in everyday life, are applied to create low-dimensional
digital representations of complex digital data. As Chalmers (1993:378) puts
it: “our everyday world is 2.1 dimensional, therefore physical spaces of high
dimensionality are unfamiliar to most of us, and it is generally more difficult to
present, perceive and remember patterns and structures within them.”
The user’s understanding of spatialization is based on envisioning spatial
properties. Furthermore it relies on cognition of geographical space, which
involves memory, spatial reasoning and communication about objects, their
spatio-temporal and thematic attributes, as well as the relationships among these
objects in the real world (Montello, 1996).
Golledge (1995) presents a minimal set of primitives for building spatial
concepts. These include identity, location, magnitude, and time. Distance,
angle and direction, connection and linkage (nearest neighbor, proximity,
similarity etc.) are derived concepts from the first order primitive location.
Higher order spatial concepts are combinations of derived concepts. For
example, if location, magnitude, and connectivity are combined, we obtain the
concept of an ordered tree, which provides a useful metaphor or data model for
3
the concept of scale. Likewise, location may be combined with magnitude to
obtain (local) density, and build up the concept of dispersion.
A very common example for the application of spatial metaphors to
envision an abstract computer environment is the desktop metaphor developed
by Apple as a graphical user interface for the Macintosh computer. The two
dimensional view of a computer operating system as an office table, covered
with folders and documents, allows one to visually collect, process and store
digital data.
Using the spatial properties such as proximity, we typically regroup related
files or applications, by putting them into a common folder. Consequently,
hierarchies of folders can be created, to simplify navigation through “data
space”. Deeper into the hierarchy, more detailed information about the data is
revealed, thus relating to scale dependence in the real world. Moreover, by
surmounting distance with the “drag” and “drop” option, we are able to perform
actions within the computing environment, such as copying or deleting files.
Files which have to be deleted are carried to a specific place on the “office
table”, to be put into a “trash can”. Typically the trash resides somewhere at the
edges of the “office table”, neither obstructing our working environment, nor
being too close to important files.
FROM QUERIES TO BROWSING AND FILTERING
The retrieval of information from large data archives has long been an
important issue in computer science (Parsaye et al, 1989). A common problem
for information retrieval is related to the user interface of the query system. The
user interface generally provide insufficient guidance and queries often return a
huge set of undesired results (Doan et al, 1996).
The term information retrieval is being set aside by newer information
seeking strategies, such as data browsing, data mining, data warehousing, or
filtering (Shneiderman, 1996). Common to the newer information gathering
terms are their exploratory nature and the integration of sophisticated direct-
manipulation user interfaces, supporting what Shneiderman calls the Visual
Information Seeking Mantra. The mantra includes three parts: “overview first,
zoom and filter, then details-on-demand” (Shneiderman, 1996). In related work,
Doan et al (1996) propose dynamic queries, using the direct manipulation
approach, where the query process as well as the results carry a visual
component. Continuous graphical feedback supports the user in query
formulation and subsequent query refinement.
To design a query system based on spatial metaphors, Shneiderman would
have us first define the kinds of queries a user would typically perform. We can
apply these to the Alexandria Digital Library (ADL, on the Web at http://
alexandria.sdc.ucsb.edu/), a distributed digital library for geographically
referenced information. The schema in Figure 1 outlines how information
seekers can interact with ADL’s collection.
4
Map Browser
•maps
footprints
Gazeteer Catalog
textual browsing and graphic browsing of results
attributes
User Tasks
Action Handles
•zoom tools
•cross-sections
•geographic footprints
System Objects
graphical and textual refinement
User Tasks
attribute lists with "tags"
•thumbnails (graphics)
Action Handles
features
•buttons
•sliders check boxes
Results
Query OverviewZoom & Filter
Details-
on-Demand
Collection Surface
spatializations
Figure 1: Visual Browsing Query Process
In the current interface, there are three ways to query the library: entering
specific keywords, in the gazetteer (geographic search) or in the catalog (attribute
search), or use the map browser to graphically refine the search area, by zoom
and pan. In all query stages, user tasks are restricted to “known-item-searches”,
thus requiring specific keywords or geographic areas as query inputs.
Whereas specific fact finding will be well served by the described system,
exploratory querying and open-ended browsing are not supported adequately.
For example, some users might not have a well defined information need.
Others might desire to gain an overview over the entire collection first, before
deciding on a specific topic. Finally, information seekers might be interested in
discovering relationships among the items in the database, enabling them to
formulate unforeseen queries.
SPATIALIZED BROWSING IN A DIGITAL LIBRARY
“A picture is worth a thousand keywords”
Drawing upon the work of Skupin and Buttenfield (1996; 1997) who
demonstrated how spatial metaphors can be constructed for abstract information
spaces and Shneiderman’s (1996) Visual Information Seeking Mantra a
spatialized query session in ADL could be envisioned as follows:
5
The query process is divided into three stages: overview first, then zoom
and filter, and lastly, details-on-demand. The graphical user interface (GUI) for
the overview stage is a direct manipulation interface with linked windows
(Figure 2). Dynamic queries are carried out by buttons, sliders and check
boxes, which trigger an immediate graphical response by the system. Items
selected in the lists will be highlighted in the spatialized views and visa-versa.
Three spatial metaphors underlie the design of this graphical user interface,
including distance (similarity), scale (level of detail), and arrangement
(dispersion and concentration).
Distance
In Figure 2, the large window displays a landscape of catalog items that
were “hit” by a query. Items that are close together are characterized by similar
keyword sets. In the abstract data space we may interpret distance as similarity
in a metaphoric sense. Catalog items which are more related to each other will
be placed closer together than items which are less related. The distance
metaphor is based on Salton’s (1989) vector space model (keyword occurrences
in a document), and multidimensional scaling (MDS) is utilized as the
projection method (Skupin and Buttenfield, 1997).
Scale
The interface has several components designed to make the level of detail
evident to the user. Keywords can be selected in a Hierarchy Tree Window,
which will update other display windows accordingly. In the Figure, selecting
the keywords ‘aerial photograph’ and ‘cartographic material’ highlight the same
keywords in the Hierarchy Tree Window and the Keyword List Window.
Keywords can be tagged and the selected items can be promoted to the top of
the list. As check boxes are tagged, the Landscape Window is updated to
display the keyword labels on the landscape.
In the lower left corner of the Figure, a window reacts dynamically to
keyword selection by displaying bars showing the relative percentage of “hits”
that would be associated with each keyword in the collection. The Cross
Section Window represents a frequency of “hits” that could be expected by
refining the query as defined by the transect line drawn in white across the
Landscape Window. These windows operate together to help the user predict
the probable success rate for a given query as it is formulated. A tool palette
over the Landscape Window allows ‘zooming in’ on the data space to see the
landscape (and information about the collection) in more detail. Zooming tools
also modify the Keyword List and Hierarchy Tree Windows.
Arrangement
The Landscape Window in Figure 2 is a “collection surface” which offers a
visual overview of queried items. The z-values in the landscape represent the
accumulated number of hits per “region” in the collection. A high peak
indicates a high concentration of items available for that particular query.
Patterns and shapes in the landscape reveal the organization of items with
respect to each other. For example a steep cone indicates a high density of
6
similar documents correlated with a high number of hits. A low plateau on the
other hand describes a lower density of items available.
The Map, Catalog, and Gazetteer check boxes re-arrange items in the
Landscape Window to identify catalog keywords, map browser footprints, and
gazetteer features, respectively. Whereas the first representation, based on
geographic regions, is well known to the geographic information community,
the abstract keyword landscapes are not as familiar. Chalmers (1993), Atkins
(1995), and Skupin and Buttenfield (1996; 1997) have shown how effective
spatial metaphors can be utilized to construct abstract information landscapes.
Figure 2: User Interface for the Overview Query Stage
EVALUATION OF SPATIAL METAPHORS
Skupin and Buttenfield (1996: 616) point out, that there has not been any
subject testing to determine the appropriateness of the described methods for
visualization. We are not certain how people comprehend spatialized views, or
whether the components of distance, direction and so forth are understood by
viewers. Spatializations rely on the use of spatial metaphors to represent data
that are not necessarily spatial. Metaphors constitute a fundamental part of
human cognition (Lakoff, 1987). Lakoff (1987) defines the Spatialization of
Form hypothesis, which requires a metaphorical mapping from physical space
into a “conceptual” space. Consequently, image schemata which structure space
are mapped into the corresponding abstract configurations, which structure
concepts (i.e. similarity) (Lakoff, 1987: 283). To inquire how well the
7
metaphorical mapping is assimilated by a user leads us to the main research
question: What kinds of skills are needed to understand the spatializations?
Metaphor Question
Distance How well do people understand the concept of
similarity?
Scale Can people discern hierarchical order?
Arrangement Can people detect regions in the display?
Table 1: Research questions
Distance
A way to test this metaphor is by using the technique of comparative
distance judgment tasks. Consequently, the complete method of triads is used
to obtain comparative distances between stimuli (Torgerson, 1958). The
judgment tasks are presented in triads, in the form: “the keyword atlases is
more similar to the keyword cartographic material than to the keyword aerial
photographs”. To extract all relationships between the three stimuli, three
questions have to be asked, giving a triadic combination. Thus, with n stimuli
there are:
n(n1)(n2)
6
triads and
n(n1)(n2)
2
judgments (1)
for each subject. From these judgments we obtain the proportion of times any
stimulus x is judged more similar to stimulus y than to z. For example, test
subjects are presented with triads in the form:
Point and click the mouse where the keyword "atlases"
should be located, in relation to the two keywords below.
cartographic
aerial photos
material
Figure 3: Subject Test for Distance Between Keywords
The obtained proximity matrices are subjected to a multidimensional
scaling algorithm. The resulting test data space is then overlaid onto the
8
keyword vector space model to produce the spatialized view. The comparisons
of the two spatializations could provide further insights.
Scale
When examining the scale metaphor, we want to inquire how well users
comprehend hierarchical order in the data archive. Hierarchy is composed of the
spatial primitives identity, location, magnitude and connection. In Lakoff’s
(1987) terms, hierarchy is an example of the part-whole schema. A sample test
for this metaphor is to present test subjects with a set of stimuli, such as
keywords from the database, and ask them to group the stimuli according to
their rank in the hierarchy. The obtained hierarchies are then compared with the
existing hierarchical order in ADL. One could also utilize cluster analysis to
validate user choice. Utilizing the spatialized displays, test subjects have to use
the zoom tool several times, and indicate which of the presented hierarchical
keyword lists, match the keywords displayed during the zoom.
Arrangement
Spatial distribution includes areal pattern types, such as dispersion and
density. Density is unit dependent. The unit in a data archive is a document,
represented by a keyword. In the three dimensional case, magnitude is added to
the spatial distribution, giving the number of “hits”. Consequently, peaks of
concentration and valleys of dispersion indicate spatial distribution on the
collection surface. The question arises, if users can discern regions in the
spatializations and to which extent the concept of concentration and dispersion
is understood.
Tests for the arrangement metaphor include a spatialized display. Subjects
are asked to “lasso” an area which corresponds best to a given keyword.
Subjects are asked to place a mark in the display, where minimum and
maximum concentration occur. Both correctness of the answer and the speed of
response are measured.
Higher order derived concepts such as gradient or slope can be tested as
well. As the number of hits vary over the collection surface, the slope increases
either sharply or in a more gentle fashion. A steep slope indicates a short
distance between two points, as well as an abrupt change in magnitude. In
other words, documents are represented with very similar content, but with
distinctly different numbers of “hits”. The combination of similarity versus
frequency is tested with a profile display. Test subjects have to select the
appropriate statement, which best represents a section of the profile.
SUMMARY AND OUTLOOK
The use of spatial primitives to query a large data repository has been
outlined. A spatialized graphical user interface has been presented, which
allows the exploration of the holdings of the Alexandria Digital Library.
Although the concept of spatialization is not entirely new to the research
community, and several authors have demonstrated how spatial metaphors can
9
be constructed for abstract data spaces, it’s appropriateness for visualization has
not been tested yet. We have outlined what kinds of questions have to be
asked, to reveal if spatial primitives such as distance, direction and scale are
understood by viewers of spatialized displays. There is an imminent need for
empirical evaluation and validation of emerging procedures and techniques in
the visualization domain. The geographic information science community, with
it’s wealth of experience in spatial information processing, is predestined to add
valuable insights to the spatialization domain. The results of this research
should fuel the enormous potential spatialization has to offer, to overcome the
bottleneck of information processing.
ACKNOWLEDGEMENTS
This paper forms a portion of the Alexandria Digital Library Project,
jointly sponsored by NSF, NASA, and ARPA. Funding by the National
Science Foundation (NSF IRI-94-11330) is greatly appreciated. Matching funds
from the University of Colorado are also acknowledged.
REFERENCES
Atkins, P. W. (1995). The Periodic Kingdom. Basic Books, New York.
Chalmers, M. (1993). Using a Landscape Metaphor to Represent a Corpus of
Documents. In: Frank, A. U., Campari, I. (Eds.). Spatial Information Theory.
A Theoretical Basis for GIS. Lecture Notes in Computer Science, No. 716,
Springer, Berlin: 377-390.
Doan K., Plaisant C., and Shneiderman B. (1996). Query Previews in
Networked Information Systems. Proceedings of the Third Forum on Research
and Technology Advances in Digital Libraries, ADL '96, Washington, DC,
May 13-15, 1996, IEEE CS Press: 120-129.
Golledge, R. (1995). Primitives of Spatial Knowledge. In: Nyerges, T. L.,
Mark, D. M., Laurini and R., Egenhofer, M. J. (Eds.) Cognitive Aspects of
Human-Computer Interaction for Geographic Information Systems, Kluwer
Academic Publishers, Dordrecht: 29-44.
Lakoff, G. (1987). Women, fire and dangerous things. What categories reveal
about the mind, University of Chicago Press, Chicago.
Montello, D. R. (1996). Cognition of Geographic Information. Research
Priorities for Geographic Information Science. Univeristy Consortium for
Geographic Information Science, Paper #4, http://www.ncgia.ucsb.edu/other/
ucgis/research_priorities/paper4.html (July, 1997).
10
Parsaye, K., Chignell, M., Khoshafian, S., and Wong, H. (1989). Intelligent
Databases. Object-Oriented, Deductive Hypermedia Technologies. Wiley, New
York.
Salton, G. (1989). Automatic Text Processing. The Transformation, Analysis,
and Retrieval of Information by Computer. Addison-Wesley, Reading, MA.
Skupin, A. and Buttenfield, B. P. (1997). Spatial Metaphors for Display of
Information Spaces. Proceedings, AUTO-CARTO 13, Seattle, Washington, Apr.
7-10, 1997: 116-125.
Skupin, A. and Buttenfield, B.P. (1996). Spatial Metaphors for Visualizing
Very Large Data Archives. Proceedings, GIS/LIS '96, Denver, Colorado, Nov.
19-21, 1996: 607-617.
Shneiderman, B. (1996). The Eyes Have It. A Task by Data Type Taxonomy
for Information Visualizations. IEEE Symposium on Visual Languages 1996,
Proceedings, Boulder, CO, Sep. 3-6, 1996: 336-343.
Torgerson, W. S. (1958). Theory and Methods of Scaling. Wiley, New York.
... In the context of digital libraries (discussed further below), the concept of`spatialization' has been applied to the representation of similarity among objects in the library's information space (Fabrikant and Buttenfield, 1997;Skupin andButtenfield, 1996, 1997). Skupin and Buttenfield (1997: 117) define spatialization as`a projection of elements of a high-dimensional information space into a low-dimensional, potentially experiential, representational space'. ...
... A particularly important component of digital spatial library efforts involves modeling the manner in which information seekers can interact with a collection. Fabrikant and Buttenfield (1997) propose a visual browsing query process schema for interaction that includes three stages: overview, zoom and filter, and details-on-demand. Their current implementation is restricted to`known-item-searches', and thus does not adequately support exploratory querying. ...
... If we are to take full advantage of the web as an information dissemination medium, a concerted effort is needed to develop and test theories that can explain human± representation interaction in user-controllable hyperlinked geoinformation access and display environments. In the context of the Alexandria digital library project, Fabrikant and Buttenfield (1997) have taken steps in this direction. Building on the concept of spatialization of text, discussed in Skupin and Buttenfield (1996), they develop an interface that integrates three spatial metaphors (dealing with concepts of distance, scale and arrangement). ...
... Browsing image galleries has been a common way to access the content und search for certain images for a long time (Besser, 1990). However, a different approach uses geographic positions to display and contextualize information (Fabrikant, Buttenfield, 1997). This method has potential to uncover and visualize certain phenomena linked to the initial acquisition of images. ...
Article
Full-text available
The interdisciplinary research group on four-dimensional research and communication of urban history (Urban History 4D) aims to investigate and develop methods and technologies to access extensive repositories of historical media and their contextual information in a spatial model, with an additional temporal component. This will make content accessible to different target groups, researchers and the public, via a 4D Browser and an Augmented Reality app for mobile devices. One goal is to improve the accessibility of media repositories and develop suitable solutions for data preparation and information research, making extensive use of visualizations. An interdisciplinary approach is taken to ensure that visualizations for research are comprehensible and meet scientific standards. The investigation of spatial visualizations of image repositories and their visual representation in a coherent context includes frequencies, directions, and perspectives within the image material, which can be grasped through quantitative methods. This paper introduces two main investigations into (1) quantitative data visualization with photography and (2) the plausibility and perception of 3D reconstructions.
... Visual design issues --Issues include the ability to incorporate a number of capabilities into visualization, including dimensionality (spatial, symbolic, temporal), dynamism, animation, and, as mentioned above, interaction. Further research is also needed in the representation of non-spatial data in spatial formats, such as spatialization, or the process of converting abstract non-numerical information into a viewable spatial framework (Kuhn 1997, Skupin and Buttenfield 1996, Couclelis 1998, Fabrikant and Buttenfield 1997, and Fabrikant 2000. ...
Chapter
While computational scientists have for a number of years proposed an annually updated set of research priorities for visualization, it is only recently that geographers, specifically cartographers, have defined a set of research priorities for geographic visualization. Although the previous lack of a stated research direction may have been acceptable or at least understandable as GVis matured and developed over the past decade, the time has now come to review the advancements of GVis and assess its research needs for the future, as well as identify links with other research priorities in GIScience. This need is due in part to current and expected demand for GVis capabilities. Not only has the volume of data available increased, the technological capabilities are more advanced. As a result, there are more data to visualize, more ways to visualize them, and more need to understand how visualization works. A thoughtful and directed approach to structuring our understanding of the efforts to, need for, and issues relating to geographic visualization will uniquely position spatial scientists to contribute to the development of visualization in general.
... and visualizing the content with mapbased metaphors and concepts. Especially the work of Fabrikant and Buttenfield (1997) was starting a discussion of utilizing information visualization concepts and mapping principles to access large databases (Fabrikant 2000). Since then, many cartographers have started to tackle analyzing big databases and social media content, often stating that their analysis is limited by the geospatial constraints and thus aim to better understand the dynamics behind it (Nelson et al. 2015). ...
Article
Full-text available
Social media content provides direct and indirect locational information. However, simply mapping media content such as Twitter posts will only pro- vide a limited scope on the social media conversation. Assessing geotagged conversations needs to be combined with analyzing content and network structures including their dynamics. This research provides a set of structured methods towards filtering Twitter messages to remove unrelated content and better understand a public discussion. The approach is applied towards crowdsourced opinions on tanning in the United States and highlights that creating choropleth maps from unprocessed Twitter data will give a limited view on public health conversations. Cluster dendrograms and the network- focused Louvain unsupervised community detection algorithm were applied on a subset of over 25 million tanning related tweets, collected over a six week period. The research results show advertising and spamming clusters of tanning-related social media tweets, but also public discussions that relate to skin cancer and melanoma health concerns, fashion and beauty.
... Improving knowledge discovery in data-rich environments by visual means is also a key concern in the GIScience community (Buckley et al., 2000; Buttenfield et al., 2000). It is surprising, however, that most of the spatialization work is carried out outside of GIScience, with the exception of a handful of geographers (for example, Couclelis, 1998; Skupin, 2000 Skupin, , 2002a Fabrikant, 2000a,b; Fabrikant and Buttenfield, 1997; Kuhn and Blumenthal, 1996; Tilton and Andrews, 1994). It seems obvious that GIScience (particularly through its cartographic roots) is well positioned to address the challenges of designing information spaces, but GIScientists should also transfer their geovisualization know-how to the InfoVis community. ...
Chapter
Information Visualization is concerned with the art and technology of designing and implementing highly interactive, computer supported tools for knowledge discovery in large non-spatial databases. Information Visualization displays, also known as information spaces or graphic spatializations, differ from ordinary data visualization and geovisualization in that they may be explored as if they represented spatial information Information spaces are very often based on spatial metaphors such as location, distance, region, scale, etc., thus potentially affording spatial analysis techniques and geovisualization approaches for data exploration and knowledge discovery. Two major concerns in spatialization can be identified from a GIScience/ geovisualization perspective: the use of space as a data generalization strategy, and the use of spatial representations or maps to depict these data abstractions. A range of theoretical and technical research questions needs to be addressed to assure the construction of cognitively adequate spatializations. In the first part of this chapter we propose a framework for the construction of cognitively plausible semantic information spaces. This theoretical scaffold is based on geographic information theory and includes principles of ontological modeling such as semantic generalization (spatial primitives), geometric generalization (visual variables), association (source –target domain mapping through spatial metaphors), and aggregation (hierarchical organization). In the remainder of the chapter we discuss ways in which the framework may be applied towards the design of cognitively adequate spatializations.
... A recent study provides empirical evidence supporting the usability of spatialized views. The research included the creation and evaluation of a spatialization prototype to access a large document collection similar to the one depicted inFigure 15 (Fabrikant 2000; Fabrikant and Buttenfield 1997). The design and implementation of spatialized interface components were based on three spatial concepts: distance (similarity), arrangement (dispersion and concentration), and scale change (changing level of detail). ...
Article
Full-text available
Information visualization is an interdisciplinary research area in which cartographic efforts have mostly addressed the handling of geographic information. Some cartographers have recently become involved in attempts to extend geographic principles and cartographic techniques to the visualization of non-geographic information. This paper reports on current progress and future opportunities in this emerging research field commonly known as spatialization. The discussion is mainly devoted to the computational techniques that turn high-dimensional data into visualizations via processes of projection and transformation. It is argued that cartographically informed engagement of computationally intensive techniques can help to provide richer and less opaque information visu-alizations. The discussion of spatialization methods is linked to another priority area of cartographic involvement, the development of theory and principles for cognitively plausible spatialization. The paper distinguishes two equally important sets of challenges for cartographic success in spatialization research. One is the recognition that there are distinct advantages to applying a cartographic perspective in information visualization. This requires our community to more thoroughly understand the essence of cartographic activity and to explore the implications of its metaphoric transfer to non-geographic domains. Another challenge lies in cartographers becoming a more integral part of the information visualization community and actively engaging its constituent research fields.
... Visual design issues-Issues include the ability to incorporate a number of capabilities into visualization, including dimensionality (spatial, symbolic, temporal), dynamism, animation, and, as mentioned above, interaction. Further research is also needed in the representation of non-spatial data in spatial formats, such as spatialization, or the process of converting abstract non-numerical information into a viewable spatial framework ( Kuhn 1997, Skupin and Buttenfield 1996, Couclelis 1998, Fabrikant and Buttenfield 1997, and Fabrikant 2000. ...
Article
Full-text available
2 GEOGRAPHIC VISUALIZATION INTRODUCTION The human visual system is the most powerful processing system known. By combining technologies such as image processing, computer graphics, animation, simulation, multimedia, and virtual reality, computers can help present information in a new way so that patterns can be found, greater understanding can be developed, and problems can be solved. "Geographic visualization", also referred to as "geovisualization" (GVis), which focuses on visualization as it relates to spatial data, can be applied to all the stages of problem-solving in geographical analysis, from development of initial hypotheses, through knowledge discovery, analysis, presentation and evaluation. This theme should be a research priority in GIScience, with a systematic effort to advance our understanding of geographic visualization. Background In its 1987 commissioned report to the United States National Science Foundation, the Panel on Graphics, Image Processing, and Workstations defined visualization as "a method of computing...a tool both for interpreting image data fed into a computer, and for generating images from complex multi-dimensional data sets..." the goal of which is "...to leverage existing scientific methods by providing new insight through visual methods" (McCormick et al. 1987: 3). MacEachren et al. (1992: 101) expanded that view by arguing that "visualization...is definitely not restricted to a method of computing...it is first and foremost an act of cognition, a human ability to develop mental representations that allow us to identify patterns and create or impose order." Geographic visualization is now considered to encompass not only the development of theory, tools, and methods for the visualization of spatial data, it also involves understanding how the tools and methods are used for hypothesis formulation, pattern identification, knowledge construction, and the facilitation of decision making. Information visualization is generally considered to involve the use of computers to generate interactive, often animated, representations of multiple variables in multiple linked formats, the goal of which is to develop greater understanding of interactions of components of a system or distribution (Buckley 1997). Often this understanding comes from exploration of the data rather than through problem solving. This generally involves one or a few "expert, highly motivated viewers who are often engaged in ill-defined tasks such as hypothesis formulation" (DiBiase et al. 1992: 213). For this paper, our concept of geographic visualization is broader than others may consider as visualization (i.e., computer-dependent methods of data display for small groups of highly-trained individuals who are primarily interested in exploration of very large spatial data sets). For example, we also consider other media (e.g., paper, film, projected displays, holography), other data sets (non-spatial), user groups ranging in size from individuals to crowds, the cognitive process of visualization, and other computer configurations (e.g., mobile computing, wearable computers).
Article
Social media content provides direct and indirect locational information. However, simply mapping media content such as Twitter posts will only provide a limited scope on the social media conversation. Assessing geotagged conversations needs to be combined with analyzing content and network structures including their dynamics. This research provides a set of structured methods towards filtering Twitter messages to remove unrelated content and better understand a public discussion. The approach is applied towards crowdsourced opinions on tanning in the United States and highlights that creating choropleth maps from unprocessed Twitter data will give a limited view on public health conversations. Cluster dendrograms and the network-focused Louvain unsupervised community detection algorithm were applied on a subset of over 25 million tanning related tweets, collected over a six week period. The research results show advertising and spamming clusters of tanning-related social media tweets, but also public discussions that relate to skin cancer and melanoma health concerns, fashion and beauty.
Conference Paper
InfoMaps is an information visualization tool designed for personal information management and for supporting data analysis. In this paper we briefly discuss the design of InfoMaps and explain its role in finding relevant information. InfoMaps is tightly coupled with Weave, an open source framework, providing a set of data analysis and visualization tools. Weave's framework is built with session states as its core and this provides InfoMaps the ability to store the entire user's interactions as well as visualization layouts. We discuss the implications of using the Weave framework with InfoMaps and its relevance to the field of information retrieval and visual analytics.
Conference Paper
Full-text available
In a networked information system (such as the NASA Earth Observing System-Data Information System (EOS-DIS)), there are three major obstacles facing users in a querying process: network performance, data volume and data complexity. In order to overcome these obstacles, we propose a two phase approach to query formulation. The two phases are the Query Preview and the Query Refinement. In the Query Preview phase, users formulate an initial query by selecting rough attribute values. The estimated number of matching data sets is shown, graphically on preview bars which allows users to rapidly focus on a manageable number of relevant data sets. Query previews also prevent wasted steps by eliminating zero hit queries. When the estimated number of data sets is long enough, the initial query is submitted to the network which returns the metadata of the data sets for further refinement in the Query Refinement phase. The two phase approach to query formulation overcomes slow network performance, and reduces the data volume and data complexity, problems. This approach is especially appropriate for users who do not have extensive knowledge about the data and who prefer an exploratory method to discover data patterns and exceptions. Using this approach, we have developed dynamic query user interfaces to allow users to formulate their queries across a networked environment
Article
Full-text available
. In information retrieval, sets of documents are stored and categorised in order to allow for search and retrieval. The complexity of the basic information is high, with representations involving thousands of dimensions. Traditional interaction techniques therefore hide much of the complexity and structure of the modelled information, and offer access of the information by means of isolated queries and word searches. Bead is a system which takes a complementary approach, as it builds and displays an approximate model of the document corpus in the form of a map or landscape constructed from the patterns of similarity and dissimilarity of the documents making up the corpus. In this paper, emphasis is given to the influences on and principles behind the design of the landscape model and the abandonment of a `point cloud' model used in an earlier version of the system, rather than the more mathematical aspects of model construction. 1 Introduction Bead is a prototype system for the grap...
Conference Paper
A useful starting point for designing advanced graphical user interfaces is the visual information seeking Mantra: overview first, zoom and filter, then details on demand. But this is only a starting point in trying to understand the rich and varied set of information visualizations that have been proposed in recent years. The paper offers a task by data type taxonomy with seven data types (one, two, three dimensional data, temporal and multi dimensional data, and tree and network data) and seven tasks (overview, zoom, filter, details-on-demand, relate, history, and extracts)
Article
A useful starting point for designing advanced graphical user interfaces is the Visual InformationSeeking Mantra: Overview first, zoom and filter, then details-on-demand. But this is only a starting point in trying to understand the rich and varied set of information visualizations that have been proposed in recent years. This paper offers a task by data type taxonomy with seven data types (1-, 2-, 3-dimensional data, temporal and multi-dimensional data, and tree and network data) and seven tasks (overview, zoom, filter, details-on-demand, relate, history, and extract). The Eyes Have It: A Task by Data Type Taxonomy for Information Visualizations Ben Shneiderman Department of Computer Science, Human-Computer Interaction Laboratory, and Institute for Systems Research University of Maryland College Park, Maryland 20742 USA ben@cs.umd.edu Abstract: A useful starting point for designing advanced graphical user interfaces is the Visual Information-Seeking Mantra: Overview first, ...