ArticlePDF Available

Data, Information, and Knowledge in Visualization

Article

Data, Information, and Knowledge in Visualization

Abstract and Figures

In visualization, we use the terms data, information and knowledge extensively, often in an interrelated context. In many cases, they indicate different levels of abstraction, understanding, or truthfulness. For example, "visualization is concerned with exploring data and information," "the primary objective in data visualization is to gain insight into an information space," and "information visualization" is for "data mining and knowledge discovery." In other cases, these three terms indicate data types, for instance, as adjectives in noun phrases, such as data visualization, information visualization, and knowledge visualization. These examples suggest that data, information, and knowledge could serve as both the input and output of a visualization process, raising questions about their exact role in visualization.
Content may be subject to copyright.
Position Paper
This work has been submitted to the IEEE for possible publication. Copyright may
be transferred without notice, after which this version may no longer be accessible.
Data, Information and Knowledge in Visualization
Table 1. Ackoff’s definitions of data, information and
knowledge in perceptual and cognitive space [1].
Category Definition
data symbols
information data that are processed to be useful, providing answers
to ‘who’, ‘what’, ‘where’, and ‘when’ questions
knowledge application of data and information, providing answers to
‘how’ questions
Table 2. Our definitions of data, information and
knowledge in computational space.
Category Definition
data computerized representations of models and attributes
of real or simulated entities
information data that represents the results of a computational
process, such as statistical analysis, for assigning
meanings to the data, or the transcripts of some
meanings assigned by human beings
knowledge data that represents the results of a computer-simulated
cognitive process, such as perception, learning,
association, and reasoning, or the transcripts of some
knowledge acquired by human beings
In visualization, data, information and knowledge are
three terms used extensively, often in an interrelated context.
In many cases, they are used to indicate different levels of
abstraction, understanding or truthfulness. For example,
‘visualization is concerned with exploring data and
information [5]; ‘the primary objective in data visualization
is to gain insight into an information space’ [6]; and
information visualization’ is for ‘data mining and
knowledge discovery’ [4]. In other cases, these three terms
are used to indicate data types, for instances, as adnominals
in noun phases, such as data visualization, information
visualization and knowledge visualization. These examples
suggest that data, information and knowledge could be both
the input and output of a visualization process, raising
questions about the exact role of data, information and
knowledge in visualization.
There are many competing definitions of data,
information and knowledge, in different aspects of computer
science and engineering and in other disciplines such as
psychology, management sciences, epistemology (theory of
knowledge). The use of the three terms is not consistent,
and often conflicting. For instance, in computing, data and
information are often used in an interchangeable manner
(e.g., data processing and information processing; data
management and information management). From a system
perspective, data is referred to as bits and bytes stored on or
communicated via, a digital medium. Thus, any
computerized representations, including knowledge
representations are types of data. On the other hand, from
the perspective of knowledge-based systems, data is a
simpler form of knowledge.
In epistemology, ‘ALL AGREE THAT KNOWLEDGE is
valuable, but the agreement about knowledge tends
to end there. Philosophers disagree about what
knowledge is, about how you get it, and even about
whether there is any to be gotten. Keith Lehrer [5]
Several attempts were made to clarify taxonomically the
terminology used in the visualization community (e.g.,
[3,8,10]). However, the terms of data, information and
knowledge remain ambiguous. This article is not another
attempt to offer a different taxonomy for visualization.
Instead, we present a clarification that differentiates these
three terms from the perspective of visualization processes.
Furthermore, we examine the current and future role of
information and knowledge in the development of the
visualization technology.
Definitions of data, information and knowledge
Since we can read data, grasp information and acquire
knowledge, we must first differentiate these three terms in
the perceptual and cognitive space. Because we can also
store data, information and knowledge in the computer, we
thereby must also differentiate them in the computational
space.
Perceptual and Cognitive Space
The Data-Information-Knowledge-Wisdom (DIKW)
hierarchy [1] is a popular model for classifying the human’s
understanding in the perceptual and cognitive space. The
origin of this hierarchy can be traced to the poet T.S. Eliot
[3]. Table 1 shows the definitions of data, information and
knowledge given by Ackoff [1].
Let be the set of all possible explicit and implicit
human memory. The former encompasses the memory of
events, facts and concepts, and the understanding of their
meanings, context and associations. The latter encompasses
all non-conscious forms of memory, such as emotional
responses, skills, habits and so on [8]. We can thus focus on
three subsets of memory, data Õ , info Õ , and know Õ ,
where data, info, and know are the sets of all possible
explicit and implicit memory about data, information, and
knowledge, respectively.
Despite the lack of an agreeable set of the definitions of
data, information and knowledge, there is a general
consensus that data is not information, and information is
not knowledge. Without diverting from the scope of this
article, here we simply assume that data, info, know are not
mutually disjoint, and none of them is a subset of another.
Without losing generality, we can generalize know to
include also wisdom, and any other high-level of
understanding, in the context of DIKW hierarchy.
Computational Space
Let be the set of all possible representations in
computer memory. Similarly, we may consider three subsets
of representations, data, info, and know. However, data is
an overloaded term in computing. For example, it is
common to treat programs as a special class of data. In
many cases, it is not possible to distinguish programs from
other data. Applying the same analogy, a computer
Min Chen
Swansea University
David Ebert
P
urdue University
Hans Hagen
Technical University
of Kaiserslautern
Robert Laramee
Swansea University
Robert van Liere
CWI, Amsterdam
Kwan-Liu Ma
University of
California, Davis
William Ribarsky
University of North
Carolina, Charlotte
Gerik Scheuermann
University of Leipzig
Deborah Silver
R
utgers University
representation of a piece of information or knowledge is just
a particular form of data. A computer representation of
visualization is also a form of visual data.
We hence propose to use the definitions in Table 2 for
the following discussions. With such definitions, we have
data = , info Õ data, and know Õ data. The definitions in
Table 2 can easily be extended to include categories of raw
data (rawdata), volume data (volume), flow data (flow),
software (software), videos (video), mathematical models
(mathmodel), visual data (visual), and so forth. This also
makes sense when using the category names as the
adnominals in noun phases, such as volume visualization
and software visualization.
Figure 1 shows a typical visualization process, where
instances of data, information and knowledge in both
computational space and perceptual and cognitive space are
illustrated. Hence the purpose of visualization can be
rationalized by the difficulties for humans to acquire a
sufficient amount of information (Pinfo Õ info) or knowledge
(Pknow Õ know) directly from a dataset (Cdata Õ data). The
process of visualization is a function that maps from data to
the set of all imagery data, image. It transforms a dataset
Cdata to a visual representation Cimage, which facilitates a
more efficient and effective cognitive process for acquiring
Pinfo and Pknow.
A visualization process is a search process
Given a dataset Cdata, a user first makes some decisions
about visualization tools to be used for exploring the dataset.
The user then experiments with different controls, such as
styles, layout, viewing position, color maps, transfer
functions, etc. until a collection of satisfactory visualization
results, Cimage, is obtained. Depending on the visualization
tasks, satisfaction can be in many forms. For example, the
user may have obtained sufficient information or knowledge
about the dataset, or may have obtained the most
appropriate illustration about the data to assist the
knowledge acquisition process of others.
Such a visualization process is fundamentally the same
as a typical search process, except that it is usually much
more complex than trying out a few keywords with a search
engine. In visualization, the tools for the ‘search’ tasks are
usually application-specific (e.g., network, flow, volume
visualization). The parameter space for the ‘search’ is
normally huge (e.g., exploring many viewing positions or
trying out many different transfer functions). The user
interaction for the ‘search’ sometimes can be very slow,
especially in handling very large datasets. This is depicted
in Figure 1 by a large interaction box that connects from the
user to the control parameters, Cctrl, which are also data.
In fact, over the past two decades, much of the emphasis
has been placed on improving the speed of visualization
tools, so the user can carry out the interactive ‘search’ faster,
can explore bigger parameter space, and hopefully find
satisfactory results quicker.
However, with the growing amount of data and
increasing availability of different visualization techniques,
the ‘search’ space for a visualization process is also getting
larger and larger. Like the internet search problem,
interactive visualization alone is no longer adequate.
Information-assisted visualization
In recent years, an assortment of techniques were
introduced for visualizing complex features in data by
relying on information abstracted from the data. Note that
here we consider info in the computational space as well as
info in the perceptual and cognitive space. Figure 2
illustrates an information-assisted visualization process.
Resolving Ambiguity Using the Set Notations
We can resolve the ambiguity in various statements that consist of the terms
of data, information and knowledge by tagging such terms using the set
notations, , and their subsets.
American National Standards Institute, Directory for Information
Systems, X3.172, 1990:
‘Data (data): a representation of facts, concepts, or instructions in a
formalized manner suitable for communication, interpretation, or processing by
human beings or by automatic means.’
‘Information (info or info): the meaning that is currently assigned’ (by
human beings or computers) ‘to data (data) by means of the conventions applied
to those data (data).’
J. Foley and B. Ribarsky, “Next-generation data visualization tools”, in L.
Rosenblum et al. (eds.) Scientific Visualization: Advances and
Challenges, Academic Press, 1994:
‘A useful definition of visualization might be the binding (or mapping) of
data (data) to representations (visual, auditory, tactile, etc.) that can be perceived.
The types of bindings could be visual, auditory, tactile, etc., or a combination of
these.’
R. M. Friedhoff and T. Kiley, “The Eye of the Beholder”, Computer
Graphics World, 13(8):46-59, 1990:
‘If researchers try to read the data (data), usually presented as vast numeric
matrices, they will take in the information (info) at snail’s pace. If the
information (info) is rendered graphically, however, they can assimilate it at a
much faster rate.’
B. H. McCormick, T. A. DeFanti and M. D. Brown (eds.), “Visualization in
Scientific Computing”, Computer Graphics, 21(6), 1987:
Visualization ‘transforms the symbolic (data) into the geometric (visual),
enabling researchers to observe their simulations and computations’.
S. Card, J. Mackinlay and B. Shneiderman, Readings in Information
Visualization: Using Vision to Think, Morgan Kaufmann, 1999:
Information (info) visualization is ‘the use of computer-supported,
interactive, visual representations (visual) of abstract data (info) to amplify
cognition’.
W. Stallings, Data and Computer Communications, (4th ed.), Macmillan,
1994:
‘Information (info or info) is born when data (data) are interpreted’ (by
human beings or computers).
M. J. Usher, Information Theory for Information Technologists,
Macmillan, 1984:
‘Information (info and info) has both qualitative and quantitative aspects.’
‘The amount of information (info and info) conveyed in an event depends on the
probability of the event.’
R. A. Frost, Introduction to Knowledge Based Systems, Collins, 1986:
‘Knowledge (know) is the symbolic representation of aspects of some named
universe of discourse.’ ‘We define data (facts or rawdata but not data since
know Õ data) as the symbolic representation of simple aspects of some named
universe of discourse.’ ‘The amount of information (info) obtained by the
receiver of a message is related to the amount by which that message reduces
receiver's uncertainty about some aspect of the universe of discourse (Shannon).’
E. Turban, Decision Support and Expert Systems, Prentice-Hall, 1995:
‘Knowledge (know): understanding, awareness, or familiarity acquired
through education or experience. Anything that has been learned, perceived,
discovered, inferred, or understood. The ability to use information (info and/or
info).’
‘Knowledge base: the assembly of all the information (info) and knowledge
(know) of a specific field of interest.’
There are techniques that make use of information captured
in the visualization process to improve the efficiency and
effectiveness of visualization. Examples of such information
are given in Table 3.
Table 3. Examples of information used in visualization.
information categories examples
information about the input dataset
abstract geometric and
temporal characteristics skeletons, features, events
topological properties contour tree for volume data,
vector field topology, tracking
graph for time-varying data
statistical indicators and
information measurements histogram, correlation, importance,
certainty, entropy, mutual informa-
tion, local statistical complexity
information about the results color histogram, level of cluttering
information about the process interaction patterns, provenance
Information about users’ perception response time, accuracy
In information-assisted visualization, the user is
provided with a second visualization pipeline (see Figure 2),
which typically displays the information about the input
dataset, but can also present attributes of the visualization
process, the properties of the results, or characteristics of the
user’s perceptual behaviors. The user uses such information
to reduce the ‘search’ space for optimal control parameters,
hence making the interaction much more cost-effective.
Such techniques provide an intrinsic interface between
the scientific visualization and information visualization
communities. With the increasing size and complexity of
data, the use of information to aid visualization will
inevitably become a necessity rather than an option.
Knowledge-assisted visualization
In a visualization process, the knowledge of the user is
an indispensable part of visualization. For instance, the user
may assign specific colors to different objects in
visualization according to certain domain knowledge. The
user may choose certain viewing positions because the
visualization results can reveal more meaningful
information or a more problematic scenario that requires
further investigation.
Meanwhile, the lack of certain knowledge by the user is
often a major obstacle in deploying visualization techniques.
The user may not have received adequate training about
how to specify transfer functions. The user may not have
sufficient time or navigation skills to explore all possible
viewing positions.
Both scenarios suggest the need for knowledge-assisted
visualization. The objectives of knowledge-assisted
visualization include sharing domain knowledge among
different users, and reducing the burden upon users to
acquire knowledge about complex visualization techniques.
It also enables the visualization community to learn and
model the best practice, and to develop powerful
visualization infrastructures evolutionarily.
In fact, some general or domain knowledge has already
been incorporated into various visualization systems,
intentionally or unintentionally. For example, a default
transfer function in a volume visualization system may
capture the domain knowledge about a specific modality. If
one could collect a large repository of such knowledge, it
would be possible for a visualization system to choose an
appropriate transfer function according to the information
about the input datasets. Figure 3 shows a visualization
pipeline supported by a knowledge base (know), which
stores the knowledge representations captured from expert
users. Rule-based reasoning can be utilized to establish an
appropriate set, or several optional sets, of control
parameters, which can significantly reduce the ‘search’
space, especially for inexperienced users. The system
component for reasoning is commonly referred to as an
Visualization
Cctrl Pinfo
Pknow
Cdata
Cimage
computational space
Interaction
perceptual and
cognitive space
Visualization
Cctrl Pinfo
Pknow
Cdata
Cimage
computational space
Interaction
perceptual and
cognitive space
Figure 1. A typical visualization process.
Visualization
Cctrl
Pinfo
Pknow
Cdata
Cimage
computational space
perceptual and
cognitive space
Processing Cinfo
Interaction
Visualization
supporting visualization pipeline
Cimage
Visualization
Cctrl
Pinfo
Pknow
Cdata
Cimage
computational space
perceptual and
cognitive space
Processing Cinfo
Interaction
Visualization
supporting visualization pipeline
Cimage
Figure 2. Information-assisted visualization.
Visualization
Cctrl
Pinfo
Pknow
Cdata
Cimage
computational space
perceptual and
cognitive space
Processing Cinfo
Interaction
Reasoning Cknow
knowledge-based system
Visualization
Cctrl
Pinfo
Pknow
Cdata
Cimage
computational space
perceptual and
cognitive space
Processing Cinfo
Interaction
Reasoning Cknow
knowledge-based system
Figure 3. Knowledge-assisted visualization with acquired knowledge
representations.
Visualization
Cctrl
Pinfo
Pknow
Cdata
Cimage
computational space
perceptual and
cognitive space
Processing Cinfo
Interaction
Reasoning
Cdata Processing Reasoning
Cinfo
Cknow
other visualization processes
knowledge supporting infrastructure
Visualization
Cctrl
Pinfo
Pknow
Cdata
Cimage
computational space
perceptual and
cognitive space
Processing Cinfo
Interaction
Reasoning
Cdata Processing Reasoning
Cinfo
Cknow
other visualization processes
knowledge supporting infrastructure
Figure 4. Knowledge-assisted visualization with simulated cognitive processing.
inference engine in knowledge based systems (or expert
systems).
The shortcomings of such a system include the
difficulties in specifying comprehensively what knowledge
to capture and the inconvenience in collecting knowledge
from experts. This constrains the deployment of such a
system to specific application domains.
An alternative approach is to establish a visualization
infrastructure, where data about visualization processes are
systematically collected, processed and analyzed. Using
case-based reasoning, knowledge can be inferred from
cases of successes and failures, the common associations
between datasets and control parameters, and many other
patterns exhibited by the systems, the users and the
interactions. Such knowledge may include a popular
approach, commonly-used parameter sets, the best practice,
an optimization strategy, and so forth. Figure 4 shows such
an infrastructure.
Such an infrastructure is general-purpose, and can
support multiple application domains. It can potentially
enable applications to benefit from the best practice and
software developed for other applications. The development
of such an infrastructure can be built upon the advances in
other areas of computing technologies, including semantic
computing, autonomic computing, knowledge-based
systems, data warehousing, machine learning, and search
engine optimization.
Conclusions
Similar to the development of many other computing
technologies, for example, speech processing, computer
vision, web technology, one likely development path for
visualization is
from offline visualization
to interactive visualization,
to information-assisted visualization,
then to knowledge-assisted visualization.
Interactive visualization has reached a matured status.
There is a significant amount of ongoing development
currently in information-assisted visualization. With a large
amount of information collected locally and globally, it is
inevitable that there will be a transition to knowledge-
assisted visualization.
As a discipline, visualization has thrived on helping
application users to transfer data (data) in the computational
space to information (info) and knowledge (know) in the
perceptual and cognitive space. As a discipline, we need
infrastructures to collect our own data about visualization
processes, and to transfer such data to information and
knowledge, which helps further our understanding as well as
enhance the visualization technology.
Acknowledgement
We would like to thank Heike Jänicke and Professor
Gerhard Brewka at University of Leipzig, and Dr. Phil
Grant and Dr. John Sharp at Swansea University for their
advices and comments on terminologies used in information
theory, knowledge-based systems, and other aspects of
computing.
References
1. R. L. Ackoff, “From data to wisdom”, Journal of Applies
Systems Analysis, vol. 16, 1989, pp. 3-9.
2. S. K. Card, J. D. Mackinlay and B. Shneiderman, Readings in
Information Visualization: Using Vision to Think, Morgan
Kaufmann Publishers, San Francisco, 1999.
3. E. H. Chi, “A taxonomy of visualization techniques using the
data state reference model”, Proc. IEEE Symposium on
Information Visualization, 2000, pp. 69-75.
4. U. Fayyad, G. G. Grinstein and A. Wierse, Information
Visualization in Data Mining and Knowledge Discovery,
Morgan Kaufmann Publishers, San Francisco, 2002.
5. K. Lehrer, Theory of Knowledge, Westview Press, 1990.
6. G. Scott, G. Domik, T.-M. Rhyne, K. W. Brodlie and B. S.
Santos, Definitions and Rationale for Visualization, http://www.
siggraph.org/education/materials/HyperVis/visgoals/visgoal2.htm,
(accessed in April 2008).
7. N. Sharma, The Origin of the “Data Information Knowledge
Wisdom” Hierarchy, http://www-personal.si.umich.edu/~nsharma/
dikw_origin.htm, (accessed in April 2008).
8. B. Shneiderman, “The eyes have it: a task by data type
taxonomy for information visualizations”, Proc. IEEE
Symposium on Visual Languages, 1996, pp. 336-343.
9. E. E. Smith and S. M. Kosslyn, Cognitive Psychology: Mind
and Brain, Pearson Prentice-Hall, Upper Saddle River, New
Jersey, 2007.
10. M. Tory and T. Moller, “Rethinking visualization: A high-
Level taxonomy”, Proc. IEEE Symposium on Information
Visualization, 2004, pp.151-158.
Isosurface topology
Isosurface topology, which is typically represented as a
contour tree, provides an abstract insight into the structural
relationship and connectivity between isosurfaces in a
dataset. In volume visualization, such information can assist
users in distinguishing features in different topological
zones, comprehending complex relationships between
isosurfaces, and designing effective transfer functions [8].
Local statistical complexity
Local statistical complexity (LSC) is an information-
theoretic measure, which tells how much information from
the local past is required to predict the dynamics in the local
future. Given a time-varying dataset, we can assign each
data point an LSC value. Higher LSC values indicate
regions that feature an extraordinary temporal evolution,
whereas, lower values indicate temporal patterns that occur
frequently in the dataset [5]. As demonstrated in Figure 5,
such information can assist users in generating a
visualization that highlights temporally-important features.
Examples
There are many examples of information-assisted visualization. On the other
hand, the development of knowledge-assisted visualization is very much in its
infancy. Here we selectively describe several examples of information-assisted
visualization in the literature, whilst accentuating the use, or potential use, of
knowledge in a few visualization systems. These examples are intended to
reinforce the viewpoints of this article, rather than to provide a comprehensive
survey.
Examples of information-assisted visualization
Curve-skeleton
Curve-skeletons are 1D geometrical representations abstracted from 3D
objects in an input dataset. Such information can be used to aid visualization
tasks, including virtual navigation, reduced-model formulation, visualization
improvement, and animation. For example, in virtual endoscopy, curve-skeletons
are used to specify collision free paths for navigation through human organs [2].
Figure 5. The local statistical complexity (LSC) of a flow around a delta wing
(gray triangle). Four streamsurfaces indicate the vortices on top of the delta
wing. The two isosurfaces in blue and light blue separate regions that hold LSC
values within the range [14.7;15] and [11;15] respectively. High LSC values point
the user to distinctive regions that may feature significant temporal events. The
image is provided by Heike Jänicke, University of Leipzig [5].
Data abstraction quality
Measuring the quality of visualization results, such as visual density and
clutter, provides users with useful guidance in synthesizing the most effective
visualization. One of such measurements is data abstraction quality, measuring
the degree to which the visualization results convey the original dataset. Such
information enables users to determine the optimal abstraction level for a given
visualization task, and to compare different visualization methods in terms of
their capability of maintaining dominant characteristics of the original dataset
while reducing the size and detail of the data [3].
Examples of knowledge-assisted visualization
Viewpoint mutual information
From Figures 2 and 3, we can observe that one transition path of
information-assisted visualization to knowledge-assisted visualization is to
automate the process of reasoning about the information abstracted from the
input data. A classical example of such a transition is [7], where viewpoint
mutual information (VMI) that measures the dependence or correlation between
a set of viewpoints and a set of objects in a dataset is used to determine the
optimal viewpoint. The fundamental difference between this approach and the
above-mentioned examples is that users do not make decision according to the
processed VMI. Instead, a relatively simple rule for minimizing VMI is used to
determine viewpoint transformation automatically. Such a rule can be seen as a
piece of knowledge hard-coded in the system.
Pre-determined ranking
In [6], a noticeable amount of generic knowledge is captured as ranks of
different visualization designs. This enables the visualization system to
automatically take users through a design process for creating a visualization.
The stored ranks and ranking conditions are essentially a collection of expert
knowledge.
Ontology mapping
The determination of visualization designs and
parameters should depend on the input data. One approach
is to extract semantic information from the input data, and
try to find the best match with the semantic information of
visualization designs (e.g., treemaps, graphs) and the
associated parameters (e.g., size, axes). In [4], three
ontologies, which are knowledge representations, are used
to store (a) the domain-specific semantics about a class of
input data, (b) the semantics about available visualization
designs, and (c) the ontological mapping from (a) to (b).
With these three ontologies, different visualization designs
are dynamically ranked according to the input data, and a
set of highly-ranked visualization designs are presented to
the user automatically.
Workflow management
VisTrails is a visualization infrastructure that provides
users with workflow management [1]. It is capable of
capturing and storing a huge amount of data about input
datasets, user interaction and visualization results in
visualization processes. VisTrails exhibits some of the
primary characteristics of the knowledge supporting
infrastructure shown in Figure 4, though it currently has
limited automated reasoning capability. Such an
infrastructure has great potential to be developed into an
infrastructure for knowledge-assisted visualization.
References
1. L. Bavoil, S. P. Callahan, P. J. Crossno, J. Freire, C. E.
Scheidegger, C. T. Silva and H. T. Vo, “VisTrails: enabling
interactive multiple-view visualizations”, Proc. IEEE
Visualization, 2005, pp. 135-142.
2. N. D. Cornea, D. Silver, P. Min, “Curve-skeleton properties,
applications, and algorithms”, IEEE Transactions on
Visualization and Computer Graphics, vol. 13, no. 3, 2007, pp.
530-548.
3. Q. Cui; M. Ward, E. A. Rundensteiner and J. Yang,
“Measuring data abstraction quality in multiresolution
visualizations”, IEEE Transactions on Visualization and
Computer Graphics, vol. 12, no. 5, 2006, pp. 709-716,
4. O. Gilson, N. Silva, P.W. Grant and M. Chen, “From web data
to visualization via ontology mapping”, Computer Graphics
Forum (special issue for EuroVis2008), vol. 27, no. 3, 2008,
pp. 959-966.
5. H. Jänicke, A. Wiebel, G. Scheuermann and W. Kollmann,
“Multifield visualization using local statistical complexity”,
IEEE Transactions on Visualization and Computer Graphics,
vol. 13, no. 6, 2007, pp. 1384-1391.
6. J. D. Mackinlay, P. Hanrahan and C. Stolte, “Show Me:
automatic presentation for visual analysis”, IEEE Transactions
on Visualization and Computer Graphics, vol. 13, no. 6, 2007,
pp. 1137-1144.
7. I. Viola, M. Feixas, M. Sbert and M. E. Gröller, “Importance-
driven focus of attention”, IEEE Transactions on Visualization
and Computer Graphics, vol. 12, no. 5, 2006, pp. 933-940.
8. G. H. Weber, S. E. Dillard, H. Carr, V. Pascucci, B. Hamann,
“Topology-controlled volume rendering”, IEEE Transactions
on Visualization and Computer Graphics, vol. 13, no. 2, 2007,
pp. 330 - 341.
... Chen et al. [12] defined the data, information, and knowledge in visualization based on the DIKW pyramid [55]. Besides, they proposed the concept of knowledge-assisted visualization that transfers knowledge in the human brain (perceptual and cognitive space) into control parameters of visualizations (computational space) through interactions. ...
Preprint
Embedding is a common technique for analyzing multi-dimensional data. However, the embedding projection cannot always form significant and interpretable visual structures that foreshadow underlying data patterns. We propose an approach that incorporates human knowledge into data embeddings to improve pattern significance and interpretability. The core idea is (1) externalizing tacit human knowledge as explicit sample labels and (2) adding a classification loss in the embedding network to encode samples' classes. The approach pulls samples of the same class with similar data features closer in the projection, leading to more compact (significant) and class-consistent (interpretable) visual structures. We give an embedding network with a customized classification loss to implement the idea and integrate the network into a visualization system to form a workflow that supports flexible class creation and pattern exploration. Patterns found on open datasets in case studies, subjects' performance in a user study, and quantitative experiment results illustrate the general usability and effectiveness of the approach.
... Humans improve their K i by combining new insights with existing knowledge. The existing knowledge itself can in turn be divided into, on the one hand, domain knowledge and, on the other hand, operational knowledge [21]. However, in the context of SA, we consider this differentiation too vague. ...
Article
Full-text available
The Internet-of-Things and ubiquitous cyber-physical systems increase the attack surface for cyber-physical attacks. They exploit technical vulnerabilities and human weaknesses to wreak havoc on organizations’ information systems, physical machines, or even humans. Taking a stand against these multi-dimensional attacks requires automated measures to be combined with people as their knowledge has proven critical for security analytics. However, there is no uniform understanding of information security knowledge and its integration into security analytics activities. With this work, we structure and formalize the crucial notions of knowledge that we deem essential for holistic security analytics. A corresponding knowledge model is established based on the Incident Detection Lifecycle, which summarizes the security analytics activities. This idea of knowledge-based security analytics highlights a dichotomy in security analytics. Security experts can operate security mechanisms and thus contribute their knowledge. However, security novices often cannot operate security mechanisms and, therefore, cannot make their highly-specialized domain knowledge available for security analytics. This results in several severe knowledge gaps. We present a research prototype that shows how several of these knowledge gaps can be overcome by simplifying the interaction with automated security analytics techniques.
... This scale also builds upon and is aligned with the NASA Autonomous Systems Taxonomy [18] by organizing attributes into a prioritized hierarchy of increasing levels of self-awareness, with each sequential level including preceding capabilities. Additionally, for the purposes of this work, the word 'knowledge' will be used over 'data' or 'information' due to the implication of an assigned meaning or understanding within the system [36] (see Table 4). ...
Article
Future deep-space crewed exploration plans include long duration missions (>1000 days) that will be constrained by lengthy transmission delays and potential occultations in communications, as well as infrequent resupply opportunities and likely periods of habitat unoccupancy. In order to meet the high level of autonomy needed for these missions, many essential capabilities and knowledge previously accomplished through ground support and human operators must now be designed into onboard systems to enable increasing self-reliance. Emergent technologies, including autonomous systems, have the potential to be mission enabling in deep space; however, as these technologies are often low-TRL and without defined mass, power, or volume, their net impact to the design must be assessed through alternative means, especially during the early planning phases. This paper proposes the concept of designing for self-reliant space habitats as the foundation for assessing potential contributions from the integration of emergent technologies. The term ‘self-reliance’ can be thought of as a combination of the spacecraft system and onboard crew's knowledge (self-awareness) and capabilities (self-sufficiency) independent of external intervention. In order to provide context for human spaceflight, these terms are first derived from related terrestrial applications. Subsequently, a methodology for characterizing the degree of self-awareness and self-sufficiency in a space habitat is outlined to provide designers with logic for assessing the contributions of emergent technologies to the overall self-reliance of the habitat as needed to allow future Earth-independence. The definitions and characterization logic provided in this work offer a systematic process for designing toward self-reliance in future deep space missions.
... Los datos deben ser presentados y visualizados de tal manera que su análisis e interpretación permita a la IES una toma de decisiones correcta para la gestión académica, cuando se hace referencia al término visualización, los datos, la información y el conocimiento son tres términos que se utilizan ampliamente, a menudo en un contexto relacionado. En muchos casos, los mismos se utilizan para indicar diferentes niveles de abstracción, comprensión o veracidad y la literatura demuestra que las técnicas de visualización de datos trabajan directamente con estos términos (Chen et al., 2009). llegar a ser complejas y necesitar de soporte adicional para alcanzar una plena comprensión de los resultados obtenidos", sostiene Eppler (2007). ...
Chapter
Full-text available
Esta investigación tuvo como objetivo realizar un diagnóstico acerca del nivel de conocimiento y uso de las herramientas tecnológicas por parte de los docentes de la Fundación FINESEC, en su práctica docente; con el fin ulterior, si es el caso, de la implementación de un taller didáctico de adiestramiento para mejorar sus capacidades en el uso y manejo de tales herramientas. El estudio presenta un enfoque cuantitativo, de carácter descriptivo- proyectiva, de campo y de corte transeccional. Se utilizó la encuesta tipo cuestionario para la recolección de datos. Los resultados muestran que el 66,7% de los docentes encuestados tienen un nivel bajo en el manejo de recursos tecnológicos digitales (TIC); y el 33,3% un nivel muy bajo. En conclusión, en base al diagnóstico se determinó la necesidad de implementar un taller de adiestramiento en el manejo de los recursos tecnológicos. Posterior a la realización del mismo, se pudo constatar el nivel de mejoramiento de los docentes en el manejo de las TIC, pues, ahora son actores activos en la construcción creativa de múltiples contenidos educativos a los fines de mejorar el rendimiento de sus estudiantes.
... Knowledge-assisted visualization enables users to externalize and share their knowledge, such as parameter settings and notes, to support visual exploration [10,21]. As also discussed by He et al. [26], the study results imply that characterizing insights automatically through user interactions or referred entities can facilitate personalized insight recommendations. ...
Preprint
Full-text available
One of the primary purposes of visualization is to assist users in discovering insights. While there has been much research in information visualization aiming at complex data transformation and novel presentation techniques, relatively little has been done to understand how users derive insights through interactive visualization of data. This paper presents a crowdsourced study with 158 participants investigating the relation between entity-based interaction (an action + its target entity) and the resulting insight. To this end, we generalized the interaction with an existing CO2 Explorer as entity-based interaction and enabled users to input notes and refer to relevant entities to assist their narratives. We logged interactions of users freely exploring the visualization and characterized their externalized insights about the data. Using entity-based interactions and references to infer insight characteristics (category, overview versus detail, and prior knowledge), we found evidence that compared with interactions, entity references improved insight characterization from slight/fair to fair/moderate agreements. To interpret prediction outcomes, feature importance and correlation analysis indicated that, e.g., detailed insights tended to have more mouse-overs in the chart area and cite the vertical reference lines in the line chart as evidence. We discuss study limitations and implications on knowledge-assisted visualization, e.g., insight recommendations based on user exploration.
Article
In this study, based on the lack of data management issue in the document "2020-2023 National Smart City Strategies and Action Plan" prepared by the Turkish Ministry of Environment, Urbanization and Climate Change, an answer was sought in the context of public administration to the question of "how urban data can be managed". The aim is to reveal the deficiencies in the data management in the public administration literature and the aforementioned document and to propose solutions accordingly. With this purpose, the literature on smart city, data, data management concepts and approaches has been reviewed. It has been understood that the main deficiency of the literature is the neglect of socio-technical viewpoint and the lack of reviews data management from the perspective of public administration. With this finding, approaches that offer solutions to smart city design, strategies and policies in the context of data management are evaluated.
Article
With the advances of telemetries and computational power, the question for firms is no longer what can be collected but what shall be collected. It is easy to add sensors and gather a vast amount of data but without knowing their purpose, the gains are marginal. In the field of Value-Driven Design, value is quantified and measured to ensure an overall higher output, something crucial for complex and ambiguous domains such as Product-Service Systems. This paper aims at finding the connection between PSS value and operational data from an industrial perspective. By using Participatory Action Research in a manufacturing firm, a value-data framework and method were developed that succeed in connecting these two extremities as well as visualize them with the support of network graph theory.
Article
Full-text available
Teams of human operators and artificial intelligent agents (AIAs) in multi-agent systems present a unique set of challenges to team coordination. This research endeavors to employ a machine learning framework to estimate a set of ranks among quality goals, where the quality goals are designed to help communicate important elements of operator intent to aid the development of a Shared Mental Model among members in a multi-agent team. Using a representation referred to as the Operationalized Intent model to capture quality goals relevant to “how” the operator would like to execute the team’s mission, this paper details the development and evaluation of a random forest algorithm to estimate operator priorities. Estimation is structured as a label ranking problem in which quality goals, which constrain “how” work is to be conducted, are ranked according to their priority. Modifying an existing label ranking algorithm, we demonstrate that the Operationalized Intent Estimator-Random Forest (OIE-RF) can estimate quality goal rankings more accurately than a situation baseline which is derived by observing the variability among operators. OIE-RF demonstrates stability in dynamic testing and the ability to use explicit communication and operator identity to increase accuracy. This exploratory research opens a new avenue for improving coordination and performance of human-agent teams.
Article
Data mining is the science of extracting information or ‘knowledge’ from data. It is a task commonly executed on cloud computing resources, personal computers and laptops. However, what about smartphones? Despite the fact that these ubiquitous mobile devices now offer levels of hardware and performance approaching that of laptops, locally executed model-training using data mining methods on smartphones is still notably rare. On-device model-training offers a number of advantages. It largely mitigates issues of data security and privacy, since no data is required to leave the device. It also ensures a self-contained, fully-portable data mining solution requiring no cloud computing or network resources and able to operate in any location. In this paper, we focus on the intersection of smartphones and data mining. We investigate the growth in smartphone performance, survey smartphone usage models in previous research and look at recent developments in locally-executed data mining on smartphones.
Article
Full-text available
The Data Information Knowledge and Wisdom Hierarchy (DIKW) has been gaining popularity in many domains. While there has been a lot of articulation of the hierarchy itself, the origins of this ubiquitous and frequently used hierarchy are largely unexplored. In this short piece we trace the trails of this hierarchy. Like an urban legend, it’s everywhere yet few know where it came from.