PoleStar, a visualization specification tool inspired by Tableau. Listing 1 shows the generated Vega-lite specification.

Source publication

Voyager: Exploratory Analysis via Faceted Browsing of Visualization Recommendations

Article

Full-text available

Sep 2015

General visualization tools typically require manual specification of views: analysts must select data variables and then choose which transformations and visual encodings to apply. These decisions often involve both domain and visualization design expertise, and may impose a tedious specification process that impedes exploration. In this paper, we...

Context 1

... "}, " marktype " : " point " , " encoding " : { " x " : { " name " : " Miles_per_Gallon " , " type " : " Q " , " summarize " : " mean " }, " y " : { " name " : " Horsepower " , " type " : " Q " , " summarize " : " mean " }, " row " : { " name " : " Origin " , " type " : " N " , " sort " : [{" name " : " Horsepower " , " summarize " : " mean " , " reverse " : true }] }, " color " : {" name " : " Cylinders " , " type " : " N "} } } Listing 1. A Vega-lite specification of the visualization shown in Figure 10. The JSON object specifies a trellis of scatter plots for a data about cars. ...

View in full-text

Context 2

... conducted a user study to contrast recommendation browsing with manual chart construction, focusing on exploratory analysis of previously unseen data. We compared Voyager with PoleStar, our own implementation of a visualization specification interface ( Figure 10). ...

View in full-text

Context 3

... also features similar UI elements, including field capsules, bookmarks, and an undo mechanism. Figure 10 illustrates PoleStar's interface. The left-hand panel presents the data schema, listing all variables in the dataset. ...

View in full-text

GeoVis: a data-driven geographic visualization recommendation system via latent space encoding

Article

Apr 2024
J Visual

As one of the effective means of representing geographic information, geographic visualization can directly improve the cognitive efficiency of users who are perceiving geospatial data. The existing geographic information visualization relies heavily on the background knowledge and visualization skills the data workers own. Therefore, the geographic visualization task is usually very time-consuming and challenging. To lower the barrier of visualization of geographical data, we propose a novel recommendation system of geographic information visualization called GeoVis. This system extracts the distribution characteristics with adaptive kernel density estimation and recommends the map type (scatter, bubble, hexbin and heatmap) that can best reflect the regularity of data distribution based on latent code. The key idea of how the data-driven recommendation works is to use latent code to express and decouple data features and then learn the mapping between data features and visual styles. At the same time, this system recommends design choices (e.g., map styles and color schemes). Users only need to browse the recommendation results to realize explorations and analyses of the dataset, which will greatly improve their work efficiency. We conduct a series of evaluation experiments on the proposed system, including a case study. The experiment results show that the system is practical and effective and can perform the task of recommending informative and esthetic geographical visualization results well.

IVESA - Visual Analysis of Time-Stamped Event Sequences

Article

Full-text available

Apr 2024
IEEE T VIS COMPUT GR

Time-stamped event sequences (TSEQs) are time-oriented data without value information, shifting the focus of users to the exploration of temporal event occurrences. TSEQs exist in application domains, such as sleeping behavior, earthquake aftershocks, and stock market crashes. Domain experts face four challenges, for which they could use interactive and visual data analysis methods. First, TSEQs can be large with respect to both the number of sequences and events, often leading to millions of events. Second, domain experts need validated metrics and features to identify interesting patterns. Third, after identifying interesting patterns, domain experts contextualize the patterns to foster sensemaking. Finally, domain experts seek to reduce data complexity by data simplification and machine learning support. We present IVESA, a visual analytics approach for TSEQs. It supports the analysis of TSEQs at the granularities of sequences and events, supported with metrics and feature analysis tools. IVESA has multiple linked views that support overview, sort+filter, comparison, details-on-demand, and metadata relation-seeking tasks, as well as data simplification through feature analysis, interactive clustering, filtering, and motif detection and simplification. We evaluated IVESA with three case studies and a user study with six domain experts working with six different datasets and applications. Results demonstrate the usability and generalizability of IVESA across applications and cases that had up to 1,000,000 events.

Guided Visual Interactive Exploration and Labeling of Industrial Sensor Data

Book

Mar 2024

Tristan Funken

Comprehensive, accurately labeled sensor datasets are an essential prerequisite for training supervised machine learning models used for tasks such as quality control, predictive maintenance, and defect detection in the manufacturing industry. However, the provision of such datasets still poses two specific challenges: first, performing exploratory data analysis (EDA) to provide data scientists with the necessary knowledge in the domain context to label the data, and second, a lack of visual interactive labeling (VIAL) approaches to efficiently annotate large volumes of industrial sensor data with accurate labels. This dissertation proposes the innovative VIEDAL process, integrating guidance systems for both EDA and VIAL tasks. Drawing from real-world use cases, this thesis presents a detailed system design to support each task, addressing feasibility and usefulness through a comprehensive design study. The EDA guidance system records domain expert interactions to generate guided sessions for novices, while the VIAL guidance system incorporates unsupervised and active learning approaches to streamline dataset annotation. Through user studies, the effectiveness of the proposed systems is evaluated, demonstrating reproducibility of expert key insights through generated EDA sessions and faster creation high quality labeled datasets. Additionally, this work discusses approaches for transferring recorded EDA sessions and VIAL models between use cases to streamline future guidance system implementations. The results of this thesis provide a foundational for further research to expedite the creation of labeled sensor datasets, thereby facilitating faster development and integration of machine learning models for enhancing production processes in the manufacturing industry.

V-FRAMER: Visualization Framework for Mitigating Reasoning Errors in Public Policy

Conference Paper

Full-text available

Feb 2024

Existing data visualization design guidelines focus primarily onconstructing grammatically-correct visualizations that faithfullyconvey the values and relationships in the underlying data. However,a designer may create a grammatically-correct visualizationthat still leaves audiences susceptible to reasoning misleaders, e.g.by failing to normalize data or using unrepresentative samples. Reasoningmisleaders are especially pernicious when presenting publicpolicy data, where data-driven decisions can affect public health, safety, and economic development. Through textual analysis, aformative evaluation, and iterative design with 19 policy communicators,we construct an actionable visualization design framework,V-FRAMER, that effectively synthesizes ways of mitigating reasoningmisleaders. We discuss important design considerations forframeworks like V-FRAMER, including using concrete examplesto help designers understand reasoning misleaders, and using ahierarchical structure to support example-based accessing. We furtherdescribe V-FRAMER’s congruence with current practice andhow practitioners might integrate the framework into their existingworkflows. Related materials available at: https://osf.io/q3uta/.

Conclusion

Chapter

Full-text available

Dec 2023

This chapter briefly summarizes the content of the book and describes practical concerns of visualizing time-oriented data in real-world data settings. Visual analytics is briefly outlined as a modern approach that combines visualization, interaction, and computational analysis more tightly to facilitate data analysis activities better. Finally, research opportunities for future work are discussed.

A Heuristic Approach for Dual Expert/End-User Evaluation of Guidance in Visual Analytics

Article

Full-text available

Oct 2023
IEEE T VIS COMPUT GR

Guidance can support users during the exploration and analysis of complex data. Previous research focused on characterizing the theoretical aspects of guidance in visual analytics and implementing guidance in different scenarios. However, the evaluation of guidance-enhanced visual analytics solutions remains an open research question. We tackle this question by introducing and validating a practical evaluation methodology for guidance in visual analytics. We identify eight quality criteria to be fulfilled and collect expert feedback on their validity. To facilitate actual evaluation studies, we derive two sets of heuristics. The first set targets heuristic evaluations conducted by expert evaluators. The second set facilitates end-user studies where participants actually use a guidance-enhanced system. By following such a dual approach, the different quality criteria of guidance can be examined from two different perspectives, enhancing the overall value of evaluation studies. To test the practical utility of our methodology, we employ it in two studies to gain insight into the quality of two guidance-enhanced visual analytics solutions, one being a work-in-progress research prototype, and the other being a publicly available visualization recommender system. Based on these two evaluations, we derive good practices for conducting evaluations of guidance in visual analytics and identify pitfalls to be avoided during such studies.

Dead or Alive: Continuous Data Profiling for Interactive Data Science

Article

Full-text available

Oct 2023
IEEE T VIS COMPUT GR

Profiling data by plotting distributions and analyzing summary statistics is a critical step throughout data analysis. Currently, this process is manual and tedious since analysts must write extra code to examine their data after every transformation. This inefficiency may lead to data scientists profiling their data infrequently, rather than after each transformation, making it easy for them to miss important errors or insights. We propose continuous data profiling as a process that allows analysts to immediately see interactive visual summaries of their data throughout their data analysis to facilitate fast and thorough analysis. Our system, AutoProfiler, presents three ways to support continuous data profiling: (1) it automatically displays data distributions and summary statistics to facilitate data comprehension; (2) it is live, so visualizations are always accessible and update automatically as the data updates; (3) it supports follow up analysis and documentation by authoring code for the user in the notebook. In a user study with 16 participants, we evaluate two versions of our system that integrate different levels of automation: both automatically show data profiles and facilitate code authoring, however, one version updates reactively (“live”) and the other updates only on demand “dead”). We find that both tools, dead or alive, facilitate insight discovery with 91% of user-generated insights originating from the tools rather than manual profiling code written by users. Participants found live updates intuitive and felt it helped them verify their transformations while those with on-demand profiles liked the ability to look at past visualizations. We also present a longitudinal case study on how AutoProfiler helped domain scientists find serendipitous insights about their data through automatic, live data profiles. Our results have implications for the design of future tools that offer automated data analysis support.

Design Cognition in Data Visualization

Chapter

Full-text available

Oct 2023

Paul C. Parsons

In this chapter I introduce the field of design cognition and its relevance to data visualization. I outline two historically dominant paradigms of design cognition. The first, promoted by Herbert Simon in the 1970s, is the rational problem solving paradigm which is based on information processing psychology and problem solving theory. The second, promoted by Donald Schön in the 1980s, is the reflective practice paradigm which is based on constructivist philosophy and situated views of cognition. I describe some of their strengths and weakness, and some attempts to reconcile their differences. Underlying philosophical issues pertaining to cognition and epistemology are briefly discussed. I then examine implications of these two paradigms for four data visualization topics: defining, automating, modeling, and teaching data visualization design. In discussing these topics, possible avenues of future research are proposed.

Interactive Generation of Narrative Visualizations for Risk Communication

Thesis

Full-text available

Sep 2023

Anna Kleinau

The urgent need for better health communication is especially evident when considering the millions of deaths caused each year worldwide by lifestyle choices and behavioral risk factors. These deaths show that simply researching and understanding the risks caused by these factors is insufficient, as they must also be effectively communicated to the general public. Narrative visualization can help achieve this by exploring how data can be visualized and incorporated into an engaging story that will capture the interest of the public. In my thesis, I investigate how risk visualizations can be designed for a general audience, and subsequently, I develop a tool to assist story authors in creating data-based risk visualizations for their data stories. With the tool I developed, the story author can receive recommendations for risk factors based on their data set. Using methods from visualization and annotation generation, these recommendations are presented as suitable visualizations based on current research in risk communication. The visualizations adapt to the current intention of the story author, whether it is to explore the data set, convince the public to change their behavior, or educating the public about risk factors for a specific disease. Using the tool, the story author can select the most relevant risk factors, customize the visualizations and export them for integration into their data story. I evaluated my tool with domain experts, as well as the resulting visualizations with the general public. The results demonstrate that the tool is usable and that the visualizations are understandable and engaging for the general public. By combining the research fields of risk communication, visualization generation and narrative visualization, my work provides a novel approach to support domain experts in communicating risks and risk factors to the general public. With my approach, experts can create annotated, data-based risk visualizations without requiring expertise in risk communication, visualization design or narrative visualization.

mTreeIllustrator: A Mixed-Initiative Framework for Visual Exploratory Analysis of Multidimensional Hierarchical Data

Article

Aug 2023

Multidimensional hierarchical (mTree) data are very common in daily life and scientific research. However, mTree data exploration is a laborious and time-consuming process due to its structural complexity and large dimension combination space. To address this problem, we present mTreeIllustrator, a mixed-initiative framework for exploratory analysis of multidimensional hierarchical data with faceted visualizations. First, we propose a recommendation pipeline for the automatic selection and visual representation of important subspaces of mTree data. Furthermore, we design a visual framework and an interaction schema to couple automatic recommendations with human specifications to facilitate progressive exploratory analysis. Comparative experiments and user studies demonstrate the usability and effectiveness of our framework.

PoleStar, a visualization specification tool inspired by Tableau. Listing 1 shows the generated Vega-lite specification.

Contexts in source publication

Citations