Fig 10 - uploaded by Dominik Moritz
Content may be subject to copyright.
PoleStar, a visualization specification tool inspired by Tableau. Listing 1 shows the generated Vega-lite specification.

PoleStar, a visualization specification tool inspired by Tableau. Listing 1 shows the generated Vega-lite specification.

Source publication
Article
Full-text available
General visualization tools typically require manual specification of views: analysts must select data variables and then choose which transformations and visual encodings to apply. These decisions often involve both domain and visualization design expertise, and may impose a tedious specification process that impedes exploration. In this paper, we...

Contexts in source publication

Context 1
... "}, " marktype " : " point " , " encoding " : { " x " : { " name " : " Miles_per_Gallon " , " type " : " Q " , " summarize " : " mean " }, " y " : { " name " : " Horsepower " , " type " : " Q " , " summarize " : " mean " }, " row " : { " name " : " Origin " , " type " : " N " , " sort " : [{" name " : " Horsepower " , " summarize " : " mean " , " reverse " : true }] }, " color " : {" name " : " Cylinders " , " type " : " N "} } } Listing 1. A Vega-lite specification of the visualization shown in Figure 10. The JSON object specifies a trellis of scatter plots for a data about cars. ...
Context 2
... conducted a user study to contrast recommendation browsing with manual chart construction, focusing on exploratory analysis of previously unseen data. We compared Voyager with PoleStar, our own implementation of a visualization specification interface ( Figure 10). ...
Context 3
... also features similar UI elements, including field capsules, bookmarks, and an undo mechanism. Figure 10 illustrates PoleStar's interface. The left-hand panel presents the data schema, listing all variables in the dataset. ...

Citations

... Visualization recommendation system (Vartak et al. 2017) is an important way to help users explore data. It has a novel operation interface and convenient user interaction design, which greatly decreases the technical difficulty of visualization and improves the understanding of abstract data (Qin et al. 2020 Early studies are based on rules, including Voyager (Wongsuphasawat et al. 2015), DIVE (Hu et al. 2018) and Show me (Mackinlay et al. 2007). Although their visualizations look impressive and artistic, these methods are greatly dependent on the guidelines made by them, which has obvious limitations considering visualization design space will have explosive growth when data dimension increases. ...
Article
As one of the effective means of representing geographic information, geographic visualization can directly improve the cognitive efficiency of users who are perceiving geospatial data. The existing geographic information visualization relies heavily on the background knowledge and visualization skills the data workers own. Therefore, the geographic visualization task is usually very time-consuming and challenging. To lower the barrier of visualization of geographical data, we propose a novel recommendation system of geographic information visualization called GeoVis. This system extracts the distribution characteristics with adaptive kernel density estimation and recommends the map type (scatter, bubble, hexbin and heatmap) that can best reflect the regularity of data distribution based on latent code. The key idea of how the data-driven recommendation works is to use latent code to express and decouple data features and then learn the mapping between data features and visual styles. At the same time, this system recommends design choices (e.g., map styles and color schemes). Users only need to browse the recommendation results to realize explorations and analyses of the dataset, which will greatly improve their work efficiency. We conduct a series of evaluation experiments on the proposed system, including a case study. The experiment results show that the system is practical and effective and can perform the task of recommending informative and esthetic geographical visualization results well.
... Our feature filtering technique has its root in the rank-by-feature framework [90], in our case applied to TSEQs. The incorporated metadata filtering technique was inspired by faceted search and browsing techniques [11,93,111], whereas our clustering-based filtering approach is inspired by exploratory search approaches [18]. In contrast, domain experts did not offer strong demand for event-based filtering [112]. ...
Article
Full-text available
Time-stamped event sequences (TSEQs) are time-oriented data without value information, shifting the focus of users to the exploration of temporal event occurrences. TSEQs exist in application domains, such as sleeping behavior, earthquake aftershocks, and stock market crashes. Domain experts face four challenges, for which they could use interactive and visual data analysis methods. First, TSEQs can be large with respect to both the number of sequences and events, often leading to millions of events. Second, domain experts need validated metrics and features to identify interesting patterns. Third, after identifying interesting patterns, domain experts contextualize the patterns to foster sensemaking. Finally, domain experts seek to reduce data complexity by data simplification and machine learning support. We present IVESA, a visual analytics approach for TSEQs. It supports the analysis of TSEQs at the granularities of sequences and events, supported with metrics and feature analysis tools. IVESA has multiple linked views that support overview, sort+filter, comparison, details-on-demand, and metadata relation-seeking tasks, as well as data simplification through feature analysis, interactive clustering, filtering, and motif detection and simplification. We evaluated IVESA with three case studies and a user study with six domain experts working with six different datasets and applications. Results demonstrate the usability and generalizability of IVESA across applications and cases that had up to 1,000,000 events.
... Visualization recommendations rule-based chart type [145,192] statistical relevance single visualization [147,212,228] multiple visualizations [53,138,218,238,239] deep learning single visualization [103] multiple visualizations [171,185] visualization sequences [60] visualizations tree [130] Recommendations based on user preferences rule-based single visualization [84,85] typically encompass the statistical distribution of data subsets [147], as well as statistically interesting views that display correlations, clusters, outliers, and anomalies [212,228]. Multiple tools have been developed to recommend visualizations based on the statistical properties of data, considering the interestingness of multiple variables and chart types, e.g., [53,138,218]. ...
... Multiple tools have been developed to recommend visualizations based on the statistical properties of data, considering the interestingness of multiple variables and chart types, e.g., [53,138,218]. For instance, Voyager, a system proposed by Wongsuphasawat et al. [238,239], is a frequently cited mixed-initiative system that facilitates faceted browsing of recommended charts. Voyager allows users to explore dataset dimensions and suitable visualizations, thereafter automatically recommending views related to the currently specified user chart based on statistical and perceptual measures. ...
Book
Comprehensive, accurately labeled sensor datasets are an essential prerequisite for training supervised machine learning models used for tasks such as quality control, predictive maintenance, and defect detection in the manufacturing industry. However, the provision of such datasets still poses two specific challenges: first, performing exploratory data analysis (EDA) to provide data scientists with the necessary knowledge in the domain context to label the data, and second, a lack of visual interactive labeling (VIAL) approaches to efficiently annotate large volumes of industrial sensor data with accurate labels. This dissertation proposes the innovative VIEDAL process, integrating guidance systems for both EDA and VIAL tasks. Drawing from real-world use cases, this thesis presents a detailed system design to support each task, addressing feasibility and usefulness through a comprehensive design study. The EDA guidance system records domain expert interactions to generate guided sessions for novices, while the VIAL guidance system incorporates unsupervised and active learning approaches to streamline dataset annotation. Through user studies, the effectiveness of the proposed systems is evaluated, demonstrating reproducibility of expert key insights through generated EDA sessions and faster creation high quality labeled datasets. Additionally, this work discusses approaches for transferring recorded EDA sessions and VIAL models between use cases to streamline future guidance system implementations. The results of this thesis provide a foundational for further research to expedite the creation of labeled sensor datasets, thereby facilitating faster development and integration of machine learning models for enhancing production processes in the manufacturing industry.
... Moreover, existing work guides the choice of which graph type to choose to maximize perceptual precision when reading values [5,9] or judging correlations [20], maximizing the discriminability of color palettes [59], or creating effective designs for prescribed lower-level perceptual tasks [48,55]. Much of this advice has also been formalized within rule-based recommender systems which provide more guidance, including APT [38], SAGE [53], Show Me within Tableau [39], Voyager [71], and Draco [45]. ...
Conference Paper
Full-text available
Existing data visualization design guidelines focus primarily onconstructing grammatically-correct visualizations that faithfullyconvey the values and relationships in the underlying data. However,a designer may create a grammatically-correct visualizationthat still leaves audiences susceptible to reasoning misleaders, e.g.by failing to normalize data or using unrepresentative samples. Reasoningmisleaders are especially pernicious when presenting publicpolicy data, where data-driven decisions can affect public health, safety, and economic development. Through textual analysis, aformative evaluation, and iterative design with 19 policy communicators,we construct an actionable visualization design framework,V-FRAMER, that effectively synthesizes ways of mitigating reasoningmisleaders. We discuss important design considerations forframeworks like V-FRAMER, including using concrete examplesto help designers understand reasoning misleaders, and using ahierarchical structure to support example-based accessing. We furtherdescribe V-FRAMER’s congruence with current practice andhow practitioners might integrate the framework into their existingworkflows. Related materials available at: https://osf.io/q3uta/.
... Enabling people to browse and filter for suitable visualization techniques according to different criteria as suggested in Chapter 7 is only a first step. Visualization recommendation (see Kriglstein et al., 2014;Wongsuphasawat et al., 2016) and guidance approaches (see Ceneda et al., 2017;Ceneda et al., 2018) can offer additional support during the data analysis. ...
Chapter
Full-text available
This chapter briefly summarizes the content of the book and describes practical concerns of visualizing time-oriented data in real-world data settings. Visual analytics is briefly outlined as a modern approach that combines visualization, interaction, and computational analysis more tightly to facilitate data analysis activities better. Finally, research opportunities for future work are discussed.
... 4.2.1, we assume users are exposed to our heuristics only after experiencing the guidance to judge its effectiveness. For the study, we employed the open-source guidance system "Voyager" (see Fig. 5), a mixed-initiative system that enables users to perform a guided exploration of a dataset [52]. In terms of guidance, the system offers different kinds of recommendations to support the selection of appropriate visualizations to perform the analysis. ...
... Moreover, when the user selects a specific visualization, the system can recommend additional data dimensions to be explored as well as alternative encodings of the same data. Figure 5: Voyager [52], the guidance-enhanced system for our user evaluation. ...
... Controllable Explainable Expressive Visible Relevant Flexible Timely Figure 6: Results of the user evaluation of Voyager [52]. For space reasons, we do not report the ratings but simply color-coded the table according to the ratings. ...
Article
Full-text available
Guidance can support users during the exploration and analysis of complex data. Previous research focused on characterizing the theoretical aspects of guidance in visual analytics and implementing guidance in different scenarios. However, the evaluation of guidance-enhanced visual analytics solutions remains an open research question. We tackle this question by introducing and validating a practical evaluation methodology for guidance in visual analytics. We identify eight quality criteria to be fulfilled and collect expert feedback on their validity. To facilitate actual evaluation studies, we derive two sets of heuristics. The first set targets heuristic evaluations conducted by expert evaluators. The second set facilitates end-user studies where participants actually use a guidance-enhanced system. By following such a dual approach, the different quality criteria of guidance can be examined from two different perspectives, enhancing the overall value of evaluation studies. To test the practical utility of our methodology, we employ it in two studies to gain insight into the quality of two guidance-enhanced visual analytics solutions, one being a work-in-progress research prototype, and the other being a publicly available visualization recommender system. Based on these two evaluations, we derive good practices for conducting evaluations of guidance in visual analytics and identify pitfalls to be avoided during such studies.
... In general, this automation helps alleviate the burden of specifying charts so that users can focus more on insights rather than how to produce a specific chart [15]. Some systems automate visual presentation and then rank charts according to metrics of interest such as high correlation [8], charts that satisfy a particular pattern in the data [45], or contain attributes of interest [50]. Closely related to our work is the Profiler system, which checks data for common quality issues such as missing data or outliers, and presents potentially interesting charts to the user [20]. ...
Article
Full-text available
Profiling data by plotting distributions and analyzing summary statistics is a critical step throughout data analysis. Currently, this process is manual and tedious since analysts must write extra code to examine their data after every transformation. This inefficiency may lead to data scientists profiling their data infrequently, rather than after each transformation, making it easy for them to miss important errors or insights. We propose continuous data profiling as a process that allows analysts to immediately see interactive visual summaries of their data throughout their data analysis to facilitate fast and thorough analysis. Our system, AutoProfiler, presents three ways to support continuous data profiling: (1) it automatically displays data distributions and summary statistics to facilitate data comprehension; (2) it is live, so visualizations are always accessible and update automatically as the data updates; (3) it supports follow up analysis and documentation by authoring code for the user in the notebook. In a user study with 16 participants, we evaluate two versions of our system that integrate different levels of automation: both automatically show data profiles and facilitate code authoring, however, one version updates reactively (“live”) and the other updates only on demand “dead”). We find that both tools, dead or alive, facilitate insight discovery with 91% of user-generated insights originating from the tools rather than manual profiling code written by users. Participants found live updates intuitive and felt it helped them verify their transformations while those with on-demand profiles liked the ability to look at past visualizations. We also present a longitudinal case study on how AutoProfiler helped domain scientists find serendipitous insights about their data through automatic, live data profiles. Our results have implications for the design of future tools that offer automated data analysis support.
... The most well-known early work was from Mackinlay on his presentation tool APT [31], which viewed the design of graphical representations as fundamentally a search problem aiming to optimize effectiveness and expressiveness. Subsequent research in this space has informed the design of systems like Tableau [30], SAGE [38], Voyager [67], Draco [40], and numerous others in recent years. ...
Chapter
Full-text available
In this chapter I introduce the field of design cognition and its relevance to data visualization. I outline two historically dominant paradigms of design cognition. The first, promoted by Herbert Simon in the 1970s, is the rational problem solving paradigm which is based on information processing psychology and problem solving theory. The second, promoted by Donald Schön in the 1980s, is the reflective practice paradigm which is based on constructivist philosophy and situated views of cognition. I describe some of their strengths and weakness, and some attempts to reconcile their differences. Underlying philosophical issues pertaining to cognition and epistemology are briefly discussed. I then examine implications of these two paradigms for four data visualization topics: defining, automating, modeling, and teaching data visualization design. In discussing these topics, possible avenues of future research are proposed.
... However, there is growing research on how visualizations can be automatically created from data. Such visualization generation tools can be generally usable [62] or designed with specific purposes in mind [7,17,24]. ...
... The additional fact extraction makes such tools well-suited for initial data exploration. A noteworthy software implementing this is Voyager by Wongsuphasawat et al. [62]. Their interface presenting visualization recommendations is shown in Figure 2.3. ...
... Visualization generation tools can be structured in multiple ways. The structure used by Wongsuphasawat et al. [62] in Voyager uses a visualization browser for the user to interact with and select visualizations. Visualizations are created through the process of first a recommendation engine creating interesting visualization specifications, which are passed through to a compiler and subsequent renderer creating the visualizations from those specifications. ...
Thesis
Full-text available
The urgent need for better health communication is especially evident when considering the millions of deaths caused each year worldwide by lifestyle choices and behavioral risk factors. These deaths show that simply researching and understanding the risks caused by these factors is insufficient, as they must also be effectively communicated to the general public. Narrative visualization can help achieve this by exploring how data can be visualized and incorporated into an engaging story that will capture the interest of the public. In my thesis, I investigate how risk visualizations can be designed for a general audience, and subsequently, I develop a tool to assist story authors in creating data-based risk visualizations for their data stories. With the tool I developed, the story author can receive recommendations for risk factors based on their data set. Using methods from visualization and annotation generation, these recommendations are presented as suitable visualizations based on current research in risk communication. The visualizations adapt to the current intention of the story author, whether it is to explore the data set, convince the public to change their behavior, or educating the public about risk factors for a specific disease. Using the tool, the story author can select the most relevant risk factors, customize the visualizations and export them for integration into their data story. I evaluated my tool with domain experts, as well as the resulting visualizations with the general public. The results demonstrate that the tool is usable and that the visualizations are understandable and engaging for the general public. By combining the research fields of risk communication, visualization generation and narrative visualization, my work provides a novel approach to support domain experts in communicating risks and risk factors to the general public. With my approach, experts can create annotated, data-based risk visualizations without requiring expertise in risk communication, visualization design or narrative visualization.
... On the other hand, some automatic tools have been developed to reduce the technique threshold of visualization, including rule-based recommendation tools [9,10] ranking mechanic-based tools [11,10], machine learning-based tools [12,11] mixed-initiative tools [13]. These tools are mainly designed for tabular data. ...
Article
Multidimensional hierarchical (mTree) data are very common in daily life and scientific research. However, mTree data exploration is a laborious and time-consuming process due to its structural complexity and large dimension combination space. To address this problem, we present mTreeIllustrator, a mixed-initiative framework for exploratory analysis of multidimensional hierarchical data with faceted visualizations. First, we propose a recommendation pipeline for the automatic selection and visual representation of important subspaces of mTree data. Furthermore, we design a visual framework and an interaction schema to couple automatic recommendations with human specifications to facilitate progressive exploratory analysis. Comparative experiments and user studies demonstrate the usability and effectiveness of our framework.