Conference Paper

VizQL: a language for query, analysis and visualization

Authors:
To read the full-text of this research, you can request a copy directly from the author.

Abstract

Conventional query languages such as SQL and MDX have limited formatting and visualization capabilities. Thus, although powerful queries can be composed, another layer of software is needed to report or present the results in a useful form to the analyst. VizQL™ is designed to fill that gap. VizQL evolved from the Polaris system at Stanford, which combined query, analysis and visualization into a single framework [1].VizQL is a formal language for describing tables, charts, graphs, maps, time series and tables of visualizations. These different types of visual representations are unified into one framework, making it easy to switch from one visual representation to another (e.g. from a list view to a cross-tab to a chart). Unlike current charting packages and like query languages, VizQL permits an unlimited number of picture expressions. Visualizations can thus be easily customized and controlled. VizQL is a declarative language. The desired picture is described; the low-level operations needed to retrieve the results, to perform analytical calculations, to map the results to a visual representation, and to render the image are generated automatically by the query analyzer. The query analyzer compiles VizQL expressions to SQL and MDX and thus VizQL can be used with relational databases and datacubes. The current implementation supports Hyperion Essbase, Microsoft SQL Server, Microsoft Analysis Services, MySQL, Oracle, as well as desktop data sources such as CSV and Excel files. This analysis phase includes many optimizations that allow large databases to be browsed interactively. VizQL enables a new generation of visual analysis tools that closely couple query, analysis and visualization.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... In shelf builders, users map data columns to visual attributes, typically in a manner that is motivated by a principled visualization framework, such as VizQL [24] or the Grammar of Graphics [95]. Tableau [80] and Charticulator [67] are prominent examples of this paradigm. ...
... After pasting in the code into the template body (via the Body tab of Fig. 1c), several automated suggestions are provided on how the data fields could be abstracted as template parameters. Clicking through the suggestions replaces the "age" and "people" data fields ( Figure [23][24]. She uses the settings popover ( Fig. 6) to specify their allowed data roles-she could have also done so using the Params text box (Fig. 1c). ...
... As JSON-mediated visualization grammars continue to gain popularity, additional languages will inevitably emerge to solve problems unaddressed in prior efforts. Future languages could support more complex rendering schemes, focusing on particular domains such as geospatial analytics [106], 3D visual analytics, pivot tables (perhaps simplifying the language of VizQL [24]), or even on the chart recommendation language CompassQL [111]-which would enable task-specific variations of Voyager [98]. ...
Preprint
Interfaces for creating visualizations typically embrace one of several common forms Textual specification enables fine-grained control, shelf building facilitates rapid exploration, while chart choosing promotes immediacy and simplicity. Ideally these approaches could be unified to integrate the user- and usage-dependent benefits found in each modality, yet these forms remain distinct. We propose parameterized declarative templates, a simple abstraction mechanism over JSON-based visualization grammars, as a foundation for multimodal visualization editors. We demonstrate how templates can facilitate organization and reuse by factoring the more than 160 charts that constitute Vega-Lite's example gallery into approximately 40 templates. We exemplify the pliability of abstracting over charting grammars by implementing -- as a template -- the functionality of the shelf builder Polestar (a simulacra of Tableau) and a set of templates that emulate the Google Sheets chart chooser. We show how templates support multimodal visualization editing by implementing a prototype and evaluating it through an approachability study.
... (1) Visualization Specifications Visualization specifications provide various ways that users can specify what they want. There have been a great many studies from both visualization [1,2,[14][15][16] and database community [3,[17][18][19] on visualization specifications. We include it in this survey for two reasons: ...
... High-level languages [2,3,16,18,36,[70][71][72][73] encapsulate the details of visualization construction, such as the mapping function, as well as some properties for marks such as canvas size, legend, and other properties. ...
... Echarts [16,73] is a latest development in declarative visualization languages designed to support quick visualization creation for non-programmers. VizQL [3] develops from the Polaris system [20] and is the visualization Fig. 3 Example of low-and high-level visualization languages. The target visualization (➂) is a bar chart showing the passenger_num of different destinations. ...
Article
Full-text available
Data visualization is crucial in today’s data-driven business world, which has been widely used for helping decision making that is closely related to major revenues of many industrial companies. However, due to the high demand of data processing w.r.t. the volume, velocity, and veracity of data, there is an emerging need for database experts to help for efficient and effective data visualization. In response to this demand, this article surveys techniques that make data visualization more efficient and effective. (1) Visualization specifications define how the users can specify their requirements for generating visualizations. (2) Efficient approaches for data visualization process the data and a given visualization specification, which then produce visualizations with the primary target to be efficient and scalable at an interactive speed. (3) Data visualization recommendation is to auto-complete an incomplete specification, or to discover more interesting visualizations based on a reference visualization.
... Hierarchical structures are integral to visualization algebras such as VizQL [5,14]. VizQL defines algebraic operators over data attributes to compose small-multiple display, which are manifested as interactions to drag-and-drop attributes onto x-and y-axis "shelves". ...
... Hierarchies are used in multi-dimensional databases to drill-down or roll-up, and in visualization systems to zoom in or zoom out. Visualization formalisms like VizQL [5,14] rely on such hierarchies to define operators like nest. ...
Preprint
Comparison is a core task in visual analysis. Although there are numerous guidelines to help users design effective visualizations to aid known comparison tasks, there are few formalisms that define the semantics of comparison operations in a way that can serve as the basis for a grammar of comparison interactions. Recent work proposed a formalism called View Composition Algebra (VCA) that enables ad hoc comparisons between any combination of marks, trends, or charts in a visualization interface. However, VCA limits comparisons to visual representations of data that have an identical schema, or where the schemas form a strict subset relationship (e.g., comparing price per state with price, but not with price per county). In contrast, the majority of real-world data - temporal, geographical, organizational - are hierarchical. To bridge this gap, this paper presents an extension to VCA (called VCAH) that enables ad hoc comparisons between visualizations of hierarchical data. VCAH leverages known hierarchical relationships to enable ad hoc comparison of data at different hierarchical granularities. We illustrate applications to hierarchical and Tableau visualizations.
... The optimization techniques used in these systems are often inaccessible to visualization developers at large, who are not necessarily experts in performance optimizations. Furthermore, current general-purpose data visualization tools [5,18,9] provide limited support for the developer to create visual exploration applications at scale. ...
... In a seminal work, Wilkinson introduces a grammar of graphics [24] and its implementation (VizML), forming the basis of the subsequent research on visualization specification. Drawing from Wilkinson's grammar of graphics, Polaris [22] (commercialized as Tableau) uses a table algebra, which later evolved to VizQL [9], the underlying representation of Tableau visualizations. Wickham introduces ggplot2 [23], a widely-popular package in the R statistical language, based on Wilkinson's grammar. ...
Conference Paper
Full-text available
Scalable interactive visual data exploration is crucial in many domains due to increasingly large datasets generated at rapid rates. Details-on-demand provides a useful interaction paradigm for exploring large datasets, where the user starts at an overview, finds regions of interest, zooms in to see detailed views, zooms out and then repeats. This paradigm is the primary user interaction mode of widely-used systems such as Google Maps, Aperture Tiles and Fore-Cache. These earlier systems, however, are highly customized with hardcoded visual representations and optimizations. A more general framework is needed to facilitate the development of visual data exploration systems at scale. In this paper, we present Kyrix, an end-to-end system for developing scalable details-on-demand data exploration applications. Kyrix provides the developer with a declarative model for easy specification of general visualizations. Behind the scenes, Kyrix utilizes a suite of performance optimization techniques to achieve a response time within 500 ms for various user interactions. We also report results from a performance study which shows that a novel dynamic fetching scheme adopted by Kyrix outperforms tile-based fetching used in traditional systems.
... Although the VA community (19 references) and the visualization community (12) can be seen as the target audience for this survey, around half of the 66 publications are not focused on VA or visualization. Application-driven publications (25) contribute the most to this (external) group. ...
... All research spin-offs have approached the emerging market with distinct skill-sets manifesting the diverse approaches in the commercial VA sector today. For example, Tableau's core architecture resides on VizQL [25], a declarative query definition language that translates user actions into database queries and the respective data responses back into graphical representations. Similarly, Spotfire's architecture builds on top of IVEE: An information visualization & exploration environment [24], a research prototype for the dynamic queries idea in which the database query process is translated into visual metaphors. ...
Article
Five years after the first state-of-the-art report on Commercial Visual Analytics Systems we present a reevaluation of the Big Data Analytics field. We build on the success of the 2012 survey, which was influential even beyond the boundaries of the InfoVis and Visual Analytics (VA) community. While the field has matured significantly since the original survey, we find that innovation and research-driven development are increasingly sacrificed to satisfy a wide range of user groups. We evaluate new product versions on established evaluation criteria, such as available features, performance, and usability, to extend on and assure comparability with the previous survey. We also investigate previously unavailable products to paint a more complete picture of the commercial VA landscape. Furthermore, we introduce novel measures, like suitability for specific user groups and the ability to handle complex data types, and undertake a new case study to highlight innovative features. We explore the achievements in the commercial sector in addressing VA challenges and propose novel developments that should be on systems' roadmaps in the coming years.
... Tableau [15] supports data visualization in a comprehensive manner, where the user can connect to different data sources to select the data to be analyzed and the tool suggests multiple options for interpreting the data. Along with VizQL [5], a specification language and ShowMe [7] which builds over [5] to automatically present data as small sets of multiple views, Tableau provides an exhaustive set of options focusing on the user experience. VizRec [18], Voyager [20], [6], SeeDB [19] are other work that explore various aspects of visualization. ...
... Tableau [15] supports data visualization in a comprehensive manner, where the user can connect to different data sources to select the data to be analyzed and the tool suggests multiple options for interpreting the data. Along with VizQL [5], a specification language and ShowMe [7] which builds over [5] to automatically present data as small sets of multiple views, Tableau provides an exhaustive set of options focusing on the user experience. VizRec [18], Voyager [20], [6], SeeDB [19] are other work that explore various aspects of visualization. ...
Conference Paper
Selecting the appropriate visual presentation of the data such that it not only preserves the semantics but also provides an intuitive summary of the data is an important, often the final step of data analytics. Unfortunately, this is also a step involving significant human effort starting from selection of groups of columns in the structured results from analytics stages, to the selection of right visualization by experimenting with various alternatives. In this paper, we describe our DataVizard system aimed at reducing this overhead by automatically recommending the most appropriate visual presentation for the structured result. Specifically, we consider the following two scenarios: first, when one needs to visualize the results of a structured query such as SQL; and the second, when one has acquired a data table with an associated short description (e.g., tables from the Web). Using a corpus of real-world database queries (and their results) and a number of statistical tables crawled from the Web, we show that DataVizard is capable of recommending visual presentations with high accuracy.
... On hovering over different chart options, the system also recommends what needs to be modified in the data to be able to see these types of charts. A key aspect of Tableau is VizQL [7] a specification language that describes the structure of a view and the queries used to populate that structure. ShowMe [9] builds over [7] to automatically present data as small sets of multiple views, focusing on the user experience. ...
... A key aspect of Tableau is VizQL [7] a specification language that describes the structure of a view and the queries used to populate that structure. ShowMe [9] builds over [7] to automatically present data as small sets of multiple views, focusing on the user experience. VizRec [19] describes the authors' vision of what visualization recommender systems should have, for identifying and interactively recommending visualizations for a task. ...
Article
Selecting the appropriate visual presentation of the data such that it preserves the semantics of the underlying data and at the same time provides an intuitive summary of the data is an important, often the final step of data analytics. Unfortunately, this is also a step involving significant human effort starting from selection of groups of columns in the structured results from analytics stages, to the selection of right visualization by experimenting with various alternatives. In this paper, we describe our \emph{DataVizard} system aimed at reducing this overhead by automatically recommending the most appropriate visual presentation for the structured result. Specifically, we consider the following two scenarios: first, when one needs to visualize the results of a structured query such as SQL; and the second, when one has acquired a data table with an associated short description (e.g., tables from the Web). Using a corpus of real-world database queries (and their results) and a number of statistical tables crawled from the Web, we show that DataVizard is capable of recommending visual presentations with high accuracy. We also present the results of a user survey that we conducted in order to assess user views of the suitability of the presented charts vis-a-vis the plain text captions of the data.
... Interactive methods for querying databases, such as Polaris and later VizQL (Tableau) 28,29 , offer platforms for authoring interactive charts and dashboards through drag-and-drop interfaces. These systems have provided significant value to business analytics with their ease of use and suitability for many common tasks, but they are restrictive in terms of their proprietary nature, limited expressivity, and lack of support for graph-based data sources. ...
Article
Full-text available
Graph databases capture richly linked domain knowledge by integrating heterogeneous data and metadata into a unified representation. Here, we present the use of bespoke, interactive data graphics (bar charts, scatter plots, etc.) for visual exploration of a knowledge graph. By modeling a chart as a set of metadata that describes semantic context (SPARQL query) separately from visual context (Vega-Lite specification), we leverage the high-level, declarative nature of the SPARQL and Vega-Lite grammars to concisely specify web-based, interactive data graphics synchronized to a knowledge graph. Resources with dereferenceable URIs (uniform resource identifiers) can employ the hyperlink encoding channel or image marks in Vega-Lite to amplify the information content of a given data graphic, and published charts populate a browsable gallery of the database. We discuss design considerations that arise in relation to portability, persistence, and performance. Altogether, this pairing of SPARQL and Vega-Lite—demonstrated here in the domain of polymer nanocomposite materials science—offers an extensible approach to FAIR (findable, accessible, interoperable, reusable) scientific data visualization within a knowledge graph framework.
... Others have collected visualizations and figures from scientific papers [SHL * 16, CZL * 20, DWS * 20, CLL * 21]. Moreover, we may consider converting existing TableQA datasets [PL15,ZXS17,IYC17] into ChartQA datasets by translating their queries to SQL commands [YZY * 18] and plotting the relevant portions of the data tables into chart images using SQL2Visualization methods [Han06,LTL * 21a]. With respect to user studies, future works need to focus on field trials and longitudinal studies where the participants can ask their own questions with their own datasets. ...
Preprint
Full-text available
Information visualizations such as bar charts and line charts are very common for analyzing data and discovering critical insights. Often people analyze charts to answer questions that they have in mind. Answering such questions can be challenging as they often require a significant amount of perceptual and cognitive effort. Chart Question Answering (CQA) systems typically take a chart and a natural language question as input and automatically generate the answer to facilitate visual data analysis. Over the last few years, there has been a growing body of literature on the task of CQA. In this survey, we systematically review the current state-of-the-art research focusing on the problem of chart question answering. We provide a taxonomy by identifying several important dimensions of the problem domain including possible inputs and outputs of the task and discuss the advantages and limitations of proposed solutions. We then summarize various evaluation techniques used in the surveyed papers. Finally, we outline the open challenges and future research opportunities related to chart question answering.
... In this context, the first efforts focused on developing visual querying languages for DBs such as [40][41][42][43][44]. Although, they share some similar concepts, most of them address the need to offer the database analyst a visual way for syntactically expressing a query, rather than offering visual operations for interactive data exploration. In most interactive visualization systems, visual user operations (e.g., map panning) are used for specifying the actual query logic and several visualization languages have been proposed to to simplify the generation of such visualizations [45][46][47][48]. ...
Article
[See also http://www.cs.uoi.gr/~pvassil/projects/ploigia/info.html] Data exploration and visual analytics systems are of great importance in Open Science scenarios, where less tech-savvy researchers wish to access and visually explore big raw data files (e.g., json, csv) generated by scientific experiments using commodity hardware and without being overwhelmed in the tedious processes of data loading, indexing and query optimization. In this paper, we present our work for enabling efficient query processing on large raw data files for interactive visual exploration scenarios and analytics. We introduce a framework, named RawVis, built on top of a lightweight in-memory tile-based index, VALINOR, that is constructed on-the-fly given the first user query over a raw file and progressively adapted based on the user interaction. We evaluate the performance of a prototype implementation compared to three other alternatives and show that our method outperforms in terms of response time, disk accesses and memory consumption. Particularly during an exploration scenario, the proposed method in most cases is about 5-10× faster compared to existing solutions, and requires significantly less memory resources. Keywords: Visual Analytics, Progressive & Adaptive Indexes, User-driven Incremental Processing, Interactive Indexing, RawVis, In-situ Query Processing, Big Data Visualization
... Our visualization language L V is shown in Figure 8, which formalizes core constructs in Vega-Lite [Satyanarayan et al. 2017], the ggplot2 visualization library for R and VizQL [Hanrahan 2006] from Tableau. This formalization enables concise descriptions of visualizations by encoding data as properties of graphical marks. ...
Article
Full-text available
While visualizations play a crucial role in gaining insights from data, generating useful visualizations from a complex dataset is far from an easy task. In particular, besides understanding the functionality provided by existing visualization libraries, generating the desired visualization also requires reshaping and aggregating the underlying data as well as composing different visual elements to achieve the intended visual narrative. This paper aims to simplify visualization tasks by automatically synthesizing the required program from simple visual sketches provided by the user. Specifically, given an input data set and a visual sketch that demonstrates how to visualize a very small subset of this data, our technique automatically generates a program that can be used to visualize the entire data set. From a program synthesis perspective, automating visualization tasks poses several challenges that are not addressed by prior techniques. First, because many visualization tasks require data wrangling in addition to generating plots from a given table, we need to decompose the end-to-end synthesis task into two separate sub-problems. Second, because the intermediate specification that results from the decomposition is necessarily imprecise, this makes the data wrangling task particularly challenging in our context. In this paper, we address these problems by developing a new compositional visualization-by-example technique that (a) decomposes the end-to-end task into two different synthesis problems over different DSLs and (b) leverages bi-directional program analysis to deal with the complexity that arises from having an imprecise intermediate specification. We have implemented our visualization-by-example approach in a tool called Viser and evaluate it on over 80 visualization tasks collected from on-line forums and tutorials. Viser can solve 84 of these benchmarks within a 600 second time limit, and, for those tasks that can be solved, the desired visualization is among the top-5 generated by Viser in 70% of the cases.
... Our visualization language L V is shown in Figure 8, which formalizes core constructs in VegaLite [Satyanarayan et al. 2017], the ggplot2 visualization library for R and VizQL [Hanrahan 2006] from Tableau. This formalization enables concise descriptions of visualizations by encoding data 49:10 Chenglong Wang, Yu Feng, Rastislav Bodik, Alvin Cheung, and Isil Dillig as properties of graphical marks. ...
Preprint
Full-text available
While visualizations play a crucial role in gaining insights from data, generating useful visualizations from a complex dataset is far from an easy task. Besides understanding the functionality provided by existing visualization libraries, generating the desired visualization also requires reshaping and aggregating the underlying data as well as composing different visual elements to achieve the intended visual narrative. This paper aims to simplify visualization tasks by automatically synthesizing the required program from simple visual sketches provided by the user. Specifically, given an input data set and a visual sketch that demonstrates how to visualize a very small subset of this data, our technique automatically generates a program that can be used to visualize the entire data set. Automating visualization poses several challenges. First, because many visualization tasks require data wrangling in addition to generating plots, we need to decompose the end-to-end synthesis task into two separate sub-problems. Second, because the intermediate specification that results from the decomposition is necessarily imprecise, this makes the data wrangling task particularly challenging in our context. In this paper, we address these problems by developing a new compositional visualization-by-example technique that (a) decomposes the end-to-end task into two different synthesis problems over different DSLs and (b) leverages bi-directional program analysis to deal with the complexity that arises from having an imprecise intermediate specification. We implemented our visualization-by-example algorithm and evaluate it on 83 visualization tasks collected from on-line forums and tutorials. Viser can solve 84% of these benchmarks within a 600 second time limit, and, for those tasks that can be solved, the desired visualization is among the top-5 generated by Viser in 70% of the cases.
... In 2003, Hanrahan revised Mackinglay‚s specifications into a declarative visual language known as VizQL [13]. It is a formal language for describing tables, charts, graphs, maps, time series and tables of visualizations. ...
Preprint
Full-text available
Choosing a suitable visualization for data is a difficult task. Current data visualization recommender systems exist to aid in choosing a visualization, yet suffer from issues such as low accessibility and indecisiveness. In this study, we first define a step-by-step guide on how to build a data visualization recommender system. We then use this guide to create a model for a data visualization recommender system for non-experts that aims to resolve the issues of current solutions. The result is a question-based model that uses a decision tree and a data visualization classification hierarchy in order to recommend a visualization. Furthermore, it incorporates both task-driven and data characteristics-driven perspectives, whereas existing solutions seem to either convolute these or focus on one of the two exclusively. Based on testing against existing solutions, it is shown that the new model reaches similar results while being simpler, clearer, more versatile, extendable and transparent. The presented guide can be used as a manual for anyone building a data visualization recommender system. The resulting model can be applied in the development of new data visualization software or as part of a learning tool.
... In 2003, Hanrahan revised Mackinglay‚s specifications into a declarative visual language known as VizQL [13]. It is a formal language for describing tables, charts, graphs, maps, time series and tables of visualizations. ...
Chapter
Full-text available
Choosing a suitable visualization for data is a difficult task. Current data visualization recommender systems exist to aid in choosing a visualization, yet suffer from issues such as low accessibility and indecisiveness. In this study, we first define a step-by-step guide on how to build a data visualization recommender system. We then use this guide to create a model for a data visualization recommender system for non-experts that aims to resolve the issues of current solutions. The result is a question-based model that uses a decision tree and a data visualization classification hierarchy in order to recommend a visualization. Furthermore, it incorporates both task-driven and data characteristics-driven perspectives, whereas existing solutions seem to either convolute these or focus on one of the two exclusively. Based on testing against existing solutions, it is shown that the new model reaches similar results while being simpler, clearer, more versatile, extendable and transparent. The presented guide can be used as a manual for anyone building a data visualization recommender system. The resulting model can be applied in the development of new data visualization software or as part of a learning tool.
... Wilkinson introduces a grammar of graphics [Wil99] and its implementation (VizML), forming the basis of the subsequent research on visualization specification. Drawing from Wilkinson's grammar of graphics, Polaris [STH02] (commercialized as Tableau) uses a table algebra, which later evolved to VizQL [Han06], the underlying representation of Tableau visualizations. Wickham introduces ggplot2 [Wic10], a widely-adopted package in the R statistical language, based on Wilkinson's grammar. ...
Article
Full-text available
Pan and zoom are basic yet powerful interaction techniques for exploring large datasets. However, existing zoomable UI toolkits such as Pad++ and ZVTM do not provide the backend database support and data‐driven primitives that are necessary for creating large‐scale visualizations. This limitation in existing general‐purpose toolkits has led to many purpose‐built solutions (e.g. Google Maps and ForeCache) that address the issue of scalability but cannot be easily extended to support visualizations beyond their intended data types and usage scenarios. In this paper, we introduce Kyrix to ease the process of creating general and large‐scale web‐based pan/zoom visualizations. Kyrix is an integrated system that provides the developer with a concise and expressive declarative language along with a backend support for performance optimization of large‐scale data. To evaluate the scalability of Kyrix, we conducted a set of benchmarked experiments and show that Kyrix can support high interactivity (with an average latency of 100 ms or below) on pan/zoom visualizations of 100 million data points. We further demonstrate the accessibility of Kyrix through an observational study with 8 developers. Results indicate that developers can quickly learn Kyrix's underlying declarative model to create scalable pan/zoom visualizations. Finally, we provide a gallery of visualizations and show that Kyrix is expressive and flexible in that it can support the developer in creating a wide range of customized visualizations across different application domains and data types.
... -High-level languages [25,26,35,61,72,73,75,79] further abstract the details of low-level visualization construction. They provide concise specification interfaces that are easier for new users to learn and use. ...
Conference Paper
The problem of data visualization is to transform data into a visual context such that people can easily understand the significance of data. Nowadays, data visualization becomes especially important, because it is the de facto standard for modern business intelligence and successful data science. This tutorial will cover three specific topics: visualization languages define how the users can interact with various visualization systems; efficient data visualization processes the data and produces visualizations based on well-specified user queries; smart data visualization recommends data visualizations based on underspecified user queries. In this tutorial, we will go logically through these prior art, paying particular attentions on problems that may attract the interest from the database community.
... In a seminal work, Wilkinson introduces a grammar of graphics [22] and its implementation (VizML), forming the basis of the subsequent research on visualization specification. Drawing from Wilkinson's grammar of graphics, Polaris [20] (commercialized as Tableau) uses a table algebra, which later evolved to VizQL [8], the underlying representation of Tableau visualizations. Wickham introduces ggplot2 [21], a widely-popular package in the R statistical language, based on Wilkinson's grammar. ...
Preprint
Full-text available
Scalable interactive visual data exploration is crucial in many domains due to increasingly large datasets generated at rapid rates. Details-on-demand provides a useful interaction paradigm for exploring large datasets, where users start at an overview, find regions of interest, zoom in to see detailed views, zoom out and then repeat. This paradigm is the primary user interaction mode of widely-used systems such as Google Maps, Aperture Tiles and ForeCache. These earlier systems, however, are highly customized with hardcoded visual representations and optimizations. A more general framework is needed to facilitate the development of visual data exploration systems at scale. In this paper, we present Kyrix, an end-to-end system for developing scalable details-on-demand data exploration applications. Kyrix provides developers with a declarative model for easy specification of general visualizations. Behind the scenes, Kyrix utilizes a suite of performance optimization techniques to achieve a response time within 500ms for various user interactions. We also report results from a performance study which shows that a novel dynamic fetching scheme adopted by Kyrix outperforms tile-based fetching used in earlier systems.
... Wilkinson introduces a grammar of graphics [Wil99] and its implementation (VizML), forming the basis of the subsequent research on visualization specification. Drawing from Wilkinson's grammar of graphics, Polaris [STH02] (commercialized as Tableau) uses a table algebra, which later evolved to VizQL [Han06], the underlying representation of Tableau visualizations. Wickham introduces ggplot2 [Wic10], a widely-adopted package in the R statistical language, based on Wilkinson's grammar. ...
Conference Paper
Full-text available
Pan and zoom are basic yet powerful interaction techniques for exploring large datasets. However, existing zoomable UI toolkits such as Pad++ and ZVTM do not provide the backend database support and data-driven primitives that are necessary for creating large-scale visualizations. This limitation in existing general-purpose toolkits has led to many purpose-built solutions (e.g. Google Maps and ForeCache) that address the issue of scalability but cannot be easily extended to support visualizations beyond their intended data types and usage scenarios. In this paper, we introduce Kyrix to ease the process of creating general and large-scale web-based pan/zoom visualizations. Kyrix is an integrated system that provides the developer with a concise and expressive declarative language along with a backend support for performance optimization of large-scale data. To evaluate the scalability of Kyrix, we conducted a set of benchmarked experiments and show that Kyrix can support high interactivity (with an average latency of 100 ms or below) on pan/zoom visualizations of 100 million data points. We further demonstrate the accessibility of Kyrix through an observational study with 8 developers. Results indicate that developers can quickly learn Kyrix’s underlying declarative model to create scalable pan/zoom visualizations. Finally, we provide a gallery of visualizations and show that Kyrix is expressive and flexible in that it can support the developer in creating a wide range of customized visualizations across different application domains and data types.
... Facilitating data discovery and the analytic communication process, Tableau is designed to help users produce explanatory graphics and dynamic and interactive visualizations (Jones, 2014). Although essentially Tableau deploys VizQL, a programming language for describing tables, charts, graphs, maps, time series, and tables of visualizations (Hanrahan, 2006), the Tableau interface is user-friendly, such that simple clicks of sensors can accomplish complex tasks. Tableau has great interactive functions and a mobile-friendly interface, but the lack of the capability to drag and move the map via a mouse is not complaisant with typical user experience. ...
Article
Full-text available
This article examines whether implementing visualizations on an institutional repository webpage increases traffic on the site. Two methods for creating visualizations to attract faculty and student interest were employed. The first is a map displaying usage of institutional repository content from around the world. This map uses Tableau software to display Google Analytics data. The second method is a text mining tool allowing users to generate word clouds from dissertation and thesis abstracts according to discipline and year of publication. The word cloud uses R programing language, the Shiny software package, and a text mining package called tm. Change in the number of institutional repository website sessions was analyzed through change-point analysis.
... Vega (https://vega.github.io/vega/) is a visualization grammar in a JSON format. VizQL [24], used by Tableau, is a visual query language that translates drag-and-drop actions into data queries and then expresses data visually. ...
... Automatic visualization tools synthesize graphical designs, which are abstract descriptions of visualizations (Fig. 1). For example, the underlying language for APT describes graphical techniques (e.g., color variation and position on axis) to encode information, whereas ShowMe [38] synthesizes encodings using VizQL [23]. Following CompassQL [65], Draco uses a logical representation of the Vega-Lite grammar [52]. ...
Article
Full-text available
There exists a gap between visualization design guidelines and their application in visualization tools. While empirical studies can provide design guidance, we lack a formal framework for representing design knowledge, integrating results across studies, and applying this knowledge in automated design tools that promote effective encodings and facilitate visual exploration. We propose modeling visualization design knowledge as a collection of constraints, in conjunction with a method to learn weights for soft constraints from experimental data. Using constraints, we can take theoretical design knowledge and express it in a concrete, extensible, and testable form: the resulting models can recommend visualization designs and can easily be augmented with additional constraints or updated weights. We implement our approach in Draco, a constraint-based system based on Answer Set Programming (ASP). We demonstrate how to construct increasingly sophisticated automated visualization design systems, including systems based on weights learned directly from the results of graphical perception experiments.
... In 2003, Hanrahan revised Mackinlay's specifications into a declarative visual language known as VizQL (Hanrahan, 2006). It is a formal language for describing tables, charts, graphs, maps and time series. ...
Conference Paper
Full-text available
In today’s age, there are huge amounts of data being generated every second of every day. Through data visualization, humans can explore, analyse and present it. Choosing a suitable visualization for data is a difficult task, especially for non-experts. Current data visualization recommender systems exist to aid in choosing a visualization, yet suffer from issues such as low accessibility and indecisiveness. The aim of this study is to create a model for a data visualization recommender system for non-experts that resolves these issues. Based on existing work and a survey among data scientists, requirements for a new model were identified and implemented. The result is a question-based model that uses a decision tree and a data visualization classification hierarchy in order to recommend a visualization. Furthermore, it incorporates both task-driven and data characteristics-driven perspectives, whereas existing solutions seem to either convolute these or focus on one of the two exclusively. Based on testing against existing solutions, it is shown that the new model reaches similar results while being simpler, clearer, more versatile, extendable and transparent. The presented model can be applied in the development of new data visualization software or as part of a learning tool.
... Polaris [48] (now called Tableau) uses a table algebra drawn from Wilkinson's grammar of graphics. The table algebra of Polaris later evolved to VizQL [22], forming the underlying representation of Tableau visualizations. Wickham introduces ggplot2 [54], a widely-popular package in the R statistical language, based on Wilkinson's grammar. ...
Article
Full-text available
Rapidly creating effective visualizations using expressive grammars is challenging for users who have limited time and limited skills in statistics and data visualization. Even high-level, dedicated visualization tools often require users to manually select among data attributes, decide which transformations to apply, and specify mappings between visual encoding variables and raw or transformed attributes. In this paper, we introduce Data2Vis, a neural translation model, for automatically generating visualizations from given datasets. We formulate visualization generation as a sequence to sequence translation problem where data specification is mapped to a visualization specification in a declarative language (Vega-Lite). To this end, we train a multilayered Long Short-Term Memory (LSTM) model with attention on a corpus of visualization specifications. Qualitative results show that our model learns the vocabulary and syntax for a valid visualization specification, appropriate transformations (count, bins, mean) and how to use common data selection patterns that occur within data visualizations. Our model generates visualizations that are comparable to manually-created visualizations in a fraction of the time, with potential to learn more complex visualization strategies at scale.
... Vega (https://vega.github.io/vega/) is a visualization grammar in a JSON format. VizQL [24], used by Tableau, is a visual query language that translates drag-and-drop actions into data queries and then expresses data visually. ...
Article
Data visualization transforms data into images to aid the understanding of data; therefore, it is an invaluable tool for explaining the significance of data to visually inclined people. Given a (big) dataset, the essential task of visualization is to visualize the data to tell compelling stories by selecting, filtering, and transforming the data, and picking the right visualization type such as bar charts or line charts. Our ultimate goal is to automate this task that currently requires heavy user intervention in the existing visualization systems. An evolutionized system in the field faces the following three main challenges: (1) Visualization verification: to determine whether a visualization for a given dataset is interesting, from the viewpoint of human understanding; (2) Visualization search space: a "boring" dataset may become interesting after an arbitrary combination of operations such as selections, joins, and aggregations, among others; (3) On-time responses: do not deplete the user's patience. In this paper, we present the DeepEye system to address these challenges. This system solves the first challenge by training a binary classifier to decide whether a particular visualization is good for a given dataset, and by using a supervised learning to rank model to rank the above good visualizations. It also considers popular visualization operations, such as grouping and binning, which can manipulate the data, and this will determine the search space. Our proposed system tackles the third challenge by incorporating database optimization techniques for sharing computations and pruning.
... Exploratory Interfaces steer data scientists through the data space by providing both insights and further queries: Recent approaches discover relevant data objects based on relevance-feedback [28] or by performing a variation of faceted search [29]; Query recommendation systems help data scientists ask relevant questions based on the data set and their past interests [8,70,83]. Visual Analytics reduce the cognitive effort of data exploration by augmenting data systems with visual and gestural interfaces: Various approaches enable data scientists to visually browse data sets [73,74,78]; Recommendation systems automatically select an appropriate visualization given a data set [52]; DbTouch [55] and GestureDB [63] develop database kernels and languages that can be controlled by fingertips; recent efforts also work toward novel visualization languages [43]. Approximate query processing provides estimated answers to exploratory queries in orders of magnitude less time, by touching a fraction of base data. ...
Conference Paper
During exploratory statistical analysis, data scientists repeatedly compute statistics on data sets to infer knowledge. Moreover, statistics form the building blocks of core machine learning classification and filtering algorithms. Modern data systems, software libraries, and domain-specific tools provide support to compute statistics but lack a cohesive framework for storing, organizing, and reusing them. This creates a significant problem for exploratory statistical analysis as data grows: Despite existing overlap in exploratory workloads (which are repetitive in nature), statistics are always computed from scratch. This leads to repeated data movement and recomputation, hindering interactive data exploration. We address this challenge in Data Canopy, where descriptive and dependence statistics are synthesized from a library of basic aggregates. These basic aggregates are stored within an in-memory data structure, and and are reused for overlapping data parts and for various statistical measures. What this means for exploratory statistical analysis is that repeated requests to compute different statistics do not trigger a full pass over the data. We discuss in detail the basic design elements in Data Canopy, which address multiple challenges: (1) How to decompose statistics into basic aggregates for maximal reuse? (2) How to represent, store, maintain, and access these basic aggregates? (3) Under different scenarios, which basic aggregates to maintain? (4) How to tune Data Canopy in a hardware conscious way for maximum performance and how to maintain good performance as data grows and memory pressure increases? We demonstrate experimentally that Data Canopy results in an average speed-up of at least 10x after just 100 exploratory queries when compared with state-of-the-art systems used for exploratory statistical analysis.
... Later, the specifications based on Mackinlays heuristics were used to develop a research system called Polaris (Stolte et al., 2002). These specifications were then revised into a formal declarative visual language known as VizQL (Hanrahan, 2006). The visualization software Tableaus (https://public.tableau.com/s/) ...
Conference Paper
Full-text available
Choosing the best visualization of a given dataset becomes more and more complex as not only the amount of data, but also the number of visualization types and the number of potential uses of visualizations grow tremendously. This challenge has spurred on the research into visualization recommendation systems. The ultimate aim of such a system is the suggestion of visualizations which provide interesting insights into the data. It should ideally consider data characteristics, domain knowledge and individual preferences to produce aesthetically appealing and easy to understand charts. Based on the mentioned factors, we have reviewed in this paper the state-of-the-art in visualization recommendation systems starting from the earliest attempt made on this subject. We identify challenges to visualization and visualization recommendation to guide future research directions.
... The key feature of Tableau technology is the new language for data exploration. VizQL is a formal language for describing tables, charts, graphs, maps, time series and tables of visualizations (Hanrahan 2006). ...
Chapter
Nowadays there are different technologies enabling visualisation of spatial data. The combination of two different systems may enhance the visualisations and therefore better communication of the results to decision makers and the wider public. The aim of our contribution is to assess the possibility of combining the functionality of Geographic Information Systems (GIS) and Business Intelligence (BI) systems for spatial data visualisation. We assess the analytical and visualisation features of combined ESRI ArcGIS and BI Tableau systems with the use of the visual data exploration approach. For the purpose of this study, Geographic Information System is used as a data manager and a data blender. The geoprocessed feature class was stored in the personal geodatabase and then loaded into the Tableau environment. We present the selected functionality of visual data discovery on the example of land change flows in the Czech Republic, Poland and Slovakia. In order to highlight the possibility to conduct analyses on different spatial levels, we ran the simulation at the local level and aggregated it to the regional level. The use of computational capabilities of GIS and BI enhance the geovisualisation on the map by quantitative analysis of tabular data, facilitate the visualisation of the results, and improve communication.
... 在可视化开发中,van Wijk [13] 针对可视化开发中的交互需求,设计出可以对可视化中的过渡及导航进行描 述的语言,用以支持可视化中的交互.该语言侧重从算法层面首先对可视化中的部分交互过程建立数学模型,并 给出相应描述.VisQL [14] 从 Polaris [4] 系统中演化出来,将数据的查询、 分析及可视化进行整合,支持对多种不同可 视化类型的描述,如表格、图形、地图、时序关系等.VisQL 不断演进,支持了 Tableau 系统的设计与实现.但该 语言及其演化版本没有公开,无法了解其设计细节,对其可扩展性等特性无法评估.Heer 基于相关的可视化工 具 [2,3,15] 及系统 [5] 的开发经验,结合声明式编程理念,给出了可视化中的声明式语言设计 [16,17] ,并给出了一种用于 设计可视化的声明式语言 Vega.该语言对可视化进行了充分的抽象,将一个可视化分成数据(data)、数据转换 (data transforms)、刻度(scales)、导航(guides)、标记(marks)等主要组成部分,并给出了 Vega 的 JSON 结构.Vega 及其相关的系统及工具,是可视化应用开发中的有益尝试.但 Vega 设计时主要考虑了对单个可视化本身的抽 象,对多可视化之间的交互以及基于 Vega 的扩展方面,依然存在一些挑战. ...
Article
Full-text available
While model-driven engineering (MDE) methodology has made significant improvements in terms of efficiency and effectiveness in many areas of software development, the same cannot be said of the development of data visualization systems. With this challenge in mind, this paper introduces DVDL (data visualization description language), a modular and hierarchical visualization description language that take advantage of the model-based design of MDE to describe visualization development at an abstract level. The paper also presents DVIZ (data visualization), a visualization system based on DVDL. With a growing popularity and demand for data visualization technology, a number of visualization tools have emerged in recent years, though few of them can be considered as adaptable and scalable as DVIZ. Key features in DVIZ include data source selection by user, property configuration of visual elements, and result publishing and sharing. The system also supports real-time result generation and multi-visual interaction. Lastly, since DVIZ is web-based, it supports result distribution across various social media. © Copyright 2016, Institute of Software, the Chinese Academy of Sciences. All rights reserved.
... Some research is not accompanied with an instantiation but is still incredibly informative, like Chi's work on the data state model, a formalism describing the visualization process using a graph model. [16] These previous visualization frameworks-turned-tools, with early work on visualization languages and automation like SeeDB [17,18,19], Visualization Languages [20,21], Tableau's analytic data engine, [22] and Data Visualization Management Systems [23] are the direct inspiration for DIVE. ...
Thesis
Our world is filled with data describing complex systems like international trade, personal mobility, particle interactions, and genomes. To make sense of these systems, analysts often use visualization and statistical analysis to identify trends, patterns, or anomalies in their data. However, currently available software tools for visualizing and analyzing data have major limitations because they have prohibitively steep learning curves, often need domain-specific customization, and require users to know a priori what they want to see before they see it. Here, I present a new platform for exploratory data visualization and analysis that automatically presents users with inferred visualizations and analyses. By turning data visualization and analysis into an act of curation and selection, this platform aspires to democratize the techniques needed to understand and communicate data. Conceptually, for any dataset, there are a finite number of combinations of its elements. Therefore there are a finite number of common visualizations or analyses of a dataset. In other words, it should be possible to enumerate the whole space of possible visualizations and analyses for a given dataset. Analysts can then explore this space and select interesting results. To capture this intuition, we developed a conceptual framework inspired by set theory and relational algebra, and drawing upon existing work in visualization and database architecture. With these analytical abstractions, we rigorously characterize datasets, infer data models, enumerate visualizable or analyzable data structures, and score these data structures. We implement this framework in the Data Integration and Visualization Engine (DIVE), a web-based platform for anyone to efficiently analyze or visualize arbitrary structured datasets, and then to export and share their results. DIVE has been under development since March 2014 and will continue being development in order to have a alpha version available by September 2015 and a beta version by the end of 2015.
... Although faceted search has made this task easier by allowing users to easily create selection queries over few given queriable facets. But there are many other interesting techniques for query formulation, such as query by example [116,117], visual query interface [91,90,85,80,48,11], keyword search [6,54,55,10,103], query recommendation [115,89,21], iterative querying [12,79], aids for query construction [92,71], forms [63,28,64]. ...
Article
Faceted browsing is a popular paradigm for end-user data access. It is, at present, the de-facto search interface for almost all e-commerce. A typical faceted interface has two main component panels: a query panel and a result panel. Faceted browsing is primarily designed to help users quickly get to a specific item if they know the characteristics they are looking for. However, limitations in the query and the result panel deter effective faceted browsing, especially for users unfamiliar with the data. In this dissertation, we highlight two such limitations, one each in the query and the result panel. We propose add-on extensions to address each of these limitations. In a faceted interface, users progressively select a sequence of facet values to get to their desired result set, which is called an exploration path. If the dataset is high-dimensional, the query panel can only show a few of those dimensions as queriable facets. Users cannot see in the query panel the overall space of available exploration paths, and thus end up choosing an inferior exploration path. Many users have difficulty in selecting or understanding an exploration path when there are many non-queriable facets and the query panel has very limited information of interaction between facets. We address this limitation by showing users an integrated summary of facet interaction that summarizes their chosen exploration path, and by presenting a two-phased faceted interface that provides users a facetwise way to compare the available exploration paths. The result panel that is normally used for presenting relational tuples, including faceted interface, cannot support fast browsing. When a user scrolls fast through data having alphanumeric values, then everything seems like a fast changing blur. To help the user get a quick sense of data, we propose a novel variable-speed scrolling interface, which provides the user a good impression of the data through selected representative tuples that are chosen based on the user???s scrolling speed and browsing history.
... Guess has a built in query language into its graph visualization tool. Tableau by Hanrahan (2006) uses a structured query language for data visualization of relational databases, cubes, and spreadsheets. ZAME (Elmqvist et al. (2008a)) is a visualization tool for exploring graphs at a scale of millions of nodes and edges. ...
Article
Technological advances have led to a proliferation of data characterized by a complex structure; namely, high-dimensional attribute information complemented by relationships between the objects or even the attributes. Classical data mining techniques usually explore the attribute space, while network analytic techniques focus on the relationships, usually expressed in the form of a graph. However, visualization techniques offer the possibility to gain useful insight through appropriate graphical displays coupled with data mining and network analytic techniques. In this thesis, we study various topics of the visual analytic process. Specifically, in chapter 2, we propose a visual analytic algebra geared towards attributed graphs. The algebra defines a universal language for graph data manipulations during the visual analytic process and allows documentation and reproducibility. In chapter 3, we extend the algebra framework to address the uncertain querying problem. The algebra's operators are illustrated on a number of synthetic and real data sets, implemented in an existing visualization system (Cytoscape) and validated through a small user study. In chapter 4, we introduce a dimension reduction technique that through a regularization framework incorporates network information either on the objects or the attributes. The technique is illustrated on a number of real world applications. Finally, in the last part of the thesis, we present a multi-task generalized linear model that improves the learning of a single task (problem) by utilizing information from connected/similar tasks through a shared representation. We present an algorithm for estimating the parameters of the problem efficiently and illustrate it on a movie ratings data set.
... It introduces graphical primitives that map to textual query notations, allowing users to specify a "query-path" across multiple tables, and also express recursive queries visually. The Polaris System [58] constructs a similar mapping between common user interface actions such as zooming and panning, and relational algebra. This allows users to visualize any standard SQL database using a point and click interface. ...
Article
Humans are increasingly becoming the primary consumer of structured data. As the volume and heterogeneity of data produced in the world increases, the existing paradigm of using an application layer to query and search for information in data is becoming infeasible. The human end-user is overwhelmed with a barrage of diverse query and data models. Due to the lack of familiarity with the data sources, search queries issued by the user are typically found to be imprecise. To solve this problem, this dissertation introduces the notion of a "queried unit", or qunit, which is the semantic unit of information returned in response to a user's search query. In a qunits-based system, the user comes in with an information need, and is guided to the qunit that is an appropriate response for that need. The qunits-based paradigm aids the user by systematically shrinking both the query and result spaces. On one end, the query space is reduced by enriching the user's imprecise information need. This is done by extracting information from the user during query input by providing schema and data suggestions. On the other end, the result space is reduced by modeling the structured data into a collection of qunits. This is done using qunit derivation methods that use various sources of information such as query logs. This dissertation describes the design and implementation of a autocompletion-style system that performs both query and result space reduction by interacting with the user in real time, providing suggestions and pruning candidate qunit results. It enables the user to search through databases without any knowledge of the data, schema or the query language.
... Its query operations are quite simple (e.g., allowing only one level of grouping and being highly restrictive on the form of query conditions). Tableau [4], which is built on VizQL [45], specializes in interactive data visualization and is limited in querying capability. ...
Article
Database systems are tremendously powerful and useful, as evidenced by their popularity in modern business. Unfortunately, for non-expert users, to use a database is still a daunting task due to its poor usability. This PhD dissertation examines stages in the information seeking process and proposes techniques to help users interact with the database through direct manipulation, which has been proven a natural interaction paradigm. For the first stage of information seeking, query formulation, we proposed a spreadsheet algebra upon which a direct manipulation interface for database querying can be built. We developed a spreadsheet algebra that is powerful (capable of expressing at least all single-block SQL queries) and can be intuitively implemented in a spreadsheet. In addition, we proposed assisted querying by browsing, where we help users query the database through browsing. For the second stage, result review, instead of asking users to review possibly many results in a flat table, we proposed a hierarchical navigation scheme that allows users to browse the results through representatives with easy drill-down and filtering capabilities. We proposed an efficient tree-based method for generating the representatives. For the query refinement stage, we proposed and implemented a provenance-based automatic refinement framework. Users label a set of output tuples and our framework produces a ranked list of changes that best improve the query. This dissertation significantly lowers the barrier for non-expert users and reduces the effort for expert users to use a database.
Article
Information visualizations such as bar charts and line charts are very common for analyzing data and discovering critical insights. Often people analyze charts to answer questions that they have in mind. Answering such questions can be challenging as they often require a significant amount of perceptual and cognitive effort. Chart Question Answering (CQA) systems typically take a chart and a natural language question as input and automatically generate the answer to facilitate visual data analysis. Over the last few years, there has been a growing body of literature on the task of CQA. In this survey, we systematically review the current state‐of‐the‐art research focusing on the problem of chart question answering. We provide a taxonomy by identifying several important dimensions of the problem domain including possible inputs and outputs of the task and discuss the advantages and limitations of proposed solutions. We then summarize various evaluation techniques used in the surveyed papers. Finally, we outline the open challenges and future research opportunities related to chart question answering.
Preprint
Full-text available
There has been substantial growth in the use of JSON-based grammars, as well as other standard data serialization languages, to create visualizations. Each of these grammars serves a purpose: some focus on particular computational tasks (such as animation), some are concerned with certain chart types (such as maps), and some target specific data domains (such as ML). Despite the prominence of this interface form, there has been little detailed analysis of the characteristics of these languages. In this study, we survey and analyze the design and implementation of 57 JSON-style DSLs for visualization. We analyze these languages supported by a collected corpus of examples for each DSL (consisting of 4395 instances) across a variety of axes organized into concerns related to domain, conceptual model, language relationships, affordances, and general practicalities. We identify tensions throughout these areas, such as between formal and colloquial specifications, among types of users, and within the composition of languages. Through this work, we seek to support language implementers by elucidating the choices, opportunities, and tradeoffs in visualization DSL design.
Book
The noble way to substantiate decisions that affect many people is to ask these people for their opinions. For governments that run whole countries, this means asking all citizens for their views to consider their situations and needs. Organizations such as Africa's Voices Foundation, who want to facilitate communication between decision-makers and citizens of a country, have difficulty mediating between these groups. To enable understanding, statements need to be summarized and visualized. Accomplishing these goals in a way that does justice to the citizens' voices and situations proves challenging. Standard charts do not help this cause as they fail to create empathy for the people behind their graphical abstractions. Furthermore, these charts do not create trust in the data they are representing as there is no way to see or navigate back to the underlying code and the original data. To fulfill these functions, visualizations would highly benefit from interactions to explore the displayed data, which standard charts often only limitedly provide. To help improve the understanding of people's voices, we developed and categorized 80 ideas for new visualizations, new interactions, and better connections between different charts, which we present in this report. From those ideas, we implemented 10 prototypes and two systems that integrate different visualizations. We show that this integration allows consistent appearance and behavior of visualizations. The visualizations all share the same main concept: representing each individual with a single dot. To realize this idea, we discuss technologies that efficiently allow the rendering of a large number of these dots. With these visualizations, direct interactions with representations of individuals are achievable by clicking on them or by dragging a selection around them. This direct interaction is only possible with a bidirectional connection from the visualization to the data it displays. We discuss different strategies for bidirectional mappings and the trade-offs involved. Having unified behavior across visualizations enhances exploration. For our prototypes, that includes grouping, filtering, highlighting, and coloring of dots. Our prototyping work was enabled by the development environment Lively4. We explain which parts of Lively4 facilitated our prototyping process. Finally, we evaluate our approach to domain problems and our developed visualization concepts. Our work provides inspiration and a starting point for visualization development in this domain. Our visualizations can improve communication between citizens and their government and motivate empathetic decisions. Our approach, combining low-level entities to create visualizations, provides value to an explorative and empathetic workflow. We show that the design space for visualizing this kind of data has a lot of potential and that it is possible to combine qualitative and quantitative approaches to data analysis.
Article
In this work, we present a self-driving data visualization system, called DeepEye , that automatically generates and recommends visualizations based on the idea of visualization by examples. We propose effective visualization recognition techniques to decide which visualizations are meaningful and visualization ranking techniques to rank the good visualizations. Furthermore, a main challenge of automatic visualization system is that the users may be misled by blindly suggesting visualizations without knowing the user's intent. To this end, we extend DeepEye to be easily steerable by allowing the user to use keyword search and providing click-based faceted navigation . Empirical results, using real-life data and use cases, verify the power of our proposed system.
Conference Paper
This paper describes an approach that puts even inexperienced users in charge of force-directed layouts. The visual interface to a powerful but relatively easy to use visualization grammar has been augmented with sliders for controlling the strength of constraints applied to visual objects. Users can change the balance of power between constraints while an animated visualization is running, turn off the constraints affecting the layout, or return a layout to its pre-constraint-solving specification. An initial empirical evaluation supported the usefulness of this interactive design intervention for providing user control over force-directed layouts. This approach is a step towards addressing the lack of tools with which less sophisticated users can design customized visualizations that best meet their needs.
Article
The interactive exploration of data cubes has become a popular application, especially over large datasets. In this paper, we present DICE, a combination of a novel frontend query interface and distributed aggregation backend that enables interactive cube exploration. DICE provides a convenient, practical alternative to the typical offline cube materialization strategy by allowing the user to explore facets of the data cube, trading off accuracy for interactive response-times, by sampling the data. We consider the time spent by the user perusing the results of their current query as an opportunity to execute and cache the most likely followup queries. The frontend presents a novel intuitive interface that allows for sampling-aware aggregations, and encourages interaction via our proposed faceted model. The design of our backend is tailored towards the low-latency user interaction at the frontend, and viceversa. We discuss the synergistic design behind both the frontend user experience and the backend architecture of DICE; and, present a demonstration that allows the user to fluidly interact with billiontuple datasets within sub-second interactive response times.
Conference Paper
Interactive ad-hoc analytics over large datasets has become an increasingly popular use case. We detail the challenges encountered when building a distributed system that allows the interactive exploration of a data cube. We introduce DICE, a distributed system that uses a novel session-oriented model for data cube exploration, designed to provide the user with interactive sub-second latencies for specified accuracy levels. A novel framework is provided that combines three concepts: faceted exploration of data cubes, speculative execution of queries and query execution over subsets of data. We discuss design considerations, implementation details and optimizations of our system. Experiments demonstrate that DICE provides a sub-second interactive cube exploration experience at the billion-tuple scale that is at least 33% faster than current approaches.
Article
Visual analytics that combine analytical techniques with advanced visualization features is fast becoming a standard tool in extracting information from complex data. A number of sophisticated tools have been developed for this purpose, which necessitates formal methods to guide the creation of such tools and also compare them. Further, there is a need for visual analysts to document the steps in their analysis, for reuse, sharing, result justification, and so forth. This calls for a visual analytic framework encapsulated in a formal algebra. In this paper, we develop such an algebra for graph data and introduce its atomic operators, which include selection, aggregation, and labeling. We then build a framework around this algebra, which enables visual exploration of data. We employ visual operators and support dynamic attributes of data to complete this visual analytic framework. We show several visual data analysis examples using the algebra and framework to illustrate its utility.
Thesis
Full-text available
[FR] En informatique, les utilisateurs peuvent se retrouver confrontés à des actions impossibles à réaliser à cause d’options inexistantes ou devoir s’adapter à une pensée et des habitudes qui ne sont pas les leurs. Les domaines de l’informatique –bases de données, langages de programmation, …– ont chacun leur propre vision, comment gérer l’information et la présenter à l’utilisateur. Le travail préliminaire de cette thèse est une étude et une synthèse des concepts émergents qui permet de voir les équivalences d’un domaine à l’autre. La première contribution consiste à améliorer de deux manières les favoris collaboratifs, via notre outil Coviz. (1) Chacun met en favori des documents en les marquant de ses propres vocables ; ces étiquettes peu-vent être classées les unes par rapport aux autres selon si elles sont plus ou moins spécifiques. Le résultat, nominatif, peut être réutilisé par d’autres utilisateurs. (2) L’outil combine la recherche par mot-clé, à la manière de Google, et la sélection des caractéristiques des documents, à la manière d’eBay. Ce dernier point est commun à tous les utilisateurs, c’est-à-dire que le système considère chacune des actions individuelles comme celles d’un groupe. La seconde contribution, majeure et confidentielle, est un outil, DIP, qui s’intègre aux logiciels existants. Son but est d’offrir plus de libertés à l’utilisateur, sur l’interaction et la présentation des informations. Le principe est de diminuer les contraintes dites « machines » du logiciel en adjoignant un nouvel accès direct entre l’utilisateur et les données disponibles. Il a été testé en condition réelle de stress avec le logiciel KeePlace. D’un point de vue final, l’utilisateur gagne en expression de filtrage –des informations– et particulièrement sur la formulation des dates, en partage, en maintien de l’état de sa navigation, en automatisation de ses tâches courantes, etc. [EN] In information technology, end users can be faced with impossible tasks due the lack of options or need to adapt their though processes and habits to new paradigms. The domains of informatics – databases, programming languages, etc – have their own way of looking at the world and how to present information to the user. A first work of this thesis is a study and a synthesis of the emerging concepts that allow the equivalences between domains to be shown. The first contribution consists of improving shared bookmarks in two ways though our tool Coviz. (1) Each user bookmarks documents with their own terms, these labels can be categorised against each other according to how specific they are. (2) The tool combines keyword search (like Google) with document selection by characteristic (like eBay). This last point is common to all users, that is to say all the actions of individual users are combined and considered to be those of a group. The second major contribution (which is confidential) is a tool – DIP – that plugs into existing soft-ware tools. Its purpose is to provide greater freedom to the user over the interaction with and presentation of data. The principle is to reduce the constraints due to computer/software limitations by adding new direct access methods between the user and raw data. It was tested in the field with the tool KeePlace. The result is that the user gains in the expressiveness of sorting of data – particularly concerning dates and times –, in sharing, retaining browser state, automating day-to-day tasks, etc.
Article
Full-text available
水災的預警須在短時間內分析大量且多維度的水情資料,以臺灣水利決策單位-水利署為例,其目前使用之雨量警戒值淹水預警系統,是利用經驗模式定出全臺各鄉鎮市區之警戒值,當即時雨量超過警戒值即發佈淹水警戒,此系統已應用於近年各颱風豪雨事件之應變操作,結果顯示其在實務運作上能達到預警效果,但其對於觀測雨量、淹水警戒值以及地理資訊等不同來源資料之間並沒有做足適當的連結,導致水災應變人員仍浪費時間在水情資訊之間來回比對與查詢,因此本研究以此系統為基礎,開發一個防災資訊儀表板(D^2ashboard),並針對防災資訊的特性設計互動介面,使用三種資料互動模組:(1)資料提示、(2)資訊刷、(3)動態查詢,解決水情資料無法有效傳遞之問題。本研究於民國102年8月潭美及康芮颱風期間,利用防災資訊儀表板實際進行水災預警與應變作業,結果顯示在兩次事件中,使用防災資訊儀表板能分別幫助應變人員減少25%與61%判讀淹水警戒資訊的時間,因此確實能提升研判警戒資訊的效率。 Decision-making for the issuance of flood warnings is a complex process that requires quick analysis of multi-dimensional hydrological data. The Water Resource Agency in Taiwan has developed a flood alert system that has defined two levels for alerts based on predefined rainfall thresholds. The system tracks rainfall in real time through rain gauge stations in specific locations and subsequently issues flood alerts to the affected counties when the rainfall reaches a predefined threshold. The system has been operational during several typhoons and heavy-rain events and has demonstrated its usefulness in flood prevention. However, in order to judge the flood potential, the analyzers have spent a considerable amount of time exploring critical information due to the lack of connectivity between relative hydrological data in the system. The research led to the development of a visual decision-making tool (called D^2ashboard), which allows users to manipulate related information to enhance their understanding regarding the risk of a floods and to influence decision making. We integrated three interactive functions: (1) data tips, (2) data brushing, and (3) dynamic queries in order to increase its usability. Users are now able to take advantage of its capacity for dynamic exploration and from its intuitive operation. We implemented D^2ashboard to the disaster prevention process during the typhoons TRAMI and KONG-REY (August 2013) for validation. The results show that D^2ashboard decreased the time the experts spend on judging the flood information by 25% and 61% respectively.
Article
In the last several years, large multidimensional databases have become common in a variety of applications, such as data warehousing and scientific computing. Analysis and exploration tasks place significant demands on the interfaces to these databases. Because of the size of the data sets, dense graphical representations are more effective for exploration than spreadsheets and charts. Furthermore, because of the exploratory nature of the analysis, it must be possible for the analysts to change visualizations rapidly as they pursue a cycle involving first hypothesis and then experimentation. In this paper, we present Polaris, an interface for exploring large multidimensional databases that extends the well-known pivot table interface. The novel features of Polaris include an interface for constructing visual specifications of table-based graphical displays and the ability to generate a precise set of relational queries from the visual specifications. The visual specifications can be rapidly and incrementally developed, giving the analyst visual feedback as he constructs complex queries and visualizations