Conference Paper

VizQL: a language for query, analysis and visualization

Authors:
To read the full-text of this research, you can request a copy directly from the author.

Abstract

Conventional query languages such as SQL and MDX have limited formatting and visualization capabilities. Thus, although powerful queries can be composed, another layer of software is needed to report or present the results in a useful form to the analyst. VizQL™ is designed to fill that gap. VizQL evolved from the Polaris system at Stanford, which combined query, analysis and visualization into a single framework [1].VizQL is a formal language for describing tables, charts, graphs, maps, time series and tables of visualizations. These different types of visual representations are unified into one framework, making it easy to switch from one visual representation to another (e.g. from a list view to a cross-tab to a chart). Unlike current charting packages and like query languages, VizQL permits an unlimited number of picture expressions. Visualizations can thus be easily customized and controlled. VizQL is a declarative language. The desired picture is described; the low-level operations needed to retrieve the results, to perform analytical calculations, to map the results to a visual representation, and to render the image are generated automatically by the query analyzer. The query analyzer compiles VizQL expressions to SQL and MDX and thus VizQL can be used with relational databases and datacubes. The current implementation supports Hyperion Essbase, Microsoft SQL Server, Microsoft Analysis Services, MySQL, Oracle, as well as desktop data sources such as CSV and Excel files. This analysis phase includes many optimizations that allow large databases to be browsed interactively. VizQL enables a new generation of visual analysis tools that closely couple query, analysis and visualization.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... In shelf builders, users map data columns to visual attributes, typically in a manner that is motivated by a principled visualization framework, such as VizQL [24] or the Grammar of Graphics [95]. Tableau [80] and Charticulator [67] are prominent examples of this paradigm. ...
... After pasting in the code into the template body (via the Body tab of Fig. 1c), several automated suggestions are provided on how the data fields could be abstracted as template parameters. Clicking through the suggestions replaces the "age" and "people" data fields ( Figure [23][24]. She uses the settings popover ( Fig. 6) to specify their allowed data roles-she could have also done so using the Params text box (Fig. 1c). ...
... As JSON-mediated visualization grammars continue to gain popularity, additional languages will inevitably emerge to solve problems unaddressed in prior efforts. Future languages could support more complex rendering schemes, focusing on particular domains such as geospatial analytics [106], 3D visual analytics, pivot tables (perhaps simplifying the language of VizQL [24]), or even on the chart recommendation language CompassQL [111]-which would enable task-specific variations of Voyager [98]. ...
Preprint
Interfaces for creating visualizations typically embrace one of several common forms Textual specification enables fine-grained control, shelf building facilitates rapid exploration, while chart choosing promotes immediacy and simplicity. Ideally these approaches could be unified to integrate the user- and usage-dependent benefits found in each modality, yet these forms remain distinct. We propose parameterized declarative templates, a simple abstraction mechanism over JSON-based visualization grammars, as a foundation for multimodal visualization editors. We demonstrate how templates can facilitate organization and reuse by factoring the more than 160 charts that constitute Vega-Lite's example gallery into approximately 40 templates. We exemplify the pliability of abstracting over charting grammars by implementing -- as a template -- the functionality of the shelf builder Polestar (a simulacra of Tableau) and a set of templates that emulate the Google Sheets chart chooser. We show how templates support multimodal visualization editing by implementing a prototype and evaluating it through an approachability study.
... (1) Visualization Specifications Visualization specifications provide various ways that users can specify what they want. There have been a great many studies from both visualization [1,2,[14][15][16] and database community [3,[17][18][19] on visualization specifications. We include it in this survey for two reasons: ...
... High-level languages [2,3,16,18,36,[70][71][72][73] encapsulate the details of visualization construction, such as the mapping function, as well as some properties for marks such as canvas size, legend, and other properties. ...
... Echarts [16,73] is a latest development in declarative visualization languages designed to support quick visualization creation for non-programmers. VizQL [3] develops from the Polaris system [20] and is the visualization Fig. 3 Example of low-and high-level visualization languages. The target visualization (➂) is a bar chart showing the passenger_num of different destinations. ...
Article
Full-text available
Data visualization is crucial in today’s data-driven business world, which has been widely used for helping decision making that is closely related to major revenues of many industrial companies. However, due to the high demand of data processing w.r.t. the volume, velocity, and veracity of data, there is an emerging need for database experts to help for efficient and effective data visualization. In response to this demand, this article surveys techniques that make data visualization more efficient and effective. (1) Visualization specifications define how the users can specify their requirements for generating visualizations. (2) Efficient approaches for data visualization process the data and a given visualization specification, which then produce visualizations with the primary target to be efficient and scalable at an interactive speed. (3) Data visualization recommendation is to auto-complete an incomplete specification, or to discover more interesting visualizations based on a reference visualization.
... To lower the barriers to creating DVs and further unlock the power of DV for the general public, researchers have proposed a variety of DV-related tasks that have attracted significant attention from both industrial and academic researchers. Numerous studies on these topics have been presented in leading conferences and journals such as VLDB [2], [9], [10], ICDE [11], [12], SIGMOD [13]- [15], and TKDE [16], [17]. These tasks include text-to-vis (i.e., automatically generating DVs from natural language questions) [8], [15], vis-to-text [18] (i.e., automatically generating interpretations of complex DVs for educational purposes), FeVisQA [12] (i.e., free-form question answering over data visualization), and table-to-text (i.e., describing a given table) [19]. ...
... These specifications include various elements such as chart type, colors, sizes, and mapping functions, as well as properties for visual marks like canvas dimensions and legends. Several DVLs are prevalent in the field, such as Vega-Lite [6], ggplot2 [7], ZQL [10], ECharts [28], Vega-Zero [8], and VizQL [13], each offering unique features to facilitate the visualization process. Visualization Specification. ...
Preprint
Data visualization (DV) is the fundamental and premise tool to improve the efficiency in conveying the insights behind the big data, which has been widely accepted in existing data-driven world. Task automation in DV, such as converting natural language queries to visualizations (i.e., text-to-vis), generating explanations from visualizations (i.e., vis-to-text), answering DV-related questions in free form (i.e. FeVisQA), and explicating tabular data (i.e., table-to-text), is vital for advancing the field. Despite their potential, the application of pre-trained language models (PLMs) like T5 and BERT in DV has been limited by high costs and challenges in handling cross-modal information, leading to few studies on PLMs for DV. We introduce \textbf{DataVisT5}, a novel PLM tailored for DV that enhances the T5 architecture through a hybrid objective pre-training and multi-task fine-tuning strategy, integrating text and DV datasets to effectively interpret cross-modal semantics. Extensive evaluations on public datasets show that DataVisT5 consistently outperforms current state-of-the-art models on various DV-related tasks. We anticipate that DataVisT5 will not only inspire further research on vertical PLMs but also expand the range of applications for PLMs.
... A DVL specifies the details of the visualization construction (e.g., chart type, color, size, mapping function, and properties for marks such as canvas size and legend). The most common DVLs include Vega-Lite [39], ggplot2 [50], ZQL [41], ECharts [22] and VizQL [16], each of which has its own grammar. Our work mainly uses Vega-Lite due to its popularity and wide usage [30,31,43,46]. ...
... Recent years have witnessed an enormous rise in DV in the data mining [14,18,37,43] and the database communities [16,29,30,46,47]. There are many intelligent tasks proposed to lower the barriers to use DV. ...
Preprint
Data visualization (DV) has become the prevailing tool in the market due to its effectiveness into illustrating insights in vast amounts of data. To lower the barrier of using DVs, automatic DV tasks, such as natural language question (NLQ) to visualization translation (formally called text-to-vis), have been investigated in the research community. However, text-to-vis assumes the NLQ to be well-organized and expressed in a single sentence. However, in real-world settings, complex DV is needed through consecutive exchanges between the DV system and the users. In this paper, we propose a new task named CoVis, short for Conversational text-to-Visualization, aiming at constructing DVs through a series of interactions between users and the system. Since it is the task which has not been studied in the literature, we first build a benchmark dataset named Dial-NVBench, including dialogue sessions with a sequence of queries from a user and responses from the system. Then, we propose a multi-modal neural network named MMCoVisNet to answer these DV-related queries. In particular, MMCoVisNet first fully understands the dialogue context and determines the corresponding responses. Then, it uses adaptive decoders to provide the appropriate replies: (i) a straightforward text decoder is used to produce general responses, (ii) an SQL-form decoder is applied to synthesize data querying responses, and (iii) a DV-form decoder tries to construct the appropriate DVs. We comparatively evaluate MMCoVisNet with other baselines over our proposed benchmark dataset. Experimental results validate that MMCoVisNet performs better than existing baselines and achieves a state-of-the-art performance.
... In contrast to prior approaches at the time-which provided chart typologies that offered only a fixed collection of charts with limited customization-Wilkinson constructed a compositional language with six primitives: data, transforms, marks, scales, guides, and coordinate systems. As a result, the GoG affords visualization authors a large expressive gamut without imposing a heavy specification burden, and its design has influenced modern visualization languages like Tableau's VizQL [24,65], ggplot2 [76], and Vega-Lite [62]. ...
... Fig. 2 summarizes how different visualization grammars surface perceptual groupings. GoG-based systems like VizQL [24], ggplot2 [76], and Vega-Lite [62] do not explicitly encode perceptual groupings. Rather, perceptual groupings occur as a result of other primitives the grammars offer -for instance, spatial proximity via ordinal scales or facets, similar attributes and alignment via the visual encoding mapping, etc. ...
Preprint
Full-text available
The Grammar of Graphics (GoG) has become a popular format for specifying visualizations because it unifies different chart types into a consistent, modular, and customizable framework. But its benefits have not yet reached the broader class of data-driven graphic representations -- from annotated charts and hierarchical visualizations to molecular structure diagrams, Euclidean geometry, and mathematical formulae. These graphics are still developed using rigid typologies, monolithic tools, or specialized grammars that lack the customizability and generality of the GoG. In response, we present Bluefish, a relational grammar of graphics that extends the benefits of the GoG to this larger domain. Bluefish provides two key abstractions: user-extensible, domain-specific elements (e.g., mathematical expressions, chemical atoms, or program state stack frames); and perceptual groupings (also known as Gestalt relations) like proximity, nesting, and linking. Users compose these primitives within a Bluefish specification, which the language runtime compiles to a relational scenegraph: a formal representation of a graphic that, compared to traditional tree-based scenegraphs, better preserves semantic relationships between visual elements. To illustrate its flexibility, we show that Bluefish can represent data-driven graphic representations across a diverse range of domains while closely aligning with domain-specific vocabulary. Moreover, to demonstrate the affordances of Bluefish's relational scenegraph, we develop a prototype screen reader tool that allows blind and low-vision users to traverse a diagram without significant additional scaffolding.
... Hierarchical structures are integral to visualization algebras such as VizQL [5,14]. VizQL defines algebraic operators over data attributes to compose small-multiple display, which are manifested as interactions to drag-and-drop attributes onto x-and y-axis "shelves". ...
... Hierarchies are used in multi-dimensional databases to drill-down or roll-up, and in visualization systems to zoom in or zoom out. Visualization formalisms like VizQL [5,14] rely on such hierarchies to define operators like nest. ...
Preprint
Comparison is a core task in visual analysis. Although there are numerous guidelines to help users design effective visualizations to aid known comparison tasks, there are few formalisms that define the semantics of comparison operations in a way that can serve as the basis for a grammar of comparison interactions. Recent work proposed a formalism called View Composition Algebra (VCA) that enables ad hoc comparisons between any combination of marks, trends, or charts in a visualization interface. However, VCA limits comparisons to visual representations of data that have an identical schema, or where the schemas form a strict subset relationship (e.g., comparing price per state with price, but not with price per county). In contrast, the majority of real-world data - temporal, geographical, organizational - are hierarchical. To bridge this gap, this paper presents an extension to VCA (called VCAH) that enables ad hoc comparisons between visualizations of hierarchical data. VCAH leverages known hierarchical relationships to enable ad hoc comparison of data at different hierarchical granularities. We illustrate applications to hierarchical and Tableau visualizations.
... The optimization techniques used in these systems are often inaccessible to visualization developers at large, who are not necessarily experts in performance optimizations. Furthermore, current general-purpose data visualization tools [5,18,9] provide limited support for the developer to create visual exploration applications at scale. ...
... In a seminal work, Wilkinson introduces a grammar of graphics [24] and its implementation (VizML), forming the basis of the subsequent research on visualization specification. Drawing from Wilkinson's grammar of graphics, Polaris [22] (commercialized as Tableau) uses a table algebra, which later evolved to VizQL [9], the underlying representation of Tableau visualizations. Wickham introduces ggplot2 [23], a widely-popular package in the R statistical language, based on Wilkinson's grammar. ...
Conference Paper
Full-text available
Scalable interactive visual data exploration is crucial in many domains due to increasingly large datasets generated at rapid rates. Details-on-demand provides a useful interaction paradigm for exploring large datasets, where the user starts at an overview, finds regions of interest, zooms in to see detailed views, zooms out and then repeats. This paradigm is the primary user interaction mode of widely-used systems such as Google Maps, Aperture Tiles and Fore-Cache. These earlier systems, however, are highly customized with hardcoded visual representations and optimizations. A more general framework is needed to facilitate the development of visual data exploration systems at scale. In this paper, we present Kyrix, an end-to-end system for developing scalable details-on-demand data exploration applications. Kyrix provides the developer with a declarative model for easy specification of general visualizations. Behind the scenes, Kyrix utilizes a suite of performance optimization techniques to achieve a response time within 500 ms for various user interactions. We also report results from a performance study which shows that a novel dynamic fetching scheme adopted by Kyrix outperforms tile-based fetching used in traditional systems.
... Although the VA community (19 references) and the visualization community (12) can be seen as the target audience for this survey, around half of the 66 publications are not focused on VA or visualization. Application-driven publications (25) contribute the most to this (external) group. ...
... All research spin-offs have approached the emerging market with distinct skill-sets manifesting the diverse approaches in the commercial VA sector today. For example, Tableau's core architecture resides on VizQL [25], a declarative query definition language that translates user actions into database queries and the respective data responses back into graphical representations. Similarly, Spotfire's architecture builds on top of IVEE: An information visualization & exploration environment [24], a research prototype for the dynamic queries idea in which the database query process is translated into visual metaphors. ...
Article
Five years after the first state-of-the-art report on Commercial Visual Analytics Systems we present a reevaluation of the Big Data Analytics field. We build on the success of the 2012 survey, which was influential even beyond the boundaries of the InfoVis and Visual Analytics (VA) community. While the field has matured significantly since the original survey, we find that innovation and research-driven development are increasingly sacrificed to satisfy a wide range of user groups. We evaluate new product versions on established evaluation criteria, such as available features, performance, and usability, to extend on and assure comparability with the previous survey. We also investigate previously unavailable products to paint a more complete picture of the commercial VA landscape. Furthermore, we introduce novel measures, like suitability for specific user groups and the ability to handle complex data types, and undertake a new case study to highlight innovative features. We explore the achievements in the commercial sector in addressing VA challenges and propose novel developments that should be on systems' roadmaps in the coming years.
... Tableau [15] supports data visualization in a comprehensive manner, where the user can connect to different data sources to select the data to be analyzed and the tool suggests multiple options for interpreting the data. Along with VizQL [5], a specification language and ShowMe [7] which builds over [5] to automatically present data as small sets of multiple views, Tableau provides an exhaustive set of options focusing on the user experience. VizRec [18], Voyager [20], [6], SeeDB [19] are other work that explore various aspects of visualization. ...
... Tableau [15] supports data visualization in a comprehensive manner, where the user can connect to different data sources to select the data to be analyzed and the tool suggests multiple options for interpreting the data. Along with VizQL [5], a specification language and ShowMe [7] which builds over [5] to automatically present data as small sets of multiple views, Tableau provides an exhaustive set of options focusing on the user experience. VizRec [18], Voyager [20], [6], SeeDB [19] are other work that explore various aspects of visualization. ...
Conference Paper
Selecting the appropriate visual presentation of the data such that it not only preserves the semantics but also provides an intuitive summary of the data is an important, often the final step of data analytics. Unfortunately, this is also a step involving significant human effort starting from selection of groups of columns in the structured results from analytics stages, to the selection of right visualization by experimenting with various alternatives. In this paper, we describe our DataVizard system aimed at reducing this overhead by automatically recommending the most appropriate visual presentation for the structured result. Specifically, we consider the following two scenarios: first, when one needs to visualize the results of a structured query such as SQL; and the second, when one has acquired a data table with an associated short description (e.g., tables from the Web). Using a corpus of real-world database queries (and their results) and a number of statistical tables crawled from the Web, we show that DataVizard is capable of recommending visual presentations with high accuracy.
... Existing studies [20,26,43,50] have significantly influenced and shaped the following two requirements for a declarative grammar designed for querying multivariate hierarchical data. ...
Preprint
When using exploratory visual analysis to examine multivariate hierarchical data, users often need to query data to narrow down the scope of analysis. However, formulating effective query expressions remains a challenge for multivariate hierarchical data, particularly when datasets become very large. To address this issue, we develop a declarative grammar, HiRegEx (Hierarchical data Regular Expression), for querying and exploring multivariate hierarchical data. Rooted in the extended multi-level task topology framework for tree visualizations (e-MLTT), HiRegEx delineates three query targets (node, path, and subtree) and two aspects for querying these targets (features and positions), and uses operators developed based on classical regular expressions for query construction. Based on the HiRegEx grammar, we develop an exploratory framework for querying and exploring multivariate hierarchical data and integrate it into the TreeQueryER prototype system. The exploratory framework includes three major components: top-down pattern specification, bottom-up data-driven inquiry, and context-creation data overview. We validate the expressiveness of HiRegEx with the tasks from the e-MLTT framework and showcase the utility and effectiveness of TreeQueryER system through a case study involving expert users in the analysis of a citation tree dataset.
... VizQL [16] is a formal language used by Tableau [2] to describe data visualizations. VizQL allows the specification of table configurations (i.e., rows and columns) as well as visual encodings within each pane. ...
Preprint
Full-text available
Various data visualization applications such as reverse engineering and interactive authoring require a vocabulary that describes the structure of visualization scenes and the procedure to manipulate them. A few scene abstractions have been proposed, but they are restricted to specific applications for a limited set of visualization types. A unified and expressive model of data visualization scenes for different applications has been missing. To fill this gap, we present Manipulable Semantic Components (MSC), a computational representation of data visualization scenes, to support applications in scene understanding and augmentation. MSC consists of two parts: a unified object model describing the structure of a visualization scene in terms of semantic components, and a set of operations to generate and modify the scene components. We demonstrate the benefits of MSC in three applications: visualization authoring, visualization deconstruction and reuse, and animation specification.
... Data visualization research has a long history of investigating DSLs for visualization specification (e.g., [4,12,23,29,31,32]), but research into linting and debugging visualizations is nascent. McNutt and Kindlmann [19] introduce a visualization linter that checks a predefined set of rules on a given visualization and returns a list of failed rules with explanations; this postprocessing approach is disconnected from the development workflow and does not localize errors for rendered visualizations in their specifications. ...
... He supposed that if the grammar is successful, it should be possible to reduce any data visualization problem into a graphic utilizing the rules outlined. Grammar-based visual encodings, as well as declarative languages [12,28,29,45,51,62], arose out of a need to fluidly and precisely articulate a set of intents for communicating data. These works provide formalized approaches for describing tables, charts, graphs, maps, and tables and give credence to treating visualization as a language. ...
Preprint
Full-text available
Data visualization can be defined as the visual communication of information. One important barometer for the success of a visualization is whether the intents of the communicator(s) are faithfully conveyed. The processes of constructing and displaying visualizations have been widely studied by our community. However, due to the lack of consistency in this literature, there is a growing acknowledgment of a need for frameworks and methodologies for classifying and formalizing the communicative component of visualization. This work focuses on intent and introduces how this concept in communicative visualization mirrors concepts in linguistics. We construct a mapping between the two spaces that enables us to leverage relevant frameworks to apply to visualization. We describe this translation as using the philosophy of language as a base for explaining communication in visualization. Furthermore, we illustrate the benefits and point out several prospective research directions.
... This is especially true for researchers with limited programming skills. These challenges thus warrant the development of truly integrated and highly usable tools [13]. ...
Article
Full-text available
The emergence of massive datasets exploring the multiple levels of molecular biology has made their analysis and knowledge transfer more complex. Flexible tools to manage big biological datasets could be of great help for standardizing the usage of developed data visualizations and integration methods. Business intelligence (BI) tools have been used in many fields as exploratory tools. They have numerous connectors to link numerous data repositories with a unified graphic interface, offering an overview of data and facilitating interpretation for decision makers. BI tools could be a flexible and user-friendly way of handling molecular biological data with interactive visualizations. However, it is rather uncommon to see such tools used for the exploration of massive and complex datasets in biological fields. We believe that two main obstacles could be the reason. Firstly, we posit that the way to import data into BI tools are not compatible with biological databases. Secondly, BI tools may not be adapted to certain particularities of complex biological data, namely, the size, the variability of datasets and the availability of specialized visualizations. This paper highlights the use of five BI tools (Elastic Kibana, Siren Investigate, Microsoft Power BI, Salesforce Tableau and Apache Superset) onto which the massive data management repository engine called Elasticsearch is compatible. Four case studies will be discussed in which these BI tools were applied on biological datasets with different characteristics. We conclude that the performance of the tools depends on the complexity of the biological questions and the size of the datasets.
... At a high level, there are two main steps in BI project: (1) building BI models, and (2) performing ad-hoc analysis by querying against BI models. While querying BI-models was made simple by vendors like Tableau and Power-BI (through intuitive user interfaces and dashboards) [27,38], the first step of building "BI-models", a prerequisite for ad-hoc analysis, remains a key pain point for non-technical users. ...
Preprint
Full-text available
Business Intelligence (BI) is crucial in modern enterprises and billion-dollar business. Traditionally, technical experts like database administrators would manually prepare BI-models (e.g., in star or snowflake schemas) that join tables in data warehouses, before less-technical business users can run analytics using end-user dashboarding tools. However, the popularity of self-service BI (e.g., Tableau and Power-BI) in recent years creates a strong demand for less technical end-users to build BI-models themselves. We develop an Auto-BI system that can accurately predict BI models given a set of input tables, using a principled graph-based optimization problem we propose called \textit{k-Min-Cost-Arborescence} (k-MCA), which holistically considers both local join prediction and global schema-graph structures, leveraging a graph-theoretical structure called \textit{arborescence}. While we prove k-MCA is intractable and inapproximate in general, we develop novel algorithms that can solve k-MCA optimally, which is shown to be efficient in practice with sub-second latency and can scale to the largest BI-models we encounter (with close to 100 tables). Auto-BI is rigorously evaluated on a unique dataset with over 100K real BI models we harvested, as well as on 4 popular TPC benchmarks. It is shown to be both efficient and accurate, achieving over 0.9 F1-score on both real and synthetic benchmarks.
... For instance, while aggregation functions such as mean and sum are suitable for quantitative data, grouping is better suited to nominal and ordinal data, and binning intervals is the right transformation in the case of temporal samples [12]. In addition, recent works have proposed more complex transformations of multidimensional datasets to extract meaningful subsets using relational queries [27][28][29]. In the case of connected structures, the topology can play an important role in the transformations, and also in the next stage of Visual Mapping [30]. ...
Article
Full-text available
Rapid growth in the generation of data from various sources has made data visualisation a valuable tool for analysing data. However, visual analysis can be a challenging task, not only due to intricate dashboards but also when dealing with complex and multidimensional data. In this context, advances in Natural Language Processing technologies have led to the development of Visualisation-oriented Natural Language Interfaces (V-NLIs). In this paper, we carry out a scoping review that analyses synergies between the fields of Data Visualisation and Natural Language Interaction. Specifically, we focus on chatbot-based V-NLI approaches and explore and discuss three research questions. The first two research questions focus on studying how chatbot-based V-NLIs contribute to interactions with the Data and Visual Spaces of the visualisation pipeline, while the third seeks to know how chatbot-based V-NLIs enhance users’ interaction with visualisations. Our findings show that the works in the literature put a strong focus on exploring tabular data with basic visualisations, with visual mapping primarily reliant on fixed layouts. Moreover, V-NLIs provide users with restricted guidance strategies, and few of them support high-level and follow-up queries. We identify challenges and possible research opportunities for the V-NLI community such as supporting high-level queries with complex data, integrating V-NLIs with more advanced systems such as Augmented Reality (AR) or Virtual Reality (VR), particularly for advanced visualisations, expanding guidance strategies beyond current limitations, adopting intelligent visual mapping techniques, and incorporating more sophisticated interaction methods.
... Filter() constrains T e to a user-selected conditions u. As indicated in the Algorithm 1, we iterate over each edge (s, p, o) in the G ′ ont (line 1 ) and generate the query accordingly, depending on whether the nodes s/o has filtering conditions through an attribute filtered (line 3,9,14 ). If there is one filtering node, the Filter() is required (line 12,17 ). ...
Article
Full-text available
The Internet of Food (IoF) is an emerging field in smart foodsheds, involving the creation of a knowledge graph (KG) about the environment, agriculture, food, diet, and health. However, the heterogeneity and size of the KG present challenges for downstream tasks, such as information retrieval and interactive exploration. To address those challenges, we propose an interactive knowledge and learning environment (IKLE) that integrates three programming and modeling languages to support multiple downstream tasks in the analysis pipeline. To make IKLE easier to use, we have developed algorithms to automate the generation of each language. In addition, we collaborated with domain experts to design and develop a dataflow visualization system, which embeds the automatic language generations into components and allows users to build their analysis pipeline by dragging and connecting components of interest. We have demonstrated the effectiveness of IKLE through three real-world case studies in smart foodsheds.
... Others have collected visualizations and figures from scientific papers [SHL * 16, CZL * 20, DWS * 20, CLL * 21]. Moreover, we may consider converting existing TableQA datasets [PL15,ZXS17,IYC17] into ChartQA datasets by translating their queries to SQL commands [YZY * 18] and plotting the relevant portions of the data tables into chart images using SQL2Visualization methods [Han06,LTL * 21a]. With respect to user studies, future works need to focus on field trials and longitudinal studies where the participants can ask their own questions with their own datasets. ...
Article
Full-text available
Information visualizations such as bar charts and line charts are very common for analyzing data and discovering critical insights. Often people analyze charts to answer questions that they have in mind. Answering such questions can be challenging as they often require a significant amount of perceptual and cognitive effort. Chart Question Answering (CQA) systems typically take a chart and a natural language question as input and automatically generate the answer to facilitate visual data analysis. Over the last few years, there has been a growing body of literature on the task of CQA. In this survey, we systematically review the current state‐of‐the‐art research focusing on the problem of chart question answering. We provide a taxonomy by identifying several important dimensions of the problem domain including possible inputs and outputs of the task and discuss the advantages and limitations of proposed solutions. We then summarize various evaluation techniques used in the surveyed papers. Finally, we outline the open challenges and future research opportunities related to chart question answering.
... Interactive methods for querying databases, such as Polaris and later VizQL (Tableau) 28,29 , offer platforms for authoring interactive charts and dashboards through drag-and-drop interfaces. These systems have provided significant value to business analytics with their ease of use and suitability for many common tasks, but they are restrictive in terms of their proprietary nature, limited expressivity, and lack of support for graph-based data sources. ...
Article
Full-text available
Graph databases capture richly linked domain knowledge by integrating heterogeneous data and metadata into a unified representation. Here, we present the use of bespoke, interactive data graphics (bar charts, scatter plots, etc.) for visual exploration of a knowledge graph. By modeling a chart as a set of metadata that describes semantic context (SPARQL query) separately from visual context (Vega-Lite specification), we leverage the high-level, declarative nature of the SPARQL and Vega-Lite grammars to concisely specify web-based, interactive data graphics synchronized to a knowledge graph. Resources with dereferenceable URIs (uniform resource identifiers) can employ the hyperlink encoding channel or image marks in Vega-Lite to amplify the information content of a given data graphic, and published charts populate a browsable gallery of the database. We discuss design considerations that arise in relation to portability, persistence, and performance. Altogether, this pairing of SPARQL and Vega-Lite—demonstrated here in the domain of polymer nanocomposite materials science—offers an extensible approach to FAIR (findable, accessible, interoperable, reusable) scientific data visualization within a knowledge graph framework.
... Others have collected visualizations and figures from scientific papers [SHL * 16, CZL * 20, DWS * 20, CLL * 21]. Moreover, we may consider converting existing TableQA datasets [PL15,ZXS17,IYC17] into ChartQA datasets by translating their queries to SQL commands [YZY * 18] and plotting the relevant portions of the data tables into chart images using SQL2Visualization methods [Han06,LTL * 21a]. With respect to user studies, future works need to focus on field trials and longitudinal studies where the participants can ask their own questions with their own datasets. ...
Preprint
Full-text available
Information visualizations such as bar charts and line charts are very common for analyzing data and discovering critical insights. Often people analyze charts to answer questions that they have in mind. Answering such questions can be challenging as they often require a significant amount of perceptual and cognitive effort. Chart Question Answering (CQA) systems typically take a chart and a natural language question as input and automatically generate the answer to facilitate visual data analysis. Over the last few years, there has been a growing body of literature on the task of CQA. In this survey, we systematically review the current state-of-the-art research focusing on the problem of chart question answering. We provide a taxonomy by identifying several important dimensions of the problem domain including possible inputs and outputs of the task and discuss the advantages and limitations of proposed solutions. We then summarize various evaluation techniques used in the surveyed papers. Finally, we outline the open challenges and future research opportunities related to chart question answering.
... In this context, the first efforts focused on developing visual querying languages for DBs such as [40][41][42][43][44]. Although, they share some similar concepts, most of them address the need to offer the database analyst a visual way for syntactically expressing a query, rather than offering visual operations for interactive data exploration. In most interactive visualization systems, visual user operations (e.g., map panning) are used for specifying the actual query logic and several visualization languages have been proposed to to simplify the generation of such visualizations [45][46][47][48]. ...
Article
[See also http://www.cs.uoi.gr/~pvassil/projects/ploigia/info.html] Data exploration and visual analytics systems are of great importance in Open Science scenarios, where less tech-savvy researchers wish to access and visually explore big raw data files (e.g., json, csv) generated by scientific experiments using commodity hardware and without being overwhelmed in the tedious processes of data loading, indexing and query optimization. In this paper, we present our work for enabling efficient query processing on large raw data files for interactive visual exploration scenarios and analytics. We introduce a framework, named RawVis, built on top of a lightweight in-memory tile-based index, VALINOR, that is constructed on-the-fly given the first user query over a raw file and progressively adapted based on the user interaction. We evaluate the performance of a prototype implementation compared to three other alternatives and show that our method outperforms in terms of response time, disk accesses and memory consumption. Particularly during an exploration scenario, the proposed method in most cases is about 5-10× faster compared to existing solutions, and requires significantly less memory resources. Keywords: Visual Analytics, Progressive & Adaptive Indexes, User-driven Incremental Processing, Interactive Indexing, RawVis, In-situ Query Processing, Big Data Visualization
... Our visualization language L V is shown in Figure 8, which formalizes core constructs in Vega-Lite [Satyanarayan et al. 2017], the ggplot2 visualization library for R and VizQL [Hanrahan 2006] from Tableau. This formalization enables concise descriptions of visualizations by encoding data as properties of graphical marks. ...
Article
Full-text available
While visualizations play a crucial role in gaining insights from data, generating useful visualizations from a complex dataset is far from an easy task. In particular, besides understanding the functionality provided by existing visualization libraries, generating the desired visualization also requires reshaping and aggregating the underlying data as well as composing different visual elements to achieve the intended visual narrative. This paper aims to simplify visualization tasks by automatically synthesizing the required program from simple visual sketches provided by the user. Specifically, given an input data set and a visual sketch that demonstrates how to visualize a very small subset of this data, our technique automatically generates a program that can be used to visualize the entire data set. From a program synthesis perspective, automating visualization tasks poses several challenges that are not addressed by prior techniques. First, because many visualization tasks require data wrangling in addition to generating plots from a given table, we need to decompose the end-to-end synthesis task into two separate sub-problems. Second, because the intermediate specification that results from the decomposition is necessarily imprecise, this makes the data wrangling task particularly challenging in our context. In this paper, we address these problems by developing a new compositional visualization-by-example technique that (a) decomposes the end-to-end task into two different synthesis problems over different DSLs and (b) leverages bi-directional program analysis to deal with the complexity that arises from having an imprecise intermediate specification. We have implemented our visualization-by-example approach in a tool called Viser and evaluate it on over 80 visualization tasks collected from on-line forums and tutorials. Viser can solve 84 of these benchmarks within a 600 second time limit, and, for those tasks that can be solved, the desired visualization is among the top-5 generated by Viser in 70% of the cases.
... Our visualization language L V is shown in Figure 8, which formalizes core constructs in VegaLite [Satyanarayan et al. 2017], the ggplot2 visualization library for R and VizQL [Hanrahan 2006] from Tableau. This formalization enables concise descriptions of visualizations by encoding data 49:10 Chenglong Wang, Yu Feng, Rastislav Bodik, Alvin Cheung, and Isil Dillig as properties of graphical marks. ...
Preprint
Full-text available
While visualizations play a crucial role in gaining insights from data, generating useful visualizations from a complex dataset is far from an easy task. Besides understanding the functionality provided by existing visualization libraries, generating the desired visualization also requires reshaping and aggregating the underlying data as well as composing different visual elements to achieve the intended visual narrative. This paper aims to simplify visualization tasks by automatically synthesizing the required program from simple visual sketches provided by the user. Specifically, given an input data set and a visual sketch that demonstrates how to visualize a very small subset of this data, our technique automatically generates a program that can be used to visualize the entire data set. Automating visualization poses several challenges. First, because many visualization tasks require data wrangling in addition to generating plots, we need to decompose the end-to-end synthesis task into two separate sub-problems. Second, because the intermediate specification that results from the decomposition is necessarily imprecise, this makes the data wrangling task particularly challenging in our context. In this paper, we address these problems by developing a new compositional visualization-by-example technique that (a) decomposes the end-to-end task into two different synthesis problems over different DSLs and (b) leverages bi-directional program analysis to deal with the complexity that arises from having an imprecise intermediate specification. We implemented our visualization-by-example algorithm and evaluate it on 83 visualization tasks collected from on-line forums and tutorials. Viser can solve 84% of these benchmarks within a 600 second time limit, and, for those tasks that can be solved, the desired visualization is among the top-5 generated by Viser in 70% of the cases.
... In 2003, Hanrahan revised Mackinglay‚s specifications into a declarative visual language known as VizQL [13]. It is a formal language for describing tables, charts, graphs, maps, time series and tables of visualizations. ...
Preprint
Full-text available
Choosing a suitable visualization for data is a difficult task. Current data visualization recommender systems exist to aid in choosing a visualization, yet suffer from issues such as low accessibility and indecisiveness. In this study, we first define a step-by-step guide on how to build a data visualization recommender system. We then use this guide to create a model for a data visualization recommender system for non-experts that aims to resolve the issues of current solutions. The result is a question-based model that uses a decision tree and a data visualization classification hierarchy in order to recommend a visualization. Furthermore, it incorporates both task-driven and data characteristics-driven perspectives, whereas existing solutions seem to either convolute these or focus on one of the two exclusively. Based on testing against existing solutions, it is shown that the new model reaches similar results while being simpler, clearer, more versatile, extendable and transparent. The presented guide can be used as a manual for anyone building a data visualization recommender system. The resulting model can be applied in the development of new data visualization software or as part of a learning tool.
... In 2003, Hanrahan revised Mackinglay‚s specifications into a declarative visual language known as VizQL [13]. It is a formal language for describing tables, charts, graphs, maps, time series and tables of visualizations. ...
Chapter
Full-text available
Choosing a suitable visualization for data is a difficult task. Current data visualization recommender systems exist to aid in choosing a visualization, yet suffer from issues such as low accessibility and indecisiveness. In this study, we first define a step-by-step guide on how to build a data visualization recommender system. We then use this guide to create a model for a data visualization recommender system for non-experts that aims to resolve the issues of current solutions. The result is a question-based model that uses a decision tree and a data visualization classification hierarchy in order to recommend a visualization. Furthermore, it incorporates both task-driven and data characteristics-driven perspectives, whereas existing solutions seem to either convolute these or focus on one of the two exclusively. Based on testing against existing solutions, it is shown that the new model reaches similar results while being simpler, clearer, more versatile, extendable and transparent. The presented guide can be used as a manual for anyone building a data visualization recommender system. The resulting model can be applied in the development of new data visualization software or as part of a learning tool.
... Wilkinson introduces a grammar of graphics [Wil99] and its implementation (VizML), forming the basis of the subsequent research on visualization specification. Drawing from Wilkinson's grammar of graphics, Polaris [STH02] (commercialized as Tableau) uses a table algebra, which later evolved to VizQL [Han06], the underlying representation of Tableau visualizations. Wickham introduces ggplot2 [Wic10], a widely-adopted package in the R statistical language, based on Wilkinson's grammar. ...
Article
Full-text available
Pan and zoom are basic yet powerful interaction techniques for exploring large datasets. However, existing zoomable UI toolkits such as Pad++ and ZVTM do not provide the backend database support and data‐driven primitives that are necessary for creating large‐scale visualizations. This limitation in existing general‐purpose toolkits has led to many purpose‐built solutions (e.g. Google Maps and ForeCache) that address the issue of scalability but cannot be easily extended to support visualizations beyond their intended data types and usage scenarios. In this paper, we introduce Kyrix to ease the process of creating general and large‐scale web‐based pan/zoom visualizations. Kyrix is an integrated system that provides the developer with a concise and expressive declarative language along with a backend support for performance optimization of large‐scale data. To evaluate the scalability of Kyrix, we conducted a set of benchmarked experiments and show that Kyrix can support high interactivity (with an average latency of 100 ms or below) on pan/zoom visualizations of 100 million data points. We further demonstrate the accessibility of Kyrix through an observational study with 8 developers. Results indicate that developers can quickly learn Kyrix's underlying declarative model to create scalable pan/zoom visualizations. Finally, we provide a gallery of visualizations and show that Kyrix is expressive and flexible in that it can support the developer in creating a wide range of customized visualizations across different application domains and data types.
... -High-level languages [25,26,35,61,72,73,75,79] further abstract the details of low-level visualization construction. They provide concise specification interfaces that are easier for new users to learn and use. ...
Conference Paper
The problem of data visualization is to transform data into a visual context such that people can easily understand the significance of data. Nowadays, data visualization becomes especially important, because it is the de facto standard for modern business intelligence and successful data science. This tutorial will cover three specific topics: visualization languages define how the users can interact with various visualization systems; efficient data visualization processes the data and produces visualizations based on well-specified user queries; smart data visualization recommends data visualizations based on underspecified user queries. In this tutorial, we will go logically through these prior art, paying particular attentions on problems that may attract the interest from the database community.
... In a seminal work, Wilkinson introduces a grammar of graphics [22] and its implementation (VizML), forming the basis of the subsequent research on visualization specification. Drawing from Wilkinson's grammar of graphics, Polaris [20] (commercialized as Tableau) uses a table algebra, which later evolved to VizQL [8], the underlying representation of Tableau visualizations. Wickham introduces ggplot2 [21], a widely-popular package in the R statistical language, based on Wilkinson's grammar. ...
Preprint
Full-text available
Scalable interactive visual data exploration is crucial in many domains due to increasingly large datasets generated at rapid rates. Details-on-demand provides a useful interaction paradigm for exploring large datasets, where users start at an overview, find regions of interest, zoom in to see detailed views, zoom out and then repeat. This paradigm is the primary user interaction mode of widely-used systems such as Google Maps, Aperture Tiles and ForeCache. These earlier systems, however, are highly customized with hardcoded visual representations and optimizations. A more general framework is needed to facilitate the development of visual data exploration systems at scale. In this paper, we present Kyrix, an end-to-end system for developing scalable details-on-demand data exploration applications. Kyrix provides developers with a declarative model for easy specification of general visualizations. Behind the scenes, Kyrix utilizes a suite of performance optimization techniques to achieve a response time within 500ms for various user interactions. We also report results from a performance study which shows that a novel dynamic fetching scheme adopted by Kyrix outperforms tile-based fetching used in earlier systems.
... Wilkinson introduces a grammar of graphics [Wil99] and its implementation (VizML), forming the basis of the subsequent research on visualization specification. Drawing from Wilkinson's grammar of graphics, Polaris [STH02] (commercialized as Tableau) uses a table algebra, which later evolved to VizQL [Han06], the underlying representation of Tableau visualizations. Wickham introduces ggplot2 [Wic10], a widely-adopted package in the R statistical language, based on Wilkinson's grammar. ...
Conference Paper
Full-text available
Pan and zoom are basic yet powerful interaction techniques for exploring large datasets. However, existing zoomable UI toolkits such as Pad++ and ZVTM do not provide the backend database support and data-driven primitives that are necessary for creating large-scale visualizations. This limitation in existing general-purpose toolkits has led to many purpose-built solutions (e.g. Google Maps and ForeCache) that address the issue of scalability but cannot be easily extended to support visualizations beyond their intended data types and usage scenarios. In this paper, we introduce Kyrix to ease the process of creating general and large-scale web-based pan/zoom visualizations. Kyrix is an integrated system that provides the developer with a concise and expressive declarative language along with a backend support for performance optimization of large-scale data. To evaluate the scalability of Kyrix, we conducted a set of benchmarked experiments and show that Kyrix can support high interactivity (with an average latency of 100 ms or below) on pan/zoom visualizations of 100 million data points. We further demonstrate the accessibility of Kyrix through an observational study with 8 developers. Results indicate that developers can quickly learn Kyrix’s underlying declarative model to create scalable pan/zoom visualizations. Finally, we provide a gallery of visualizations and show that Kyrix is expressive and flexible in that it can support the developer in creating a wide range of customized visualizations across different application domains and data types.
... Facilitating data discovery and the analytic communication process, Tableau is designed to help users produce explanatory graphics and dynamic and interactive visualizations (Jones, 2014). Although essentially Tableau deploys VizQL, a programming language for describing tables, charts, graphs, maps, time series, and tables of visualizations (Hanrahan, 2006), the Tableau interface is user-friendly, such that simple clicks of sensors can accomplish complex tasks. Tableau has great interactive functions and a mobile-friendly interface, but the lack of the capability to drag and move the map via a mouse is not complaisant with typical user experience. ...
Article
Full-text available
This article examines whether implementing visualizations on an institutional repository webpage increases traffic on the site. Two methods for creating visualizations to attract faculty and student interest were employed. The first is a map displaying usage of institutional repository content from around the world. This map uses Tableau software to display Google Analytics data. The second method is a text mining tool allowing users to generate word clouds from dissertation and thesis abstracts according to discipline and year of publication. The word cloud uses R programing language, the Shiny software package, and a text mining package called tm. Change in the number of institutional repository website sessions was analyzed through change-point analysis.
... Vega (https://vega.github.io/vega/) is a visualization grammar in a JSON format. VizQL [24], used by Tableau, is a visual query language that translates drag-and-drop actions into data queries and then expresses data visually. ...
... Automatic visualization tools synthesize graphical designs, which are abstract descriptions of visualizations (Fig. 1). For example, the underlying language for APT describes graphical techniques (e.g., color variation and position on axis) to encode information, whereas ShowMe [38] synthesizes encodings using VizQL [23]. Following CompassQL [65], Draco uses a logical representation of the Vega-Lite grammar [52]. ...
Article
Full-text available
There exists a gap between visualization design guidelines and their application in visualization tools. While empirical studies can provide design guidance, we lack a formal framework for representing design knowledge, integrating results across studies, and applying this knowledge in automated design tools that promote effective encodings and facilitate visual exploration. We propose modeling visualization design knowledge as a collection of constraints, in conjunction with a method to learn weights for soft constraints from experimental data. Using constraints, we can take theoretical design knowledge and express it in a concrete, extensible, and testable form: the resulting models can recommend visualization designs and can easily be augmented with additional constraints or updated weights. We implement our approach in Draco, a constraint-based system based on Answer Set Programming (ASP). We demonstrate how to construct increasingly sophisticated automated visualization design systems, including systems based on weights learned directly from the results of graphical perception experiments.
... In 2003, Hanrahan revised Mackinlay's specifications into a declarative visual language known as VizQL (Hanrahan, 2006). It is a formal language for describing tables, charts, graphs, maps and time series. ...
Conference Paper
Full-text available
In today’s age, there are huge amounts of data being generated every second of every day. Through data visualization, humans can explore, analyse and present it. Choosing a suitable visualization for data is a difficult task, especially for non-experts. Current data visualization recommender systems exist to aid in choosing a visualization, yet suffer from issues such as low accessibility and indecisiveness. The aim of this study is to create a model for a data visualization recommender system for non-experts that resolves these issues. Based on existing work and a survey among data scientists, requirements for a new model were identified and implemented. The result is a question-based model that uses a decision tree and a data visualization classification hierarchy in order to recommend a visualization. Furthermore, it incorporates both task-driven and data characteristics-driven perspectives, whereas existing solutions seem to either convolute these or focus on one of the two exclusively. Based on testing against existing solutions, it is shown that the new model reaches similar results while being simpler, clearer, more versatile, extendable and transparent. The presented model can be applied in the development of new data visualization software or as part of a learning tool.
... Polaris [48] (now called Tableau) uses a table algebra drawn from Wilkinson's grammar of graphics. The table algebra of Polaris later evolved to VizQL [22], forming the underlying representation of Tableau visualizations. Wickham introduces ggplot2 [54], a widely-popular package in the R statistical language, based on Wilkinson's grammar. ...
Article
Full-text available
Rapidly creating effective visualizations using expressive grammars is challenging for users who have limited time and limited skills in statistics and data visualization. Even high-level, dedicated visualization tools often require users to manually select among data attributes, decide which transformations to apply, and specify mappings between visual encoding variables and raw or transformed attributes. In this paper, we introduce Data2Vis, a neural translation model, for automatically generating visualizations from given datasets. We formulate visualization generation as a sequence to sequence translation problem where data specification is mapped to a visualization specification in a declarative language (Vega-Lite). To this end, we train a multilayered Long Short-Term Memory (LSTM) model with attention on a corpus of visualization specifications. Qualitative results show that our model learns the vocabulary and syntax for a valid visualization specification, appropriate transformations (count, bins, mean) and how to use common data selection patterns that occur within data visualizations. Our model generates visualizations that are comparable to manually-created visualizations in a fraction of the time, with potential to learn more complex visualization strategies at scale.
Chapter
Data and information visualization involves creating clear and understandable visual representations of complex quantitative and qualitative data using static, dynamic, or interactive visuals. These visualizations are based on data from specific areas of expertise and are designed to help a wide audience explore and understand data structures, patterns, and relationships. Effective data visualization is accurate, simple, and visually appealing, using deliberate choices of shapes, colors, and other elements. New technologies like virtual and augmented reality can enhance the immersive and interactive nature of data visualization. The goal of data visualization is to present and explore non-physical data from various sources, differentiating it from scientific visualization, which focuses on rendering realistic images based on physical data.
Article
Various data visualization applications such as reverse engineering and interactive authoring require a vocabulary that describes the structure of visualization scenes and the procedure to manipulate them. A few scene abstractions have been proposed, but they are restricted to specific applications for a limited set of visualization types. A unified and expressive model of data visualization scenes for different applications has been missing. To fill this gap, we present Manipulable Semantic Components (MSC), a computational representation of data visualization scenes, to support applications in scene understanding and augmentation. MSC consists of two parts: a unified object model describing the structure of a visualization scene in terms of semantic components, and a set of operations to generate and modify the scene components. We demonstrate the benefits of MSC in three applications: visualization authoring, visualization deconstruction and reuse, and animation specification.
Article
Business Intelligence (BI) is crucial in modern enterprises and billion-dollar business. Traditionally, technical experts like database administrators would manually prepare BI-models (e.g., in star or snowflake schemas) that join tables in data warehouses, before less-technical business users can run analytics using end-user dashboarding tools. However, the popularity of self-service BI (e.g., Tableau and Power-BI) in recent years creates a strong demand for less technical end-users to build BI-models themselves. We develop an Auto-BI system that can accurately predict BI models given a set of input tables, using a principled graph-based optimization problem we propose called k-Min-Cost-Arborescence (k-MCA), which holistically considers both local join prediction and global schema-graph structures, leveraging a graph-theoretical structure called arborescence. While we prove k-MCA is intractable and inapproximate in general, we develop novel algorithms that can solve k-MCA optimally, which is shown to be efficient in practice with sub-second latency and can scale to the largest BI-models we encounter (with close to 100 tables). Auto-BI is rigorously evaluated on a unique dataset with over 100K real BI models we harvested, as well as on 4 popular TPC benchmarks. It is shown to be both efficient and accurate, achieving over 0.9 F1-score on both real and synthetic benchmarks.
Article
Full-text available
Safety has become the primary concern for the air transportation system nowadays primarily due to increasing air traffic throughout the world. Various regulatory bodies have been maintaining enormous amount of aviation accidental data repositories. This past data is highly complex because of its many temporal and geographical components along with multiple variables. To be able to analyze this past data, there is always a need of user friendly and GUI based System. This article has proposed an intelligent vision-based decision-making system for the exploration of past aviation accidents and incidents dataset. The proposed visual query-based model is capable to analyse the major factors like flight phases, human factors, weather conditions and faulty components in particular aircraft models which are responsible for those unsafe events and may claim life of many passengers who are traveling and crew personnels. This model enables the users to express “what” visuals should be created instead of “how” to create them. Various case studies conducted through visual queries have proved that the system will be highly able to improve situational awareness regarding flight conditions to the crew members and air traffic controllers along with aviation authorities so that they are able to take timely decisions and deciding on what kind of training staff members need to reduce the consequences of such accidents and incidents.
Preprint
Full-text available
There has been substantial growth in the use of JSON-based grammars, as well as other standard data serialization languages, to create visualizations. Each of these grammars serves a purpose: some focus on particular computational tasks (such as animation), some are concerned with certain chart types (such as maps), and some target specific data domains (such as ML). Despite the prominence of this interface form, there has been little detailed analysis of the characteristics of these languages. In this study, we survey and analyze the design and implementation of 57 JSON-style DSLs for visualization. We analyze these languages supported by a collected corpus of examples for each DSL (consisting of 4395 instances) across a variety of axes organized into concerns related to domain, conceptual model, language relationships, affordances, and general practicalities. We identify tensions throughout these areas, such as between formal and colloquial specifications, among types of users, and within the composition of languages. Through this work, we seek to support language implementers by elucidating the choices, opportunities, and tradeoffs in visualization DSL design.
Book
The noble way to substantiate decisions that affect many people is to ask these people for their opinions. For governments that run whole countries, this means asking all citizens for their views to consider their situations and needs. Organizations such as Africa's Voices Foundation, who want to facilitate communication between decision-makers and citizens of a country, have difficulty mediating between these groups. To enable understanding, statements need to be summarized and visualized. Accomplishing these goals in a way that does justice to the citizens' voices and situations proves challenging. Standard charts do not help this cause as they fail to create empathy for the people behind their graphical abstractions. Furthermore, these charts do not create trust in the data they are representing as there is no way to see or navigate back to the underlying code and the original data. To fulfill these functions, visualizations would highly benefit from interactions to explore the displayed data, which standard charts often only limitedly provide. To help improve the understanding of people's voices, we developed and categorized 80 ideas for new visualizations, new interactions, and better connections between different charts, which we present in this report. From those ideas, we implemented 10 prototypes and two systems that integrate different visualizations. We show that this integration allows consistent appearance and behavior of visualizations. The visualizations all share the same main concept: representing each individual with a single dot. To realize this idea, we discuss technologies that efficiently allow the rendering of a large number of these dots. With these visualizations, direct interactions with representations of individuals are achievable by clicking on them or by dragging a selection around them. This direct interaction is only possible with a bidirectional connection from the visualization to the data it displays. We discuss different strategies for bidirectional mappings and the trade-offs involved. Having unified behavior across visualizations enhances exploration. For our prototypes, that includes grouping, filtering, highlighting, and coloring of dots. Our prototyping work was enabled by the development environment Lively4. We explain which parts of Lively4 facilitated our prototyping process. Finally, we evaluate our approach to domain problems and our developed visualization concepts. Our work provides inspiration and a starting point for visualization development in this domain. Our visualizations can improve communication between citizens and their government and motivate empathetic decisions. Our approach, combining low-level entities to create visualizations, provides value to an explorative and empathetic workflow. We show that the design space for visualizing this kind of data has a lot of potential and that it is possible to combine qualitative and quantitative approaches to data analysis.
Article
In this work, we present a self-driving data visualization system, called DeepEye , that automatically generates and recommends visualizations based on the idea of visualization by examples. We propose effective visualization recognition techniques to decide which visualizations are meaningful and visualization ranking techniques to rank the good visualizations. Furthermore, a main challenge of automatic visualization system is that the users may be misled by blindly suggesting visualizations without knowing the user's intent. To this end, we extend DeepEye to be easily steerable by allowing the user to use keyword search and providing click-based faceted navigation . Empirical results, using real-life data and use cases, verify the power of our proposed system.
Article
In the last several years, large multidimensional databases have become common in a variety of applications, such as data warehousing and scientific computing. Analysis and exploration tasks place significant demands on the interfaces to these databases. Because of the size of the data sets, dense graphical representations are more effective for exploration than spreadsheets and charts. Furthermore, because of the exploratory nature of the analysis, it must be possible for the analysts to change visualizations rapidly as they pursue a cycle involving first hypothesis and then experimentation. In this paper, we present Polaris, an interface for exploring large multidimensional databases that extends the well-known pivot table interface. The novel features of Polaris include an interface for constructing visual specifications of table-based graphical displays and the ability to generate a precise set of relational queries from the visual specifications. The visual specifications can be rapidly and incrementally developed, giving the analyst visual feedback as he constructs complex queries and visualizations