Article

The Use of Faces to Represent Points in K-Dimensional Space Graphically

Authors:
To read the full-text of this research, you can request a copy directly from the author.

Abstract

A novel method of representing multivariate data is presented. Each point in k-dimensional space, k≤18, is represented by a cartoon of a face whose features, such as length of nose and curvature of mouth, correspond to components of the point. Thus every multivariate observation is visualized as a computer-drawn face. This presentation makes it easy for the human mind to grasp many of the essential regularities and irregularities present in the data. Other graphical representations are described briefly.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... In the context of information visualisation, data glyphs are used for the representation of multidimensional data (Chernoff, 1973). Glyphs can be described as composite graphical objects that use their visual and geometric attributes to encode multidimensional data by mapping each dimension of data point to the marks of a glyph (Anderson, 1957;Wittenbrink, Pang, and Lodha, 1996). ...
... In this chapter, we base our work on one type of glyph: Chernoff faces (Chernoff, 1973). This type of glyphs encodes multidimensional data using facial features such as the length of the nose, the orientation of the eyebrows, the shape of a mouth, among others. ...
... In addition, some authors justify the usage of certain glyph designs with reasons related to human aptitudes, such as the ability to easily recognise faces, e.g. face glyphs (Chernoff, 1973), or to visually discriminate natural shapes, e.g. leaf glyphs (Fuchs, Weiler, and Schreck, 2015). ...
Thesis
The visual representation of concepts has been the focus of multiple studies throughout history and is considered to be behind the origin of existing writing systems. Its exploration has led to the development of several visual language systems and is a core part of graphic design assignments, such as icon design. As is the case with problems from other fields, the visual representation of concepts has also been addressed using computational approaches. In this thesis, we focus on the computational generation of visual symbols to represent concepts, specifically through the use of blending. We started by studying aspects related to the transformation mechanisms used in the visual blending process, which led to the proposal of a visual blending taxonomy that can be used in the study and production of visual blends. In addition to the study of visual blending, we conceived and implemented several systems: a system for the automatic generation of visual blends using a descriptive approach, with which we conducted an experiment with three concepts (pig, angel and cactus); a visual blending system based on the combination of emoji, which we called Emojinating; and a system for the generation of flags, which we called Moody Flags. The experimental results obtained through multiple user studies indicate that the systems that we developed are able to represent abstract concepts, which can be useful in ideation activities and for visualisation purposes. Overall, the purpose of our study is to explore how the representation of concepts can be done through visual blending. We established that visual blending should be grounded on the conceptual level, lead- ing to what we refer to as Visual Conceptual Blending. We delineated a roadmap for the implementation of visual conceptual blending and described resources that can help in such a venture, as is the case of a categorisation of emoji oriented towards visual blending.
... Cluster analysis [5] + + − − − Mapping into low-dimensional spaces (e.g., [12]) + + − − − Dashboards (e.g., [13]) + − + + + Radar chart diagrams and parallel coordinates plots [10] + + + + − Data glyphs [14] + + + + − Chernoff faces [15] + + + + − Cognitive maps (e.g., [16,17]) ...
... Cognitive visualization helps the decision maker quickly interpret and assess the state of a complex system (or a set of systems) using figurative thinking. Chernoff showed [15] that a human recognizes the difference between known objects that were around during human evolution, such as human faces, much more easily than the difference between abstract images. The original method of Chernoff faces is a way to map data about a complex, multiparameter system such as a human face, which has a symmetry axis and allows for displaying quantitative and qualitative parameters (up to 18 parameters when keeping vertical symmetry and 36 parameters without it [18]). ...
Article
Full-text available
The article discusses the problem of visualization of complex multiparameter systems, defined by datasets on their structure, functional structure, and activity in the form of complex graphs and transition of traditional representation of the data acquired by graph mining to a compact image built by pictographic methods. In these situations, we propose using the Unified Graphic Visualization of Activity (UGVA) method for data concentration and structuring. The UGVA method allows coding in an anthropomorphic image of elements of graphs with data on structural and functional features of systems and overlaying these images with the data on the system’s activity using coloring and artifacts. The image can be composed in different ways: it can include the zone of integral evaluation parameters, segmented data axes of five types, and four types of symmetry. We describe the method of creating UGVA images, which consists of 13 stages: the parametric model is represented as a structural image that is converted to a basic image that is then detailed into the particular image by defining geometric parameters of the primitives and to the individualized image with the data about a particular object. We show how the individualized image can be overlaid with the operative data as color coding and artifacts and describe the principles of interpreting UGVA images. This allows solving tasks of evaluation, comparison, and monitoring of complex multiparameter systems by showing the decision-maker an anthropomorphic image instead of the graph. We describe a case study of using the UGVA method for visualization of data about an educational process: curricula and graduate students, including the data mined from the university’s learning management system at the Siberian Federal University for students majoring in “informatics and computing”. The case study demonstrates all stages of image synthesis and examples of their interpretation for situation assessment, monitoring, and comparison of students and curricula. It allowed for finding problematic moments in learning for individual students and their entire group by analyzing the development of their competence profiles and formulating recommendations for further learning. The effectiveness of the resulting images is compared to the other approaches: elastic maps and Chernoff faces. We discuss using graph mining to generate learning problems in order to lessen the workload of gathering raw data for the UGVA method and provide general recommendations for using the UGVA method based on our experience of supporting decision making.
... One of the picture charts, i.e. Chernoff's faces [36], will be used to illustrate the relationship between the numbers of occurrences of all types of nonconformities in the examined months of the year. In the method of this face, individual variables are reflected by different characteristics of the human face [36]. ...
... Chernoff's faces [36], will be used to illustrate the relationship between the numbers of occurrences of all types of nonconformities in the examined months of the year. In the method of this face, individual variables are reflected by different characteristics of the human face [36]. The Pearson correlation coefficient and the correlation matrix will be used to determine the degree of dependence between the numbers of occurrences of particular types of nonconformities. ...
Article
Full-text available
The article presents the result of multidimensional analysis of ‘Behaton’ type paving stones’ nonconformities for improving the production process by improving the quality of the final product. Statistical tools, including SPC tools and quality tools, both basic and new, were used to analyse nonconformities in the spatial-temporal system, i.e. according to the type of nonconformity and according to the examined months. The purpose of using the data analysis tools was to thoroughly analyse the cases of nonconformities of the tested product, obtain information on the structure of these nonconformities in the various terms, and information on the stability and predictability of the numerical structure of nonconformity over time. Potential causes influencing a large percentage of paving stone defects were identified, factors and variables influencing the most frequently occurring nonconformities were determined, and improvement actions were proposed. As a result of the multidimensional and multifaceted analyses of paving stone nonconformities, it was shown that in the structure of nonconformity there were cases that were unusual in terms of the number of occurrences, and the lack of stability in the number of nonconformities in terms of the examined months was proven. Three critical nonconformities of the tested product were identified: side surface defects, vertical edge defects, and scratches and cracks. It was determined that the most important factor causing a large percentage of nonconformity was the time of shaking and vibrating the concrete, which was significantly related to the technical condition of the machines, and the most important reason for a large percentage of paving stone nonconformity was the lack of efficient maintenance. Machine, method, and man turned out to be the most important categories of problem factors and specific remedial actions were proposed. A multidimensional look at the structure of paving stone nonconformity as well as the factor and causes causing them has brought a lot of valuable information for the management staff of the analysed company, thanks to which it is possible to improve the production process and improve the quality of the final product.
... However when using human-looking glyphs for displaying developer related data, various visual attributes become critical and should be considered very carefully. One idea to overcome this issue, would be the use of abstract forms, which only reminds on human faces, as initially presented as the popular Chernoff faces [8]. ...
... We applied the city, the forest, and the island metaphor for our use-cases. Our ␣␣␣␣␣␣"baseModel":␣"Cylinder_Ax.001", 8 ␣␣␣␣␣␣"variants":␣[ 9 ␣␣␣␣␣␣␣␣{␣"name":␣"Cylinder_Ax.001",␣"color":␣1.0␣}, 10 ␣␣␣␣␣␣␣␣{␣"name":␣"Cylinder_Ax.002",␣"color":␣0.75␣}, 11 ␣␣␣␣␣␣␣␣{␣"name":␣"Cylinder_Ax.003",␣"color":␣0.5␣}, 12 ␣␣␣␣␣␣␣␣{␣"name":␣"Cylinder_Ax.004",␣"color":␣0.25␣}, 13 ␣␣␣␣␣␣␣␣{␣"name":␣"Cylinder_Ax.005",␣"color":␣0.0␣} ...
Chapter
Full-text available
For various program comprehension tasks, software visualization techniques can be beneficial by displaying aspects related to the behavior, structure, or evolution of software. In many cases, the question is related to the semantics of the source code files, e.g., the localization of files that implement specific features or the detection of files with similar semantics. This work presents a general software visualization technique for source code documents, which uses 3D glyphs placed on a two-dimensional reference plane. The relative positions of the glyphs captures their semantic relatedness. Our layout originates from applying Latent Dirichlet Allocation and Multidimensional Scaling on the comments and identifier names found in the source code files. Though different variants for 3D glyphs can be applied, we focus on cylinders, trees, and avatars. We discuss various mappings of data associated with source code documents to the visual variables of 3D glyphs for selected use cases and provide details on our visualization system.
... Another data visualization tool was invented fifty years ago to display multivariate data using a coding that produced cartoon faces [Chernoff, 1973]. Because a large portion of the human brain is reserved for processing facial features, scholars believed that this presentation made it easier for humans to understand data. ...
... Chernoff reduced the multiple complexities of faces to a few characteristics like length of the nose and curvature of the mouth. To extend his method, Chernoff [1973] suggested "adding ears, hair, [and] facial lines." The abstract of the paper uses the term "cartoon," and humans have evolved to read people not cartoons. ...
Preprint
Full-text available
Human-computer interaction relies on mouse/touchpad, keyboard, and screen, but tools have recently been developed that engage sound, smell, touch, muscular resistance, voice dialogue, balance, and multiple senses at once. How might these improvements impact upon the practice of statistics and data science? People with low vision may be better able to grasp and explore data. More generally, methods developed to enable this have the potential to allow sighted people to use more senses and become better analysts. We would like to adapt some of the wide range of available computer and sensory input/output technologies to transform data science workflows. Here is a vision of what this synthesis might accomplish.
... In order to answer the question (5), a picture graph of the Chernoff faces type (Chernoff, 1973) was used. In this chart, the quarters of the year will be visualized by 4 faces in such a way that the relative values of the number of products complained by customers for each complaint cause in those quarters will be represented by a different size or position of different elements of the human face. ...
Article
Full-text available
The article presents the results of the analysis of cardboard packaging complaints based on selected quality tools and statistical tools for the purpose of a rough assessment of the effectiveness of corrective and preventive actions taken by the surveyed company and for predictive purposes. The analysis was performed in terms of two research periods - 1 year and quarters, and from the point of view of total complaints and external - customer complaints. Data on the number of products complained of as well as financial losses incurred by the company on this account were analysed. The article presents the potential of both classic quality tools and statistical tools for the purposes of in-depth analysis of complaints data and for predictive purposes and subsequent risk analysis. The critical complaint was indicated - complaint code 403 - overprint. The number of complained products to be expected in the next quarter of the new year was determined. The article shows that the corrective and preventive actions taken by the company have not yet brought the expected result in the form of reducing the number of products complained by customers during the quarters surveyed.
... Las variables fueron estandarizadas para evitar que una variable "domine" el cálculo de distancias y el proceso de conglomeración debido a que sus valores son (por un tema de escala) miles de veces mayores que los de las otras variable. Chernoff (1971Chernoff ( , 1973 propuso el uso de una cara o rostro humano para representar puntos en k dimensiones, asignando a cada unidad de observación una cara de manera que la posición, longitud y forma de cada una de los componentes faciales (ojos, cejas, nariz, boca, orejas, pelo, etc.) refleja el comportamiento de una de las variables que intervienen en el estudio. El rango de variabilidad se establece de manera que la estructura global En el software STATA se aplicó primero el comando chernoff con las especificaciones de los rasgos físicos, seguido de la opción saveall para grabar las caras y a continuación se aplicó el comando graph combine para definir los cluster sugeridos por el dendograma. ...
Article
Objetivo: Divulgar entre los investigadores en salud pública algunas propuestas de gráficos estadísticos multivariantes, tales como las caras de Chernoff, los gráficos de estrella, las curvas de Andrews y gráficos de radar. Material y método: Los gráficos estadísticos multivariantes propuestos son aplicados para describir los Sistemas de Salud Públicos de Chile, año 2010, (n=29) en base a información cuantitativa de dimensión p=7; ellos son procesados usando software estadísticos standard como R, STATA y SAS. Resultados: La descripción de los 29 Sistemas de Salud Públicos de Chile en base a información cuantitativa de dimensión siete, representados en cada uno de los gráficos multivariantes exhibidos, permite determinar conglomerados, estándares y tendencias. Conclusiones: La disponibilidad de eficientes software estadísticos sugiere complementar las indispensables estrategias de análisis exploratorio de nuestra información con representaciones gráficas adecuadas al contexto multidimensional que pueden ayudar a alcanzar una mejor comprensión del problema de interés. La síntesis gráfica generada por la aplicación de gráficos estadísticos multivariantes para describir los Sistemas de Salud Públicos de Chile complementa nuestro conocimiento de este aspecto de la realidad nacional, puede sugerir algunas hipótesis, refutar otras y ayudar en la interpretación de resultados complejos. Naturalmente esto es extensible a muchas otras situaciones de interés para el investigador salubrista.
... Scatterplots typically use a small circle or dot as a visual representation for items, but many variations exist that use glyph shapes to convey multidimensional variables [27,39,5,8,7]. However, in normalized mode, sometimes the aspect ratio of visual marks changes according to the aspect ratio of the space assigned to that value. ...
Preprint
Full-text available
Scatterplots are a common tool for exploring multidimensional datasets, especially in the form of scatterplot matrices (SPLOMs). However, scatterplots suffer from overplotting when categorical variables are mapped to one or two axes, or the same continuous variable is used for both axes. Previous methods such as histograms or violin plots use aggregation, which makes brushing and linking difficult. To address this, we propose gatherplots, an extension of scatterplots to manage the overplotting problem. Gatherplots are a form of unit visualization, which avoid aggregation and maintain the identity of individual objects to ease visual perception. In gatherplots, every visual mark that maps to the same position coalesces to form a packed entity, thereby making it easier to see the overview of data groupings. The size and aspect ratio of marks can also be changed dynamically to make it easier to compare the composition of different groups. In the case of a categorical variable vs. a categorical variable, we propose a heuristic to decide bin sizes for optimal space usage. To validate our work, we conducted a crowdsourced user study that shows that gatherplots enable people to assess data distribution more quickly and more correctly than when using jittered scatterplots.
... Anderson's Irises resulted from his long experience of working out relevant models to describe changes in specific populations by means of a limited number of characteristics. Yet, Anderson had also coped with the opposite problem of building simple multi-dimensional data interpretation 40 years before Chernoff faces appeared (Chernoff, 1973). ...
Article
Full-text available
The article supports the need for training techniques for neural network computer simulations in a spreadsheet context. Their use in simulating artificial neural networks is systematically reviewed. The authors distinguish between fundamental methods for addressing the issue of network computer simulation training in the spreadsheet environment, joint application of spreadsheets and tools for neural network simulation, application of third-party add-ins to spreadsheets, development of macros using embedded languages of spreadsheets, use of standard spreadsheet add-ins for non-linear optimization, creation of neural networks in the spreadsheet environment without add-ins, and On the article, methods for creating neural network models in Google Sheets, a cloud-based spreadsheet, are discussed. The classification of multidimensional data presented in R. A. Fisher's "The Use of Multiple Measurements in Taxonomic Problems" served as the model's primary inspiration. Discussed are various idiosyncrasies of data selection as well as Edgar Anderson's participation in the 1920s and 1930s data preparation and collection. The approach of multi-dimensional data display in the form of an ideograph, created by Anderson and regarded as one of the first effective methods of data visualization, is discussed here.
... En un segundo bloque recopilamos alternativas gráficas menos empleadas, bien por novedosas (Wallace y Karra 2020) bien por poco exploradas (Chernoff 1973). ...
... Previous studies have suggested that metaphors promote data comprehension [22,51]. A well-known example is Chernoff faces [10], which maps one data value to one face character like the eyebrows' angle or the nose's size. Later in two quantitative experiments, Flury et al. [20] and Jacob [26] found that face glyphs outperform other visual designs like polygons and digits. ...
Preprint
Full-text available
Glyph-based visualization achieves an impressive graphic design when associated with comprehensive visual metaphors, which help audiences effectively grasp the conveyed information through revealing data semantics. However, creating such metaphoric glyph-based visualization (MGV) is not an easy task, as it requires not only a deep understanding of data but also professional design skills. This paper proposes MetaGlyph, an automatic system for generating MGVs from a spreadsheet. To develop MetaGlyph, we first conduct a qualitative analysis to understand the design of current MGVs from the perspectives of metaphor embodiment and glyph design. Based on the results, we introduce a novel framework for generating MGVs by metaphoric image selection and an MGV construction. Specifically, MetaGlyph automatically selects metaphors with corresponding images from online resources based on the input data semantics. We then integrate a Monte Carlo tree search algorithm that explores the design of an MGV by associating visual elements with data dimensions given the data importance, semantic relevance, and glyph non-overlap. The system also provides editing feedback that allows users to customize the MGVs according to their design preferences. We demonstrate the use of MetaGlyph through a set of examples, one usage scenario, and validate its effectiveness through a series of expert interviews.
... The visualization of multivariate data is a relevant challenge. Examples such as the Chernoff Faces [Chernoff 1973], glyphs similar to stick figures [Pickett and Grinstein 1988], the use of parameterized naturalistic textures [Interrante 2000] and simulations of impressionist paintings in which the characteristics of brushstrokes are summarized according to the values of associated variables [Tateosian et al. 2007] show different ways of dealing with the theme. A common problem for all of them is the difficulty in represent quantification, as already discussed by Bertin [Bertin 1980]. ...
Article
This paper describes an online interactive thematic map for simultaneously visualizing up to three scalar variables and which supports data filtering, panning and zooming in levels of detail. The visual encoding of the map mixes the use of colors and textures as well as simple operations like border detection and intersection identification. The user experience is enhanced by means of queries posed through manipulation tools that produce instant visual feedback. This is possible through the high rendering rates achieved by the system through the use of GPU programming to assemble and manipulate previously rasterized tiles with location information recorded in the color space of pixels. This procedure allows the implementation of interactive animated actions and spatial data decomposition.
... 5.6 werden in Abb. 5.7 sogenannten Chernoff-Gesichter (Chernoff 1973) verwendet. Während in Abb. ...
Chapter
ImRisikoEinführung Kap. 2 werdenGefahrEinführungverschiedeneGefährdungEinführung Risiken und Gefährdungen in Form von Beispielen und Zahlenangaben vorgestellt. Dabei werden länger andauernde Prozesse und kurzfristige Ereignisse gemischt. Teilweise überschneiden sich die Zuordnungen.
... 5.6 werden in Abb. 5.7 sogenannten Chernoff-Gesichter (Chernoff 1973) verwendet. Während in Abb. ...
... Moreover, such elements can use compounds of visual proprieties and other elements to represent multiple proprieties simultaneously, such as glyphs. An early example of this is Chernoff faces, which represented the living conditions in Los Angeles by using faces where variables are mapped to different eyes, mouths, faces and colors [40]. However, the same concept can be applied with higher levels of abstraction, such as the symbols proposed by Dunne and Shneiderman which represent common sub-structures in graphs through different shapes and colors, in order to simplify complex networks [41]. ...
Article
Full-text available
Many fields of study still face the challenges inherent to the analysis of complex multidimensional datasets, such as the field of computational biology, whose research of infectious diseases must contend with large protein-protein interaction networks with thousands of genes that vary in expression values over time. In this paper, we explore the visualization of multivariate data through CroP, a data visualization tool with a coordinated multiple views framework where users can adapt the workspace to different problems through flexible panels. In particular, we focus on the visualization of relational and temporal data, the latter being represented through layouts that distort timelines to represent the fluctuations of values across complex datasets, creating visualizations that highlight significant events and patterns. Moreover, CroP provides various layouts and functionalities to not only highlight relationships between different variables, but also dig-down into discovered patterns in order to better understand their sources and their effects. These methods are demonstrated through multiple experiments with diverse multivariate datasets, with a focus on gene expression time-series datasets. In addition to a discussion of our results, we also validate CroP through model and interface tests performed with participants from both the fields of information visualization and computational biology.
... Faces may be processed differently by children and adults (for a review, see Nakabayashi & Liu, 2014). For example, when evaluating angry expressions of line-drawn faces (i.e., Chernoff's faces; Chernoff, 1973), children rated them as angrier than adults (Tsurusawa et al., 2008). Therefore, children might be more sensitive than adults to the expressions in face drawings. ...
Article
The allocation of attention is affected by internal emotional states, such as anxiety and depression. The attention captured by real images of negative faces can be quantified by emotional probe tasks. The present study investigated whether attentional bias toward drawings of negative faces (line drawings and cartoon faces) differs from that of real faces. Non-clinical university students indicated their levels of anxiety and depression via self-report questionnaires, and completed a probe discrimination task under three face image conditions in a between-participants design. Significant correlations were found between bias scores and scores on the self-reported BDI-II under the real face condition. However, two types of face drawings were only weakly correlated with self-report scores. In our probe task to investigate attentional bias to facial stimuli in nonclinical adults, the strength of the relationship between depression and attentional bias to negative face was stronger for real faces than for face drawings.
... We feel our experiments would not be complete without results on real-world data. With this in mind we downloaded 27 data sets from the popular UCI Machine Learning repository [43], and sourced the Chernoff Fossil data set from [44]. This provides us with 28 real-world data sets with varying characteristics which are detailed in Table 2. ...
Article
Full-text available
The k -means clustering algorithm, whilst widely popular, is not without its drawbacks. In this paper, we focus on the sensitivity of k -means to its initial set of centroids. Since the cluster recovery performance of k -means can be improved by better initialisation, numerous algorithms have been proposed aiming at producing good initial centroids. However, it is still unclear which algorithm should be used in any particular clustering scenario. With this in mind, we compare 17 such algorithms on 6,000 synthetic and 28 real-world data sets. The synthetic data sets were produced under different configurations, allowing us to show which algorithm excels in each scenario. Hence, the results of our experiments can be particularly useful for those considering k -means for a non-trivial clustering scenario.
... Anderson's Irises resulted from his long experience of working out relevant models to describe changes in specific populations by means of a limited number of characteristics. Yet, Anderson had also coped with the opposite problem of building simple multi-dimensional data interpretation 40 years before Chernoff faces appeared [5]. 2. The described methods of applying cloud-oriented spreadsheets as a tools for training mathematical informatics can enable solution of all basic problems of neural network simulation. ...
Conference Paper
Full-text available
The authors of the given article continue the series presented by the 2018 paper "Computer Simulation of Neural Networks Using Spreadsheets: The Dawn of the Age of Camelot". This time, they consider mathematical informatics as the basis of higher engineering education fundamentalization. Mathematical informatics deals with smart simulation, information security, long-term data storage and big data management, artificial intelligence systems, etc. The authors suggest studying basic principles of mathematical informatics by applying cloud-oriented means of various levels including those traditionally considered supplementary-spreadsheets. The article considers ways of building neural network models in cloud-oriented spreadsheets, Google Sheets. The model is based on the problem of classifying multi-dimensional data provided in "The Use of Multiple Measurements in Taxonomic Problems" by R. A. Fisher. Edgar Anderson's role in collecting and preparing the data in the 1920s-1930s is discussed as well as some peculiarities of data selection. There are presented data on the method of multi-dimensional data presentation in the form of an ideograph developed by Anderson and considered one of the first efficient ways of data visualization.
... Similar to the Stiff diagram, the Chernoff face (Chernoff 1973) draws cartoon faces as an interesting visualization for a small number of water samples with different geochemical compositions (Figure 1e). The physical and chemical parameter values of each water sample are scaled (milliequivalent percentages of major ions in WQChartPy), such that the parameter values are mapped as the variations of specific facial features (e.g., separation of eyes, length of nose, and curvature of mouth) in schematic cartoon faces. ...
Article
Full-text available
Graphical methods have been widely used for visualization, classification, and interpretation of aqueous geochemical data to obtain a better understanding of surface and subsurface hydrologic systems. This method note presents WQChartPy, an open‐source Python package developed to plot a total of twelve diagrams for analysis of aqueous geochemical data. WQChartPy can handle various data formats including Microsoft Excel, comma‐separated values (CSV), and general delimited text. The twelve diagrams include eight traditional diagrams (trilinear Piper, Durov, Stiff, Chernoff face, Schoeller, Gibbs, Chadha, and Gaillardet) and four recently proposed diagrams (rectangle Piper, color‐coded Piper, contour‐filled Piper, and HFE‐D) that have not been implemented in existing graphing software. The diagrams generated by WQChartPy can be saved as portable network graphics (PNG), scalable vector graphics (SVG), or portable document format (PDF) files for scientific publications. Jupyter and Google Colab notebooks are available online to illustrate how to use WQChartPy with example datasets. The geochemical diagrams can be generated with several lines of Python codes. Source codes of WQChartPy are publicly available at GitHub (https://github.com/jyangfsu/WQChartPy) and PyPI (https://pypi.org/project/wqchartpy/). This article is protected by copyright. All rights reserved.
... -Class II -moderate retail level where: (Ward), and iterative-indexing method (k-means) (Rand, 1971;Chernoff, 1973;Rencher, 2002). The result of the application of the method of agglomeration is dendrogram whose nodes correspond to objects clusters. ...
Conference Paper
Full-text available
The subject of this work is a future development of a post-revenue model that will be based on innovative sales strategies. The industry in focus is global automotive industry. This paper gives an overview of why it is necessary to adopt a new business model and how much it is actually a requirement of the global automotive market itself. The automotive industry is a highly cyclical industry because it depends heavily on the availability of investments at some point, and for this reason, the corporate well-designed sales strategy is the primary tool for maintaining the level of revenues and their growth in the industry, such as automotive. The industry is highly turbulent and has faced severe changes in the recent decade where clients expect more service-oriented approach by major manufacturer as well as by other stakeholders involved in the automotive value chain. Thus, stakeholders have to innovate constantly to keep clients satisfied. In this respect, in our paper we highlight the most important aspects to be considered by business counterparts relevant to the automotive industry companies in order to build future sustainable business operations. We present the factors that need to be examined, especially in the context of emerging markets in Europe and provide directions for future research.
... From the graphs, we can note that ME and MV have different values for different IDs while RE and Error were not much affected by different values of IDs. Using scattered radial bar plots (Chernoff, 1973;Stasko & Zhang, 2000), we examined dependent variables for each ID individually. For example, in Figure 6 we can comprehend that average error (avg_ERR) for target width 90 and distance 200 is more compared to other IDs, which is evident from the green part of the bar in the plots. ...
Article
Eye-gaze-controlled interfaces allow the direct manipulation of a graphical user interface by looking at it. This technology has great potential in military aviation, in particular, operating different displays in situations where pilots’ hands are occupied with flying the aircraft. This paper reports studies on analyzing the accuracy of eye-gaze-controlled interfaces inside aircraft undertaking representative flying missions. We report that using eye-gaze-controlled interfaces, pilots can undertake representative pointing and selection tasks at less than two seconds on average in a transport aircraft. Further, we analyzed the accuracy of eye-gaze-tracking glasses under various G load factors and analyzed the failure modes. We observed that the accuracy of the eye-tracking glasses is less than 5˚of visual angle up to+3G, although less accurate at 21G and +5G. We also found that existing eye tracker fails to track eyes under higher external illumination and needs to have a larger vertical field of view than the presently available systems. We used this analysis to develop eye-gaze trackers for multi-functional displays and head-mounted display system (HMDS). We obtained significant reduction in pointing and selection times using our proposed HMDS compared to a traditional thumb-stick-based target designation system.
Article
The purpose of this paper is to investigate the simultaneous effect of research outputs such as the number of articles indexed in Scopus on economic indicators such as inflation rate, unemployment rate and GDP of selected countries in the period of 2016 and 2020. Many articles have been studied in this regard, but none of them have focused on the simultaneous examination of research achievements on economic indicators. In some articles, research outputs on each of the economic growth indicators have been examined separately. Furthermore, separate analysis give biased estimates for the parameters and misleading inference. Consequently, we need to consider a method in which these variables can be modelled jointly. For this study, a random sample of 39 countries has been collected from the World Bank data to extract the economic index and Scopus data to extract the number of articles. In this paper, a joint model with random effects for longitudinal economic growth indicators is proposed. For these data, the simultaneous effects of some covariate for example the number of articles indexed in Scopus on the economic growth indicators as three mixed correlated responses are explored. There are main findings. Firstly, in the simultaneous examination of the effect of research outputs on economic indicators, some latent influencing factors related to each country under the title of random effects that have significant on economic indicators. Secondly, research achievements on economic indicators are significant at the same time. This significance is due to the simultaneous examination of economic indicators and appropriate statistical models are obtained with the least error compared to separate analysis.
Chapter
Modern nonclinical safety assessment is heavily dependent on the use of multiple tools, some that generate data, and others that help to evaluate and understand data. This chapter includes discussions of why a particular procedure or interpretation is recommended, by the clear enumeration of the assumptions that are necessary for a procedure to be valid and by discussion of problems drawn from the actual practice of toxicology and toxicological pathology. Toxicological experiments generally have a twofold purpose. The first question is whether or not an agent results in an effect on a biological system. The second question, never far behind, is how much of an effect is present. One approach for the selection of appropriate techniques to employ in a particular situation is to use a decision tree method. Fisher's exact test should be used to compare two sets of discontinuous, quantal data.
Chapter
The objective behind the entire safety assessment process in the pharmaceutical industry is to identify those compounds for which the risk of harming humans does not exceed the potential benefit to them. The terminology involved in screen design and evaluation and the characteristics of a screen should be clearly stated and understood. The characteristics of screen performance are defined as follows: sensitivity, specificity, positive accuracy, negative accuracy, capacity, and reproducibility. The use of screens that first occurs to most pharmaceutical scientists is in pharmacology. There are three major types of screen designs: the single stage, sequential, and tiered. Screening data presents a special case that, due to its inherent characteristics, is not well served by traditional approaches. The control chart approach, commonly used in manufacturing quality control in another form of screening, offers some desirable characteristics.
Book
Teacher tenure is a problem. Teacher tenure is a solution. Fracking is safe. Fracking causes earthquakes. Our kids are over-tested. Our kids are not tested enough. We read claims like these in the newspaper every day, often with no justification other than 'it feels right'. How can we figure out what is right? Escaping from the clutches of truthiness begins with one simple question: 'what is the evidence?' With his usual verve and flair, Howard Wainer shows how the sceptical mindset of a data scientist can expose truthiness, nonsense, and outright deception. Using the tools of causal inference he evaluates the evidence, or lack thereof, supporting claims in many fields, with special emphasis in education. This wise book is a must-read for anyone who has ever wanted to challenge the pronouncements of authority figures and a lucid and captivating narrative that entertains and educates at the same time.
Chapter
Predicting water runoff in ungauged water catchment areas is vital to practical applications such as the design of drainage infrastructure and flooding defences, runoff forecasting, and for catchment management tasks such as water allocation and climate impact analysis. This full colour book offers an impressive synthesis of decades of international research, forming a holistic approach to catchment hydrology and providing a one-stop resource for hydrologists in both developed and developing countries. Topics include data for runoff regionalisation, the prediction of runoff hydrographs, flow duration curves, flow paths and residence times, annual and seasonal runoff, and floods. Illustrated with many case studies and including a final chapter on recommendations for researchers and practitioners, this book is written by expert authors involved in the prestigious IAHS PUB initiative. It is a key resource for academic researchers and professionals in the fields of hydrology, hydrogeology, ecology, geography, soil science, and environmental and civil engineering.
Chapter
Predicting water runoff in ungauged water catchment areas is vital to practical applications such as the design of drainage infrastructure and flooding defences, runoff forecasting, and for catchment management tasks such as water allocation and climate impact analysis. This full colour book offers an impressive synthesis of decades of international research, forming a holistic approach to catchment hydrology and providing a one-stop resource for hydrologists in both developed and developing countries. Topics include data for runoff regionalisation, the prediction of runoff hydrographs, flow duration curves, flow paths and residence times, annual and seasonal runoff, and floods. Illustrated with many case studies and including a final chapter on recommendations for researchers and practitioners, this book is written by expert authors involved in the prestigious IAHS PUB initiative. It is a key resource for academic researchers and professionals in the fields of hydrology, hydrogeology, ecology, geography, soil science, and environmental and civil engineering.
Chapter
Predicting water runoff in ungauged water catchment areas is vital to practical applications such as the design of drainage infrastructure and flooding defences, runoff forecasting, and for catchment management tasks such as water allocation and climate impact analysis. This full colour book offers an impressive synthesis of decades of international research, forming a holistic approach to catchment hydrology and providing a one-stop resource for hydrologists in both developed and developing countries. Topics include data for runoff regionalisation, the prediction of runoff hydrographs, flow duration curves, flow paths and residence times, annual and seasonal runoff, and floods. Illustrated with many case studies and including a final chapter on recommendations for researchers and practitioners, this book is written by expert authors involved in the prestigious IAHS PUB initiative. It is a key resource for academic researchers and professionals in the fields of hydrology, hydrogeology, ecology, geography, soil science, and environmental and civil engineering.
Chapter
The article focuses on the need to maintain data from the digital educational footprint throughout the student’s life in the frame of institutional, corporate and independent learning. The grounds for decision-making such as the scope of the learning situation and the depth of analysis are identified. The possibilities of multidimensional analysis and cognitive computer graphics to enhance decision-making in LMS are considered. The advantages of Chernoff faces and their modifications are emphasized. It is suggested to form an image of a specialist in the form of an anthropomorphic figure and overlay individual attributes from the digital educational footprint of a student. The method of Unified Graphic Visualization of Activity (UGVA) is described. It allows forming images of specialists, comparing them with each other, and estimating the balance of educational material contribution. An example of the formation of a specific profile for the academic program “Informatics and Computer Science” and its visualization in UGVA notation is given. Images are formed for several students, including current learning achievements from individual digital educational footprints (competence aspect). The example is accompanied by recommendations for teachers of the academic department implementing the corresponding program in Siberian Federal University.
Article
While dockless bike sharing is gaining popularity, oversupplied and poorly maintained bikes introduce chaos and waste (e.g., so-called zombie bikes that unused). Spatiotemporal pattern visualizations can help policy-making and infrastructure improvement (e.g., allocating parking areas). However, multivariate symbolizing (e.g., supply, flow, usage) to optimize dockless bike sharing is challenging. In this paper, we introduce metaphor theory to design multivariate symbols. First, we systemically explore the coupling of three metaphor types (orientational, ontological and structural) with symbols at three levels of iconicity. Then, we construct metaphorical symbols for optimizing dockless bike sharing following a user-centred design process. We also offer an evaluation using eye-tracking and questionnaire techniques. The results indicate that, compared with bin-packing and multiview symbols, metaphorical symbols significantly improved effectiveness and efficiency, and reduced participants' cognitive load. Our evaluation presents preliminary evidence that metaphors can offer new organizational mechanisms for map symbols to represent multivariate naturally and effectively.
Thesis
p>This thesis uses a number of datasets to characterise geographical variation in neonatal and maternal phenotypes, and investigate both maternal-neonatal and paternal-neonatal relationships. These include cohorts from UK, Finland, India, Sri Lanka, China, Congo, Nigeria and Jamaica. Analyses were restricted to singleton, liveborn, term births. Neonates in Europe were the largest, followed by Jamaica, China then Africa, India and Sri Lanka. There was wide variation in many of the measurements such as birthweight, where the mean values ranged from 2730g to 3570g across populations. However, head circumference was similar in all populations except China, where it was markedly smaller. The main differences between populations were in the ratio of head to length, with small heads in China and large heads in India, Sri Lanka and Africa, relative to length. The mothers from Sri Lanka were the shortest (mean height 151cm) and thinnest (mean BMI at 30 weeks gestation 20 kg/m<sup>2</sup>), while those from Southampton were the tallest (mean height 164cm) and fattest (mean BMI 27 kg/m<sup>2</sup>). There were large differences between mothers in the amount of fat relative to muscle. Urban Indian mothers were relatively fat while mothers from the Congo, rural India and particularly Jamaica were relatively muscular. Mother to baby relationships were surprisingly similar across populations, although some effects were stronger in developing countries. All the maternal variables had important effects on the neonatal measures, particularly maternal birthweight. ‘Like with like’ relationships were seen consistently for maternal height and neonatal length, maternal and neonatal head, and maternal and neonatal fat. Maternal muscle effects were relatively weak, except in one dataset (Congo). After adjusting for the variation in maternal phenotypes across populations, differences in neonatal phenotypes were reduced but still present.</p
Book
This book is devoted to the emerging field of integrated visual knowledge discovery that combines advances in artificial intelligence/machine learning and visualization/visual analytic. A long-standing challenge of artificial intelligence (AI) and machine learning (ML) is explaining models to humans, especially for live-critical applications like health care. A model explanation is fundamentally human activity, not only an algorithmic one. As current deep learning studies demonstrate, it makes the paradigm based on the visual methods critically important to address this challenge. In general, visual approaches are critical for discovering explainable high-dimensional patterns in all types in high-dimensional data offering "n-D glasses," where preserving high-dimensional data properties and relations in visualizations is a major challenge. The current progress opens a fantastic opportunity in this domain. This book is a collection of 25 extended works of over 70 scholars presented at AI and visual analytics related symposia at the recent International Information Visualization Conferences with the goal of moving this integration to the next level. The sections of this book cover integrated systems, supervised learning, unsupervised learning, optimization, and evaluation of visualizations. The intended audience for this collection includes those developing and using emerging AI/machine learning and visualization methods. Scientists, practitioners, and students can find multiple examples of the current integration of AI/machine learning and visualization for visual knowledge discovery. The book provides a vision of future directions in this domain. New researchers will find here an inspiration to join the profession and to be involved for further development. Instructors in AI/ML and visualization classes can use it as a supplementary source in their undergraduate and graduate classes.
Preprint
Full-text available
Metaphoric glyphs enhance the readability and learnability of abstract glyphs used for the visualization of quantitative multidimensional data by building upon graphical entities that are intuitively related to the underlying problem domain. Their construction is, however, a predominantly manual process. In this paper, we introduce the Glyph-from-Icon (GfI) approach that allows the automated generation of metaphoric glyphs from user specified icons. Our approach modifies the icon's visual appearance using up to seven quantifiable visual variables, three of which manipulate its geometry while four affect its color. Depending on the visualization goal, specific combinations of these visual variables define the glyphs's variables used for data encoding. Technically, we propose a diffusion-curve based parametric icon representation, which comprises the degrees-of-freedom related to the geometric and color-based visual variables. Moreover, we extend our GfI approach to achieve scalability of the generated glyphs. Based on a user study we evaluate the perception of the glyph's main variables, i.e., amplitude and frequency of geometric and color modulation, as function of the stimuli and deduce functional relations as well as quantization levels to achieve perceptual monotonicity and readability. Finally, we propose a robustly perceivable combination of visual variables, which we apply to the visualization of COVID-19 data.
Chapter
Strategic foresight, corporate foresight, and technology management enable firms to detect discontinuous changes early and develop future courses for a more sophisticated market positioning. The enhancements in machine learning and artificial intelligence allow more automatic detection of early trends to create future courses and make strategic decisions. Visual Analytics combines methods of automated data analysis through machine learning methods and interactive visualizations. It enables a far better way to gather insights from a vast amount of data to make a strategic decision. While Visual Analytics got various models and approaches to enable strategic decision-making, the analysis of trends is still a matter of research. The forecasting approaches and involvement of humans in the visual trend analysis process require further investigation that will lead to sophisticated analytical methods. We introduce in this paper a novel model of Visual Analytics for decision-making, particularly for technology management, through early trends from scientific publications. We combine Corporate Foresight and Visual Analytics and propose a machine learning-based Technology Roadmapping based on our previous work. © 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.
Chapter
Scatterplot visualization techniques are known as a useful method that shows the correlations of variables on the axes, as well as revealing patterns or abnormalities in the multidimensional data sets. They are often used in the early stage of the exploratory analysis. Scatterplot techniques have the drawback that they are not quite effective in showing a high number of dimensions where each plot in two-dimensional space can only present a pair-wise of two variables on the x-axis and y-axis. Scatterplot matrices and multiple scatterplots provide more plots that show more pair-wise variables, yet also compromise the space due to the space division for the plots. This chapter presents a comprehensive review of multi-dimensional visualization methods. We introduce a hybrid model to support multidimensional data visualization from which we present a hybrid scatterplots visualization to enable the greater capability of individual scatterplots in showing more information. Particularly, we integrate star plots with scatterplots for showing the selected attributes on each item for better comparison among and within individual items, while using scatterplots to show the correlation among the data items. We also demonstrate the effectiveness of this hybrid method through two case studies.
Chapter
Predicting water runoff in ungauged water catchment areas is vital to practical applications such as the design of drainage infrastructure and flooding defences, runoff forecasting, and for catchment management tasks such as water allocation and climate impact analysis. This full colour book offers an impressive synthesis of decades of international research, forming a holistic approach to catchment hydrology and providing a one-stop resource for hydrologists in both developed and developing countries. Topics include data for runoff regionalisation, the prediction of runoff hydrographs, flow duration curves, flow paths and residence times, annual and seasonal runoff, and floods. Illustrated with many case studies and including a final chapter on recommendations for researchers and practitioners, this book is written by expert authors involved in the prestigious IAHS PUB initiative. It is a key resource for academic researchers and professionals in the fields of hydrology, hydrogeology, ecology, geography, soil science, and environmental and civil engineering.
Article
The design of efficient representations is well established as a fruitful way to explore and analyze complex or large data. In these representations, data are encoded with various visual attributes depending on the needs of the representation itself. To make coherent design choices about visual attributes, the visual search field proposes guidelines based on the human brain perception of features. However, information visualization representations frequently need to depict more data than the amount these guidelines have been validated on. Since, the information visualization community has extended these guidelines to a wider parameter space. This paper contributes to this theme by extending visual search theories to an information visualization context. We consider a visual search task where subjects are asked to find an unknown outlier in a grid of randomly laid out distractors. Stimuli are defined by color and shape features for the purpose of visually encoding categorical data. The experimental protocol is made of a parameters space reduction step (i.e., sub-sampling) based on a machine learning model, and a user evaluation to validate hypotheses and measure capacity limits. The results show that the major difficulty factor is the number of visual attributes that are used to encode the outlier. When redundantly encoded, the display heterogeneity has no effect on the task. When encoded with one attribute, the difficulty depends on that attribute heterogeneity until its capacity limit (7 for color, 5 for shape) is reached. Finally, when encoded with two attributes simultaneously, performances drop drastically even with minor heterogeneity.
Article
This paper presents a smart sensor dashboard for a digital twin of a smart manufacturing workshop. We described the development of the digital twin followed by three user studies on the visualization and interaction aspects of the smart sensor dashboard. The first two user studies evaluated ocular parameters and users’ response for different 2D and 3D graphs rendered on 2D screen and VR headset. The bar chart found to generate most accurate users’ response in both 2D and 3D case. The third study recreated the Fitts’ Law task in 3D and compared visual and haptic feedback. We found that haptic feedback significantly improved quantitative metrics of interaction than a no-feedback case, whereas multimodal feedback is significantly improved qualitative metrics of the interaction. Results from the study can be utilized to design VR environments with interactive graphs.
Article
Full-text available
This study uses Chernoff faces to model the responses of students, faculty, and administration staff of a teacher education institution in Manila, Philippines, to the implementation of an Outcomes-Based Teacher Education Curriculum (OBTEC) trimester scheme. Chernoff faces provide a valuable representation to model responses because people are used to studying and reacting to faces. This study used a quantitative research method by analyzing cross-sectional data from the study of the OBTEC trimester scheme. A total of 322 participants were selected through convenience sampling and given a 15-item survey in which possible responses ranged from 1 (strongly disagree) to 6 (strongly agree). The administrators were found to give a generally favorable rating (overall mean = 4.56 agree; overall SD = 0.45) to the OBTEC trimester scheme. The statements most highly rated by the administrators pertain to the success of OBTEC in integrating pedagogical content knowledge training with outcomes-based education, preparation of the students for the teaching profession, and consistency with the K to 12 curriculum. These responses are characterized by the structure of the face, the width of the mouth, and the height of the face, respectively. The most negative aspects of the OBTEC trimester scheme, according to the students, are characterized by hair height, nose width, and a hair style of thin hair that points downward. Chernoff faces were found to be a simple, yet powerful tool to model responses in the evaluation of the OBTEC trimester scheme.
Article
Full-text available
Este trabalho analisa a ocorrência de incêndios florestais na Unidade de Conservação, Floresta Estadual Edmundo Navarro de Andrade, situada em Rio Claro, SP, Brasil. Nesta pesquisa foi mapeado e classificado o uso e ocupação da terra e análise estatística por meio da técnica multivariada de agrupamento, através do levantamento dos incêndios florestais nos anos de 2012 a 2018. Com o uso do software Quantum Gis foi realizado o mapeamento com as seguintes classes de uso e ocupação da terra: eucalipto, floresta, palmeiras, corpo d´água, área construída e solo exposto. Com o software R foram realizadas as análises estatísticas. Destacam-se o ano de 2014 que obteve maior área atingida pelos incêndios florestais, de 286,09 ha com ocorrência de 10 incêndios e o ano de 2016 que apresentou o maior número de ocorrências incêndios, 19 atingindo uma área de 66,2 ha. Conclui-se que com a aplicação de geotecnologias e análise estatística pode contribuir para o manejo sustentável das Unidades de Conservação.
Article
A method of plotting data of more than two dimensions is proposed. Each data point, $x = (x_1, \cdots , x_k$), is mapped into a function of the form $$f_x(t) = x_1/\sqrt{2} + x_2 \text{sin} t + x_3 cos t + x_4 \text{sin} 2t + x_5 \text{cos} 2t + \cdots,$$ and the function is plotted on the range -$\pi
Article
Eighty-eight specimens of Eocene nummulitids from the Yellow Limestone Formation of northwestern Jamaica are classified according to quantitative measurements of morphologic parameters that are generally considered to be taxonomically useful. The specimens are grouped into homogeneous classes by the computer screening of differently oriented data projections. By this method, the use of similarity coefficients and the question of a priori weighting of characters, for which numerical taxonomy has been heavily criticized, are both avoided. The stability of the classes thus obtained is validated by discriminant analysis. These techniques provide an objective view of phenetic differences among specimens and show how the measured characters produce those differences. Tightness of coiling and total number of whorls, prove to be the most useful features in discriminating between groups but seem to have taxonomic value only at the specific and not at the generic level. This suggests that the generaOperculinoides andNummulites are synonymous.
Article
Recognizing associations between large numbers of variables is a problem encountered in all the sciences. For this reason the Editor felt that the following article by Dr. Edgar E. Anderson, which appeared in the Proceedings of the National Academy of Sciences. Vol. 13, pp. 923–27, 1957, would be of interest to the readers of Technometrics. The article is republished with the kind permission of Dr. Anderson and of Dr. Wendell M. Stanley, the Editor of the Proceedings of the National Academy of Sciences.
Unpublished communicationConstructing Data Pictures
  • D A B Lindberg
  • j
  • R Pickett
  • B W White
Lindberg, D.A.B., Unpublished communication. [5J Pickett, R. and White, B. W., "Constructing Data Pictures," Proceedings of the 7th National Symposium of the Society for Information Display (October 1966), 75-81.
A Graphical Technique to Assist in Sensitivity Analysis
  • D Daetz
Daetz, D., "A Graphical Technique to Assist in Sensitivity Analysis," Unpublished report, 1972.
Constructing Data Pictures
  • R Pickett
  • B W White