Jürgen Bernard

Jürgen Bernard
  • Professor
  • Assistant Professor at University of Zurich

Open Positions in Interactive Visual Data Analysis (IVDA) Group! https://www.ifi.uzh.ch/en/ivda/open-positions.html

About

108
Publications
91,128
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,958
Citations
Current institution
University of Zurich
Current position
  • Assistant Professor
Additional affiliations
September 2020 - present
University of Zurich
Position
  • Professor (Assistant)
Description
  • Leading the Interactive Visual Data Analysis Group
June 2019 - December 2020
University of British Columbia
Position
  • PostDoc Position
Description
  • In the InfoVis Group led by Tamara Munzner, working on Visual Analytics and Interactive Machine Learning
March 2016 - May 2019
Technical University of Darmstadt
Position
  • PostDoc Position
Description
  • Research Fields: Information Visualization Visual Analytics Exploratory Search Visual Cluster Analysis Semi-supervised and Active Learning Similarity Search for Complex Data Objects Time Series Analysis

Publications

Publications (108)
Article
Full-text available
Objectives This article describes the design and evaluation of MS Pattern Explorer, a novel visual tool that uses interactive machine learning to analyze fitness wearables’ data. Applied to a clinical study of multiple sclerosis (MS) patients, the tool addresses key challenges: managing activity signals, accelerating insight generation, and rapidly...
Preprint
Full-text available
In medical diagnostics of both early disease detection and routine patient care, particle-based contamination of in-vitro diagnostics consumables poses a significant threat to patients. Objective data-driven decision-making on the severity of contamination is key for reducing patient risk, while saving time and cost in quality assessment. Our colla...
Article
Full-text available
Wearable sensor technologies are becoming increasingly relevant in health research, particularly in the context of chronic disease management. They generate real-time health data that can be translated into digital biomarkers, which can provide insights into our health and well-being. Scientific methods to collect, interpret, analyze, and translate...
Article
Time-stamped event sequences (TSEQs) are time-oriented data without value information, shifting the focus of users to the exploration of temporal event occurrences. TSEQs exist in application domains, such as sleeping behavior, earthquake aftershocks, and stock market crashes. Domain experts face four challenges, for which they could use interactiv...
Preprint
BACKGROUND The increased use of digital data in health research calls for inter- and transdisciplinary collaborations as this data is associated with methodological complexities. This often entails merging the linear deductive approach of health science with the explorative iterative approach of data science. Yet, it is questioned how established h...
Article
Background The increased use of digital data in health research demands interdisciplinary collaborations to address its methodological complexities and challenges. This often entails merging the linear deductive approach of health research with the explorative iterative approach of data science. However, there is a lack of structured teaching cours...
Article
We present ManuKnowVis, the result of a design study, in which we contextualize data from multiple knowledge repositories of a manufacturing process for battery modules used in electric vehicles. In data-driven analyses of manufacturing data, we observed a discrepancy between two stakeholder groups involved in serial manufacturing processes: Knowle...
Preprint
Full-text available
ions and taxonomic structures for tasks are useful for designers of interactive data analysis approaches, serving as design targets and evaluation criteria alike. For individual data types, dataset-specific taxonomic structures capture unique data characteristics, while being generalizable across application domains. The creation of dataset-centric...
Preprint
Full-text available
Item ranking systems support users in multi-criteria decision-making tasks. Users need to trust rankings and ranking algorithms to reflect user preferences nicely while avoiding systematic errors and biases. However, today only few approaches help end users, model developers, and analysts to explain rankings. We report on the study of explanation a...
Article
Full-text available
Electrical engines are a key technology that automotive manufacturers must master tostay competitive. To improve the manufacturing of this technology, engineers need to analyze anoverwhelming number of measurements of engines. However, engineers are hindered inanalyzing large numbers of engines by the following challenges: (1) Engines comprise a co...
Article
Graph neural networks (GNNs) are a class of powerful machine learning tools that model node relations for making predictions of nodes or links. GNN developers rely on quantitative metrics of the predictions to evaluate a GNN, but similar to many other neural networks, it is difficult for them to understand if the GNN truly learns characteristics of...
Conference Paper
If you have ever used an e-commerce service or a streaming platform, you have already come across something like: "recommended for you", or "other users have also bought this". Our educational article below will give you an introduction to Recommender Systems (RS), and illustrate how this field currently leverages deep-learning techniques. Our arti...
Conference Paper
We present a visual analytics approach for the in-depth analysis and explanation of incremental machine learning processes that are based on data labeling. Our approach offers multiple perspectives to explain the process, i.e., data characteristics, label distribution, class characteristics, and classifier characteristics. Additionally, we introduc...
Article
Linguistic insight in the form of high-level relationships and rules in text builds the basis of our understanding of language. However, the data-driven generation of such structures often lacks labeled resources that can be used as training data for supervised machine learning. The creation of such ground-truth data is a time-consuming process tha...
Article
Strategies for selecting the next data instance to label, in service of generating labeled data for machine learning, have been considered separately in the machine learning literature on active learning and in the visual analytics literature on human-centered approaches. We propose a unified design space for instance selection strategies to suppor...
Preprint
Full-text available
In visual interactive labeling, users iteratively assign labels to data items until the machine model reaches an acceptable accuracy. A crucial step of this process is to inspect the model's accuracy and decide whether it is necessary to label additional elements. In scenarios with no or very little labeled data, visual inspection of the prediction...
Article
Full-text available
In this design study, we present IRVINE, a Visual Analytics (VA) system, which facilitates the analysis of acoustic data to detect and understand previously unknown errors in the manufacturing of electrical engines. In serial manufacturing processes, signatures from acoustic data provide valuable information on how the relationship between multiple...
Article
Mixed-initiative visual data analysis processes are characterized by the co-adaptation of users and systems over time. As the analysis progresses, both actors – users and systems – gather information, update their analysis behavior, and work on different tasks towards their respective goals. In this paper, we contribute a multigranular model of co-...
Preprint
Full-text available
Graph neural networks (GNNs) are a class of powerful machine learning tools that model node relations for making predictions of nodes or links. GNN developers rely on quantitative metrics of the predictions to evaluate a GNN, but similar to many other neural networks, it is difficult for them to understand if the GNN truly learns characteristics of...
Article
Full-text available
Class separation is an important concept in machine learning and visual analytics. We address the visual analysis of class separation measures for both high-dimensional data and its corresponding projections into 2D through dimensionality reduction (DR) methods. Although a plethora of separation measures have been proposed, it is difficult to compa...
Article
Event sequences are central to the analysis of data in domains that range from biology and health, to logfile analysis and people's everyday behavior. Many visualization tools have been created for such data, but people are error-prone when asked to judge the similarity of event sequences with basic presentation methods. This paper describes an exp...
Article
Classifiers are among the most widely used supervised machine learning algorithms. Many classification models exist, and choosing the right one for a given task is difficult. During model selection and debugging, data scientists need to assess classifiers' performances, evaluate their learning behavior over time, and compare different models. Typic...
Article
Methods from supervised machine learning allow the classification of new data automatically and are tremendously helpful for data analysis. The quality of supervised maching learning depends not only on the type of algorithm used, but also on the quality of the labelled dataset used to train the classifier. Labelling instances in a training dataset...
Conference Paper
Full-text available
Many machine learning algorithms require a labelled training dataset. The task of labelling a multivariate dataset can be tedious, but can be supported by systems combining interactive visualisation and machine learning techniques into a single interface. mVis is such a system, providing a unified ecosystem to explore multivariate datasets and exec...
Conference Paper
Full-text available
Many interactive machine learning workflows in the context of visual analytics encompass the stages of exploration, verification, and knowledge communication. Within these stages, users perform various types of actions based on different human needs. In this position paper, we postulate expanding this workflow by introducing gameful design elements...
Preprint
Classifiers are among the most widely used supervised machine learning algorithms. Many classification models exist, and choosing the right one for a given task is difficult. During model selection and debugging, data scientists need to asses classifier performance, evaluate the training behavior over time, and compare different models. Typically,...
Preprint
We propose the concept of Speculative Execution for Visual Analytics and discuss its effectiveness for model exploration and optimization. Speculative Execution enables the automatic generation of alternative, competing model configurations that do not alter the current model state unless explicitly confirmed by the user. These alternatives are com...
Article
Full-text available
Pre‐processing is a prerequisite to conduct effective and efficient downstream data analysis. Pre‐processing pipelines often require multiple routines to address data quality challenges and to bring the data into a usable form. For both the construction and the refinement of pre‐processing pipelines, human‐in‐the‐loop approaches are highly benefici...
Conference Paper
Full-text available
In multivariate time series analysis, pre-processing is integral for enabling analysis, but inevitably introduces uncertainty into the data. Enabling the assessment of the uncertainty and allowing uncertainty-aware analysis, the uncertainty needs to be quantified initially. We address this challenge by formalizing the quantification of uncertainty...
Article
Full-text available
Supervised machine learning techniques require labelled multivariate training datasets. Many approaches address the issue of unlabelled datasets by tightly coupling machine learning algorithms with interactive visualisations. Using appropriate techniques, analysts can play an active role in a highly interactive and iterative machine learning proces...
Article
Full-text available
Feature selection is an effective technique to reduce dimensionality, for example when the condition of a system is to be understood from multivariate observations. The selection of variables often involves a priori assumptions about underlying phenomena. To avoid the associated uncertainty, we aim at a selection criterion that only considers the o...
Conference Paper
Full-text available
We propose the concept of Speculative Execution for Visual Analytics and discuss its effectiveness for model exploration and optimization. Speculative Execution enables the automatic generation of alternative, competing model configurations that do not alter the current model state unless explicitly confirmed by the user. These alternatives are com...
Conference Paper
Full-text available
Advanced artificial intelligence models are used to solve complex real-world problems across different domains. While bringing along the expertise for their specific domain problems, users from these various application fields often do not readily understand the underlying artificial intelligence models. The resulting opacity implicates a low level...
Conference Paper
Full-text available
A curated literature collection on a specific topic helps researchers to find relevant articles quickly. Assigning multiple keywords to each article is one of the techniques to structure such a collection. But it is challenging to assign all the keywords consistently without any gaps or ambiguities. We propose to support the user with a machine lea...
Article
Full-text available
The assignment of labels to data instances is a fundamental prerequisite for many machine learning tasks. Moreover, labeling is a frequently applied process in visual interactive analysis approaches and visual analytics. However, the strategies for creating labels usually differ between these two fields. This raises the question whether synergies b...
Article
The labeling of data sets is a time‐consuming task, which is, however, an important prerequisite for machine learning and visual analytics. Visual‐interactive labeling (VIAL) provides users an active role in the process of labeling, with the goal to combine the potentials of humans and machines to make labeling more efficient. Recent experiments sh...
Poster
Full-text available
Two-dimensional color maps are used in many applications, for example to encode multi-dimensional data. We propose an automatic approach to create color maps based on different quality criteria. The quality criteria and their weightings may be defined interactively by the user. The system then tries to find an optimal color map based on the given c...
Poster
Full-text available
Similarity functions are essential for many analytical tasks. Goal: create a similarity function based on visual-interactive user feedback to capture the Mental Similarity Notion in the heads of domain experts. Inspiring solutions for numerical data exist. However, the interpretation of user feedback for categorical data attributes poses additio...
Article
In this design study, we present a visualization technique that segments patients' histories instead of treating them as raw event sequences, aggregates the segments using criteria such as the whole history or treatment combinations, and then visualizes the aggregated segments as static dashboards that are arranged in a dashboard network to show lo...
Article
Labeling data instances is an important task in machine learning and visual analytics. Both fields provide a broad set of labeling strategies, whereby machine learning (and in particular active learning) follows a rather model-centered approach and visual analytics employs rather user-centered approaches (visual-interactive labeling). Both approach...
Article
Clustering is a core building block for data analysis, aiming to extract otherwise hidden structures and relations from raw datasets, such as particular groups that can be effectively related, compared, and interpreted. A plethora of visual-interactive cluster analysis techniques has been proposed to date, however, arriving at useful clusterings of...
Article
Full-text available
The definition of similarity is a key prerequisite when analyzing complex data types in data mining, information retrieval, or machine learning. However, the meaningful definition is often hampered by the complexity of data objects and particularly by different notions of subjective similarity latent in targeted user groups. Taking the example of s...
Article
The characterization and abstraction of large multivariate time series data often poses challenges with respect to effectiveness or efficiency. Using the example of human motion capture data challenges exist in creating compact solutions that still reflect semantics and kinematics in a meaningful way. We present a visual-interactive approach for th...
Article
The exploration of text document collections is a complex and cumbersome task. Clustering techniques can help to group documents based on their content for the generation of overviews. However, the underlying clustering workflows comprising preprocessing, feature selection, clustering algorithm selection and parameterization offer several degrees o...
Article
Background While the optimal use and timing of secondary therapy after radical prostatectomy (RP) remain controversial, there are limited data on patient-reported outcomes following multimodal therapy. Objective To assess the impact of additional radiation therapy (RT) and/or androgen deprivation therapy (ADT) on urinary continence, potency, and q...
Conference Paper
The process of political decision making is often complex and tedious. The policy process consists of multiple steps, most of them are highly iterative. In addition, different stakeholder groups are involved in political decision making and contribute to the process. A series of textual documents accompanies the process. Examples are official docum...
Conference Paper
The present paper asks how can visualization help data scientists make sense of event sequences, and makes three main contributions. The first is a research agenda, which we divide into methods for presentation, interaction & computation, and scale-up. Second, we introduce the concept of Event Maps to help with scale-up, and illustrate coarse-, med...
Article
107 Background: While the optimal use and timing of secondary therapy after radical prostatectomy remain controversial, there are limited data on the patient-reported outcomes following multimodality therapy. Our objective was to assess the impact of additional radiation and/or hormonal therapy on long-term urinary continence, quality of life and p...
Conference Paper
The assessment of patient well-being is highly relevant for the early detection of diseases, for assessing the risks of therapies, or for evaluating therapy outcomes. The knowledge to assess a patient's well-being is actually tacit knowledge and thus, can only be used by the physicians themselves. The rationale of this research approach is to use v...
Conference Paper
Full-text available
Segmentation and labeling of different activities in multivariate time series data is an important task in many domains. There is a multitude of automatic segmentation and labeling methods available, which are designed to handle different situations. These methods can be used with multiple parametrizations, which leads to an overwhelming amount of...
Article
A long-term goal in prostate cancer research is a sound prognosis prior to surgery, and as a consequence, data-centered research is becoming increasingly important. Currently, it takes several days to define meaningful cohorts by manually selecting patients from health record systems and performing statistical hypothesis tests with cohorts. The aut...
Article
The purpose of this ongoing work is to motivate public policy making as an application area for information visualization and visual analytics. Through our expertise gathered in several policy making-related projects, we identified parallels between the benefits of visualization and the needs of evidence-based public policy making. In the following...
Conference Paper
Full-text available
Color is one of the most important visual variables since it can be combined with any other visual mapping to encode information without using additional space on the display. Encoding one or two dimensions with color is widely explored and discussed in the field. Also mapping multi-dimensional data to color is applied in a vast number of applicati...
Conference Paper
Full-text available
The analysis of equine motion has a long tradition in the past of mankind. Equine biomechanics aims at detecting characteristics of horses indicative of good performance. Especially, veterinary medicine gait analysis plays an important role in diagnostics and in the emerging research of long-term effects of athletic exercises. More recently, the in...
Conference Paper
Decision making is a complex process consisting of several consecutive steps. Before converting a decision into effective action the problem to be tackled needs to be analyzed, alternative solutions need to be developed, and the best solution needs to be picked. In many cases computational models support decision makers in this process. Therefore,...
Article
To this day, data-driven science is a widely accepted concept in the digital library (DL) context (Hey et al. in The fourth paradigm: data-intensive scientific discovery. Microsoft Research, 2009). In the same way, domain knowledge from information visualization, visual analytics, and exploratory search has found its way into the DL workflow. This...
Conference Paper
Decision making in the field of policy making is a complex task. On the one hand conflicting objectives influence the availability of alternative solutions for a given problem. On the other hand economic, social, and environmental impacts of the chosen solution have to be considered. In the political context, these solutions are called policy optio...
Conference Paper
Full-text available
Color is one of the most effective visual variables since it can be combined with other mappings and encode information without using any additional space on the display. An important example where expressing additional visual dimensions is direly needed is the analysis of high-dimensional data. The property of perceptual linearity is desirable in...
Conference Paper
The definition of similarity between data objects plays a key role in many analytical systems. The process of similarity definition comprises several challenges as three main problems occur: different stakeholders, mixed data, and changing requirements. Firstly, in many applications the developers of the analytical system (data scientists) model th...
Article
The analysis of research data plays a key role in data-driven areas of science. Varieties of mixed research data sets exist and scientists aim to derive or validate hypotheses to find undiscovered knowledge. Many analysis techniques identify relations of an entire dataset only. This may level the characteristic behavior of different subgroups in th...
Article
We present a system to analyze time‐series data in sensor networks. Our approach supports exploratory tasks for the comparison of univariate, geo‐referenced sensor data, in particular for anomaly detection. We split the recordings into fixed‐length patterns and show them in order to compare them over time and space using two linked views. Apart fro...
Article
In this paper we analyze different layout algorithms that preserve relative directions in geo-referenced networks. This is an important criterion for many sensor networks such as the electric grid and other supply networks, because it enables the user to match the geographic setting with the drawing on the screen. Even today, the layout of these ne...
Data
The analysis of research data plays a key role in data-driven areas of science. Varieties of mixed research data sets exist and scientists aim to derive or validate hypotheses to find undiscovered knowledge. Many analysis techniques identify relations of an entire dataset only. This may level the characteristic behavior of different subgroups in th...
Article
Full-text available
We present MotionExplorer, an exploratory search and analysis system for sequences of human motion in large motion capture data collections. This special type of multivariate time series data is relevant in many research fields including medicine, sports and animation. Key tasks in working with motion data include analysis of motion states and tran...
Data
Adaptive visualizations aim to reduce the complexity of visual representations and convey information using interactive visualizations. Although the research on adaptive visualizations grew in the last years, the existing approaches do not make use of the variety of adaptable visual variables. Further the existing approaches often premises experts,...
Conference Paper
Adaptive visualizations aim to reduce the complexity of visual representations and convey information using interactive visualizations. Although the research on adaptive visualizations grew in the last years, the existing approaches do not make use of the variety of adaptable visual variables. Further the existing approaches often premises experts,...
Conference Paper
The complexity of actual decision making problems especially in the field of policy making is increasing due to conflicting aspects to be considered. Methods from the field of strategic environmental assessment consider environmental, economic, and social impacts caused by political decisions. This makes the analysis of reasonable decisions more co...
Article
Today's politicians are confronted with new (digital) ways to tackle complex decision-making problems. In order to make the right decisions profound analysis of the problems and possible solutions has to be performed. Therefore policy analysts need to collaborate with external experts consulted as advisors. Due to different expertises of these stak...
Article
The data collection contains 6813 links to time-oriented earth observation measurements from the Baseline Surface Radiation Network (BSRN). It covers all available measurements from the time period between 1992-01 and 2012-07 taken at BSRN stations all over the world. The data is used as a large and representative time-oriented research dataset for...
Conference Paper
Visual analysis of time series data is an important, yet challenging task with many application examples in fields such as financial or news stream data analysis. Many visual time series analysis approaches consider a global perspective on the time series. Fewer approaches consider visual analysis of local patterns in time series, and often rely on...
Article
Visual cluster analysis provides valuable tools that help analysts to understand large data sets in terms of representative clusters and relationships thereof. Often, the found clusters are to be understood in context of belonging categorical, numerical or textual metadata which are given for the data elements. While often not part of the clusterin...
Conference Paper
Full-text available
Visual cluster analysis provides valuable tools that help analysts to understand large data sets in terms of representative clusters and relationships thereof. Often, the found clusters are to be understood in context of belonging categorical, numerical or textual metadata which are given for the data elements. While often not part of the clusterin...
Conference Paper
Full-text available
The analysis of time-dependent data is an important problem in many application domains, and interactive visualization of time-series data can help in understanding patterns in large time series data. Many effective approaches already exist for visual analysis of univariate time series supporting tasks such as assessment of data quality, detection...
Article
The analysis of time-dependent data is an important problem in many application domains, and interactive visualization of time-series data can help in understanding patterns in large time series data. Many effective approaches already exist for visual analysis of univariate time series supporting tasks such as assessment of data quality, detection...
Article
Today's digital libraries (DLs) archive vast amounts of information in the form of text, videos, images, data measurements, etc. User access to DL content can rely on similarity between metadata elements, or similarity between the data itself (content-based similarity). We consider the problem of exploratory search in large DLs of time-oriented dat...
Article
Today's digital libraries (DLs) archive vast amounts of information in the form of text, videos, images, data measurements, etc. User access to DL content can rely on similarity between metadata elements, or similarity between the data itself (content-based similarity). We consider the problem of exploratory search in large DLs of time-oriented dat...
Conference Paper
Increasing amounts of data are collected in many areas of research and application. The degree to which this data can be accessed, retrieved, and analyzed is decisive to obtain progress in fields such as scientific research or industrial production. We present a novel method supporting content-based retrieval and exploratory search in repositories...
Article
Exploration and selection of data descriptors representing objects using a set of features are important components in many data analysis tasks. Usually, for a given dataset, an optimal data description does not exist, as the suitable data representation is strongly use case dependent. Many solutions for selecting a suitable data description have b...

Questions

Questions (3)
Question
Like describing the density, shape, etc. of point clouds in scatterplots it would also be valuable to have a feature vector that helps estimate interesting properties of high-dimensional datasets. Interesting features that come to my mind: size, intrinsic dimensionality, noise variance, shape of dense regions, etc. At the best case: do you know libraries that provide such 'metadata' about high-dimensional data sets?
Question
Inspired by Aggarwal et al.`s paper "On the Surprising Behavior of Distance Metrics in High Dimensional Space" the question arises which distance measure is best suited for high dimensional data. My targeted analysis tasks are: NN calculation, clustering, and the projection of data
Question
I face the task of laying out a weighted graph that will change the number of nodes during runtime. The chosen layout should rearrange the nodes without losing the global topology in order to keep the "amount of visual adjustment" as low as possible for the user. Additional requirement: it should be applicable in a JAVA-environment. Does anyone know the best practices?

Network

Cited By