Gunther Heinz Weber

Gunther Heinz Weber
  • Dr. rer. nat.
  • Researcher at Lawrence Berkeley National Laboratory

About

142
Publications
20,972
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
4,230
Citations
Current institution
Lawrence Berkeley National Laboratory
Current position
  • Researcher
Additional affiliations
April 2008 - present
University of California, Davis
Position
  • Research Assistant
January 2007 - present
Lawrence Berkeley National Laboratory
Position
  • Researcher

Publications

Publications (142)
Preprint
Modern machine learning often relies on optimizing a neural network's parameters using a loss function to learn complex features. Beyond training, examining the loss function with respect to a network's parameters (i.e., as a loss landscape) can reveal insights into the architecture and learning process. While the local structure of the loss landsc...
Preprint
In machine learning, a loss function measures the difference between model predictions and ground-truth (or target) values. For neural network models, visualizing how this loss changes as model parameters are varied can provide insights into the local structure of the so-called loss landscape (e.g., smoothness) as well as global properties of the u...
Preprint
Characterizing the loss of a neural network with respect to model parameters, i.e., the loss landscape, can provide valuable insights into properties of that model. Various methods for visualizing loss landscapes have been proposed, but less emphasis has been placed on quantifying and extracting actionable and reproducible insights from these compl...
Article
Full-text available
A significant challenge on an exascale computer is the speed at which we compute results exceeds by many orders of magnitude the speed at which we save these results. Therefore the Exascale Computing Project (ECP) ALPINE project focuses on providing exascale-ready visualization solutions including in situ processing. In situ visualization and analy...
Article
Contour trees describe the topology of level sets in scalar fields and are widely used in topological data analysis and visualization. A main challenge of utilizing contour trees for large-scale scientific data is their computation at scale using highperformance computing. To address this challenge, recent work has introduced distributed hierarchic...
Preprint
Full-text available
Contour trees describe the topology of level sets in scalar fields and are widely used in topological data analysis and visualization. A main challenge of utilizing contour trees for large-scale scientific data is their computation at scale using high-performance computing. To address this challenge, recent work has introduced distributed hierarchi...
Article
Full-text available
Modern machine learning often relies on optimizing a neural network's parameters using a loss function to learn complex features. Beyond training, examining the loss function with respect to a network's parameters (i.e., as a loss landscape) can reveal insights into the architecture and learning process. While the local structure of the loss landsc...
Article
Over the last decade merge trees have been proven to support a plethora of visualization and analysis tasks since they effectively abstract complex datasets. This paper describes the ExTreeM-Algorithm: a scalable algorithm for the computation of merge trees via extremum graphs. The core idea of ExTreeM is to first derive the extremum graph $\mathc...
Poster
Full-text available
Data from electron microscopes is seldom labelled, preventing it from being easily indexed and searched. Manual labelling is labor-intensive and monotonous, making it a prime candidate for automatic classification using deep learning. In this work, we explore data augmentation techniques for producing additional labeled samples. We utilize the newl...
Chapter
Full-text available
Data-driven machine learning (ML) models are attracting increasing interest in chemical engineering and already partly outperform traditional physical simulations. Previous work in this field has mainly focused on improving the models’ statistical performance while the thereby imparted knowledge has been taken for granted. However, also the structu...
Article
Full-text available
Understanding flow traffic patterns in networks, such as the Internet or service provider networks, is crucial to improving their design and building them robustly. However, as networks grow and become more complex, it is increasingly cumbersome and challenging to study how the many flow patterns, sizes and the continually changing source-destinati...
Chapter
In high-performance parallel in situ processing, the term in transit processing refers to those configurations where data must move from a producer to a consumer that runs on separate resources. In the context of parallel and distributed computing on an HPC platform one of the central challenges is to determine a mapping of data from producer ranks...
Chapter
One key challenge when doing in situ processing is the investment required to add code to numerical simulations needed to take advantage of in situ processing. Such instrumentation code is often specialized, and tailored to a specific in situ method or infrastructure. Then, if a simulation wants to use other in situ tools, each of which has its own...
Article
Full-text available
Phase-contrast transmission electron microscopy (TEM) is a powerful tool for imaging the local atomic structure of materials. TEM has been used heavily in studies of defect structures of two-dimensional materials such as monolayer graphene due to its high dose efficiency. However, phase-contrast imaging can produce complex nonlinear contrast, even...
Article
Contour trees are used for topological data analysis in scientific visualization. While originally computed with serial algorithms, recent work has introduced a vector-parallel algorithm. However, this algorithm is relatively slow for fully augmented contour trees which are needed for many practical data analysis tasks. We therefore introduce a rep...
Preprint
Full-text available
Phase contrast transmission electron microscopy (TEM) is a powerful tool for imaging the local atomic structure of materials. TEM has been used heavily in studies of defect structures of 2D materials such as monolayer graphene due to its high dose efficiency. However, phase contrast imaging can produce complex nonlinear contrast, even for weakly-sc...
Article
The term “in situ processing” has evolved over the last decade to mean both a specific strategy for visualizing and analyzing data and an umbrella term for a processing paradigm. The resulting confusion makes it difficult for visualization and analysis scientists to communicate with each other and with their stakeholders. To address this problem, a...
Article
Full-text available
We describe a novel technique for the simultaneous visualization of multiple scalar fields, e.g. representing the members of an ensemble, based on their contour trees. Using tree alignments, a graph‐theoretic concept similar to edit distance mappings, we identify commonalities across multiple contour trees and leverage these to obtain a layout that...
Article
We introduce an approach for the interactive visual analysis of weighted, dynamic networks. These networks arise in areas such as computational neuroscience, sociology, and biology. Network analysis remains challenging due to complex time-varying network behavior. For example, edges disappear/reappear, communities grow/vanish, or overall network to...
Article
This work describes an approach for the interactive visual analysis of large-scale simulations, where numerous superlevel set components and their evolution are of primary interest. The approach first derives, at simulation runtime, a specialized Cinema database that consists of images of component groups, and topological abstractions. This databas...
Article
Full-text available
As data sets grow to exascale, automated data analysis and visualisation are increasingly important, to intermediate human understanding and to reduce demands on disk storage via in situ analysis. Trends in architecture of high performance computing systems necessitate analysis algorithms to make effective use of combinations of massively multicore...
Chapter
After two decades in computational topology, it is clearly a computationally challenging area. Not only do we have the usual algorithmic and programming difficulties with establishing correctness, we also have a class of problems that are mathematically complex and notationally fragile. Effective development and deployment therefore requires an add...
Article
Sets of multiple scalar fields can be used to model many types of variation in data, such as uncertainty in measurements and simulations or time‐dependent behavior of scalar quantities. Many structural properties of such fields can be explained by dependencies between different points in the scalar field. Although these dependencies can be of arbit...
Article
This paper studies the influence of the definition of neighborhoods and methods used for creating point connectivity on topological analysis of scalar functions. It is assumed that a scalar function is known only at a finite set of points with associated function values. In order to utilize topological approaches to analyze the scalar-valued point...
Article
Radiation detection can provide a reliable means of detecting radiological material. Such capabilities can help to prevent nuclear and/or radiological attacks, but reliable detection in uncontrolled surroundings requires algorithms that account for environmental background radiation. The Berkeley Data Cloud (BDC) facilitates the development of such...
Article
Full-text available
Background: There exists a need for effective and easy-to-use software tools supporting the analysis of complex Electrocorticography (ECoG) data. Understanding how epileptic seizures develop or identifying diagnostic indicators for neurological diseases require the in-depth analysis of neural activity data from ECoG. Such data is multi-scale and is...
Conference Paper
We introduce a new method that identifies and tracks features in arbitrary dimensions using the merge tree—a structure for identifying topological features based on thresholding in scalar fields. This method analyzes the evolution of features of the function by tracking changes in the merge tree and relates features by matching subtrees between con...
Article
Tracking graphs are a well established tool in topological analysis to visualize the evolution of components and their properties over time, i.e., when components appear, disappear, merge, and split. However, tracking graphs are limited to a single level threshold and the graphs may vary substantially even under small changes to the threshold. To e...
Article
Application-oriented papers provide an important way to invigorate and cross-pollinate the visualization field, but the exact criteria for judging an application paper's merit remain an open question. This article builds on a panel at the 2016 IEEE Visualization Conference entitled "Application Papers: What Are They, and How Should They Be Evaluate...
Poster
Full-text available
NVRAM-based Burst Buffers are an important part of the emerging HPC storage landscape. The National Energy Research Scientific Computing Center (NERSC) at LBNL recently installed one of the first Burst Buffer systems as part of its new Cori supercomputer, collaborating with Cray on the development of the DataWarp software. NERSC has over 6500 users...
Conference Paper
Full-text available
Emerging exascale systems have the ability to accelerate the time-to-discovery for scientific workflows. However, as these workflows become more complex, their generated data has grown at an unprecedented rate, making I/O constraints challenging. To address this problem advanced memory hierarchies, such as burst buffers, have been proposed as inter...
Conference Paper
We present ECoG ClusterFlow, a novel interactive visual analysis tool for the exploration of high-resolution Electrocorticography (ECoG) data. Our system detects and visualizes dynamic high-level structures, such as communities, using the time-varying spatial connectivity network derived from the high-resolution ECoG data. ECoG ClusterFlow provides...
Article
Full-text available
Modern cosmological simulations have reached the trillion-element scale, rendering data storage and subsequent analysis formidable tasks. To address this circumstance, we present a new MPI-parallel approach for analysis of simulation data while the simulation runs, as an alternative to the traditional workflow consisting of periodically saving larg...
Conference Paper
As data sets grow to exascale, automated data analysis and visu- alisation are increasingly important, to intermediate human under- standing and to reduce demands on disk storage via in situ anal- ysis. Trends in architecture of high performance computing sys- tems necessitate analysis algorithms to make effective use of com- binations of massively...
Conference Paper
Full-text available
NVRAM-based Burst Buffers are an important part of the emerging HPC storage landscape. The National Energy Research Scientific Computing Center (NERSC) at Lawrence Berkeley National Laboratory recently installed one of the first Burst Buffer systems as part of its new Cori supercomputer, collaborating with Cray on the development of the DataWarp so...
Article
We present Brain Modulyzer, an interactive visual exploration tool for functional magnetic resonance imaging (fMRI) brain scans, aimed at analyzing the correlation between different brain regions when resting or when performing mental tasks. Brain Modulyzer combines multiple coordinated views—such as heat maps, node link diagrams and anatomical vie...
Chapter
The thematic composition of document collections is commonly conceptualized by clusters of high-dimensional point clouds. However, illustrating these clusters is challenging: typical visualizations such as colored projections or parallel coordinate plots suffer from feature occlusion and noise covering the whole visualization. We propose a method t...
Chapter
Interactive exploration and analysis of multi-field data utilizes a tight feedback loop of computation/visualization and user interaction to facilitate knowledge discovery in complex datasets. It does so by providing both overview visualizations, as well as support for focusing on features utilizing iterative drill-down operations. When exploring m...
Chapter
Merge trees represent the topology of scalar functions. To assess the topological similarity of functions, one can compare their merge trees. To do so, one needs a notion of a distance between merge trees, which we define. We provide examples of using our merge tree distance and compare this new measure to other ways used to characterize topologica...
Chapter
Topological techniques provide robust tools for data analysis. They are used, for example, for feature extraction, for data de-noising, and for comparison of data sets. This chapter concerns contour trees, a topological descriptor that records the connectivity of the isosurfaces of scalar functions. These trees are fundamental to analysis and visua...
Conference Paper
Improved simulations and sensors are producing datasets whose increasing complexity exhausts our ability to visualize and comprehend them directly. To cope with this problem, we can detect and extract significant features in the data and use them as the basis for subsequent analysis. Topological methods are valuable in this context because they pro...
Article
Full-text available
One potential solution to reduce the concentration of carbon dioxide in the atmosphere is the geologic storage of captured CO2 in underground rock formations, also known as carbon sequestration. There is ongoing research to guarantee that this process is both efficient and safe. We describe tools that provide measurements of media porosity, and per...
Conference Paper
Full-text available
Topological landscapes have been proposed as a visual metaphor for contour trees that does not require an understanding of the theory involved in defining contour trees. The idea is to create a representative terrain with the same topological structure as a given contour tree. This representation exploits the natural human ability to interpret topo...
Conference Paper
We propose a novel technique for building geometry-preserving topological landscapes. Our technique creates a direct correlation between a scalar function and its topological landscape. This correlation is accomplished by introducing the notion of geometric proximity into the topological landscapes, reflecting the distance of topological features w...
Chapter
Full-text available
Contours, the connected components of level sets, play an important role in understanding the global structure of a scalar field. In particular their nesting behavior and topology – often represented in form of a contour tree – have been used extensively for visualization and analysis. However, traditional contour trees only encode structural prope...
Conference Paper
We present a novel extraction scheme for crack-free isosurfaces from adaptive mesh refinement (AMR) data that builds on prior work utilizing dual grids and filling resulting gaps with stitch cells. We use a case-table-based approach to simplify the implementation of stitch cell generation. The most significant benefit of our new approach is that it...
Article
Analyzing high-dimensional point clouds is a classical challenge in visual analytics. Traditional techniques, such as projections or axis-based techniques, suffer from projection artifacts, occlusion, and visual complexity. We propose to split data analysis into two parts to address these shortcomings. First, a structural overview phase abstracts d...
Article
Full-text available
Topological information has proven very valuable in the analysis of scientific data. An important challenge that remains is presenting this highly abstract information in a way that it is comprehensible even if one does not have an in-depth background in topology. Furthermore, it is often desirable to combine the structural insight gained by topolo...
Chapter
Three-dimensional gene expression PointCloud data generated by the Berkeley Drosophila TranscriptionNetwork Project (BDTNP) provides quantitative information about the spatial and temporal expression of genes in early Drosophila embryos at cellular resolution. The BDTNP team visualizes and analyzes Point- Cloud data using the software application P...
Conference Paper
Full-text available
Gene expression and in vivo DNA binding data provide important information for understanding gene regulatory networks: in vivo DNA binding data indicate genomic regions where transcription factors are bound, and expression data show the output resulting from this binding. Thus, there must be functional relationships between these two types of data....
Article
Full-text available
Defining high-level features, detecting them, tracking them and deriving quantities based on them is an integral aspect of modern data analysis and visualization. In combustion simulations, for example, burning regions, which are characterized by high fuel-consumption, are a possible feature of interest. Detecting these regions makes it possible to...
Article
We consider several challenging problems in climate that require quantitative analysis of very large data volumes generated by modern climate simulations. We demonstrate new software capable of addressing these challenges that is designed to exploit petascale platforms using state-of-the-art methods in high performance computing. Atmospheric rivers...
Article
Studying transformation in a chemical system by considering its energy as a function of coordinates of the system's components provides insight and changes our understanding of this process. Currently, a lack of effective visualization techniques for high-dimensional energy functions limits chemists to plot energy with respect to one or two coordin...
Conference Paper
Most analyses of ChIP-chip in vivo DNA binding have focused on qualitative descriptions of whether genomic regions are bound or not. There is increasing evidence, however, that factors bind in a highly overlapping manner to the same genomic regions and that it is quantitative differences in occupancy on these commonly bound regions that are the cri...
Conference Paper
Full-text available
Integral curves, such as streamlines, streaklines, pathlines, and timelines, are an essential tool in the analysis of vector field structures, offering straightforward and intuitive interpretation of visualization results. While such curves have a long-standing tradition in vector field visualization, their application to Adaptive Mesh Refinement (...
Article
Full-text available
VisIt is a popular open source tool for visualizing and analyzing data. It owes its success to its foci of increasing data understanding, large data support, and providing a robust and usable product, as well as its underlying design that fits today's supercomputing landscape. In this short paper, we describe the VisIt project and its accomplishmen...
Article
Full-text available
Large-scale simulations are increasingly being used to study complex scientific and engineering phenomena. As a result, advanced visualization and data analysis are also becoming an integral part of the scientific process. Often, a key step in extracting insight from these large simulations involves the definition, extraction, and evaluation of fea...
Chapter
Full-text available
Tracking features and exploring their temporal dynamics can aid scientists in identifying interesting time intervals in a simulation and serve as basis for performing quantitative analyses of temporal phenomena. In this paper, we develop a novel approach for tracking subsets of isosurfaces, such as burning regions in simulated flames, which are def...
Conference Paper
Full-text available
During the last decades, electronic textual information has become the world's largest and most important information source. Daily newspapers, books, scientific and governmental publications, blogs and private messages have grown into a wellspring of endless information and knowledge. Since neither existing nor new information can be read in its e...
Article
Full-text available
Knowledge discovery from large and complex scientific data is a challenging task. With the ability to measure and simulate more processes at increasingly finer spatial and temporal scales, the growing number of data dimensions and data objects presents tremendous challenges for effective data analysis and data exploration methods and tools. The com...
Article
Full-text available
This paper presents topology-based methods to robustly extract, analyze, and track features defined as subsets of isosurfaces. First, we demonstrate how features identified by thresholding isosurfaces can be defined in terms of the Morse complex. Second, we present a specialized hierarchy that encodes the feature segmentation independent of the thr...
Article
Full-text available
A series of experiments studied how visualization software scales to massive data sets. Although several paradigms exist for processing large data, the experiments focused on pure parallelism, the dominant approach for production software. The experiments used multiple visualization algorithms and ran on multiple architectures. They focused on mass...
Article
Full-text available
The recent development of methods for extracting precise measurements of spatial gene expression patterns from three-dimensional (3D) image data opens the way for new analyses of the complex gene regulatory networks controlling animal development. We present an integrated visualization and analysis framework that supports user-guided data clusterin...
Chapter
Full-text available
¨ Previous investigations from Ushizima et al. (2008) and Rubel et al. (2008) to find particle bunches reported results using fixed spatial tolerance around centers of maximum compactness and assumed ad hoc thresholding values to determine potential particle candidates involved in the physical phenomena of interest. Ushizima et al. (2008) pointed o...
Conference Paper
Full-text available
Adaptive Mesh Refinement (AMR) is a highly effective method for simulations spanning a large range of spatiotemporal scales such as those encountered in astrophysical simulations. Combining research in novel AMR visualization algorithms and basic infrastructure work, the Department of Energy (DOE) Scientific Discovery through Advanced Computing (Sc...
Conference Paper
Full-text available
The advent of highly accurate, large scale volumetric simulations has made data analysis and visualization techniques an integral part of the modern scientific process. To develop new insights from raw data, scientists need the ability to define features of interest in a flexible manner and to understand how changes in the feature definition impact...
Article
Full-text available
Numerical simulations of laser wakefield particle accelerators play a key role in the understanding of the complex acceleration process and in the design of expensive experimental facilities. As the size and complexity of simulation output grows, an increasingly acute challenge is the practical need for computational techniques that aid in scientif...
Conference Paper
Full-text available
Understanding vector fields resulting from large scientific simulations is an important and often difficult task. Stream- lines, curves that are tangential to a vector field at each point, are a powerful visualization method in this context. Application of streamline-based visualization to very large vector field data represents a significant chall...
Article
State-of-the-art computational science simulations generate large-scale vector field data sets. Visualization and analysis is a key aspect of obtaining insight into these data sets and represents an important challenge. This article discusses possibilities and challenges of modern vector field visualization and focuses on methods and techniques dev...
Article
Full-text available
As scientific instruments and computer simulations produce more and more data, the task of locating the essential information to gain insight becomes increasingly difficult. FastBit is an efficient software tool to address this challenge. In this article, we present a summary of the key underlying technologies, namely bitmap compression, encoding,...
Conference Paper
Full-text available
One of the central challenges facing visualization research is how to effectively enable knowledge discovery. An effective approach will likely combine application architectures that are capable of running on today's largest platforms to address the challenges posed by large data with visual data analysis techniques that help find, represent, and e...
Chapter
Full-text available
Genomes of hundreds of species have been sequenced to date, and many more are being sequenced. As more and more sequence data sets become available, and as the challenge of comparing these massive “billion basepair DNA sequences” becomes substan- tial, so does the need for more powerful tools supporting the exploration of these data sets. Similarit...
Article
Full-text available
Topology-based methods have been successfully used for the analysis and visualization of piecewise-linear functions defined on triangle meshes. This paper describes a mechanism for extending these methods to piecewise-quadratic functions defined on triangulations of surfaces. Each triangular patch is tessellated into monotone regions, so that ex- i...
Chapter
Full-text available
As 3D volumetric images of the human body become an increasingly crucial source of information for the diagnosis and treatment of a broad variety of medical conditions, advanced techniques that allow clinicians to efficiently and clearly visualize volumetric images become increasingly important. Interaction has proven to be a key concept in analysi...
Article
Full-text available
doi:10.1088/1742-6596/180/1/012084
Conference Paper
While the primary product of scientific visualization is images and movies, its primary objective is really scientific insight. Too often, the focus of visualization research is on the product, not the mission. This paper presents two case studies, both that appear in previous publications, that focus on using visualization technology to produce in...
Conference Paper
Full-text available
One of the central challenges in modern science is the need to quickly derive knowledge and understanding from large, complex collections of data. We present a new approach that deals with this challenge by combining and extending techniques from high performance visual data analysis and scientific data management. This approach is demonstrated wit...
Conference Paper
Full-text available
Our work combines and extends techniques from high-performance scientific data management and visualization to enable scientific researchers to gain insight from extremely large, complex, time-varying laser wakefield particle accelerator simulation data. We extend histogram-based parallel coordinates for use in visual information display as well as...
Article
The SciDAC Visualization and Analytics Center for Enabling Technologies (VACET) isa highly productive effort combining the forces of leading visualization researchersfrom five different institutions to solve some of the most challenging dataunderstanding problems in modern science. The VACET technology portfolio isdiverse, spanning all typical visu...
Article
To fully understand animal transcription networks, it is essential to accurately measure the spatial and temporal expression patterns of transcription factors and their targets. We describe a registration technique that takes image-based data from hundreds of Drosophila blastoderm embryos, each costained for a reference gene and one of a set of gen...
Article
Full-text available
Adaptive Mesh Refinement (AMR) is a highly effective computation method for simulations that span a large range of spatiotemporal scales, such as astrophysical simulations, which must accommodate ranges from interstellar to sub-planetary. Most mainstream visualization tools still lack support for AMR grids as a first class data type and AMR code te...

Network

Cited By