Achim Streit

Achim Streit
Karlsruhe Institute of Technology | KIT · Scientific Computing Center

About

212
Publications
33,628
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
2,761
Citations

Publications

Publications (212)
Preprint
Full-text available
The current landscape in time-series forecasting is dominated by Transformer-based models. Their high parameter count and corresponding demand in computational resources pose a challenge to real-world deployment, especially for commercial and scientific applications with low-power embedded devices. Pruning is an established approach to reduce neura...
Article
Full-text available
Data-driven medium-range weather forecasts have recently outperformed classical numerical weather prediction models, with Pangu-Weather (PGW) being the first breakthrough model to achieve this. The Transformer-based PGW introduced novel architectural components including the three-dimensional attention mechanism (3D Transformer) in the Transformer...
Preprint
The FAIR principles are globally accepted guidelines for improved data management practices with the potential to align data spaces on a global scale. In practice, this is only marginally achieved through the different ways in which organizations interpret and implement these principles. The concept of FAIR Digital Objects provides a way to realize...
Preprint
Full-text available
The gradients used to train neural networks are typically computed using backpropagation. While an efficient way to obtain exact gradients, backpropagation is computationally expensive, hinders parallelization, and is biologically implausible. Forward gradients are an approach to approximate the gradients from directional derivatives along random t...
Chapter
This study explores the learning dynamics of neural networks by analyzing the singular value decomposition (SVD) of their weights throughout training. Our investigation reveals that an orthogonal basis within each multidimensional weight’s SVD representation stabilizes during training. Building upon this, we introduce Orthogonality-Informed Adaptiv...
Preprint
Full-text available
Data-driven medium-range weather forecasts have recently outperformed classical numerical weather prediction models, with Pangu-Weather (PGW) being the first breakthrough model to achieve this. The Transformer-based PGW introduced novel architectural components including the three-dimensional attention mechanism (3D-Transformer) in the Transformer...
Article
Full-text available
Clustering in data mining involves grouping similar objects into categories based on their characteristics. As the volume of data continues to grow and advancements in high-performance computing evolve, a critical need has emerged for algorithms that can efficiently process these computations and exploit the various levels of parallelism offered by...
Article
The rise of artificial intelligence (AI) has relied on an increasing demand for energy, which threatens to outweigh its promised positive effects. To steer AI onto a more sustainable path, quantifying and comparing its energy consumption is key.
Article
Full-text available
In the last decade, deep learning (DL) has significantly impacted industry and science. Initially largely motivated by computer vision tasks in 2-D imagery, the focus has shifted toward 3-D data analysis. In particular, 3-D surface reconstruction, i.e., reconstructing a 3-D shape from sparse input, is of great interest to a large variety of applica...
Article
Full-text available
On the path to full understanding of the structure-function relationship or even design of RNA, structure prediction would offer an intriguing complement to experimental efforts. Any deep learning on RNA structure, however, is hampered by the sparsity of labeled training data. Utilizing the limited data available, we here focus on predicting spatia...
Chapter
Looking closely at the Top500 list of high-performance computers (HPC) in the world, it becomes clear that computing power is not the only number that has been growing in the last three decades. The amount of power required to operate such massive computing machines has been steadily increasing, earning HPC users a higher than usual carbon footprin...
Article
Full-text available
Thermal Bridges on Building Rooftops (TBBR) is a multi-channel remote sensing dataset. It was recorded during six separate UAV fly-overs of the city center of Karlsruhe, Germany, and comprises a total of 926 high-resolution images with 6927 manually-provided thermal bridge annotations. Each image provides five channels: three color, one thermograph...
Chapter
Full-text available
We present , an evolutionary optimization algorithm and software package for global optimization and in particular hyperparameter search. For efficient use of HPC resources, omits the synchronization after each generation as done in conventional genetic algorithms. Instead, it steers the search with the complete population present at time of breedi...
Preprint
Full-text available
Backpropagation has long been criticized for being biologically implausible, relying on concepts that are not viable in natural learning processes. This paper proposes an alternative approach to solve two core issues, i.e., weight transport and update locking, for biological plausibility and computational efficiency. We introduce Feed-Forward with...
Preprint
Full-text available
We present Propulate, an evolutionary optimization algorithm and software package for global optimization and in particular hyperparameter search. For efficient use of HPC resources, Propulate omits the synchronization after each generation as done in conventional genetic algorithms. Instead, it steers the search with the complete population presen...
Preprint
Full-text available
On the path to full understanding of the structure-function relationship or even design of RNA, structure prediction would offer an intriguing complement to experimental efforts. Any deep learning on RNA structure, however, is hampered by the sparsity of labeled training data. Utilizing the limited data available, we here focus on predicting spatia...
Article
Full-text available
In this work, we present a neural approach to reconstructing rooted tree graphs describing hierarchical interactions, using a novel representation we term the Lowest Common Ancestor Generations (LCAG) matrix. This compact formulation is equivalent to the adjacency matrix, but enables learning a tree’s structure from its leaves alone without the pri...
Chapter
Cluster analysis helps to better understand the inherent structure of data by grouping similar data points together. Typically this similarity is expressed in terms of distance between data points, either in full value space or value subspaces or in terms of correlations among attributes. However, distance-based clustering algorithms suffer the cur...
Preprint
Full-text available
In this work, we present a neural approach to reconstructing rooted tree graphs describing hierarchical interactions, using a novel representation we term the Lowest Common Ancestor Generations (LCAG) matrix. This compact formulation is equivalent to the adjacency matrix, but enables learning a tree's structure from its leaves alone without the pri...
Preprint
Full-text available
As with any physical instrument, hyperspectral cameras induce different kinds of noise in the acquired data. Therefore, Hyperspectral denoising is a crucial step for analyzing hyperspectral images (HSIs). Conventional computational methods rarely use GPUs to improve efficiency and are not fully open-source. Alternatively, deep learning-based method...
Article
Full-text available
With increasing data and model complexities, the time required to train neural networks has become prohibitively large. To address the exponential rise in training time, users are turning to data parallel neural networks (DPNN) to utilize large-scale distributed resources on computer clusters. Current DPNN approaches implement the network parameter...
Chapter
For more than two decades, researchers have been developing methods to predict HPC job run times. The methods vary from simple rule-based solutions to modern methods using machine and deep learning libraries. These studies are often developed for scientific publications and the sustainability after publication is often neglected. It is also often d...
Chapter
Different moisture processes in the atmosphere leave distinctive isotopologue fingerprints. Therefore, the paired analysis of water vapour and the ratio between different isotopologues, for example \(\{H_2O,\delta D\}\) with \(\delta D\) as the standardized \(HDO/H_2O\) isotopologue ratio, can be used to investigate these processes. In this paper,...
Preprint
Full-text available
With increasing data and model complexities, the time required to train neural networks has become prohibitively large. To address the exponential rise in training time, users are turning to data parallel neural networks (DPNN) to utilize large-scale distributed resources on computer clusters. Current DPNN approaches implement the network parameter...
Conference Paper
To cope with the rapid growth in available data, theefficiency of data analysis and machine learning libraries has re-cently received increased attention. Although great advancementshave been made in traditional array-based computations, mostare limited by the resources available on a single computationnode. Consequently, novel approaches must be m...
Chapter
Modeling of hydrological systems and their dynamics in high spatio-temporal resolution leads to a better understanding of the hydrological cycle, thus it reduces the uncertainties in hydrologic forecasts. Simulation of such high-resolution, distributed and physically based models demands high performance computing resources. However, the availabili...
Preprint
Full-text available
In order to cope with the exponential growth in available data, the efficiency of data analysis and machine learning libraries have recently received increased attention. Although corresponding array-based numerical kernels have been significantly improved, most are limited by the resources available on a single computational node. Consequently, ke...
Chapter
Full-text available
Today’s High-Performance Computing (HPC) environments increasingly have to manage relatively new access patterns (e.g., large numbers of metadata operations) which general-purpose parallel file systems (PFS) were not optimized for. Burst-buffer file systems aim to solve that challenge by spanning an ad hoc file system across node-local flash storag...
Chapter
Accurate water-related predictions and decision-making require a simulation of hydrological systems in high spatio-temporal resolution. However, the simulation of such a large-scale dynamical system is compute-intensive. One approach to circumvent this issue, is to use landscape properties to reduce model redundancies and computation complexities....
Conference Paper
Today's HPC systems experience steadily increasing problems with the storage I/O bottleneck. At the same time, new storage technologies are emerging in the compute nodes of HPC systems. There are many ideas and approaches how compute-node local storage can be made usable for HPC systems. One consideration is to copy job data to the compute-node loc...
Preprint
Full-text available
A data life cycle (DLC) is a high-level data processing pipeline that involves data acquisition, event reconstruction, data analysis, publication, archiving, and sharing. For astroparticle physics a DLC is particularly important due to the geographical and content diversity of the research field. A dedicated and experiment spanning analysis and dat...
Chapter
For efficient utilization of large-scale HPC systems, the task of resource management and job scheduling is of highest priority. Therefore, modern job scheduling systems require information about the estimated total wall time of the jobs already at submission time. Proper wall time estimates are a key for reliable scheduling decisions. Typically, u...
Chapter
Market-oriented resource allocation in cloud computing is driven by increasingly stringent needs for flexibility, fine-grained allocation, and more critically, revenue maximization. Double combinatorial auctions aptly address these demands, but their \(\mathcal {NP}\)-hardness has hindered them from being widely adopted. Heuristic algorithms, with...
Preprint
Full-text available
Modern large-scale astroparticle setups measure high-energy particles, gamma rays, neutrinos, radio waves, and the recently discovered gravitational waves. Ongoing and future experiments are located worldwide. The data acquired have different formats, storage concepts, and publication policies. Such differences are a crucial point in the era of Big...
Article
Full-text available
Modern large-scale astroparticle setups measure high-energy particles, gamma rays, neutrinos, radio waves, and the recently discovered gravitational waves. Ongoing and future experiments are located worldwide. The data acquired have different formats, storage concepts, and publication policies. Such differences are a crucial point in the era of Big...
Conference Paper
Full-text available
Through the introduction of next-generation models the climate sciences have experienced a breakthrough in high-resolution simulations. In the past, the bottleneck was the numerical complexity of the models, nowadays it is the required storage space for the model output. One way to tackle the data storage challenge is through data compression. In t...
Preprint
Full-text available
Modern experimental astroparticle physics features large-scale setups measuring different messengers, namely high-energy particles generated by cosmic accelerators (e.g. supernova remnants, active galactic nuclei, etc): cosmic and gamma rays, neutrinos and recently discovered gravitational waves. Ongoing and future experiments are distributed over...
Article
Applications for biomedical data processing often integrate external libraries and frameworks for common algorithmic tasks. It typically reduces development time and increases overall code quality. With the introduction of lightweight container‐based virtualization, the bundling of applications and their required dependencies has become feasible, a...
Article
Full-text available
There is a progressive digitization in many medical fields, such as digital microscopy, which leads to an increase in data volume and processing demands for the underlying computing infrastructure. This paper explores scaling behaviours of a Ki–67 analysis application, which processes medical image tiles, originating from a WSI (Whole slide Image)...
Article
Full-text available
Over the past several years, rapid growth of data has affected many fields of science. This has often resulted in the need for overhauling or exchanging the tools and approaches in the disciplines' data life cycles. However, this allows the application of new data analysis methods and facilitates improved data sharing. The project Large-Scale Data...
Article
Full-text available
New imaging techniques enable visualizing and analyzing a multitude of unknown phenomena in many areas of science at high spatio-temporal resolution. The rapidly growing amount of image data, however, can hardly be analyzed manually and, thus, future research has to focus on automated image analysis methods that allow one to reliably extract the de...
Conference Paper
Building an Exascale computer that solves scientific problems by three orders of magnitude faster as the current Petascale systems is harder than just making it huge. Towards the first Exascale computer, energy consumption has been emerged to a crucial factor. Every component will have to change to create an Exascale syestem, which capable of a mil...
Conference Paper
A data center is often also a Cloud center, which delivers its computational and storage capacity as services. To enable on-demand resource provision with elasticity and high reliability, the host machines in data centers are usually virtualized, which brings a challenging research topic, i.e., how to schedule the virtual machines (VM) on the hosts...
Article
Full-text available
Modern science is most often driven by data. Improvements in state-of-the-art technologies and methods in many scientific disciplines lead not only to increasing data rates, but also to the need to improve or even completely overhaul their data life cycle management. Communities usually face two kinds of challenges: generic ones like federated auth...
Chapter
The Simulation Laboratory Elementary Particle and Astropartice Physics (SimLab E&A Particle) is one of the new support instruments recently established in the Steinbuch Centre for Computing (SCC) and Jülich Supercomputing Centre (JSC) providing high-level support to supercomputer users. Simulation Laboratory (SimLab) is a community-oriented researc...
Article
We present a distributed system for storage, processing, three-dimensional visualisation and basic analysis of data from Earth-observing satellites. The database and the server have been designed for high performance and scalability, whereas the client is highly portable thanks to having been designed as a HTML5- and WebGL-based Web application. Th...
Conference Paper
Considering the wide usage of databases and their ever growing size, it is crucial to improve the query processing performance. Selection of an appropriate set of indexes for the workload processed by the database system is an important part of physical design and performance tuning. This selection is a non-trivial tasks, especially considering pos...
Conference Paper
With the number of different cloud providers growing the importance for reliable and up-to-date performance data increases to make qualified decision on what providers to use to carry out different tasks. To cope with this issue, this paper proposes a framework usable for distributed and self-organized continuous benchmarking of hybrid and heteroge...