Michael Kuhn

Michael Kuhn
University of Hamburg | UHH · Department of Informatics

Dr.

About

39
Publications
2,351
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
216
Citations
Additional affiliations
November 2009 - present
University of Hamburg
Position
  • Researcher

Publications

Publications (39)
Article
While compression can provide significant storage and cost savings, its use within HPC applications is often only of secondary concern. This is in part due to the inflexibility of existing approaches where a single compression algorithm has to be used throughout the whole application but also because insights into the behaviour of the algorithms wi...
Chapter
Message passing is the foremost parallelization method used in high-performance computing (HPC). Parallel programming in general and especially message passing strongly increase the complexity and susceptibility to errors of programs. The de-facto standard technologies used to realize message passing applications in HPC are MPI with C/C++ or Fortra...
Conference Paper
Full-text available
In times of ever-increasing data sizes, data management and insightful analysis are amidst the most severe challenges of high-performance computing. While high-level libraries such as NetCDF, HDF5, and ADIOS2, as well as the associated self-describing data formats, offer convenient interfaces to complex data sets, they were built on outdated assump...
Article
Full-text available
Research into data reduction techniques has gained popularity in recent years as storage capacity and performance become a growing concern. This survey paper provides an overview of leveraging points found in high-performance computing (HPC) systems and suitable mechanisms to reduce data volumes. We present the underlying theories and their applica...
Presentation
Full-text available
The computational power and memory of a single CPU core is limited and it becomes inevitable to make use of parallel computation techniques at some point. To utilise a single node with multiple CPUs with several cores to its full capacity it seems likely to use OpenMP, which allows executing concurrent threads on shared memory. In this technical t...
Conference Paper
Power measurement is essential for power-/energy-aware performance optimizations in large-scale HPC systems and applications. A suitable power monitoring solution in such systems faces certain requirements. Although a variety of commercial power measurement devices already exist in the market, there is yet room for improvement. This paper enumerate...
Conference Paper
With the constantly increasing number of cores in high performance computing (HPC) systems, applications produce even more data that will eventually have to be stored and accessed in parallel. Applications’ I/O in HPC is performed in a layered manner; scientific applications use standardized high-level libraries and data formats like HDF\(5\) and N...
Conference Paper
With the emergence of multi-core and multi-socket non-uniform memory access (NUMA) platforms in recent years, new software challenges have arisen to use them efficiently. In the field of high performance computing (HPC), parallel programming has always been the key factor to improve applications performance. However, the implications of parallel ar...
Article
Full-text available
The computational power and storage capability of supercomputers are growing at a different pace, with storage lagging behind; the widening gap necessitates new approaches to keep the investment and running costs for storage systems at bay. In this paper, we aim to unify previous models and compare different approaches for solving these problems. B...
Conference Paper
Full-text available
Both energy and storage are becoming key issues in high-performance (HPC) systems, especially when thinking about upcoming Exascale systems. The amount of energy consumption and storage capacity needed to solve future problems is growing in a marked curve that the HPC community must face in cost-/energy-efficient ways. In this paper we provide a po...
Conference Paper
File systems as well as I/O libraries offer interfaces which can be used to interact with them, albeit on different levels of abstraction. While an interface’s syntax simply describes the available operations, its semantics determine how these operations behave and which assumptions developers can make about them. There are several different interf...
Conference Paper
Deduplication is a storage saving technique that is highly successful in enterprise backup environments. On a file system, a single data block might be stored multiple times across different files, for example, multiple versions of a file might exist that are mostly identical. With deduplication, this data replication is localized and redundancy is...
Article
Intelligently switching energy saving modes of CPUs, NICs and disks is mandatory to reduce the energy consumption. Hardware and operating system have a limited perspective of future performance demands, thus automatic control is suboptimal. However, it is tedious for a developer to control the hardware by himself. In this paper we propose an extens...
Conference Paper
The performance of parallel distributed file systems suffers from many clients executing a large number of operations in parallel, because the I/O subsystem can be easily overwhelmed by the sheer amount of incoming I/O operations. Many optimizations exist that try to alleviate this problem. Client-side optimizations perform preprocessing to minimiz...
Conference Paper
The performance of parallel distributed file systems suffers from many clients executing a large number of operations in parallel, because the I/O subsystem can be easily overwhelmed by the sheer amount of incoming I/O operations. This, in turn, can slow down the whole distributed system. Many optimizations exist that try to alleviate this problem....
Article
Full-text available
In this paper the data life cycle management is extended by accounting for energy consumption during the life cycle of files. Information about the energy consumption of data not only allows to account for the correct costs of its life cycle, but also provides a feedback to the user and administrator, and improves awareness of the energy consumptio...
Article
Modern file systems maintain extensive metadata about stored files. While metadata typically is useful, there are situations when the additional overhead of such a design becomes a problem in terms of performance. This is especially true for parallel and cluster file systems, where every metadata operation is even more expensive due to their archit...
Article
The performance of parallel cluster file systems suffers from many clients executing a large number of operations in parallel, because the I/O subsystem can be easily overwhelmed by the sheer amount of incoming I/O operations. This, in turn, can slow down the whole distributed system. Many optimizations exist that try to alleviate this problem. Cli...
Conference Paper
Modern file systems maintain extensive metadata about stored files. While this usually is useful, there are situations when the additional overhead of such a design becomes a problem in terms of performance. This is especially true for parallel and cluster file systems, because due to their design every metadata operation is even more expensive. In...
Conference Paper
With MPI-IO we see various alternatives for programming file I/O. The overall program performance depends on many different factors. A new trace analysis environment provides deeper insight into the client/server behavior and visualizes events of both process types. We investigate the influence of making independent vs. collective calls together wi...
Article
In today's file systems each file is made up of data and metadata. The metadata contains some information about the associated data, like ownership and permissions of the file. While this usually is useful, there are situations when the additional overhead of such a design becomes a problem in terms of performance. This is especially true for clust...
Conference Paper
Full-text available
Modern file systems maintain extensive metadata about stored files. While this usually is useful, there are situations when the additional overhead of such a design becomes a problem in terms of performance. This is especially true for parallel and cluster file systems, because due to their design every metadata operation is even more expensive. I...

Network

Cited By

Projects

Projects (3)
Project
www.cosemos.de
Project
The goal of the NESUS Action is to establish an open European research network targeting sustainable solutions for ultrascale computing aiming at cross fertilization among HPC, large scale distributed systems, and big data management. The network will contribute to glue disparate researchers working across different areas and provide a meeting ground for researchers in these separate areas to exchange ideas, to identify synergies, and to pursue common activities in research topics such as sustainable software solutions (applications and system software stack), data management, energy efficiency, and resilience. Some of the most active research groups of the world in this area are members of this proposal. This Action will increase the value of these groups at the European-level by reducing duplication of efforts and providing a more holistic view to all researchers, it will promote the leadership of Europe, and it will increase their impact on science, economy, and society.
Archived project
In the Exa2Green project, an interdisciplinary research team of HPC experts, computer scientists, mathematicians, physicists and engineers takes up the challenge to develop a radically new energy-aware computing paradigm and programming methodology for exascale computing. As a proof of concept, the online coupled model system COSMO-ART, based on the operational weather forecast model of the COSMO Consortium (www.cosmo-model.org), is being modified to incorporate energy-aware numerics. COSMO-ART was developed at KIT and allows the treatment of primary and secondary aerosols and their impact on radiation and clouds.