Hannes Weisbach’s research while affiliated with TUD Dresden University of Technology and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (2)


Fig. 1 FFMK software architecture: compute processes with performance-critical parts of (MPI) runtime and communication driver execute directly on L4 microkernel; functionality that is not critical for performance is offloaded to the L 4 Linux kernel, which also hosts global platform management and fault-tolerance services
Fig. 2 OS noise during a run of the fixed work quantum (FWQ) benchmark on a node of a production HPC cluster with Linux-based vendor OS
Fig. 4 Minimal OS noise remaining in decoupled execution; L 4 Linux running on same socket
Fig. 5 BSP-style MPI-FWQ (StepSync mode) on L 4 Linux (Std) and with decoupled thread execution (DC) on Taurus. This figure has originally been published in [33]
Fig. 6 Performance variation of FWQ and HPCCG on a dual-socket Intel E5-2650 v2

+5

FFMK: A Fast and Fault-Tolerant Microkernel-Based System for Exascale Computing
  • Chapter
  • Full-text available

July 2020

·

588 Reads

Lecture Notes in Computational Science and Engineering

Carsten Weinhold

·

·

Jan Bierbaum

·

[...]

·

The FFMK project designs, builds and evaluates a system-software architecture to address the challenges expected in Exascale systems. In particular, these challenges include performance losses caused by the much larger impact of runtime variability within applications, hardware, and operating system (OS), as well as increased vulnerability to failures. The FFMK OS platform is built upon a multi-kernel architecture, which combines the L4Re microkernel and a virtualized Linux kernel into a noise-free, yet feature-rich execution environment. It further includes global, distributed platform management and system-level optimization services that transparently minimize checkpoint/restart overhead for applications. The project also researched algorithms to make collective operations fault tolerant in presence of failing nodes. In this paper, we describe the basic components, algorithms, and services we developed in Phase 2 of the project.

Download

Citations (1)


... The actual speedup that can be observed in such a scenario depends on a spectrum of code properties, such as decomposition strategies, sparse matrix structures, block vector sizes, communication concurrency, and the performance characteristics of back-to-back loops, which can all influence resource utilization [5,8]. These prior studies show that bottleneck evasion via asynchronicity can be regarded as a performance optimization technique, complementing traditional techniques such as explicitly asynchronous communication, noise mitigation, MPI process placement, dynamic load balancing, synchronization of operating kernel (OS) influence, lightweight OS kernels, etc. [22,9,18,27]. ...

Reference:

Making Applications Faster by Asynchronous Execution: Slowing Down Processes or Relaxing MPI Collectives
Hardware Performance Variation: A Comparative Study Using Lightweight Kernels
  • Citing Chapter
  • January 2018

Lecture Notes in Computer Science