
Ralph Müller-PfefferkornTU Dresden | TUD · Center for Information Services and High Performance Computing (ZIH)
Ralph Müller-Pfefferkorn
Dr rer. nat.
About
542
Publications
153,321
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
9,542
Citations
Introduction
Additional affiliations
October 2008 - present
Publications
Publications (542)
Typical Machine Learning (ML) approaches are characterized by their iterative and exploratory nature: continuously refining and adapting not only code but also ML models to optimize the results and the performance on new data. This poses novel challenges related to keeping the trained model Findable, Accessible, Interoperable and Reusable (FAIR), e...
Machine learning has transitioned from an individ-
ualistic approach to a collaborative one, enabling the collective
effort to address increasingly complex challenges as they arise.
One challenge that emerges is the management of a collaborative
development process in machine learning projects. This paper
outlines a collaborative environment KEENsi...
In recent years, one-sided communication has emerged as an alternative to message-based communication to improve the scalability of distributed programs. Decoupling communication and synchronization in such programs allows for more asynchronous execution of processes, but introduces new challenges to ensure program correctness and efficiency. The c...
Research data infrastructures are quickly evolving and show a wide variety, for instance in the way they address user requirements and use cases as well as how they provide user‐required information through their software architecture. In this article, we discuss challenges and provide approaches to developing software architectures and software co...
Data Management Plans (DMPs) are crucial for a structured research data management and often a mandatory part of research proposals. DMP tools support the development of DMPs. Among the variety of tools available, it can be difficult for researchers, data stewards and institutions to choose the one that is most appropriate for their specific needs...
A number of German federal states have established initiatives to support the institutionalization of RDM infrastructures and services. They can serve as intermediaries between disciplinary approaches to RDM like NFDI and RDM services in individual research institutions. This presentation gives an overview of the current state of these initiatives...
Geospatial data are fundamental in most global-change and sustainability-related domains. However, readily accessible information on data quality and provenance is often missing or hardly accessible for users due to technical or perceptual barriers, for example, due to unstructured metadata information or missing references. Within an interdiscipli...
Summarizing a key use case of a research workstream of the German publicly funded KEEN project, methods and tool chains are demonstrated to extract and to contextualize process data in an automated way based on engineering information. The contextualized process data serves as a high-quality data source for machine learning methods. The article cov...
Data collection in process industry yields a variety of data types. To maintain the knowledge about the data and to enable finding and reusing them (FAIR data), a common description with metadata is necessary. In the project KEEN, ProMetaS – the Process Engineering/Industry Metadata Schema – was developed. It defines various metadata categories adh...
The chemical industry is one of the key industrial sectors in Germany and at the same time one of the largest consumers of energy and raw materials. A successful energy transition and the development of a circular economy can only succeed if they are actively supported and shaped by the chemical industry – through the redesign of existing productio...
Metadata management is core to support discovery and reuse of data products, and to allow for reproducibility of the research data in Earth System Sciences (ESS). Thus, ensuring acquisition and provision of meaningful and quality assured metadata should become an integral part of data-driven ESS projects.We propose an open-source tool for the autom...
Nowadays, the daily work of many research communities is characterized by an increasing amount and complexity of data. This makes it increasingly difficult to manage, access and utilize the data to ultimately gain scientific insights based on it. At the same time, domain scientists want to focus on their science instead of IT. The solution is resea...
Considering the cost effectiveness, elasticity, and flexibility of virtualized cloud environments, porting HPC applications to those environments and executing them within these settings is becoming more and more popular. For this purpose, traditional HPC applications have to be redesigned as cloud applications. An analysis of the performance of th...
Ubiquitous access to existing scientific data is one of the key enablers for rapid development and progress of empirical and without doubt, also non-empirical sciences. Management, in particular the long-term preservation of said data presents a major challenge. Universities, scientific and cultural organisations, international collaborations and p...
Research data management is of the utmost importance in a world where research data is created with an ever increasing amount and rate and with a high variety across all scientific disciplines. This paper especially discusses software engineering challenges stemming from creating a long-living software system. It aims at providing a reference imple...
In this paper we address the problem of locating race conditions among synchronization primitives in execution traces of hybrid parallel programs. In hybrid parallel programs collective and point-to-point synchronization can’t be analyzed separately. We introduce a model for synchronization primitives and formally define synchronization races with...
This is a preprint of
BaBar collaboration,
A study of $B^{\pm} \rightarrow J/\psi\pi^{\pm}$ and B^{\pm} \rightarrow J/\psiK^{\pm}$ decays: measurement of the ratio of branching fractions and search for direct $CP$-violating charge asymmetries
https://www.researchgate.net/publication/281076761_A_study_of_Bpm_rightarrow_Jpsipipm_and_Bpm_rightarrow...
One aspect of the so called Big Data challenge is the rising quantity of data in almost all scientific, social, governmental and commercial disciplines. As a result there are many ongoing developments of analysis techniques to substitute manual processes with automatic or semi-automatic algorithms. This means the knowledge of data analysts has to b...
The sheer volume of data accumulated in many scientific disciplines as well as in industry is a critical point that requires immediate attention. The handling of large data sets will become a limiting factor-even for data intensive applications running on future Exascale systems. Nowadays, Big Data can be more a collection of challenges for data pr...
This work is on the Physics of the B Factories. Part A of this book contains
a brief description of the SLAC and KEK B Factories as well as their detectors,
BaBar and Belle, and data taking related issues. Part B discusses tools and
methods used by the experiments in order to obtain results. The results
themselves can be found in Part C.
Please not...
The analysis and optimization of HPC I/O is a daunting task that is still unaddressed at large. The SIOX project aims to help HPC users and system administrators alike to improve the I/O performance of the applications by gaining awareness of the I/O operations taking place on the system and launching corrective measures when a problem is encounter...
State-of-the-art research in a variety of natural sciences depends heavily on methods of computational chemistry, for example, the calculation of the properties of materials, proteins, catalysts, and drugs. Applications providing such methods require a lot of expertise to handle their complexity and the usage of high-performance computing. The MoSG...
The MoSGrid portal offers an approach to carry out high-quality molecular simulations on distributed compute infrastructures to scientists with all kinds of background and experience levels. A user friendly web interface guarantees the ease-of-use of modern chemical simulation applications well established in the field. The usage of well-defined wo...
Big Data applications in science are producing huge amounts of data, which require advanced processing, handling, and analysis capabilities. For the organization of large scale data sets it is essential to annotate these with metadata, index them, and make them easily findable. In this paper we investigate two scientific use cases from biology and...
The BaBar detector operated successfully at the PEP-II asymmetric e⁺e⁻ collider at the SLAC National Accelerator Laboratory from 1999 to 2008. This report covers upgrades, operation, and performance of the collider and the detector systems, as well as the trigger, online and offline computing, and aspects of event reconstruction since the beginning...
This poster describes future scenarios of information infrastructures in
science and other fields of research. The scenarios presented are based
on practical experience resulting from interaction with research data in
a research center and its library, and further enriched by the results
of a baseline study of existing data repositories and data
in...
Modern tools for computational chemistry allow the calculation of a wide range of properties of all sorts of molecules applying various levels of theory. But to perform convincing and significant calculations with these tools not only requires insight into the scientific theory itself, but also knowledge and experience on how to operate the simulat...
For resources to be shared, sites must be able to exchange basic accounting and usage data in a common format. This document describes a common format which enables the exchange of basic accounting and usage data from different resources. This record format is intended to facilitate the sharing of usage information, particularly in the area of the...
With the continuous growth of data generated in various scientific and commercial endeavors and the rising need for interdisciplinary studies and applications in e-Science easy exchange of information and computation resources capable of processing large amounts of data to allow ad-hoc co-operation becomes ever more important. Unfortunately differe...
At the threshold to exascale computing, limitations of the MPI programming model become more and more pronounced. HPC programmers have to design codes that can run and scale on systems with hundreds of thousands of cores. Setting up accordingly many communication buffers, point-to-point communication links, and using bulk-synchronous communication...
In this paper we present the concept of a scalable job centric monitoring infrastructure.The overall performance of this distributed, layer based architecturecalled SLAte can be increased by installing additional servers to adapt to thedemands of the monitored resources and users. Another important aspect is tooffer a uniform global view on all dat...
Structural bioinformatics applies computational methods to analyze and model three-dimensional molecular structures. There is a huge number of applications available to work with structural data on large scale. Using these tools on distributed computing infrastructures (DCIs), however, is often complicated due to a lack of suitable interfaces. The...
Job centric monitoring allows to observe jobs on remote computing resources. It may offer visualisation of recorded monitoring data and helps to find faulty or misbehaving jobs. If installations like grids or clouds are observed monitoring data of many thousands of jobs have to be handled. The challenge of job centric monitoring infrastructures is...
The objective of the German BMBF research project Highly Efficient Implementation
of CFD Codes for HPC Many-Core Architectures (HICFD) is to develop
new methods and tools for the analysis and optimization of the performance
of parallel computational fluid dynamics (CFD) codes on high performance computer
systems with many-core processors. In the wo...
Motivation: Web-based access to computational chemistry grid resources has proven to be a viable approach to simplify the use of simulation codes. The introduction of recipes allows to reuse already developed chemical workflows. By this means, workflows for recurring basic compute jobs can be provided for daily services. Nevertheless, the same plat...
The MoSGrid Science Gateway provides convenient ways to analyse and model threedimensional molecular structures using computational chemistry workflows. Molecular dynamics, quantum chemistry, and docking are the considered application domains. The science gateway including the data repository and underlying infrastructures was developed on top of t...
We present Scout, a configurable source-to-source transformation tool designed to automatically vectorize C source code. Scout provides the means to vectorize loops using SIMD instructions at source level. Our main focus during the development of Scout is a maximum flexibility of the tool in two ways: being capable of vectorizing a wide range of lo...
We report the result of a search for the rare decay B0→γγ in 426 fb-1 of data, corresponding to 226×106 B0B̅ 0 pairs, collected on the Υ(4S) resonance at the PEP-II asymmetric-energy e+e- collider using the BABAR detector. We use a maximum likelihood fit to extract the signal yield and observe 21-12+13 signal events with a statistical significance...
Structural Bioinformatics is concerned with computational methods for the analysis and modeling of three-dimensional molecular structures. There is a plethora of computational tools available to work with structural data on a large scale. Using these tools on distributed computing infrastructures (DCI), however, is often hampered by a lack of suita...
In biological research automatic high-throughput microscopes produce large amounts of pic- tures in a short time interval, which are to be analysed with image processing software. This results in thousands of computing jobs at once as the images can be processed in parallel. In this paper a workflow for such high throughput computing is presented....
The integrated Rule Orientated Data System (iRODS) is a Grid data management system that organizes distributed data and their metadata. A Rule Engine allows a flexible definition of data storage, data access and data processing. This paper presents scenarios implemented in a benchmark tool to measure the performance of an iRODS environment as well...
The integrated Rule Orientated Data System (iRODS) is a Grid data management system that organizes distributed data and their metadata. A Rule Engine allows a flexible definition of data storage, data access and data processing. This paper presents scenarios implemented in a benchmark tool to measure the performance of an iRODS environment as well...
A Grid infrastructure intends to provide a huge amount of globally distributed computing resources. Since the end-users are often no experts in computer science the processes of allocating resources or managing
data are usually hidden in underlying Grid middleware or other abstraction layers. Nevertheless, the user wants to be informed about the jo...
Bei dem Forschungsprojekt HICFD handelt es sich um ein Verbundprojekt des
vom Bundesministerium für Bildung und Forschung geförderten Programms „IKT 2020 – Forschung
und Innovation“. Das Forschungsprojekt hat zum Ziel, neue Methoden und Werkzeuge zur Analyse
und Optimierung des Leistungsvermögens strömungsmechanischer, paralleler Programme auf Hoch...
In the High Energy Physics Comunity Grid (HEPCG) of Germany’s D-Grid initiative, a suite of tools supporting the user in monitoring his jobs was developed. In the HEP community many users submit large numbers of jobs. A considerable fraction of these jobs fail for various reasons. Until now, it has been hard or even impossible for the user to find...
In the High Energy Physics Comunity Grid (HEPCG) of Germany’s D-Grid initiative, a suite of tools supporting the user in monitoring his jobs was developed. In the HEP community many users submit large numbers of jobs. A considerable fraction of these jobs fail for various reasons. Until now, it has been hard or even impossible for the user to find...
Abstract For data analysis or simulations (e.g. in particle physics) single users submit hundreds,or thousands,of jobs to the Grid. This puts a new burden on the users side - keeping an overview on the status and performance,of the jobs in a dis- tributed environment. In this paper, a user- and job-centric monitoring,system is presented. It collect...
To process large amounts of data, in some fields of science hundreds or thousands of single jobs are submitted into a Grid. Monitoring the enormous numbers of jobs and their resource usage in such environments (like the LCG/gLite middleware) effectively becomes an important issue for the users. Current tools in LCG / gLite provide only limited valu...
Open Grid Service Architecture - Data Access and Integration (OGSA-DAI) is a middleware which aims to provide a unique interface to heterogeneous database management systems and to special type of files like SwissProt files. It could become a vital tool for data integration in life sciences since the data is produced by different sources and residi...
This paper describes the ideas and developments of the project EP-CACHE. Within this project new methods and tools are developed to improve the analysis and the optimization of programs for cache architectures, especially for SMP clusters. The tool set comprises the semi-automatic instrumentation of user programs, the monitoring of the cache behavi...
We present a search for the decay B(-)--> tau(-)nu(tau) in a sample of 88.9 x 10(6) BB pairs recorded with the BABAR detector at the Stanford Linear Accelerator Center B factory. One of the two B mesons from the Gamma(4S) is reconstructed in a hadronic or a semileptonic final state, and the decay products of the other B in the event are analyzed fo...
We present a search for the decays B-0 -> e(+)e(-), B-0 ->mu(+)mu(-), and B-0 -> e(+/-)mu(-/+) in data collected at the Upsilon(4S) resonance with the BABAR detector at the SLAC B Factory. Using a data set of 111 fb(-1), we find no evidence for a signal in any of the three channels investigated and set the following branching fraction upper limits...
We present a search for the decays B0-->e+ e-, B0-->mu+ mu-, and B0-->e (+/-) mu (-/+) in data collected at the Upsilon(4S) resonance with the BABAR detector at the SLAC B Factory. Using a data set of 111 fb(-1), we find no evidence for a signal in any of the three channels investigated and set the following branching fraction upper limits at the 9...
We study the decay B-→J/ψK-π+π- using 117×106 BB̅ events collected at the Y(4S) resonance with the BABAR detector at the PEP-II e+e- asymmetric-energy storage ring. We measure the branching fractions B (B-→J/ψK-π+π-)=(116±7(stat.)±9(syst.))×10-5 and B (B-→X(3872)K-)× B (X(3872)→J/ψπ+π-)=(1.28±0.41)×10-5 and find the mass of the X(3872) to be 3873.4...
We present results on time-dependent CP asymmetries in neutral B decays to several CP eigenstates. The measurements use a data sample of about 227 × 106 Y(4S) → BB̄ decays collected by the BABAR detector at the PEP-II asymmetric-energy B Factory at SLAC. The amplitude of the CPasymmetry, sin2β in the standard model, is derived from decay-time distr...
B Aubert R Barate D Boutigny- [...]
H Neal
We present results on time-dependent CP asymmetries in neutral B decays to several CP eigenstates. The measurements use a data sample of about 227 x 10(6) upsilon(4S) --> BB decays collected by the BABAR detector at the PEP-II asymmetric-energy B Factory at SLAC. The amplitude of the CPasymmetry, sin2beta in the standard model, is derived from deca...
We search for the rare flavor-changing neutral-current decay B+ -> K+ v (v) over tilde in a data sample of 82 fb(-1) collected with the BABAR detector at the PEP-II B-factory. Signal events are selected by examining the properties of the system recoiling against either a reconstructed hadronic or semileptonic charged-B decay. Using these two indepe...
We present a measurement of the Cabibbo-Kobayashi-Maskawa matrix element vertical bar V(cb)vertical bar based on a sample of about 53 700 (B) over bar (0) -> D*(+)l(-)(nu) over bar (l) decays observed by the BABAR detector. We obtain the branching fraction averaged over l = e, mu, B((B) over bar (0) -> D*(+)l(-)(nu) over bar (l)) = (4.90 +/- 0.07(s...
B Aubert R Barate D Boutigny- [...]
H Neal
We search for the rare flavor-changing neutral-current decay B(+)--> K(+)nunu in a data sample of 82 fb(-1) collected with the BABAR detector at the PEP-II B-factory. Signal events are selected by examining the properties of the system recoiling against either a reconstructed hadronic or semileptonic charged-B decay. Using these two independent sam...
We present a measurement of the Cabibbo-Kobayashi-Maskawa matrix element
|Vcb| based on a sample of about 53 700
B¯0→D*+ℓ-ν¯ℓ
decays observed by the BABAR detector. We obtain the branching fraction
averaged over ℓ=e,μ,
B(B¯0→D*+ℓ-ν¯ℓ)=(4.90±0.07(stat.)+0.36-0.35(syst.))%.
We measure the differential decay rate as a function of w, the
relativistic b...
We present results on B→J/ψKπ decays using e+e-annihilation data collected with the BABAR detector at the Υ(4S) resonance. The detector is located at the PEP-II asymmetric-energy storage ring facility at the Stanford Linear Accelerator Center. Using approximately 88×106 BB̅ pairs, we measure the decay amplitudes for the flavor eigenmodes and observ...
We describe searches for B meson decays to the charmless vector-vector final states ωK* and ωρ in 89×106 BB̅ pairs produced in e+e- annihilation at √s=10.58 GeV. We measure the following branching fractions in units of 10-6: B(B0→ωK*0)=3.4-1.6+1.8±0.4(<6.0), B(B+→ωK*+)=3.5-2.0+2.5±0.7(<7.4), B(B0→ωρ0)=0.6-1.1+1.3±0.4(<3.3), and B(B+→ωρ+)=12.6-3.3+3...
We present measurements of the branching fraction and CP-violating asymmetries in the decay B0-->f0(980)K0S. The results are obtained from a data sample of 123 x 10(6) Upsilon(4S)-->BB decays. From a time-dependent maximum likelihood fit, we measure the branching fraction B(B0-->f0(980)(-->pi+pi-)K0)=(6.0+/-0.9+/-0.6+/-1.2)x10(-6), the mixing-induc...
We search for the rare flavor-changing neutral-current decay B+ → K+ νν̄ in a data sample of 82 fb-1 collected with the BABAR detector at the PEP-II S-factory. Signal events are selected by examining the properties of the system recoiling against either a reconstructed hadronic or semileptonic charged-B decay. Using these two independent samples we...
We present a measurement of the Cabibbo-Kobayashi-Maskawa matrix element |Vcb| based on a sample of about 53 700 B̄0 → D*+ℓ-ν̄ℓ decays observed by the BABAR detector. We obtain the branching fraction averaged over ℓ = e, μ, B(B̄0 → D*+ℓ -ν̄ℓ) = (4.90 ± 0.07(stat.) -0.35 +0.36(syst.))%. We measure the differential decay rate as a function of w, the...
A search for the decays B → ρ(770)γ and B0 → ω(782)γ is performed on a sample of 211 × 106 Y(45) → BB̄ events collected by the BABAR detector at the SLAC PEP-II asymmetric-energy e+ e- storage ring. No evidence for the decays is seen. We set the following limits on the individual branching fractions: ℬ(B+ → rho;+γ] < 1.8 × 10-6, ℬ(B0 → ρ0γ < 0.4 ×...
We search for the factorization-suppressed decays $B\rightarrow \chi_{c0}K^(*)$ and $B\rightarrow \chi_{c2}K^(*)$, with $\chi_{c0}$ and $chi_{c2}$ decaying into $J/\psi\gamma$ using a sample of $124x 10^6 B\overline{B}$ events collected with the BABAR detector at the PEP-II storage ring of the Stanford Linear Accelerator Center. We find no signific...
We search for a charged partner of the X(3872) in the decay B --> X- K, X- --> J/psi pi- pi0, using 234 million BBbar events collected at the Y(4S) resonance with the BaBar detector at the PEP-II e+e- asymmetric-energy storage ring. The resulting product branching fraction upper limits are BR(B0 --> X- K+, X- --> J/psi pi- pi0) < 5.4 x 10(-6) and B...