Figure 8 - uploaded by Vassil G Vassilev
Content may be subject to copyright.
ROOT core.

ROOT core.

Source publication
Conference Paper
Full-text available
Cling is an interactive C++ interpreter, built on top of Clang and LLVM compiler infrastructure. Like its predecessor Cint, Cling realizes the read-print-evaluate-loop concept, in order to leverage rapid application development. Implemented as a small extension to LLVM and Clang, the interpreter reuses their strengths such as the praised concise an...

Context in source publication

Context 1
... is the ability to examine and modify the behaviour of an object at runtime. Many core features from processing signal/slots to pyROOT [15] are implemented using the technique (Figure 8). ...

Citations

... During our initial computational experiments, we discovered a bottleneck in the current BioDynaMo version. Most of the execution time was spent starting the BioDy-naMo simulation engine, more precisely, initializing the C++ interpreter cling (Vasilev et al. 2012). Initialization became the dominant factor since the simulation of a single neuron is very fast compared to the extensive simulations with billions of agents that BioDynaMo supports. ...
Article
Full-text available
Understanding how genetically encoded rules drive and guide complex neuronal growth processes is essential to comprehending the brain’s architecture, and agent-based models (ABMs) offer a powerful simulation approach to further develop this understanding. However, accurately calibrating these models remains a challenge. Here, we present a novel application of Approximate Bayesian Computation (ABC) to address this issue. ABMs are based on parametrized stochastic rules that describe the time evolution of small components–the so-called agents–discretizing the system, leading to stochastic simulations that require appropriate treatment. Mathematically, the calibration defines a stochastic inverse problem. We propose to address it in a Bayesian setting using ABC. We facilitate the repeated comparison between data and simulations by quantifying the morphological information of single neurons with so-called morphometrics and resort to statistical distances to measure discrepancies between populations thereof. We conduct experiments on synthetic as well as experimental data. We find that ABC utilizing Sequential Monte Carlo sampling and the Wasserstein distance finds accurate posterior parameter distributions for representative ABMs. We further demonstrate that these ABMs capture specific features of pyramidal cells of the hippocampus (CA1). Overall, this work establishes a robust framework for calibrating agent-based neuronal growth models and opens the door for future investigations using Bayesian techniques for model building, verification, and adequacy assessment.
... Despite ongoing efforts to enhance Python's efficiency through various implementations (e.g. Numba, Pyston, Cinder, pypy, pyston, pydjion, Nuitka, Shed Skin, etc.) [16], [17], [18], [19], [20], [21], and to make C++ more interactive [22], [23], the reality is that achieving a blend of interactivity and performance requires intentional and often divergent design considerations. A particular challenge with Python lies in its reliance on the CPython C API for most of its ecosystem. ...
Preprint
Full-text available
Robotics programming typically involves a trade-off between the ease of use offered by Python and the run-time performance of C++. While multi-language architectures address this trade-off by coupling Python's ergonomics with C++'s speed, they introduce complexity at the language interface. This paper proposes using Julia for performance-critical tasks within Python ROS 2 applications, providing an elegant solution that streamlines the development process without disrupting the existing Python workflow.
... Clad's proximity to the compiler allows for more control over the derivative synthesis, allowing us to also insert RooFit-specific optimizations while building the derivative code. Moreover, Clad has good support for modern C++ constructs and has off-the-shelf integration with Cling [8] (the C++ interpreter used in ROOT), making it an ideal choice for our work. ...
Article
Full-text available
With the growing datasets of current and next-generation HighEnergy and Nuclear Physics (HEP/NP) experiments, statistical analysis has become more computationally demanding. These increasing demands elicit improvements and modernizations in existing statistical analysis software. One way to address these issues is to improve parameter estimation performance and numeric stability using Automatic Differentiation (AD). AD’s computational efficiency and accuracy are superior to the preexisting numerical differentiation techniques, and it offers significant performance gains when calculating the derivatives of functions with a large number of inputs, making it particularly appealing for statistical models with many parameters. For such models, many HEP/NP experiments use RooFit, a toolkit for statistical modeling and fitting that is part of ROOT. In this paper, we report on the effort to support the AD of RooFit likelihood functions. Our approach is to extend RooFit with a tool that generates overheadfree C++ code for a full likelihood function built from RooFit functional models. Gradients are then generated using Clad, a compiler-based source-codetransformation AD tool, using this C++ code. We present our results from applying AD to the entire minimization pipeline and profile likelihood calculations of several RooFit and HistFactory models at the LHC-experiment scale. We show significant reductions in calculation time and memory usage for the minimization of such likelihood functions. We also elaborate on this approach’s current limitations and explain our plans for the future.
... The need to reconcile high performance with fast development has led to the development of a C++ interpreter [8] that provides the convenience of a read-eval-print-loop (REPL) interactive experience, also known as programming shell, that supports just-in-time compilation, and allows the use of the same programming language for compiled and interpreted code. The same analysis framework ROOT [9,10] can then be used with compiled code and interactively. ...
Article
Full-text available
Research in high energy physics (HEP) requires huge amounts of computing and storage, putting strong constraints on the code speed and resource usage. To meet these requirements, a compiled high-performance language is typically used; while for physicists, who focus on the application when developing the code, better research productivity pleads for a high-level programming language. A popular approach consists of combining Python, used for the high-level interface, and C++, used for the computing intensive part of the code. A more convenient and efficient approach would be to use a language that provides both high-level programming and high-performance. The Julia programming language, developed at MIT especially to allow the use of a single language in research activities, has followed this path. In this paper the applicability of using the Julia language for HEP research is explored, covering the different aspects that are important for HEP code development: runtime performance, handling of large projects, interface with legacy code, distributed computing, training, and ease of programming. The study shows that the HEP community would benefit from a large scale adoption of this programming language. The HEP-specific foundation libraries that would need to be consolidated are identified.
... The need to reconcile high performance with fast development has led to the development of a C ++ interpreter [8] that provides the convenience of a read-eval-print-loop (REPL) interactive experience, also known as programming shell, that supports just-in-time compilation, and allows the use of the same programming language for compiled and interpreted code. The same analysis framework ROOT [9,10] can then be used with compiled code and interactively. ...
Preprint
Full-text available
Research in high energy physics (HEP) requires huge amounts of computing and storage, putting strong constraints on the code speed and resource usage. To meet these requirements, a compiled high-performance language is typically used; while for physicists, who focus on the application when developing the code, better research productivity pleads for a high-level programming language. A popular approach consists of combining Python, used for the high-level interface, and C++, used for the computing intensive part of the code. A more convenient and efficient approach would be to use a language that provides both high-level programming and high-performance. The Julia programming language, developed at MIT especially to allow the use of a single language in research activities, has followed this path. In this paper the applicability of using the Julia language for HEP research is explored, covering the different aspects that are important for HEP code development: runtime performance, handling of large projects, interface with legacy code, distributed computing, training, and ease of programming. The study shows that the HEP community would benefit from a large scale adoption of this programming language. The HEP-specific foundation libraries that would need to be consolidated are identified
... A detailed example for the construction of a filter based on this approach can be found in a Jupyter notebook using the Cling-Kernel [33] in [4]. This filter differs from the semi-static filters (called stage A) in [30], which do not guarantee valid results in cases of underflow. ...
Article
Full-text available
Geometric predicates are at the core of many algorithms, such as the construction of Delaunay triangulations, mesh processing and spatial relation tests. These algorithms have applications in scientific computing, geographic information systems and computer-aided design. With floating-point arithmetic, these geometric predicates can incur round-off errors that may lead to incorrect results and inconsistencies, causing computations to fail. This issue has been addressed using a combination of exact arithmetic for robustness and floating-point filters to mitigate the computational cost of exact computations. The implementation of exact computations and floating-point filters can be a difficult task, and code generation tools have been proposed to address this. We present a new C++ meta-programming framework for the generation of fast, robust predicates for arbitrary geometric predicates based on polynomial expressions. We combine and extend different approaches to filtering, branch reduction, and overflow avoidance that have previously been proposed. We show examples of how this approach produces correct results for data sets that could lead to incorrect predicate results with naive implementations. Our benchmark results demonstrate that our implementation surpasses state-of-the-art implementations.
... Figure 1 demonstrates what an example function and its derivative look like when differentiated with Clad. Clad is especially easy to use for HEP tools as it is available in ROOT through Cling [9], the C++ interpreter for ROOT. ...
Preprint
Full-text available
RooFit is a toolkit for statistical modeling and fitting used by most experiments in particle physics. Just as data sets from next-generation experiments grow, processing requirements for physics analysis become more computationally demanding, necessitating performance optimizations for RooFit. One possibility to speed-up minimization and add stability is the use of Automatic Differentiation (AD). Unlike for numerical differentiation, the computation cost scales linearly with the number of parameters, making AD particularly appealing for statistical models with many parameters. In this paper, we report on one possible way to implement AD in RooFit. Our approach is to add a facility to generate C++ code for a full RooFit model automatically. Unlike the original RooFit model, this generated code is free of virtual function calls and other RooFit-specific overhead. In particular, this code is then used to produce the gradient automatically with Clad. Clad is a source transformation AD tool implemented as a plugin to the clang compiler, which automatically generates the derivative code for input C++ functions. We show results demonstrating the improvements observed when applying this code generation strategy to HistFactory and other commonly used RooFit models. HistFactory is the subcomponent of RooFit that implements binned likelihood models with probability densities based on histogram templates. These models frequently have a very large number of free parameters and are thus an interesting first target for AD support in RooFit.
... Modern versions of ROOT include major developments such as Cling [9], which provides LLVM/Clang based just-in-time compilation of C++, fundamentally improving the robustness and feature-set of "interpreted" code. The PyROOT [10] interface leverages this to provide a much friendlier interface to ROOT via python on the one hand, and opens many possibilities for interoperability between python tools, ROOT, and other C++ libraries via the automatic python bindings and comprehensive C++ language support. ...
Article
Full-text available
The unprecedented volume of data and Monte Carlo simulations at the HL-LHC will pose increasing challenges for data analysis both in terms of computing resource requirements as well as ”time to insight”. Discussed are the evolution and current state of analysis data formats, software, infrastructure and workflows at the LHC, and the directions being taken towards fast, efficient, and effective physics analysis at the HL-LHC.
... This paper describes our progress with the CUDA support of the compiler-assisted AD tool, Clad. We demonstrate the integration of Clad with interactive development services such as Jupyter Notebooks [7], Cling [8], Clang-Repl [9] and ROOT [10]. ...
Article
Full-text available
Automatic Differentiation (AD) is instrumental for science and industry. It is a tool to evaluate the derivative of a function specified through a computer program. The range of AD application domain spans from Machine Learning to Robotics to High Energy Physics. Computing gradients with the help of AD is guaranteed to be more precise than the numerical alternative and have a low, constant factor more arithmetical operations compared to the original function. Moreover, AD applications to domain problems typically are computationally bound. They are often limited by the computational requirements of high-dimensional parameters and thus can benefit from parallel implementations on graphics processing units (GPUs). Clad aims to enable differential analysis for C/C++ and CUDA and is a compiler-assisted AD tool available both as a compiler extension and in ROOT. Moreover, Clad works as a plugin extending the Clang compiler; as a plugin extending the interactive interpreter Cling; and as a Jupyter kernel extension based on xeus-cling. We demonstrate the advantages of parallel gradient computations on GPUs with Clad. We explain how to bring forth a new layer of optimization and a proportional speed up by extending Clad to support CUDA. The gradients of well-behaved C++ functions can be automatically executed on a GPU. The library can be easily integrated into existing frameworks or used interactively. Furthermore, we demonstrate the achieved application performance improvements, including (≈10x) in ROOT histogram fitting and corresponding performance gains from offloading to GPUs.
... Another relevant feature for this work is the presence of a C++ interpreter, called cling [47], and dynamic Python bindings based on the cppyy [26] library. These two together offer the possibility for higher-level tools like RDataFrame to provide users with modern, ergonomic interfaces that can interact with other common data science libraries, which most often offer Python APIs. ...
Article
Full-text available
CERN (Centre Europeen pour la Recherce Nucleaire) is the largest research centre for high energy physics (HEP). It offers unique computational challenges as a result of the large amount of data generated by the large hadron collider. CERN has developed and supports a software called ROOT , which is the de facto standard for HEP data analysis. This framework offers a high-level and easy-to-use interface called RDataFrame , which allows managing and processing large data sets. In recent years, its functionality has been extended to take advantage of distributed computing capabilities. Thanks to its declarative programming model, the user-facing API can be decoupled from the actual execution backend . This decoupling allows physical analysis to scale automatically to thousands of computational cores over various types of distributed resources. In fact, the distributed RDataFrame module already supports the use of established general industry engines such as Apache Spark or Dask. Notwithstanding the foregoing, these current solutions will not be sufficient to meet future requirements in terms of the amount of data that the new projected accelerators will generate. It is of interest, for this reason, to investigate a different approach, the one offered by serverless computing. Based on a first prototype using AWS Lambda , this work presents the creation of a new backend for RDataFrame distributed over the OSCAR tool, an open source framework that supports serverless computing. The implementation introduces new ways, relative to the AWS Lambda -based prototype, to synchronize the work of functions.