ArticlePDF Available

Abstract

Upcoming HEP experiments, e.g. at the HL-LHC, are expected to increase the volume of generated data by at least one order of magnitude. In order to retain the ability to analyze the influx of data, full exploitation of modern storage hardware and systems, such as low-latency high-bandwidth NVMe devices and distributed object stores, becomes critical. To this end, the ROOT RNTuple I/O subsystem has been designed to address performance bottlenecks and shortcomings of ROOT’s current state of the art TTree I/O subsystem. RNTuple provides a backwards-incompatible redesign of the TTree binary format and access API that evolves the ROOT event data I/O for the challenges of the upcoming decades. It focuses on a compact data format, on performance engineering for modern storage hardware, for instance through making parallel and asynchronous I/O calls by default, and on robust interfaces that are easy to use correctly. In this contribution, we evaluate the RNTuple performance for typical HEP analysis tasks. We compare the throughput delivered by RNTuple to popular I/O libraries outside HEP, such as HDF5 and Apache Parquet. We demonstrate the advantages of RNTuple for HEP analysis workflows and provide an outlook on the road to its use in production.
Journal of Physics: Conference Series
PAPER • OPEN ACCESS
RNTuple performance: Status and Outlook
To cite this article: Javier Lopez-Gomez and Jakob Blomer 2023 J. Phys.: Conf. Ser. 2438 012118
View the article online for updates and enhancements.
You may also like
Context-aware distributed cloud computing
using CloudScheduler
R Seuster, CR Leavett-Brown, K Casteels
et al.
-
An Accurate, Extensive, and Practical Line
List of Methane for the HITEMP Database
Robert J. Hargreaves, Iouli E. Gordon,
Michael Rey et al.
-
Novel biologically-inspired rosette
nanotube PLLA scaffolds for improving
human mesenchymal stem cell
chondrogenic differentiation
Allie Childs, Usha D Hemraz, Nathan J
Castro et al.
-
This content was downloaded from IP address 109.198.41.102 on 16/02/2023 at 12:45
Content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution
of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.
Published under licence by IOP Publishing Ltd
ACAT-2021
Journal of Physics: Conference Series 2438 (2023) 012118
IOP Publishing
doi:10.1088/1742-6596/2438/1/012118
1
RNTuple performance: Status and Outlook
Javier Lopez-Gomez1, Jakob Blomer1
1EP-SFT, CERN, Geneva, Switzerland
E-mail: {javier.lopez.gomez,jblomer}@cern.ch
Abstract. Upcoming HEP experiments, e.g. at the HL-LHC, are expected to increase the
volume of generated data by at least one order of magnitude. In order to retain the ability to
analyze the inux of data, full exploitation of modern storage hardware and systems, such as
low-latency high-bandwidth NVMe devices and distributed object stores, becomes critical.
To this end, the ROOT RNTuple I/O subsystem has been designed to address performance
bottlenecks and shortcomings of ROOT’s current state of the art TTree I/O subsystem.
RNTuple provides a backwards-incompatible redesign of the TTree binary format and access
API that evolves the ROOT event data I/O for the challenges of the upcoming decades. It
focuses on a compact data format, on performance engineering for modern storage hardware,
for instance through making parallel and asynchronous I/O calls by default, and on robust
interfaces that are easy to use correctly.
In this contribution, we evaluate the RNTuple performance for typical HEP analysis tasks.
We compare the throughput delivered by RNTuple to popular I/O libraries outside HEP, such
as HDF5 and Apache Parquet. We demonstrate the advantages of RNTuple for HEP analysis
workows and provide an outlook on the road to its use in production.
1. Introduction
HEP storage systems are generally tuned for write-once-read-many columnar access. Since its
inception, the ROOT project[4] supports the columnar storage of arbitrary C++ types and
collections through TTree. However, the expected increase in the amount of experiments data
that needs to be processed and the fact that TTree was not designed to make optimized use of
modern hardware and storage systems, called for a new, modernized re-engineering of TTree.
RNTuple is the new, experimental, backward-incompatible ROOT columnar I/O subsystem
targeting high performance, reliability, and easy-to-use robust interfaces. Despite RNTuple still
being under development, at this point it is feature-complete enough to carry out an evaluation.
In this paper we contribute with the following:
A performance evaluation of TTree, RNTuple, and other well-known storage alternatives
outside HEP: Apache Parquet and HDF5. Compared to a previous publication[1], we
evaluate RNTuple focusing on dierent storage devices and a dataset with nested collections.
A feature comparison and perspectives of using RNTuple in production.
2. ROOT’s RNTuple overview
The design of RNTuple[3] comprises four layers: (i) event iteration layer, that oers a convenient
interface for looping over events; (ii) logical layer, that maps complex C++ types onto columns;
(iii) primitives layer, which groups ranges of elements of a fundamental type into pages; and
ACAT-2021
Journal of Physics: Conference Series 2438 (2023) 012118
IOP Publishing
doi:10.1088/1742-6596/2438/1/012118
2
(iv) storage layer, that is responsible for the I/O of pages, clusters, and required metadata. This
design makes it simple to support new data types or storage backends.
A page contains a certain range of values for a given column, whereas a cluster contains
pages for a specic row range. Metadata includes a header that describes the data schema and a
footer that contains the location of clusters and pages, among other information. Header/footer
locations and their sizes are included in an anchor object. Figure 1 shows a simplied example
of the on-disk layout.
Header Page
Cluster
Footer
struct Ev en t {
int fId;
vector<Pa rt ic le >fPtcls;
};
struct Pa rt ic le {
float fE;
vector<int>fIds;
};
Figure 1. RNTuple on-disk format. Pages store values for a specic data member (note the
color coding).
3. Evaluation
In this section, we evaluate RNTuple w.r.t. TTree and other well-known I/O libraries outside
HENP: HDF5 and Apache Arrow/Parquet. Specically, Section 3.1 compares the support level
of important HENP I/O layer features. A quantitative experimental evaluation is provided in
Section 3.2.
To make a fair comparison, all our test programs were written in C++ and parameters such
as row group size, page size, and compression algorithm and level were set to match in all cases,
where permitted. For Apache Parquet, we leveraged the Parquet-Arrow API permitting the
convenient use of nested data structures and lists. For HDF5, however, the columnar storage of
heterogeneous data types or nested collections thereof is not a trivial problem. Parts of our test
code bridge this gap and allows switching between alternate data layouts simply by changing a
C++ template parameter. Specically, this layer provides the following layouts:
Row-wise: uses HDF5 compound types and variable-length types to represent nested
structures and collections, respectively. This layout creates a single dataset whose type is
the outer-most data structure and its dataspace dimension is 1×N.
Column-wise: emulated columnar layout that uses one HDF5 dataset per column of a
fundamental type, as described in [7]. Collections are translated to a HDF5 group and one
additional index column.
In any case, HDF5 datasets are chunked. Given that chunks are individually accessed and
(un)compressed, their size will be equal to the RNTuple or Parquet page size in all of our tests.
The chunk cache size is set to the default size of a RNTuple cluster.
3.1. Qualitative evaluation
In Table 1, we compare the level of support of dierent must-have features in a HENP I/O layer.
Compression is, in general, supported by all the analyzed storage formats; however, the native
support for dierent algorithms greatly varies, e.g. HDF5 only supports zlib and szip. Vertical
and horizontal data combinations refers to extending the dataset with new entries or columns,
respectively. To the best of our knowledge, Apache Arrow allows reading rows from many les,
but it is unclear whether new columns can be made available. Similarly, columnar access to
multilevel nested structures and collections in HDF5 is unclear. Schema evolution, i.e. handling
changes in the data schema such as adding, removing or changing the type of a column, is only
ACAT-2021
Journal of Physics: Conference Series 2438 (2023) 012118
IOP Publishing
doi:10.1088/1742-6596/2438/1/012118
3
HDF5
Parquet
TTree
RNTuple
Transparent compression (1)
Columnar access (2)
Merging without uncompressing data
Vertical/horizontal data combinations /
C++ and Python support /
Support for structs/nested collections ?
Architecture-independent encoding
Schema evolution
Support for application-dened metadata
Fully checksummed
Multi-threading friendly
Native object-store support
XRootD support
Automatic schema creation from C++ classes
On-demand schema extension (backlling)
Split encoding / delta encoding
Variable-length oats (min, max, bit size)
Supported Planned / Under development /Partial / Incomplete ?Unclear
(1) Only for chunked datasets (2) Via emulated columnar
Table 1. Comparison of features available in TTree, RNTuple, HDF5, and Apache
Arrow/Parquet.
supported in TTree; preliminary support for this feature in RNTuple is foreseen for Q2 2022.
Native support for object stores is available in HDF5, RNTuple (DAOS), and Apache Arrow
(S3). Finally, split encoding typically improves the compression ratio by reordering bytes in
integer/oating point numbers, so that the nth byte of each value is contiguous in memory.
3.2. Experimental evaluation
In this section, we provide experimental measurements of the analysis throughput, total amount
of bytes read, and le size for TTree, RNTuple, HDF5 (both row-wise and column-wise) and
Apache Parquet in a variety of situations. The hardware and software environment, datasets
used, and test cases are described in the following.
Hardware and software environment. Our benchmarks ran on a single node based on 1×AMD
EPYC 7702P 64-Core processor running at 2 GHz, and 128 GB DDR4 RAM. SMT was enabled,
although disabling it yielded similar results for the workload in our tests. This machine is also
equipped with a Samsung PM173X NVMe SSD, and a TOSHIBA MG07ACA1 SATA hard disk
drive. A ext4 lesystem resides on each drive using a 4KB block size and default mount options.
CephFS was used as the network lesystem for tests that operate over a network share.
The software environment is based on CentOS Linux 8.3 (kernel 5.15.1), Apache Arrow 5.0.0,
HDF5 1.10.5, and ROOT git revision 5001281762 built with g++ 8.4.1.
Test cases. The experiments ran in the evaluation have used the following datasets as input:
LHCb Run 1 Open Data B2HHH (B meson decays to three hadrons). No nested
collections, containing 8.5 M events, 26 branches1. The compressed le size is 1.1 GB.
CMS Open Data Higgs 4 leptons MC. NanoAOD-like[6] format, 300 k events, 84
branches. This dataset was concatenated 16 times so as to make a larger le of 2.1GB.
We carried out two dierent experiments[2]: (i) running a simple analysis program over the
LHCb dataset that generates the B mass spectrum histogram and measures the end-to-end, i.e.
from storage to histogram, analysis throughput in uncompressed MB/s; and (ii) measurement
1In TTree, the term “branch” refers to a column; usually, both terms can be used interchangeably.
ACAT-2021
Journal of Physics: Conference Series 2438 (2023) 012118
IOP Publishing
doi:10.1088/1742-6596/2438/1/012118
4
of the read rate in uncompressed MB/s for both the LHCb and the CMS datasets, retrieving all
entries in 10 selected branches.
The original les were in TTree format. In a rst stage, a third program is used to generate the
equivalent les for each of the other formats (HDF5 row-wise, HDF5 column-wise, and Apache
Parquet). All les taken as input by the benchmark programs use zstd with compression level 5,
except for HDF5 that uses zlib with compression level 3 given that HDF5 lacks native support
for zstd. RNTuple and Apache Parquet multi-threaded I/O and (un)compression was enabled
in all cases. RNTuple benchmarks use a cluster prefetch value of 5.
Each experiment and storage format combination is run four times, diering on the location
of the input le: CephFS, SSD, HDD. To ensure that data is forcibly read from the underlying
storage, the Linux page cache is cleared prior to running each test. A fourth iteration uses the
same input le as in the HDD case but does not clear the page cache before the test is executed.
Additionally, we measured the total le size after conversion and the raw bytes read by each
experiment–format pair, as reported by the vmtouch[5] tool. In all the plots shown below, the
data point and error bars (where applicable) refer to the average and minimum/maximum values
measured over 10 executions, respectively.
In a rst stage, we measured the analysis throughput for all combinations of format and
storage device for the LHCb dataset. The analysis program requires reading 18 out of 26
columns. As can be seen in Figure 2, RNTuple speedup over Apache Parquet is between ×1.4
and ×2.2 depending on the scenario. Also note that the performance of HDF5 is severely aected
by the lack of multi-threaded I/O and decompression.
HDD SSD CephFS HDD warm cache
0
1,000
2,000
3,000
uncomp MB/s
TTree
RNTuple
Parquet
HDF5/row-wise
HDF5/column-wise
Figure 2. LHCb B2HHH analysis throughput (18/26 branches; compressed).
Whereas the previous plots give an idea of the performance of each candidate, these results do
not represent the behavior in case of accessing collections. In a second experiment, we measured
the read rate in uncompressed MB/s for 10 columns of both the LHCb and CMS dataset.
This test allows us to compare the performance dierences for collections. As can be seen in
Figure 3, for the LHCb dataset, RNTuple’s worst result is comparable to Apache Parquet. In all
the other tests, RNTuple outperforms other alternatives. For the CMS dataset, RNTuple worst
case achieves at least the same result as Parquet. Dierences between both plots is explained
by the use of dierent column types (double w.r.t. float) and the presence of collections.
Finally, we measured the compressed le size and total bytes transferred during the last test
for the CMS dataset (see Figure 4). As can be seen, RNTuple provides the smallest le while
the amount of bytes read is about the same as Parquet. It is worth noting that HDF5 row-wise
does not read the whole le if the compound type denition provided at run-time misses some
members w.r.t. the committed (on-le) type; however, the throughput in this case is extremely
low (<1MB/s; see Figure 3.b).
4. Conclusion and future work
RNTuple is able to deliver the highest throughput among the analyzed alternatives in all of
our tests thanks to its performance-tuned implementation and parallel I/O scheduling and
decompression. The latest developments are available in the ROOT::Experimental namespace.
The feature roadmap aims at complete support for the HENP event data storage use case.
ACAT-2021
Journal of Physics: Conference Series 2438 (2023) 012118
IOP Publishing
doi:10.1088/1742-6596/2438/1/012118
5
HDD
SSD
CephFS
HDD warm cache
0
1,000
2,000
3,000
uncomp MB/s
(a): LHCb B2HHH (10/26 branches; compressed)
HDD
SSD
CephFS
HDD warm cache
0
500
1,000
HDF5
uncomp MB/s
(b): CMS Higgs4Leptons (10/84 branches; compressed)
TTree RNTuple Parquet HDF5/row-wise HDF5/column-wise
Figure 3. Read rate (uncompressed MB/s; 10 selected branches). Note the low (<10 MB/s)
transfer rate for HDF5 in the CMS case.
0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00 2.25 2.50 2.75 3.00 3.25
TTree
RNTuple
Parquet
HDF5/row
HDF5/col
GB (109bytes)
Read bytes
File size
Figure 4. File size and raw bytes read for the CMS dataset (10/84 branches).
RNTuple is expected to become production grade in 2024. A number of important features
are scheduled for 2022: schema evolution, on-demand addition of new columns to the model,
complete support for vertical/horizontal data combinations, and merging without uncompressing
pages, only to name a few. As a future work, we also plan to compare the performance achieved
by the RNTuple and HDF5 DAOS connectors.
Acknowledgments
This work beneted from support by the CERN Strategic R&D programme on Technologies for
Future Experiments (CERN-OPEN-2018-06).
References
[1] J. Blomer. “A quantitative review of data formats for HEP analyses”. In: J. Phys. Conf.
Ser. 1085.3 (2018), p. 032020. doi:10.1088/1742-6596/1085/3/032020.
[2] J. Blomer et al. ROOT RNTuple Virtual Probe Station. Accessed: 2022-02-10. url:https:
//github.com/jblomer/iotools/tree/master/compare.
[3] Blomer, Jakob et al. “Evolution of the ROOT Tree I/O”. In: EPJ Web Conf. 245 (2020),
p. 02030. doi:10.1051/epjconf/202024502030.
[4] Rene Brun and Fons Rademakers. “ROOT—An object oriented data analysis framework”.
In: Nuclear instruments and methods in physics research section A: accelerators,
spectrometers, detectors and associated equipment 389.1-2 (1997), pp. 81–86.
[5] Hoyte Doug. vmtouch: the Virtual Memory Toucher. 2012. url:https://hoytech.com/
vmtouch/.
[6] Rizzi, Andrea, Petrucciani, Giovanni, and Peruzzi, Marco. “A further reduction in CMS
event data for analysis: the NANOAOD format”. In: EPJ Web Conf. 214 (2019), p. 06021.
[7] S. Sehrish et al. “Python and HPC for High Energy Physics Data Analyses”. In: 7th
Workshop on Python for High-Performance and Scientic Computing. 2017.
... It supports complex nested data structures and transparent compression with various compression algorithms. Parquet was previously evaluated for HEP analysis needs and RNTuple was shown to outperform it for fast storage media [15]. To the best of our knowledge, it is not possible to write Parquet data in parallel. ...
Preprint
Full-text available
High Energy Physics (HEP) experiments, for example at the Large Hadron Collider (LHC) at CERN, store data at exabyte scale in sets of files. They use a binary columnar data format by the ROOT framework, that also transparently compresses the data. In this format, cells are not necessarily atomic but they may contain nested collections of variable size. The fact that row and block sizes are not known upfront makes it challenging to implement efficient parallel writing. In particular, the data cannot be organized in a regular grid where it is possible to precompute indices and offsets for independent writing. In this paper, we propose a scalable approach to efficient multithreaded writing of nested data in columnar format into a single file. Our approach removes the bottleneck of a single writer while staying fully compatible with the compressed, columnar, variably row-sized data representation. We discuss our design choices and the implementation of scalable parallel writing for ROOT's RNTuple format. An evaluation of our approach shows perfect scalability only limited by storage bandwidth for a synthetic benchmark. Finally we evaluate the benefits for a real-world application of dataset skimming.
... Around sixty million events are processed with data in the form of scalars or arrays thereof, similar to the CMS NanoAOD schema. The second benchmark ("ATLAS iotools") is inspired by an ATLAS Open Data tutorial that has previously been used to evaluate RNTuple performance [13]. It nicely complements the first benchmark as the prevalent column types in this schema are C++ standard library vectors, which are more complex to treat in terms of bulk I/O. ...
Article
Full-text available
RDataFrame is ROOT’s high-level interface for Python and C++ data analysis. Since it first became available, RDataFrame adoption has grown steadily and it is now poised to be a major component of analysis software pipelines for LHC Run 3 and beyond. Thanks to its design inspired by declarative programming principles, RDataFrame enables the development of highperformance, highly parallel analyses without requiring expert knowledge of multi-threading and I/O: user logic is expressed in terms of self-contained, small computation kernels tied together by a high-level API. This design completely decouples analysis logic from its actual execution, and opens several interesting avenues for workflow optimization. In particular, in this work we explore the benefits of moving internal data processing from an event-by-event to a bulkby-bulk loop. This refactoring dramatically reduces the framework’s runtime overheads; in collaboration with the I/O layer it improves data access patterns; it exposes information that optimizing compilers might use to auto-vectorize the invocation of user-defined computations; finally, while existing user-facing interfaces remain unaffected, it becomes possible to additionally offer interfaces that explicitly expose bulks of events, useful e.g. for the injection of GPU kernels into the analysis workflow. In order to inform similar future R&D, design challenges will be presented, as well as an investigation of the relevant timememory trade-off backed by novel performance benchmarks.
... Right: the expected tape storage needs of CMS [44] in Petabytes as a function of time assuming no R&D improvements (solid blue) or probable R&D improvements (dashed blue). Overlaid are expected tape resources extrapolating from 2018 pledges and assuming a 10-20% increase in budget (shaded region) undergo a major I/O upgrade of the event data file format and access API and provide a new storage type: RNTuple [45], which is expected to eventually replace TTree. The reasons for this transition are substantial performance increase expectations: 10-20% smaller files, 3-5 times better single-core performance, because RNTuple is developed with efficient support of modern hardware (GPU, HPC, Object Stores, etc.) in mind (built for multi-threading and asynchronous I/O). ...
Article
Full-text available
A performant and easy-to-use event data model (EDM) is a key component of any HEP software stack. The podio EDM toolkit provides a user friendly way of generating such a performant implementation in C++ from a high level description in yaml format. Finalizing a few important developments, we are in the final stretches for release v1.0 of podio, a stable release with backward compatibility for datafiles written with podio from then on. We present an overview of the podio basics, and go into slighty more technical detail on the most important topics and developments. These include: schema evolution for generated EDMs, multithreading with podio generated EDMs, the implementation of them as well as the basics of I/O. Using EDM4hep, the common and shared EDM of the Key4hep project, we highlight a few of the smaller features in action as well as some lessons learned during the development of EDM4hep and podio. Finally, we show how podio has been integrated into the Gaudi based event processing framework that is used by Key4hep, before we conclude with a brief outlook on potential developments after v1.0.
Article
Full-text available
The offline software framework of the LHCb experiment has undergone a significant overhaul to tackle the data processing challenges that will arise in the upcoming Run 3 and Run 4 of the Large Hadron Collider. This paper introduces FunTuple, a novel component developed for offline data processing within the LHCb experiment. This component enables the computation and storage of a diverse range of observables for both reconstructed and simulated events by leveraging on the tools initially developed for the trigger system. This feature is crucial for ensuring consistency between trigger-computed and offline-analysed observables. The component and its tool suite offer users flexibility to customise stored observables, and its reliability is validated through a full-coverage set of rigorous unit tests. This paper comprehensively explores FunTuple’s design, interface, interaction with other algorithms, and its role in facilitating offline data processing for the LHCb experiment for the next decade and beyond.
Article
Full-text available
The ROOT TTree data format encodes hundreds of petabytes of High Energy and Nuclear Physics events. Its columnar layout drives rapid analyses, as only those parts (“branches”) that are really used in a given analysis need to be read from storage. Its unique feature is the seamless C++ integration, which allows users to directly store their event classes without explicitly defining data schemas. In this contribution, we present the status and plans of the future ROOT 7 event I/O. Along with the ROOT 7 interface modernization, we aim for robust, where possible compile-time safe C++ interfaces to read and write event data. On the performance side, we show first benchmarks using ROOT’s new experimental I/O subsystem that combines the best of TTrees with recent advances in columnar data formats. A core ingredient is a strong separation of the high-level logical data layout (C++ classes) from the low-level physical data layout (storage backed nested vectors of simple types). We show how the new, optimized physical data layout speeds up serialization and deserialization and facilitates parallel, vectorized and bulk operations. This lets ROOT I/O run optimally on the upcoming ultra-fast NVRAM storage devices, as well as file-less storage systems such as object stores.
Article
Full-text available
A new event data format has been designed and prototyped by the CMS collaboration to satisfy the needs of a large fraction of physics analyses (at least 50%) with a per event size of order 1 kB. This new format is more than a factor of 20 smaller than the MINIAOD format and contains only top level information typically used in the last steps of the analysis. The talk will review the current analysis strategy from the point of view of event format in CMS (both skims and formats such as RECO, AOD, MINIAOD, NANOAOD) and will describe the design guidelines for the new NANOAOD format.
Article
Full-text available
The analysis of High Energy Physics (HEP) data sets often takes place outside the realm of experiment frameworks and central computing workflows, using carefully selected "n-tuples" or Analysis Object Data (AOD) as a data source. Such n-tuples or AODs may comprise data from tens of millions of events and grow to hundred gigabytes or a few terabytes in size. They are typically small enough to be processed by an institute's cluster or even by a single workstation. N-tuples and AODs are often stored in the ROOT file format, in an array of serialized C++ objects in columnar storage layout. In recent years, several new data formats emerged from the data analytics industry. We provide a quantitative comparison of ROOT and other popular data formats, such as Apache Parquet, Apache Avro, Google Protobuf, and HDF5. We compare speed, read patterns, and usage aspects for the use case of a typical LHC end-user n-tuple analysis. The performance characteristics of the relatively simple n-tuple data layout also provides a basis for understanding performance of more complex and nested data layouts. From the benchmarks, we derive performance tuning suggestions both for the use of the data formats and for the ROOT (de-)serialization code.
Conference Paper
High level abstractions in Python that can utilize computing hardware well seem to be an attractive option for writing data reduction and analysis tasks. In this paper, we explore the features available in Python which are useful and efficient for end user analysis in High Energy Physics (HEP). A typical vertical slice of an HEP data analysis is somewhat fragmented: the state of the reduction/analysis process must be saved at certain stages to allow for selective reprocessing of only parts of a generally time-consuming workflow. Also, algorithms tend to to be modular because of the heterogeneous nature of most detectors and the need to analyze different parts of the detector separately before combining the information. This fragmentation causes difficulties for interactive data analysis, and as data sets increase in size and complexity (O10 TiB for a "small" neutrino experiment to the O10 PiB currently held by the CMS experiment at the LHC), data analysis methods traditional to the field must evolve to make optimum use of emerging HPC technologies and platforms. Mainstream big data tools, while suggesting a direction in terms of what can be done if an entire data set can be available across a system and analysed with high-level programming abstractions, are not designed with either scientific computing generally, or modern HPC platform features in particular, such as data caching levels, in mind. Our example HPC use case is a search for a new elementary particle which might explain the phenomenon known as "Dark Matter". Using data from the CMS detector, we will use HDF5 as our input data format, and MPI with Python to implement our use case.
Article
The ROOT system in an Object Oriented framework for large scale data analysis. ROOT written in C++, contains, among others, an efficient hierarchical OO database, a C++ interpreter, advanced statistical analysis (multi-dimensional histogramming, fitting, minimization, cluster finding algorithms) and visualization tools. The user interacts with ROOT via a graphical user interface, the command line or batch scripts. The command and scripting language is C++ (using the interpreter) and large scripts can be compiled and dynamically linked in. The OO database design has been optimized for parallel access (reading as well as writing) by multiple processes.
ROOT RNTuple Virtual Probe Station
  • J Blomer
J. Blomer et al. ROOT RNTuple Virtual Probe Station. Accessed: 2022-02-10. url: https: //github.com/jblomer/iotools/tree/master/compare.
vmtouch: the Virtual Memory Toucher
  • Hoyte Doug
Hoyte Doug. vmtouch: the Virtual Memory Toucher. 2012. url: https://hoytech.com/ vmtouch/.