Conference PaperPDF Available
Relaxing the One Denition Rule in Interpreted C++
Javier López-Gómez
Department of Computer Science
University Carlos III of Madrid
28911–Leganés, Spain
jalopezg@inf.uc3m.es
Javier Fernández
Department of Computer Science
University Carlos III of Madrid
28911–Leganés, Spain
jfmunoz@inf.uc3m.es
David del Rio Astorga
Department of Computer Science
University Carlos III of Madrid
28911–Leganés, Spain
drio@pa.uc3m.es
Vassil Vassilev
Princeton University
New Jersey 08544, United States
vvasilev@cern.ch
Axel Naumann
Experimental Physics
CERN
1211 Geneva 23, Switzerland
axel.naumann@cern.ch
J. Daniel García
Department of Computer Science
University Carlos III of Madrid
28911–Leganés, Spain
jdgarcia@inf.uc3m.es
Abstract
Most implementations of the C++ programming language
generate binary executable code. However, interpreted ex-
ecution of C++ sources has its own use cases as the Cling
interpreter from CERN’s ROOT project has shown. Some
limitations are derived from the ODR (One Denition Rule)
that rules out multiple denitions of entities within a single
translation unit (TU). ODR is there to ensure uniform view of
a given C++ entity across translation units. Ensuring uniform
view of C++ entities helps when producing ABI compatible
binaries. Interpreting C++ presumes a single ever-growing
translation unit that dene away some of the ODR use-cases.
Therefore, it may well be desirable to relax the ODR and,
consequently, to support the ability of developers to override
any existing denition for a given declaration. This approach
is especially well-suited for iterative prototyping. In this pa-
per, we extend Cling, a Clang/LLVM-based C++ interpreter,
to enable redenitions of C++ entities at the prompt. To
achieve this, top-level declarations are nested into inline
namespaces and the translation unit lookup table is adjusted
to invalidate previous denitions that would otherwise result
in ambiguities. Formally, this technique refactors the code
to an equivalent that does not violate the ODR, as each de-
nition is nested in a dierent namespace. Furthermore, any
previous denition that has been shadowed is still accessible
by means of its fully-qualied name. A prototype implemen-
tation of the presented technique has been integrated into
the Cling C++ interpreter, showing that our technique is
feasible and usable.
Keywords C++, interpreter, One-Denition-Rule, Cling
ACM Reference Format:
Javier López-Gómez, Javier Fernández, David del Rio Astorga, Vassil
Vassilev, Axel Naumann, and J. Daniel García. 2019. Relaxing the
One Denition Rule in Interpreted C++. In Proceedings of ACM
SIGPLAN 2020 International Conference on Compiler Construction
(CC’20). ACM, New York, NY, USA, 11 pages.
CC’20, February 22–26, 2019, San Diego, CA, USA
2019.
1 Introduction
Recently, interpreted languages have been widely adopted
for application prototyping in multiple areas and to aid un-
experienced users in dening the logic of their applications.
In that regard, application developments based on compiled
languages for performance issues can benet of using an
interpreter of the same language for rapid prototyping in
order to reduce the time-to-market. Following this idea, the
CERN’s ROOT project has demonstrated that the use of a
C++ interpreter (Cling) can reduce the necessary eort for
developing prototypes and transforming them into high-
performance applications.
However, since the C++ language has been designed to
be compiled, interpreting this language in a user-friendly
way presents some challenges. In this paper, we focus on
providing Cling with the functionality of redening entities,
such as variables, functions, and types, in a similar way
to other interpreted languages like Python. To do so, it is
necessary to relax the C++ One Denition Rule so that we
allow more than one denition per translation unit.
In this paper, we present a formalization for supporting
entity redenition on interpreted C++ and we implement this
behavior on Cling as a validation of the proposed technique.
Specically, this paper contributes with the following:
We present a formalization for relaxing the ODR in
C++ that can be leveraged on any C++ interpreter.
We implement an Abstract Syntax Tree (AST) trans-
former to support entity redenition on a real-world
C++ interpreter.
We analyze the output of the new transformer to vali-
date the proposed technique, and evaluate the possible
overhead.
The rest of this document is organized as follows. Section 2
revisits some related works in the area. Section 3 gives an
overview of the CERN’s ROOT framework and the Cling in-
terpreter. Section 4 presents the formalization for supporting
entity redenition. Section 5 describes the implementation
of the required AST transformer for relaxing the ODR in
CC’20, February 22–26, 2019, San Diego, CA, USA J. López-Gómez, J. Fernández, D. del Rio Astorga, V. Vassilev, et al.
Cling. In Section 6, we employ some examples to validate
our proposal and evaluate the overhead introduced by the
required additional handling. Finally, Section 7 closes this
paper with some concluding remarks and future works.
2 State of the art
Interpreted languages have become popular in the industrial
and scientic areas. This is mainly due to three important
characteristics: (i) the adoption of agile software develop-
ment methodologies based on fast application prototyping
(Rapid Application Development [
14
]); (ii) the need to pro-
vide tools that ease the application development for non-
experts, which is important in scientic areas and industrial
data management; (iii) the increased portability with respect
to compiled languages. For instance, Scala [
16
] has been
widely adopted for managing large data sets in Big Data
applications. On the other hand, Python [
18
] has become
even more popular thanks to its high-level abstractions that
help domain experts to develop scientic applications [17].
However, interpreted languages are slower than compiled
ones due to the instruction generation at run-time and the
diculties for exploiting the available resources in a given
platform. Thus, to increase the performance, interpreters
of these languages leverage Just-in-Time (JIT) compilation
techniques to generate a compiled version of application
hot-paths, e.g. PyPy [
19
], HOPE [
2
]. Additionally, multiple
libraries implemented in high-performance compiled lan-
guages provide bindings to be used on interpreted appli-
cations to improve the performance, e.g. TensorFlow [
1
].
Moreover, it is worth mentioning that almost every contem-
porary programming language has a Read-Eval-Print-Loop
(REPL) also known as language shell, e.g. Swift[
4
], that de-
spite being a compiled language heavily supports REPL-style
development.
Nevertheless, in order to obtain the maximum perfor-
mance and to minimize response times in a production build
of the application, it is necessary to generate a compiled bi-
nary or to transform the interpreted application to compiled
languages. In this sense, we can nd two major approaches:
(i) tools that allow generating compiled applications or byte-
code that runs on a Virtual Machine (VM) from an interpreted
language script [
15
], and (ii) the development of interpreters
for typically compiled languages to reduce the code trans-
formation for the production version.
Some examples of tools able to compile Python scripts are
Cython [
5
], Jython [
11
] and IronPython [
8
]. For instance,
Cython is a Python and C compiler that can generate opti-
mized binaries. However, this tool requires to use a superset
of the Python language to exploit the available resources
such as the general lock releasing for exploiting thread par-
allelism. For this reason, these tools are mainly used for
implementing libraries that will be used from an interpreted
script.
On the other hand, several interpreters of C and C++ lan-
guages can be found: Ch [
6
], Clip [
13
], CInt [
9
], UnderC [
7
]
and Cling [
20
]. These tools allow developers to take advan-
tage of interpreted languages for fast application prototyping
having, as a result, a code that can be compiled with minimal
eorts. Compiled language standards and, C++ specically,
presents some limitations to be used as an interpreted lan-
guage. An example of these limitations is the C++ ODR that
avoids entity redenition in the same translation unit [
10
].
This limitation is not required in interpreted languages since
the denition of an entity depends only on the interpreta-
tion order of the script code. This way an entity denition is
valid until its next redenition. In this paper, we present a
technique to allow entity redenition on a C++ interpreter
while keeping the C++ language consistent.
3 Background
In this section, we describe the ROOT project and its C++
interpreter (Cling), widely used in the community of High-
Energy-Physics (HEP) and other scientic areas.
3.1 ROOT project
ROOT[
3
] is a cross-platform C++ framework for data pro-
cessing in the high-energy physics area, developed mostly at
CERN. This framework is designed for storing and analyzing
large amounts of data. Basically, it provides the following
components:
Data model.
The ROOT framework provides a data mo-
del that allows to store data, represented as C++ ob-
jects, into compressed binary machine-independent
les. Those binary les also store the format descrip-
tion of the data, allowing access to the information
from anywhere.
Statistics and data analysis libraries.
ROOT also pro-
vides a huge set of tools for mathematical and statis-
tical analysis that can easily operate over ROOT les.
Furthermore, it also provides visualization tools to
display histograms, scatter plots and function tting.
Additionally, these tools take full advantage of C++
features and parallel processing techniques.
Interactive C++ interpreter.
This component provides
a C++ interpreter (Cling) for interactive developing
and to compile the resulting application to exploit
the available resources. This interpreter can also be
used with user-friendly development environments de-
signed for interpreted languages such as Jupyter note-
book, either through ROOT or the Xeus-cling project.
Other language bindings.
ROOT provides a set of bind-
ings that allow to use the framework with dierent lan-
guages such as Python, R and Mathematica. In the con-
text of this paper, the Python bindings (PyROOT/cp-
pyy) are especially relevant as they leverage Cling to
access the C++ side at run-time.
Relaxing the One Definition Rule in Interpreted C++ CC’20, February 22–26, 2019, San Diego, CA, USA
3.2 Cling
Cling is a Clang/LLVM-based C++ interpreter developed at
CERN, that has been adopted as the interpreter for the ROOT
project. Cling leverages the Clang/LLVM infrastructure for
parsing and code generation, meaning that it only has to
deal with issues derived of C++ interpretation. This keeps
Cling codebase reasonably small (about 36K LOC) and eases
maintenance. An overview of Cling is shown in Figure 1.
Figure 1. Cling input transformation
Input line
Wrap in
function?
Parse
(Clang)
AST
transformers
JIT
+ exec
In general, Cling users expect a Python-like interaction.
In other words, the user expects the interpreter to accept
an statement, even if it does not appear as a part of the
body of a function. However, this practice usually results
in ill-formed code according to ISO C++[
10
]. If the input
line cannot be proved to be valid, it will be wrapped in a
uniquely-named function. At this stage, several simple cases
can be detected as valid (functions, classes, namespaces, etc.).
However, Cling is not able to do so for variable declaration,
such as “
int i = 0;
”. This is xed lated by the DeclExtractor
transformer, which extracts declarations out of the wrapper
functions. Cling also supports a “raw input” mode, in which
this “wrapping in functions” stage is skipped completely.
After turning the user code into valid C++, it can be nor-
mally parsed by Clang. The output of this stage is the abstract
syntax tree (AST) for the parsed top-level declarations. Clang
also adds these to the translation unit declaration list. In this
sense, the TU is constructed incrementally.
The generated AST for top-level declarations, i.e. those
that appear at the TU level, may be transformed to sup-
port other Cling features. This processing is performed by
independent transformation blocks which are executed se-
quentially after the AST is created. The former blocks may
be classied as either an ASTTransformer (apply to all parsed
declarations), or WrapperTransformer (apply only to wrap-
pers generated in the rst stage). For example, declaration
statements that were previously wrapped into a function
must be moved back to the global scope (TU), which is done
by the DeclExtractor transformer. Figure 2 shows the modi-
cations performed by DeclExtractor for the input line “
int
i = 0, j;
”. Additionally, Cling includes transformers to
support other features, e.g.
auto
specier synthesis, invalid
memory reference protection, etc.
The last step in the interpreter pipeline is just-in-time
(JIT) compilation and execution. Cling ooads this task on
LLVM.
Figure 2. Transformation performed by DeclExtractor
|-
`-FunctionDecl __cling_Un1Qu30 'vo i d ( vo i d *) '
| - ParmVarDecl vpClingValue 'void *'
`-CompoundStmt
| - DeclStmt
| |- V a r D e c l i'int'cinit
| | `-IntegerLiteral 'int'0
|`-Va r D e c l j'int'
4 Proposal for entity redenition
This section introduces the proposed technique to override
a previous denition for a given declaration. The described
procedure relies on nesting each redeclaration into its own
scope by using C++ inline named namespaces. Therefore,
using this technique does not incur in a violation of the ODR,
nor requires major changes to the compiler. According to ISO
C++[
10
], members of an inline namespace can be accessed as
if they are members of the enclosing namespace, i.e. names
introduced by such namespace “leak” to the enclosing scope.
However, as shown in Listing 1, if a name is made available
in the enclosing scope through more than one inline names-
pace, unqualied lookup for the given name is ambiguous.
In Section 4.3, we tackle this issue by manually adjusting the
lookup table of the enclosing scope.
Listing 1. Ambiguous unqualied lookup
inline namespace ns 0 { int i = 0 ; }
inline namespace ns 1 { d o u b l e i = 1.0; }
auto j = i; / / un qu al i fi ed l oo ku p i s
ambiguous
4.1 Covered cases and exceptions
Not all declarations that introduce a name are subject to
the aforementioned transformation. Instead, it can only be
applied in contexts where an inline namespace may be used.
Therefore, to avoid ill-formed namespace constructs, we
restrict this transformation to the translation-unit level. Sim-
ilarly, only named declarations that are denitions, or that
may be dened later, i.e. forward declarations, should be
moved into a namespace.
Additionally, some declarations that introduce a name
must not be nested into a namespace, either because repeti-
tion is allowed, or because nesting them changes the original
meaning. This includes:
using-directive
e.g.
using namespace std
, that makes
all the names in
std
visible for unqualied name lookup.
In this case, such declarations shall not be transformed,
since issuing twice a using-directive in a given scope
does not pose any problem.
using-declaration
e.g.
using std::vector
, that makes
vector
accessible for unqualied lookup in the current
CC’20, February 22–26, 2019, San Diego, CA, USA J. López-Gómez, J. Fernández, D. del Rio Astorga, V. Vassilev, et al.
scope. The same rationale applies for these declara-
tions.
4.2 Rules for AST transformation
The proposed transformation may be formally described
using a syntax-directed denition (SDD)[
12
], that appends
semantic rules to the grammar productions relating to decla-
rations whose redenition is to be allowed. In this notation,
each grammar production has been associated a set of se-
mantic rules that are evaluated in the specied order. Each
rule may either set the value of an attribute for the given
entity, e.g.
E.attr = . . .
, or call a function that may have
side-eects.
As shown in Table 1, for top-level redenable declarations,
the semantic declaration context (
DeclContext
attribute)
is set to a synthesized uniquely-named inline namespace,
which in turn is added to the translation unit. Added semantic
rules has been typesetted in bold face. Due to space limitation,
only some productions are shown.
4.3 Invalidation of ambiguous unqualied names
Rules in Table 1 cause target top-level declarations to be
nested into inline namespaces. Because inline namespaces
make their members visible in the enclosing scope, declared
names may still be accessed as if they were not part of a
namespace. However, if the same name is “leaked” via dier-
ent namespaces, unqualied lookup will fail due to ambiguity.
Instead, such lookups should resolve to the latest declaration.
4.3.1 Removing ambiguity
To that aim, ambiguous lookups must turn into non-ambigu-
ous that return the expected result. As shown in Figure 3,
each declared name in the inline namespace (NS1) is made
visible not only in the namespace lookup table, but also in
that of the enclosing scope (TU). If the same name is made
visible by several namespaces (NS1, NS2, .. . ), then there
will be more than one entry with the given name in the TU
lookup table.
Therefore, to get rid of ambiguity, existing entries for the
given name must be removed, provided that they cannot be
considered an overload. An overload is a set of declarations
that despite having the same name, the compiler is able to dis-
ambiguate using the number and/or type of the arguments.
This adjustment is made by the
fix_TU_lookup_table()
function. Moreover, if the whole functionality is encapsu-
lated in the
allow_redefine()
function shown in Listing 2,
then allowing a declaration to adopt a new denition may be
accomplished only by adding a call to
allow_redefine()
, as
shown in the syntax-directed translation scheme (SDT)[
12
]
excerpt in Listing 3.
Listing 2.
The
allow_redefine()
function used in the SDT
D . De c lC o nt ex t = new N a me s p ac e ( " __ N S _ xx x " ,
IN L I N E , { D . n o d e })
Figure 3.
Lookup tables for translation unit and inline
namespaces
inline namespace NS 1 {
int PI = 0 ;
st d :: s tr in g S;
}
inline namespace NS 2 {
dou b l e P I = 3. 14 15 92 65 ,
J;
}
· · ·
PI int
S std::string
NS1 lookup table
PI double = 3.14. . .
J double
NS2 lookup table
PI int = 0
PI double = 3.14159265
S std::string
J double
TU lookup table
D . no de = D . D ec l C on t ex t
fix_TU_lookup_table()
4.3.2 Exceptions: overloads, unscoped
enumerations, etc.
Some particular cases require either to preserve existing
lookup table entries, or to invalidate additional ones, namely:
Function overloads.
If all the duplicated entries refer
to a function overload, none of them shall be removed.
In this case, the lookup result is said to be overloaded
(not ambiguous). Additionally, ISO C++ paragraph
[temp.over.link]p4[
10
] must be veried for overloaded
templated functions.
Unscoped enumerations.
An unscoped enumeration
is a transparent context, i.e. enumerators are made vis-
ible in the parent context. Because declared enumera-
tors are made visible in the enclosing inline namespace,
and therefore in the translation unit, the removal of
all those names from the TU lookup table shall also be
considered.
Declaration after denition.
Any non-denition dec-
laration that comes after a denition is ignored, e.g.
class C { ... };
class C; // ignored
5 Cling implementation
This section describes the implementation of the aforemen-
tioned technique on top of the Cling C++ interpreter. Given
that Cling’s architecture allows for AST transformation be-
fore the JIT compilation takes place, all the additional han-
dling required for supporting redenition has been tted in
the new DefinitionShadower AST transformer1.
1
DefinitionShadower has been merged into Cling master branch. See
https://github.com/root-project/cling/.
Relaxing the One Definition Rule in Interpreted C++ CC’20, February 22–26, 2019, San Diego, CA, USA
Table 1. Syntax-directed denition to nest declarations into a namespace
Production Semantic Rules
function-definition𝐷→ · · · declarator virt-specifier-seq𝑜𝑝𝑡 function-body D.node = new FunctionDefinition(. . . , declarator, function_body)
D.DeclContext = new Namespace("__NS_xxx", INLINE, { FD.node })
D.node = FD.DeclContext
simple-declaration𝐷→ · · · decl-specifier-seq init-declarator-list ’;’ D.node = new SimpleDeclaration(. . . )
D.DeclContext = new Namespace("__NS_xxx", INLINE, { D.node })
D.node = FD.DeclContext
.
.
.
.
.
.
Listing 3. Modied SDT that allows redening entities
fu n ct io n - d e fi n it i on attribute-specifier-seq𝑜𝑝𝑡 decl-specifier-seq𝑜𝑝𝑡 d e c la r a t o r virt-specifier-seq𝑜𝑝 𝑡
function -body { · · · ;
allow_redefinition(); }
si m pl e - d ec l ar a ti on d ec l - s p ec i fi e r - s eq init-declarator-list𝑜𝑝𝑡 ';'
| att ri bute - spec if ier - seq de cl -speci fi e r - seq in it - d eclarat or - li st ';'
{· · · ;
allow_redefinition(); }
· · ·
Cling AST transformers run in strict order in which they
are registered. As will be discussed in Sections 5.1.2 and 5.1.3,
DefinitionShadower must run before the existing DeclEx-
tractor transformer to produce the expected behavior.
5.1 The DenitionShadower AST transformer
This transformer employs the aforementioned “shadowing”
technique, and therefore requires to rewrite most top-level
declarations as if they were nested into an
inline namespace
,
and to apply the xes detailed in Section 4.3 to the lookup
table of the enclosing scope (TU), so that unqualied lookup
always resolves to the latest declaration. These changes do
not require a patch to Clang sources, and can be entirely
implemented in Cling.
5.1.1 Namespacing top-level declarations
The
DefinitionShadower::Transform(Decl *)
function
implements the transformation described in Section 4.2. Specif-
ically, it performs the following: (i) creating –if needed– a
uniquely-named per-transaction NamespaceDecl node (re-
ferred to as
DefinitionShadowNS
) that has been marked as
inline
, and adding it to the TranslationUnitDecl declara-
tion list; (ii) removing the given named declaration from the
TranslationUnitDecl declaration list; (iii) setting its decla-
ration context to the
DefinitionShadowNS
namespace; and
(iv) adding it to the DefinitionShadowNS declaration list.
Note that, step (iii) fails for out-of-line member function
denitions, because the semantic declaration context should
be the CXXRecordDecl of the class, and cannot be changed.
Therefore, out-of-line member functions cannot be directly
shadowed. As a workaround, the owning class has to be rede-
ned prior to attaching a new out-of-line function denition.
Additionally, because function template instantiations in-
herit the declaration context of the templated declaration, the
instantiation pattern must also be updated. Otherwise, if we
try to redene a templated function, the mangled name for
template instantiations may clash with a previous denition
of the same template.
5.1.2 Adjusting the translation unit lookup table
The required patching to the translation-unit lookup table is
performed by the
invalidatePreviousDefinitions(Decl
*D)
function. Provided that
D
is a denition, this function
hides from Sema lookup any previous denition of the same
entity. Note that, while unqualied lookup will only return
the latest denition, it still allows reachability of shadowed
declarations via qualied lookup, e.g. __cling_N50::decl.
The previous function checks whether the given decla-
ration is a wrapper function generated by Cling, in which
case we iterate through all local declarations (that will be
moved by DeclExtractor), invalidating any previous global
denition.
invalidatePreviousDefinitions(NamedDecl *D)
han-
dles the invalidation of any previous denition of a named
declaration. In general, we lookup the given name in the
translation unit and iterate through the results, skipping
over non-denitions. Candidates for removal are checked for
function/template overload using the
Sema::IsOverload()
function, and if so they are kept. Otherwise, we remove the
declaration from the StoredDeclsList (lookup table) of the
translation-unit. As an special case, because unscoped enu-
merations “leak” enumerator names to the enclosing scope,
we also invalidate any previous denition of the enumera-
tors.
CC’20, February 22–26, 2019, San Diego, CA, USA J. López-Gómez, J. Fernández, D. del Rio Astorga, V. Vassilev, et al.
Also, because some Cling extensions cache information
about declarations, e.g. TCling, we registered an interpreter
callback that provides notication when a denition has
been shadowed. Therefore, the new
DefinitionShadowed
callback may be used in that case to erase cached informa-
tion.
5.1.3 Modications to declaration extraction
The implementation required minor changes to the DeclEx-
tractor transformer, so as to properly move declarations
to the enclosing scope. The unmodied DeclExtractor in-
correctly assumed that this scope is always the translation
unit. However, if DefinitionShadower is enabled, the wrap-
per function has been moved to an
inline namespace
and
declarations should be extracted onto it, as can be seen in
Figure 4.
Figure 4. New DeclExtractor behavior
`-NamespaceDecl __cling_N50 in l i n e
|-
`-FunctionDecl __cling_Un1Qu30 'void (void *) '
| - ParmVarDecl vpClingValue 'void *'
`-CompoundStmt
| - DeclStmt
|`-Va r D e c l i'int'cinit
|`-IntegerLiteral 'int'0
This fact also implies that the DefinitionShadower trans-
former should always run before DeclExtractor.
5.2 Enabling/disabling the new transformation
If registered, the AST transformer may be turned on/o for
the next input line by means of the
EnableShadowing
com-
pilation option. Compilation options control several aspects
of Cling, such as the optimization level or toggling a fea-
ture, e.g. declaration extraction, invalid memory reference
protection, etc.
EnableShadowing
is set to 0 if Cling raw input is enabled.
Otherwise, if
EnableShadowing
equals 1, valid named top-
level declarations shall be transformed except in the follow-
ing cases:
Not typed in the Cling prompt.
Shadowing is enabled
only for declarations that were parsed from an input
line, therefore disabling it for
#include
’ed les; oth-
erwise, it might break system header les. Because
Cling stores input lines in a virtual le with overriden
contents, they may be easily recognized based on their
source location.
Is a UsingDirectiveDecl/UsingDecl.
As discussed in
Section 4.1,
using-directive
and
using-declara-
tion should not be transformed.
Is a NamespaceDecl.
Shadowing namespace members
is currently not supported.
Is a function template instantiation.
Cling copies in-
put lines in a distinct virtual le and starts parsing it.
Consequently, at end of le,
ASTConsumer::HandleTr-
anslationUnit()
emits pending template instantia-
tions. These instantiations are fed through AST trans-
formes as top-level declarations, and should be ignored
by DefinitionShadower.
5.3 Other minor changes
Cling is able to pretty-print the type and value of an expres-
sion. This behavior is automatically turned on if an input
line is not terminated by a semicolon. However, nesting type
declarations into a namespace changes the qualied name
of the type, which aects how it is printed.
Because the proposed transformation moves most top-
level
NamedDecl
nodes into a namespace, their fully qualied
name changes w.r.t. the original typename as seen by the
user. Take the input “
class MyClass { . . . } X
” as an
example. As shown in Figure 5.b,
X
is pretty-printed by Cling
as “
(class __cling_N50::MyClass &) @0x7f0. . .
” after
enabling DefinitionShadower.
As can be seen, the typename shown in the output changes
w.r.t. Figure 5.a. The issue is xed by setting the PrintingPol-
icy ag
SuppressUnwrittenScope = 1
in
ValuePrinter.c-
pp
. This ag species whether to print parts of qualied
names that are not required to be written, e.g. inline/anony-
mous namespaces.
Figure 5. Cling pretty-print for “class MyClass {} X
(a) Original Cling ValuePrinter output
root [0] class M y C l a s s { } X
(class MyClass &) @ 0 x7 f b 4 d e b4 5 0 0 8
(b) DefinitionShadower enabled
root [0] class M y C l a s s { } X
(class _ _c li n g_ N5 0 :: M y Cl as s &)
@0x7f0f2da63008
(c) DefinitionShadower enabled and xed ValuePrinter
root [0] class M y C l a s s { } X
(class MyClass &) @ 0 x7 f d f 6 b aa c 0 0 8
5.4 Limitations
While the current implementation closes the behavioral gap
between the Cling C++ interpreter and other interpreted
languages, e.g. Python, it has some known limitations that
restrict its use, namely:
Shadowing a global object does not free storage.
In
C++, an l-value is an object that has a memory loca-
tion, e.g. a variable, and therefore it may appear on the
Relaxing the One Definition Rule in Interpreted C++ CC’20, February 22–26, 2019, San Diego, CA, USA
left-hand-side of an assignment expression. A shad-
owed l-value cannot be found via unqualied lookup,
but the memory it was referring to is still allocated.
Furthermore, these objects can be referenced using
their qualied name.
Changes in RTTI type information.
Run-Time Type
Identication (RTTI) is a C++ mechanism for type
introspection. As discussed in the previous section,
nesting type declarations into a namespace changes
the qualied name of types, which might be a problem
for applications heavily relying on RTTI. Fixing this
issue requires additional patches to the compiler.
6 Validation
This section presents an analysis of Cling DefinitionShad-
ower behaviour to validate the correctness of the proposed
entity redenition technique. To do so, we provide a close-
up view of the resulting AST and lookup table state for a
set of examples that covers most of the recurrent uses of
interpreted C++.
Table 2 shows the step-by-step sequential execution of
interpreted code in Cling, including the transformed AST
and the lookup table state.
In the rst line, an integer variable (
int i
) is declared. This
declaration is wrapped in a function named
__cling_Un1Qu-
30
. Then the AST tree is generated, and both DefinitionShad-
ower and DeclExtractor transformers are executed. First, Def-
initionShadower transforms the AST by nesting the function
into the
__cling_N50
inline namespace. Then, DeclExtrac-
tor extracts the declaration out of the wrapper function,
yielding the AST shown in Table 2. Given that this is the rst
declaration named i, the TU lookup table is not modied.
In the second line, a variable with the same name but dif-
ferent type (
double i
) is declared. As before, the declaration
is wrapped in a function named
__cling_Un1Qu31
. After the
AST is created, DefinitionShadower nests the function into
the
__cling_N51
inline namespace, and also removes the
previous entry for the given named declaration from the TU
lookup table. Finally, DeclExtractor extracts the declaration
out of the wrapper function.
Lines three and four declare two dierent functions with
the same name. However having both of them dierent pa-
rameters, it can be considered a function overload. These
input lines do not require to be wrapped. However, both are
nested into inline namespaces (
__cling_N52
and
__cling_-
N53
, respectively) by DefinitionShadower. In this case, De-
clExtractor does not do anything and the TU lookup table is
not modied.
Line ve declares a function with the same name and
parameters as the one on line four. As before, this declara-
tion does not require a wrapper. Again, DefinitionShadower
nests the declaration into namespace
__cling_N54
. This
transformer also removes the previous entry with the same
name from the TU lookup table.
Lines six through nine declare a templated structure with
one member (
struct S
), and an instance of
S<int>
with
the same name as the functions presented on lines three,
four and ve. In this case, only the variable declaration has
been wrapped into function
__cling_Un1Qu30
.Definition-
Shadower nests the templated structure, along with its spe-
cializations, into an inline namespace (
__cling_N56
). The
function wrapping the variable declaration is nested into a
dierent inline namespace (
__cling_N57
). This transformer
also removes all previous entries on the TU lookup table
that have the given name. Finally, DeclExtractor extracts the
declaration out of the wrapper function.
Lines ten through thirteen replace the templated struc-
ture introduced in lines six and seven. Also, we declare an
instance of this structure (
S<double> g
). As before, the vari-
able declaration requires the wrapper function
__cling_Un-
1Qu34
.DefinitionShadower nests the templated structure,
along with its specializations, into the
__cling_N58
names-
pace. On the other hand, the variable declaration is nested
into a dierent inline namespace (
__cling_N59
). The trans-
former also removes the entry for the previous declaration
of the structure from the TU lookup table. Finally, DeclEx-
tractor extracts the variable declaration out of the wrapper
function.
Finally, in lines fourteen through seventeen, we introduce
a using directive (
using namespace std
) and the
NS
names-
pace. Neither of these declarations have to be wrapped into a
function. Also, DefinitionShadower does not modify the AST
because both, using directives and user-dened namespaces
are considered exceptions. In this case, DeclExtractor does
not do anything and the TU lookup table is not modied.
As can be seen, this proposal improves the user experience
of using interpreted C++ for fast application prototyping,
while the code can still be reused for the high-performance
compiled version. Moreover, in a Jupyter notebook environ-
ment the user is allowed to edit existing cells and change
type/function denitions.
Finally, for the sake of completeness, we have also evalu-
ated the overhead caused by the transformations performed
by the DefinitionShadower. To do so, we have compared the
run time (JIT compilation and code execution) of the same
test program, both enabling and disabling entity redeni-
tion. This test program is comprised of a varying number of
top-level declarations of dierent types (function, class or
variable), ranging from 128 to 16384. Note that, in case of
enabling entity redenition, all the declarations have been
given the same name. In order to obtain the overhead, we
performed multiple executions and measured the average
run time. All the executions were run on a platform com-
prised of 24
×
Intel(R) Xeon(R) CPU E5-2695 v2 running at
2.40 GHz, and 128 GB of RAM.
CC’20, February 22–26, 2019, San Diego, CA, USA J. López-Gómez, J. Fernández, D. del Rio Astorga, V. Vassilev, et al.
Table 2. Step-by-step execution of interpreted code
Code Transformed AST TU lookup table
1in t i = 1 ; |-NamespaceDecl __cling_N50 inline
| |-VarDecl used i 'int'cinit
| | `-IntegerLiteral 'int'1
|`-FunctionDecl __cling_Un1Qu30 'void (void *)'
Name Type Value
i int 1
2do u b l e i = 3.141592; |-NamespaceDecl __cling_N51 inline
| |-VarDecl used i 'double'cinit
| | `-FloatingLiteral 'double'3.141592e+00
|`-FunctionDecl __cling_Un1Qu31 'void (void *)'
Name Type Value
i int 1
i double 3.141592
3char f ( int x) { re t u r n 'X'; }
4in t f ( ) { r e tur n 0 ; }
|-NamespaceDecl __cling_N52 inline
|`-FunctionDecl f'char (int)'
| |-ParmVarDecl x'int'
|`-CompoundStmt
|`-ReturnStmt
|`-CharacterLiteral 'char'88
|-NamespaceDecl __cling_N53 inline
|`-FunctionDecl f'int (void)'
|`-CompoundStmt
|`-ReturnStmt
|`-IntegerLiteral 'int'0
Name Type Value
i double 3.141592
f char (int)
f int ()
5do u b l e f ( ) { ret u r n 1.0; } |-NamespaceDecl __cling_N54 inline
|`-FunctionDecl f'double (void)'
|`-CompoundStmt
|`-ReturnStmt
|`-FloatingLiteral 'double'1.000000e+00
Name Type Value
i double 3.141592
f char (int)
f int ()
f double ()
6template <typename T >
7st r u c t S { T i ; } ;
8
9S < int > f {9 9 };
|-NamespaceDecl __cling_N56 inline
|`-ClassTemplateDecl S
| |-TemplateTypeParmDecl typename depth 0 index 0 T
| |-CXXRecordDecl struct S definition
| | `-FieldDecl i'T'
|`-ClassTemplateSpecializationDecl struct S
| |-TemplateArgument type 'int'
|`-FieldDecl i'int':'int'
|-NamespaceDecl __cling_N57 inline
| |-VarDecl f'S<int>':'__cling_N56::S<int>'
| | `-InitListExpr 'S<int>':'__cling_N56::S<int>'
| | `-IntegerLiteral 'int'99
|`-FunctionDecl __cling_Un1Qu33 'void (void *)'
Name Type Value
i double 3.141592
f char (int)
f double ()
S<T> __cling_N55::S<T>
f __cling_N55::S<int> {99}
10 template <typename T >
11 st r u c t S { T i , j ; } ;
12
13 S < double > g {0 , 33 . 0 }
|-NamespaceDecl __cling_N58 inline
|`-ClassTemplateDecl S
| |-TemplateTypeParmDecl typename depth 0 index 0 T
| |-CXXRecordDecl struct S definition
| | |-FieldDecl i'T'
| | `-FieldDecl j'T'
|`-ClassTemplateSpecializationDecl struct S
| |-TemplateArgument type 'double'
| |-FieldDecl i'double':'double'
|`-FieldDecl j'double':'double'
|-NamespaceDecl __cling_N59 inline
| |-VarDecl f'S<double>':'__cling_N58::S<double>'
| | `-InitListExpr 'S<double>':'__cling_N58::S<double>'
| | |-FloatingLiteral 'double'0.000000e+00
| | `-FloatingLiteral 'double'3.300000e+01
|`-FunctionDecl __cling_Un1Qu34 'void (void *)'
Name Type Value
i double 3.141592
S<T> __cling_N55::S<T>
f __cling_N55::S<int> {99}
S<T> __cling_N57::S<T>
g __cling_N57::S<int> {0,33.0}
14 using namespace st d ;
15 namespace NS {
16 st r in g s ( " Cl i ng " ) ;
17 }
|-UsingDirectiveDecl Namespace 'std'
|-NamespaceDecl NS
|`-VarDecl s'std::string':'std::basic_string<char>'
|`-ExprWithCleanups
|`-CXXConstructExpr
| |-ImplicitCastExpr 'const char *'
| | `-StringLiteral 'const char [6]'lvalue "Cling"
|`-CXXDefaultArgExpr
Name Type Value
i double 3.141592
f __cling_N55::S<int> {99}
S<T> __cling_N57::S<T>
g __cling_N57::S<int> {0,33.0}
NS [namespace]
Relaxing the One Definition Rule in Interpreted C++ CC’20, February 22–26, 2019, San Diego, CA, USA
Figure 6. Cling run time plots (both shadowing enabled/disabled)
(a) Function (re-)denition (b) Class (re-)denition (c) Variable (re-)denition
128
256
512
1024
2048
4096
8192
16384
1
2
3
5
7
10
20
40
No shadow Shadow
128
256
512
1024
2048
4096
8192
16384
1
2
3
5
7
10
20
40
No shadow Shadow
128
256
512
1024
2048
4096
8192
16384
1
2
3
5
7
10
20
40
No shadow No shadow, NORT
Shadow Shadow, NORT
As seen in Figures 6.a and 6.b, the run time behavior is sim-
ilar for both, the function and class denition tests, yielding a
time that grows with the number of declarations. Comparing
the original Cling implementation (No shadow) with our pro-
posal (Shadow), we can conclude that the “Shadow” version
icurrs in an non-linear overhead in the range of 452%. One
of the possible explanations for this variability, is that dif-
ferent LLVM/Clang data structure optimizations are applied
depending on the entry size.
However, the test performed for variables (see Figure 6.c),
while still growing with the number of declarations, it ex-
hibits a much higher run time, with a smaller overhead rang-
ing 213% for the “Shadow” version. This is due to the fact
that wrapper functions generated around variable declara-
tions call a Cling function that updates the internal state
of the interpreter. Cling includes the
-noruntime
command
line option that, among other things disables this behavior.
As shown in Figure 6.c, the run time using this option is
comparable to the other two cases. However, the overhead is
much smaller, in the range of
2815%, with the “Shadow”
version being faster in some cases. Again, this variability
might be caused by dierent optimizations in LLVM/Clang
data structures.
At the light of the results, we can conclude that using
the proposed technique allows interpreted C++ to obtain a
closer behaviour to an interpreted language with moderate
overheads.
7 Conclusion and future work
Interpreted languages have been proved to be a good solu-
tion for fast prototyping in agile development methodologies,
and closing the gap between domain experts and applica-
tion development. Since interpreted languages incur in extra
overhead at runtime, in some cases it is necessary to gener-
ate a compiled version of the application. To pave the way,
multiple implementations of C and C++ interpreters have
been developed. Nonetheless, these languages present some
inherent limitations due to their compiled nature. In this
paper, we present a technique to support entity redenition
in C++ interpreters. Relaxing the ODR aids rapid prototyp-
ing while keeping the C++ language relatively sound and
consistent for the particular use case.
To validate the proposed technique, we have implemented
the DefinitionShadower AST transformer to support entity
redenition in Cling with moderate overheads. As observed
through the validation, entities can be given a new denition
similarly to other interpreted languages. It is important to
remark that the presented Cling implementation is part of
the ROOT master branch, and will be used by the domain
experts at CERN by the end of 2019.
As future work, we plan to address limitations of the cur-
rent implementation, including allocated storage issues, and
RTTI. Also, we intend to add more features to Cling, such as
the generation of debugging information for JIT’ed code.
Acknowledgments
This work has been partially funded by the Spanish Ministry
of Economy and Competitiveness through Project Grant
TIN2016-79637-P (BigHPC—Towards Unication of HPC and
Big Data Paradigms).
References
[1]
Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis,
Jerey Dean, Matthieu Devin, Sanjay Ghemawat, Georey Irving,
Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry
Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasude-
van, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2016.
TensorFlow: A System for Large-scale Machine Learning. In Proceed-
ings of the 12th USENIX Conference on Operating Systems Design and
Implementation (OSDI’16). USENIX Association, Berkeley, CA, USA,
265–283. hp://dl.acm.org/citation.cfm?id=3026877.3026899
[2]
J. Akeret, L. Gamper, A. Amara, and A. Refregier. 2015. HOPE: A Python
just-in-time compiler for astrophysical computations. Astronomy and
Computing 10 (2015), 1 – 8. hps://doi.org/10.1016/j.ascom.2014.12.001
CC’20, February 22–26, 2019, San Diego, CA, USA J. López-Gómez, J. Fernández, D. del Rio Astorga, V. Vassilev, et al.
[3]
I. Antcheva, M. Ballintijn, B. Bellenot, M. Biskup, R. Brun, N. Buncic,
Ph. Canal, D. Casadei, O. Couet, V. Fine, L. Franco, G. Ganis, A. Gheata,
D. Gonzalez Maline, M. Goto, J. Iwaszkiewicz, A. Kreshuk, D. Marcos
Segura, R. Maunder, L. Moneta, A. Naumann, E. Oermann, V. Onuchin,
S. Panacek, F. Rademakers, P. Russo, and M. Tadel. 2009. ROOT A
C++ framework for petabyte data storage, statistical analysis and
visualization. Computer Physics Communications 180, 12 (2009), 2499 –
2512. hps://doi.org/10.1016/j.cpc.2009.08.005 40 YEARS OF CPC: A
celebratory issue focused on quality software for high performance,
grid and novel computing architectures.
[4]
Erik Azar and Mario Eguiluz Alebicto. 2016. Swift Data Structure and
Algorithms. Packt Publishing.
[5]
Stefan Behnel, Robert Bradshaw, Craig Citro, Lisandro Dalcin,
Dag Sverre Seljebotn, and Kurt Smith. 2011. Cython: The Best of
Both Worlds. Computing in Science and Engg. 13, 2 (March 2011),
31–39. hps://doi.org/10.1109/MCSE.2010.118
[6]
Harry H. Cheng. 1993. Scientic Computing in the CH Programming
Language. Scientic Programming 2, 3 (1993), 49–75. hps://doi.org/
10.1155/1993/261875
[7]
Steve Donovan. 2002. C++ by example (underc learning ed. ed.). Que,
Indianapolis, IN. Accompanied by CD : CDR 01063.
[8]
Michael Foord and Christian Muirhead. 2009. IronPython in Action.
Manning Publications Co., Greenwich, CT, USA.
[9] Masaharu Goto. 1995. C++ Interpreter - CINT. CQ publishing.
[10]
ISO. 2017. ISO/IEC 14882:2017 Programming languages — C++. ISO,
1214 Vernier, Geneva, Switzerland. 1605 pages. hps://isocpp.org/std/
the-standard
[11]
Josh Juneau, Jim Baker, Frank Wierzbicki, Leo Soto, and Victor Ng.
2010. The Denitive Guide to Jython: Python for the Java Platform (1st
ed.). Apress, Berkely, CA, USA.
[12]
Donald E. Knuth. 1968. Semantics of context-free languages. Mathe-
matical systems theory 2, 2 (01 Jun 1968), 127–145. hps://doi.org/10.
1007/BF01692511
[13]
Harri Luoma, Essi Lahtinen, and Hannu-Matti Järvinen. 2007. CLIP,
a Command Line Interpreter for a Subset of C++. In Proceedings of
the Seventh Baltic Sea Conference on Computing Education Research -
Volume 88 (Koli Calling ’07). Australian Computer Society, Inc., Dar-
linghurst, Australia, Australia, 199–202. hp://dl.acm.org/citation.
cfm?id=2449323.2449351
[14]
James Martin. 1991. Rapid Application Development. Macmillan Pub-
lishing Co., Inc., Indianapolis, IN, USA.
[15]
Remigius Meier and Thomas R. Gross. 2019. Reections on the
Compatibility, Performance, and Scalability of Parallel Python. In
Proceedings of the 15th ACM SIGPLAN International Symposium on
Dynamic Languages (DLS 2019). ACM, New York, NY, USA, 91–103.
hps://doi.org/10.1145/3359619.3359747
[16]
Martin Odersky and al. 2004. An Overview of the Scala Programming
Language. Technical Report IC/2004/64. EPFL Lausanne, Switzerland.
[17]
T. E. Oliphant. 2007. Python for Scientic Computing. Computing in
Science Engineering 9, 3 (May 2007), 10–20. hps://doi.org/10.1109/
MCSE.2007.58
[18]
Guido Rossum. 1995. Python Reference Manual. Technical Report.
Amsterdam, The Netherlands, The Netherlands.
[19]
PyPy Team. 2005. Complete python implementation running on top of
cpython. Technical Report.
[20]
V Vasilev, Ph Canal, A Naumann, and P Russo. 2012. Cling – The
New Interactive Interpreter for ROOT 6. Journal of Physics: Conference
Series 396, 5 (dec 2012), 052071. hps://doi.org/10.1088/1742- 6596/
396/5/052071
Relaxing the One Definition Rule in Interpreted C++ CC’20, February 22–26, 2019, San Diego, CA, USA
A Artifact Appendix
A.1 Abstract
The artifact contains a pre-compiled Cling C++ interpreter
in its latest GitHub revision (595580b), and the required shell
scripts for evaluation. The provided les perform all the checks
included in Table 2 and Figure 6, in addition to more advanced
tests, e.g. shadowing typedefs, SFINAE support, etc. To evaluate
the artifact, run the scripts and check the results.
A.2 Artifact check-list (meta-information)
Program: C++ code; included
Compilation:
provided artifact compiled using gcc-9.2.0;
minimum required gcc 4.8.5
Run-time environment:
artifact provides ArchLinux 2019.-
12.01-x86_64; works in any of the platforms supported by
the ROOT project.
Hardware:
precompiled version is built for x86_64; if build-
ing from sources: any LLVM supported architecture.
Output: Runtime or test pass/failure
How much time is needed to complete experiments
(approximately)?: 5–7 minutes
Publicly available?: Yes
Code licenses (if publicly available)?:
Cling Release Li-
cense, UI/NCSAOSL, LGPL.
Archived (provide DOI)?: 10.5281/zenodo.3579301
DOI
DOI 10.5281/zenodo.3579302
10.5281/zenodo.3579302
A.3 Description
A.3.1 How delivered
Cling is open sourced under the UI/NCSAOSL and LGPL licenses, and is
hosted on GitHub at
https://github.com/root-project/cling/
.
The repository contains source code, documentation and build instruc-
tions. The provided VM image contains a pre-compiled Cling version
from GitHub repository at revision 595580b, along with all the required
dependencies.
A.3.2 Soware dependencies
Cling requires gcc
4
.
8
.
5, LLVM, Clang, cmake and a number of
other dependencies. Required packages may be checked and installed
using the Cling Packaging Tool (CPT).
A.4 Installation
For convenience, we provide a VM image in OVA format that con-
tains all the required dependencies and may be downloaded at:
https://doi.org/10.5281/zenodo.3579301
. Users of this VM
should skip to Section A.5.
Alternatively, Cling may be compiled as specied in the GitHub
project page. To build Cling run the commands below:
$ wg e t h tt p s : // r a w . gi t h u bu s e rc o n te n t . c om /
ro ot - p ro j ec t / c li n g / ma s te r / t oo ls /
p ac k ag i n g / cp t . py
$ c hm o d + x c p t . py
$ ./ c pt . p y - - ch ec k - r eq u ir e me n ts & & ./ c pt . p y
-- cr e at e - d ev - e nv Re l e a s e - - wi t h - wo r k d i r
=. / c li ng - b ui l d /
$ e xp o rt PA T H = ./ c li n g - b u il d / b ui l d di r / b in :
$PATH
Note that building Cling requires at least 3 GiB of disk space.
A.5 Experiment workflow
The
test-xxx.sh
shell scripts automate the evaluation process. These
scripts are described below:
test-noshadow.sh.
Pipes the
INTERPRETME.C
le through Cling,
with denition shadowing disabled. This le le makes heavy
use of the technique described in the paper, so its execution will
fail. The compiler is expected to produce
“error: redefinition
of . . . ” diagnostics.
test-shadow.sh.
Pipes the
INTERPRETME.C
le through Cling,
with denition shadowing enabled; its execution is expected
to be successful. Compare the output against the
//CHECK:
comments. Some of these tests were also discussed in Table 2
in the paper.
test-perf.sh.
Run performance tests and generate plots using
gnuplot. These may vary w.r.t. those in Figure 6, depending on
your hardware. Interference should be avoided while the script
is running.
To proceed with evaluation, run the scripts as follows:
$ ./ t e st - n o sh a d ow . sh
$ ./ t e st - s h ad o w . sh
$ ./ t es t - p e rf . s h o ut . p df
$ e vi n ce ou t . p df
A.6 Evaluation and expected result
For
test-noshadow.sh
, the interpreter outputs
“error: redef-
inition of . . . ”
diagnostics. For
test-shadow.sh
, the inter-
preter generates no diagnostics and the output should match what
is specied on the //CHECK: comments in INTERPRETME.C.
For
test-perf.sh
, run time plots are written to a PDF le. The
results are hardware-dependent; however, in general the aspect
should be preserved.
A.7 Experiment customization
The
INTERPRETME.C
le may be modied to run other user-dened
tests. Alternatively, C++ code may be typed in a Cling interactive
session, e.g.
$ cl i n g
[cling]$ // ty p e C + + c o d e h e r e
Note that the proposed transformation is automatically enabled for
the ROOT project, but it is disabled by default in Cling stand-alone.
However, it may be enabled by typing the following lines:
[cling]$ # i nc l ud e " c li n g / In t er p re t e r /
I nt er p re t er . h "
[cling]$ c li ng : : ru n ti me : : gC li ng - >
a ll o w R ed e f i n it i o n ( ) ;
A.8 Notes
More information about Cling and the ROOT project is available at
https://github.com/root-project/cling/
and
https://git-
hub.com/root-project/root/, respectively.
... In paper [6], the function templates are instantiated at runtime by providing non-constant expressions to the non-type template parameters and strings from which the type is deduced by conversion to type template parameters. Another paper [7] allows multiple definitions for the same language construct, producing different Abstract Syntax Trees (AST). While LLVM Clang is used to separate the unique AST from the redefined ones, the redefined AST is JIT compiled [7]. ...
... Another paper [7] allows multiple definitions for the same language construct, producing different Abstract Syntax Trees (AST). While LLVM Clang is used to separate the unique AST from the redefined ones, the redefined AST is JIT compiled [7]. JIT compiled code can be also optimized [8]. ...
Preprint
Full-text available
Low-Level Virtual Machine (LLVM) compiler infrastructure is a useful tool for building Just-in-time (JIT) compilers, besides its reliable front-end represented by clang compiler and its elaborated middle-end containing different optimizations that improve the runtime performance. This paper addresses specifically the part of building a JIT compiler using LLVM with the scope of getting the hardware architecture details of the underlying machine such as the number of cores and the number of logical cores per processing unit and providing them to NUMA-BTLP static thread classification algorithm and to NUMA-BTDM static thread mapping algorithm. Afterwards, the hardware-aware algorithms are run by the JIT compiler within an optimization pass. JIT compiler in this paper is designed to run on a parallel C/C++ application (which creates threads using Pthreads), before the first time the application is executed on a machine. To do that, the JIT compiler takes the native code of the application, gets the corresponding LLVM IR (Intermediate Representation) for the native code and executes the hardware-aware thread classification and the thread mapping algorithms on the IR. The NUMA-Balanced Task and Loop Parallelism (NUMA-BTLP) and NUMA-Balanced Thread and Data Mapping (NUMA-BTDM) are expected to optimize the energy consumption up to 15%, on NUMA systems.
Article
Full-text available
Low-Level Virtual Machine (LLVM) compiler infrastructure is a useful tool for building just-in-time (JIT) compilers, besides its reliable front end represented by a clang compiler and its elaborated middle end containing different optimizations that improve the runtime performance. This paper specifically addresses the part of building a JIT compiler using an LLVM with the scope of obtaining the hardware architecture details of the underlying machine such as the number of cores and the number of logical cores per processing unit and providing them to the NUMA-BTLP static thread classification algorithm and to the NUMA-BTDM static thread mapping algorithm. Afterwards, the hardware-aware algorithms are run using the JIT compiler within an optimization pass. The JIT compiler in this paper is designed to run on a parallel C/C++ application (which creates threads using Pthreads), before the first time the application is executed on a machine. To achieve this, the JIT compiler takes the native code of the application, obtains the corresponding LLVM IR (Intermediate Representation) for the native code and executes the hardware-aware thread classification and the thread mapping algorithms on the IR. The NUMA-Balanced Task and Loop Parallelism (NUMA-BTLP) and NUMA-Balanced Thread and Data Mapping (NUMA-BTDM) are expected to optimize the energy consumption by up to 15% on the NUMA systems.
Article
Full-text available
The Python programming language is becoming increasingly popular for scientific applications due to its simplicity, versatility, and the broad range of its libraries. A drawback of this dynamic language, however, is its low runtime performance which limits its applicability for large simulations and for the analysis of large data sets, as is common in astrophysics and cosmology. While various frameworks have been developed to address this limitation, most focus on covering the complete language set, and either force the user to alter the code or are not able to reach the full speed of an optimised native compiled language. In order to combine the ease of Python and the speed of C++, we developed HOPE, a specialised Python just-in-time (JIT) compiler designed for numerical astrophysical applications. HOPE focuses on a subset of the language and is able to translate Python code into C++ while performing numerical optimisation on mathematical expressions at runtime. To enable the JIT compilation, the user only needs to add a decorator to the function definition. We assess the performance of HOPE by performing a series of benchmarks and compare its execution speed with that of plain Python, C++ and the other existing frameworks. We find that HOPE improves the performance compared to plain Python by a factor of 2 to 120, achieves speeds comparable to that of C++, and often exceeds the speed of the existing solutions. We discuss the differences between HOPE and the other frameworks, as well as future extensions of its capabilities. The fully documented HOPE package is available at http://hope.phys.ethz.ch and is published under the GPLv3 license on PyPI and GitHub.
Conference Paper
Full-text available
Cling is an interactive C++ interpreter, built on top of Clang and LLVM compiler infrastructure. Like its predecessor Cint, Cling realizes the read-print-evaluate-loop concept, in order to leverage rapid application development. Implemented as a small extension to LLVM and Clang, the interpreter reuses their strengths such as the praised concise and expressive compiler diagnostics. We show how to match the interpreter concept to the compiler library and generalize common set of requirements for building up an interactive interpreter. We reason the design and implementation decisions as solution to the challenge of implementing interpreter behaviour as an extension of the compiler library. We present the new features, e.g. how C++11 will come to Cling and how Cint-specific extensions are being adopted. We clarify the state of integration in the ROOT framework and the induced change set. We explain how ROOT dictionaries are simplified due to the new interpreter.
Article
Full-text available
Cython is a Python language extension that allows explicit type declarations and is compiled directly to C. As such, it addresses Python's large overhead for numerical loops and the difficulty of efficiently using existing C and Fortran code, which Cython can interact with natively.
Article
Full-text available
A new stable version ("production version") v5.28.00 of ROOT [1] has been published [2]. It features several major improvements in many areas, most noteworthy data storage performance as well as statistics and graphics features. Some of these improvements have already been predicted in the original publication Antcheva et al. (2009) [3]. This version will be maintained for at least 6 months; new minor revisions ("patch releases") will be published [4] to solve problems reported with this version.
Conference Paper
Today's hardware is increasingly parallel, and to increase performance, applications must be able to use this parallelism. Hence, programming languages must provide the means for parallel execution. The language Python offers a multithreading, shared-memory model for concurrency. However, simultaneous execution of threads, i.e., parallel execution, is not a standard feature of current virtual machines (VM) for Python. Instead, the predominant Python VMs depend on a global interpreter lock, which serializes the execution. In a parallel VM, replicating Python's concurrency semantics is challenging. Today, there are three parallel VMs, which use one of two approaches to address the challenges: Jython, IronPython, and PyPy-STM. These VMs use two fundamentally different approaches to synchronize parallel execution under Python's concurrency semantics: Jython and IronPython use fine-grained locking, and PyPy-STM uses software transactional memory (STM). The two approaches result in different performance characteristics and levels of Python compatibility for these VMs. In this paper, we report on our experience with the three parallel VMs by comparing their compatibility, performance, and scalability. The comparison shows that fine-grained locking can yield better scalability than the STM approach. However, regarding the faithful reproduction of Python's concurrency semantics and the absolute performance, the STM approach currently has the advantage.
Article
C++ is not the best choice for a rst programming language, but if it is used, the learning circumstances need to be as easy as possible. We have developed a pedagogically designed interpreter for this purpose. Our hypothesis is that an interpreter is easier than a compiler for a novice programmer to use. We do not want students to use language properties they do not yet fully understand, so our approach is imperative- rst. In addition, when using an interpreter, learning concepts such as libraries can be postponed until later in the course. C++ is a complex language and most of its lan- guage features are not needed by a novice program- mer. Therefore we have simplied the language to a subset of C++ that we call C--. For instance, classes have been omitted. We have also put a lot of ef- fort into producing clear, informative error messages, something that is made possible by of the simpler programming language. This article introduces some other C/C++ inter- preters, their evaluation, and the description of our interpreter called CLIP. So far CLIP has not been used by students, so its evaluation is left as future work.
Article
Python is a simple, yet powerful, interpreted programming language that bridges the gap between C and shell programming, and is thus ideally suited for ``throw-away programming'''' and rapid prototyping. Its syntax is put together from constructs borrowed from a variety of other languages; most prominent are influences from ABC, C, Modula-3 and Icon. The Python interpreter is easily extended with new functions and data types implemented in C. Python is also suitable as an extension language for highly customizable C applications such as editors or window managers. Python is available for various operating systems, amongst which several flavors of UNIX (including Linux), the Apple Macintosh O.S., MS-DOS, MS-Windows 3.1, Windows NT, and OS/2. This reference manual describes the syntax and ``core semantics'''' of the language. It is terse, but attempts to be exact and complete. The semantics of non-essential built-in object types and of the built-in functions and modules are described in the Python Library Reference. For an informal introduction to the language, see the Python Tutorial.
Article
Meaning may be assigned to a string in a context-free language by defining attributes of the symbols in a derivation tree for that string. The attributes can be defined by functions associated with each production in the grammar. This paper examines the implications of this process when some of the attributes are synthesized, i.e., defined solely in terms of attributes of thedescendants of the corresponding nonterminal symbol, while other attributes are inherited, i.e., defined in terms of attributes of theancestors of the nonterminal symbol. An algorithm is given which detects when such semantic rules could possibly lead to circular definition of some attributes. An example is given of a simple programming language defined with both inherited and synthesized attributes, and the method of definition is compared to other techniques for formal specification of semantics which have appeared in the literature.