Content uploaded by Javier López-Gómez
Author content
All content in this area was uploaded by Javier López-Gómez on Oct 26, 2020
Content may be subject to copyright.
Relaxing the One Denition Rule in Interpreted C++
Javier López-Gómez
Department of Computer Science
University Carlos III of Madrid
28911–Leganés, Spain
jalopezg@inf.uc3m.es
Javier Fernández
Department of Computer Science
University Carlos III of Madrid
28911–Leganés, Spain
jfmunoz@inf.uc3m.es
David del Rio Astorga
Department of Computer Science
University Carlos III of Madrid
28911–Leganés, Spain
drio@pa.uc3m.es
Vassil Vassilev
Princeton University
New Jersey 08544, United States
vvasilev@cern.ch
Axel Naumann
Experimental Physics
CERN
1211 Geneva 23, Switzerland
axel.naumann@cern.ch
J. Daniel García
Department of Computer Science
University Carlos III of Madrid
28911–Leganés, Spain
jdgarcia@inf.uc3m.es
Abstract
Most implementations of the C++ programming language
generate binary executable code. However, interpreted ex-
ecution of C++ sources has its own use cases as the Cling
interpreter from CERN’s ROOT project has shown. Some
limitations are derived from the ODR (One Denition Rule)
that rules out multiple denitions of entities within a single
translation unit (TU). ODR is there to ensure uniform view of
a given C++ entity across translation units. Ensuring uniform
view of C++ entities helps when producing ABI compatible
binaries. Interpreting C++ presumes a single ever-growing
translation unit that dene away some of the ODR use-cases.
Therefore, it may well be desirable to relax the ODR and,
consequently, to support the ability of developers to override
any existing denition for a given declaration. This approach
is especially well-suited for iterative prototyping. In this pa-
per, we extend Cling, a Clang/LLVM-based C++ interpreter,
to enable redenitions of C++ entities at the prompt. To
achieve this, top-level declarations are nested into inline
namespaces and the translation unit lookup table is adjusted
to invalidate previous denitions that would otherwise result
in ambiguities. Formally, this technique refactors the code
to an equivalent that does not violate the ODR, as each de-
nition is nested in a dierent namespace. Furthermore, any
previous denition that has been shadowed is still accessible
by means of its fully-qualied name. A prototype implemen-
tation of the presented technique has been integrated into
the Cling C++ interpreter, showing that our technique is
feasible and usable.
Keywords C++, interpreter, One-Denition-Rule, Cling
ACM Reference Format:
Javier López-Gómez, Javier Fernández, David del Rio Astorga, Vassil
Vassilev, Axel Naumann, and J. Daniel García. 2019. Relaxing the
One Denition Rule in Interpreted C++. In Proceedings of ACM
SIGPLAN 2020 International Conference on Compiler Construction
(CC’20). ACM, New York, NY, USA, 11 pages.
CC’20, February 22–26, 2019, San Diego, CA, USA
2019.
1 Introduction
Recently, interpreted languages have been widely adopted
for application prototyping in multiple areas and to aid un-
experienced users in dening the logic of their applications.
In that regard, application developments based on compiled
languages for performance issues can benet of using an
interpreter of the same language for rapid prototyping in
order to reduce the time-to-market. Following this idea, the
CERN’s ROOT project has demonstrated that the use of a
C++ interpreter (Cling) can reduce the necessary eort for
developing prototypes and transforming them into high-
performance applications.
However, since the C++ language has been designed to
be compiled, interpreting this language in a user-friendly
way presents some challenges. In this paper, we focus on
providing Cling with the functionality of redening entities,
such as variables, functions, and types, in a similar way
to other interpreted languages like Python. To do so, it is
necessary to relax the C++ One Denition Rule so that we
allow more than one denition per translation unit.
In this paper, we present a formalization for supporting
entity redenition on interpreted C++ and we implement this
behavior on Cling as a validation of the proposed technique.
Specically, this paper contributes with the following:
•
We present a formalization for relaxing the ODR in
C++ that can be leveraged on any C++ interpreter.
•
We implement an Abstract Syntax Tree (AST) trans-
former to support entity redenition on a real-world
C++ interpreter.
•
We analyze the output of the new transformer to vali-
date the proposed technique, and evaluate the possible
overhead.
The rest of this document is organized as follows. Section 2
revisits some related works in the area. Section 3 gives an
overview of the CERN’s ROOT framework and the Cling in-
terpreter. Section 4 presents the formalization for supporting
entity redenition. Section 5 describes the implementation
of the required AST transformer for relaxing the ODR in
CC’20, February 22–26, 2019, San Diego, CA, USA J. López-Gómez, J. Fernández, D. del Rio Astorga, V. Vassilev, et al.
Cling. In Section 6, we employ some examples to validate
our proposal and evaluate the overhead introduced by the
required additional handling. Finally, Section 7 closes this
paper with some concluding remarks and future works.
2 State of the art
Interpreted languages have become popular in the industrial
and scientic areas. This is mainly due to three important
characteristics: (i) the adoption of agile software develop-
ment methodologies based on fast application prototyping
(Rapid Application Development [
14
]); (ii) the need to pro-
vide tools that ease the application development for non-
experts, which is important in scientic areas and industrial
data management; (iii) the increased portability with respect
to compiled languages. For instance, Scala [
16
] has been
widely adopted for managing large data sets in Big Data
applications. On the other hand, Python [
18
] has become
even more popular thanks to its high-level abstractions that
help domain experts to develop scientic applications [17].
However, interpreted languages are slower than compiled
ones due to the instruction generation at run-time and the
diculties for exploiting the available resources in a given
platform. Thus, to increase the performance, interpreters
of these languages leverage Just-in-Time (JIT) compilation
techniques to generate a compiled version of application
hot-paths, e.g. PyPy [
19
], HOPE [
2
]. Additionally, multiple
libraries implemented in high-performance compiled lan-
guages provide bindings to be used on interpreted appli-
cations to improve the performance, e.g. TensorFlow [
1
].
Moreover, it is worth mentioning that almost every contem-
porary programming language has a Read-Eval-Print-Loop
(REPL) also known as language shell, e.g. Swift[
4
], that de-
spite being a compiled language heavily supports REPL-style
development.
Nevertheless, in order to obtain the maximum perfor-
mance and to minimize response times in a production build
of the application, it is necessary to generate a compiled bi-
nary or to transform the interpreted application to compiled
languages. In this sense, we can nd two major approaches:
(i) tools that allow generating compiled applications or byte-
code that runs on a Virtual Machine (VM) from an interpreted
language script [
15
], and (ii) the development of interpreters
for typically compiled languages to reduce the code trans-
formation for the production version.
Some examples of tools able to compile Python scripts are
Cython [
5
], Jython [
11
] and IronPython [
8
]. For instance,
Cython is a Python and C compiler that can generate opti-
mized binaries. However, this tool requires to use a superset
of the Python language to exploit the available resources
such as the general lock releasing for exploiting thread par-
allelism. For this reason, these tools are mainly used for
implementing libraries that will be used from an interpreted
script.
On the other hand, several interpreters of C and C++ lan-
guages can be found: Ch [
6
], Clip [
13
], CInt [
9
], UnderC [
7
]
and Cling [
20
]. These tools allow developers to take advan-
tage of interpreted languages for fast application prototyping
having, as a result, a code that can be compiled with minimal
eorts. Compiled language standards and, C++ specically,
presents some limitations to be used as an interpreted lan-
guage. An example of these limitations is the C++ ODR that
avoids entity redenition in the same translation unit [
10
].
This limitation is not required in interpreted languages since
the denition of an entity depends only on the interpreta-
tion order of the script code. This way an entity denition is
valid until its next redenition. In this paper, we present a
technique to allow entity redenition on a C++ interpreter
while keeping the C++ language consistent.
3 Background
In this section, we describe the ROOT project and its C++
interpreter (Cling), widely used in the community of High-
Energy-Physics (HEP) and other scientic areas.
3.1 ROOT project
ROOT[
3
] is a cross-platform C++ framework for data pro-
cessing in the high-energy physics area, developed mostly at
CERN. This framework is designed for storing and analyzing
large amounts of data. Basically, it provides the following
components:
Data model.
The ROOT framework provides a data mo-
del that allows to store data, represented as C++ ob-
jects, into compressed binary machine-independent
les. Those binary les also store the format descrip-
tion of the data, allowing access to the information
from anywhere.
Statistics and data analysis libraries.
ROOT also pro-
vides a huge set of tools for mathematical and statis-
tical analysis that can easily operate over ROOT les.
Furthermore, it also provides visualization tools to
display histograms, scatter plots and function tting.
Additionally, these tools take full advantage of C++
features and parallel processing techniques.
Interactive C++ interpreter.
This component provides
a C++ interpreter (Cling) for interactive developing
and to compile the resulting application to exploit
the available resources. This interpreter can also be
used with user-friendly development environments de-
signed for interpreted languages such as Jupyter note-
book, either through ROOT or the Xeus-cling project.
Other language bindings.
ROOT provides a set of bind-
ings that allow to use the framework with dierent lan-
guages such as Python, R and Mathematica. In the con-
text of this paper, the Python bindings (PyROOT/cp-
pyy) are especially relevant as they leverage Cling to
access the C++ side at run-time.
Relaxing the One Definition Rule in Interpreted C++ CC’20, February 22–26, 2019, San Diego, CA, USA
3.2 Cling
Cling is a Clang/LLVM-based C++ interpreter developed at
CERN, that has been adopted as the interpreter for the ROOT
project. Cling leverages the Clang/LLVM infrastructure for
parsing and code generation, meaning that it only has to
deal with issues derived of C++ interpretation. This keeps
Cling codebase reasonably small (about 36K LOC) and eases
maintenance. An overview of Cling is shown in Figure 1.
Figure 1. Cling input transformation
Input line
Wrap in
function?
Parse
(Clang)
AST
transformers
JIT
+ exec
In general, Cling users expect a Python-like interaction.
In other words, the user expects the interpreter to accept
an statement, even if it does not appear as a part of the
body of a function. However, this practice usually results
in ill-formed code according to ISO C++[
10
]. If the input
line cannot be proved to be valid, it will be wrapped in a
uniquely-named function. At this stage, several simple cases
can be detected as valid (functions, classes, namespaces, etc.).
However, Cling is not able to do so for variable declaration,
such as “
int i = 0;
”. This is xed lated by the DeclExtractor
transformer, which extracts declarations out of the wrapper
functions. Cling also supports a “raw input” mode, in which
this “wrapping in functions” stage is skipped completely.
After turning the user code into valid C++, it can be nor-
mally parsed by Clang. The output of this stage is the abstract
syntax tree (AST) for the parsed top-level declarations. Clang
also adds these to the translation unit declaration list. In this
sense, the TU is constructed incrementally.
The generated AST for top-level declarations, i.e. those
that appear at the TU level, may be transformed to sup-
port other Cling features. This processing is performed by
independent transformation blocks which are executed se-
quentially after the AST is created. The former blocks may
be classied as either an ASTTransformer (apply to all parsed
declarations), or WrapperTransformer (apply only to wrap-
pers generated in the rst stage). For example, declaration
statements that were previously wrapped into a function
must be moved back to the global scope (TU), which is done
by the DeclExtractor transformer. Figure 2 shows the modi-
cations performed by DeclExtractor for the input line “
int
i = 0, j;
”. Additionally, Cling includes transformers to
support other features, e.g.
auto
specier synthesis, invalid
memory reference protection, etc.
The last step in the interpreter pipeline is just-in-time
(JIT) compilation and execution. Cling ooads this task on
LLVM.
Figure 2. Transformation performed by DeclExtractor
|-
`-FunctionDecl __cling_Un1Qu30 'vo i d ( vo i d *) '
| - ParmVarDecl vpClingValue 'void *'
`-CompoundStmt
| - DeclStmt
| |- V a r D e c l i'int'cinit
| | `-IntegerLiteral 'int'0
|`-Va r D e c l j'int'
4 Proposal for entity redenition
This section introduces the proposed technique to override
a previous denition for a given declaration. The described
procedure relies on nesting each redeclaration into its own
scope by using C++ inline named namespaces. Therefore,
using this technique does not incur in a violation of the ODR,
nor requires major changes to the compiler. According to ISO
C++[
10
], members of an inline namespace can be accessed as
if they are members of the enclosing namespace, i.e. names
introduced by such namespace “leak” to the enclosing scope.
However, as shown in Listing 1, if a name is made available
in the enclosing scope through more than one inline names-
pace, unqualied lookup for the given name is ambiguous.
In Section 4.3, we tackle this issue by manually adjusting the
lookup table of the enclosing scope.
Listing 1. Ambiguous unqualied lookup
inline namespace ns 0 { int i = 0 ; }
inline namespace ns 1 { d o u b l e i = 1.0; }
auto j = i; / / un qu al i fi ed l oo ku p i s
ambiguous
4.1 Covered cases and exceptions
Not all declarations that introduce a name are subject to
the aforementioned transformation. Instead, it can only be
applied in contexts where an inline namespace may be used.
Therefore, to avoid ill-formed namespace constructs, we
restrict this transformation to the translation-unit level. Sim-
ilarly, only named declarations that are denitions, or that
may be dened later, i.e. forward declarations, should be
moved into a namespace.
Additionally, some declarations that introduce a name
must not be nested into a namespace, either because repeti-
tion is allowed, or because nesting them changes the original
meaning. This includes:
using-directive
e.g.
using namespace std
, that makes
all the names in
std
visible for unqualied name lookup.
In this case, such declarations shall not be transformed,
since issuing twice a using-directive in a given scope
does not pose any problem.
using-declaration
e.g.
using std::vector
, that makes
vector
accessible for unqualied lookup in the current
CC’20, February 22–26, 2019, San Diego, CA, USA J. López-Gómez, J. Fernández, D. del Rio Astorga, V. Vassilev, et al.
scope. The same rationale applies for these declara-
tions.
4.2 Rules for AST transformation
The proposed transformation may be formally described
using a syntax-directed denition (SDD)[
12
], that appends
semantic rules to the grammar productions relating to decla-
rations whose redenition is to be allowed. In this notation,
each grammar production has been associated a set of se-
mantic rules that are evaluated in the specied order. Each
rule may either set the value of an attribute for the given
entity, e.g.
E.attr = . . .
, or call a function that may have
side-eects.
As shown in Table 1, for top-level redenable declarations,
the semantic declaration context (
DeclContext
attribute)
is set to a synthesized uniquely-named inline namespace,
which in turn is added to the translation unit. Added semantic
rules has been typesetted in bold face. Due to space limitation,
only some productions are shown.
4.3 Invalidation of ambiguous unqualied names
Rules in Table 1 cause target top-level declarations to be
nested into inline namespaces. Because inline namespaces
make their members visible in the enclosing scope, declared
names may still be accessed as if they were not part of a
namespace. However, if the same name is “leaked” via dier-
ent namespaces, unqualied lookup will fail due to ambiguity.
Instead, such lookups should resolve to the latest declaration.
4.3.1 Removing ambiguity
To that aim, ambiguous lookups must turn into non-ambigu-
ous that return the expected result. As shown in Figure 3,
each declared name in the inline namespace (NS1) is made
visible not only in the namespace lookup table, but also in
that of the enclosing scope (TU). If the same name is made
visible by several namespaces (NS1, NS2, .. . ), then there
will be more than one entry with the given name in the TU
lookup table.
Therefore, to get rid of ambiguity, existing entries for the
given name must be removed, provided that they cannot be
considered an overload. An overload is a set of declarations
that despite having the same name, the compiler is able to dis-
ambiguate using the number and/or type of the arguments.
This adjustment is made by the
fix_TU_lookup_table()
function. Moreover, if the whole functionality is encapsu-
lated in the
allow_redefine()
function shown in Listing 2,
then allowing a declaration to adopt a new denition may be
accomplished only by adding a call to
allow_redefine()
, as
shown in the syntax-directed translation scheme (SDT)[
12
]
excerpt in Listing 3.
Listing 2.
The
allow_redefine()
function used in the SDT
D . De c lC o nt ex t = new N a me s p ac e ( " __ N S _ xx x " ,
IN L I N E , { D . n o d e })
Figure 3.
Lookup tables for translation unit and inline
namespaces
inline namespace NS 1 {
int PI = 0 ;
st d :: s tr in g S;
}
inline namespace NS 2 {
dou b l e P I = 3. 14 15 92 65 ,
J;
}
· · ·
PI int
S std::string
NS1 lookup table
PI double = 3.14. . .
J double
NS2 lookup table
PI int = 0
PI double = 3.14159265
S std::string
J double
TU lookup table
D . no de = D . D ec l C on t ex t
fix_TU_lookup_table()
4.3.2 Exceptions: overloads, unscoped
enumerations, etc.
Some particular cases require either to preserve existing
lookup table entries, or to invalidate additional ones, namely:
Function overloads.
If all the duplicated entries refer
to a function overload, none of them shall be removed.
In this case, the lookup result is said to be overloaded
(not ambiguous). Additionally, ISO C++ paragraph
[temp.over.link]p4[
10
] must be veried for overloaded
templated functions.
Unscoped enumerations.
An unscoped enumeration
is a transparent context, i.e. enumerators are made vis-
ible in the parent context. Because declared enumera-
tors are made visible in the enclosing inline namespace,
and therefore in the translation unit, the removal of
all those names from the TU lookup table shall also be
considered.
Declaration after denition.
Any non-denition dec-
laration that comes after a denition is ignored, e.g.
class C { ... };
class C; // ignored
5 Cling implementation
This section describes the implementation of the aforemen-
tioned technique on top of the Cling C++ interpreter. Given
that Cling’s architecture allows for AST transformation be-
fore the JIT compilation takes place, all the additional han-
dling required for supporting redenition has been tted in
the new DefinitionShadower AST transformer1.
1
DefinitionShadower has been merged into Cling master branch. See
https://github.com/root-project/cling/.
Relaxing the One Definition Rule in Interpreted C++ CC’20, February 22–26, 2019, San Diego, CA, USA
Table 1. Syntax-directed denition to nest declarations into a namespace
Production Semantic Rules
function-definition𝐷→ · · · declarator virt-specifier-seq𝑜𝑝𝑡 function-body D.node = new FunctionDefinition(. . . , declarator, function_body)
D.DeclContext = new Namespace("__NS_xxx", INLINE, { FD.node })
D.node = FD.DeclContext
simple-declaration𝐷→ · · · decl-specifier-seq init-declarator-list ’;’ D.node = new SimpleDeclaration(. . . )
D.DeclContext = new Namespace("__NS_xxx", INLINE, { D.node })
D.node = FD.DeclContext
.
.
.
.
.
.
Listing 3. Modied SDT that allows redening entities
fu n ct io n - d e fi n it i on →attribute-specifier-seq𝑜𝑝𝑡 decl-specifier-seq𝑜𝑝𝑡 d e c la r a t o r virt-specifier-seq𝑜𝑝 𝑡
function -body { · · · ;
allow_redefinition(); }
si m pl e - d ec l ar a ti on →d ec l - s p ec i fi e r - s eq init-declarator-list𝑜𝑝𝑡 ';'
| att ri bute - spec if ier - seq de cl -speci fi e r - seq in it - d eclarat or - li st ';'
{· · · ;
allow_redefinition(); }
· · ·
Cling AST transformers run in strict order in which they
are registered. As will be discussed in Sections 5.1.2 and 5.1.3,
DefinitionShadower must run before the existing DeclEx-
tractor transformer to produce the expected behavior.
5.1 The DenitionShadower AST transformer
This transformer employs the aforementioned “shadowing”
technique, and therefore requires to rewrite most top-level
declarations as if they were nested into an
inline namespace
,
and to apply the xes detailed in Section 4.3 to the lookup
table of the enclosing scope (TU), so that unqualied lookup
always resolves to the latest declaration. These changes do
not require a patch to Clang sources, and can be entirely
implemented in Cling.
5.1.1 Namespacing top-level declarations
The
DefinitionShadower::Transform(Decl *)
function
implements the transformation described in Section 4.2. Specif-
ically, it performs the following: (i) creating –if needed– a
uniquely-named per-transaction NamespaceDecl node (re-
ferred to as
DefinitionShadowNS
) that has been marked as
inline
, and adding it to the TranslationUnitDecl declara-
tion list; (ii) removing the given named declaration from the
TranslationUnitDecl declaration list; (iii) setting its decla-
ration context to the
DefinitionShadowNS
namespace; and
(iv) adding it to the DefinitionShadowNS declaration list.
Note that, step (iii) fails for out-of-line member function
denitions, because the semantic declaration context should
be the CXXRecordDecl of the class, and cannot be changed.
Therefore, out-of-line member functions cannot be directly
shadowed. As a workaround, the owning class has to be rede-
ned prior to attaching a new out-of-line function denition.
Additionally, because function template instantiations in-
herit the declaration context of the templated declaration, the
instantiation pattern must also be updated. Otherwise, if we
try to redene a templated function, the mangled name for
template instantiations may clash with a previous denition
of the same template.
5.1.2 Adjusting the translation unit lookup table
The required patching to the translation-unit lookup table is
performed by the
invalidatePreviousDefinitions(Decl
*D)
function. Provided that
D
is a denition, this function
hides from Sema lookup any previous denition of the same
entity. Note that, while unqualied lookup will only return
the latest denition, it still allows reachability of shadowed
declarations via qualied lookup, e.g. __cling_N50::decl.
The previous function checks whether the given decla-
ration is a wrapper function generated by Cling, in which
case we iterate through all local declarations (that will be
moved by DeclExtractor), invalidating any previous global
denition.
invalidatePreviousDefinitions(NamedDecl *D)
han-
dles the invalidation of any previous denition of a named
declaration. In general, we lookup the given name in the
translation unit and iterate through the results, skipping
over non-denitions. Candidates for removal are checked for
function/template overload using the
Sema::IsOverload()
function, and if so they are kept. Otherwise, we remove the
declaration from the StoredDeclsList (lookup table) of the
translation-unit. As an special case, because unscoped enu-
merations “leak” enumerator names to the enclosing scope,
we also invalidate any previous denition of the enumera-
tors.
CC’20, February 22–26, 2019, San Diego, CA, USA J. López-Gómez, J. Fernández, D. del Rio Astorga, V. Vassilev, et al.
Also, because some Cling extensions cache information
about declarations, e.g. TCling, we registered an interpreter
callback that provides notication when a denition has
been shadowed. Therefore, the new
DefinitionShadowed
callback may be used in that case to erase cached informa-
tion.
5.1.3 Modications to declaration extraction
The implementation required minor changes to the DeclEx-
tractor transformer, so as to properly move declarations
to the enclosing scope. The unmodied DeclExtractor in-
correctly assumed that this scope is always the translation
unit. However, if DefinitionShadower is enabled, the wrap-
per function has been moved to an
inline namespace
and
declarations should be extracted onto it, as can be seen in
Figure 4.
Figure 4. New DeclExtractor behavior
`-NamespaceDecl __cling_N50 in l i n e
|-
`-FunctionDecl __cling_Un1Qu30 'void (void *) '
| - ParmVarDecl vpClingValue 'void *'
`-CompoundStmt
| - DeclStmt
|`-Va r D e c l i'int'cinit
|`-IntegerLiteral 'int'0
This fact also implies that the DefinitionShadower trans-
former should always run before DeclExtractor.
5.2 Enabling/disabling the new transformation
If registered, the AST transformer may be turned on/o for
the next input line by means of the
EnableShadowing
com-
pilation option. Compilation options control several aspects
of Cling, such as the optimization level or toggling a fea-
ture, e.g. declaration extraction, invalid memory reference
protection, etc.
EnableShadowing
is set to 0 if Cling raw input is enabled.
Otherwise, if
EnableShadowing
equals 1, valid named top-
level declarations shall be transformed except in the follow-
ing cases:
Not typed in the Cling prompt.
Shadowing is enabled
only for declarations that were parsed from an input
line, therefore disabling it for
#include
’ed les; oth-
erwise, it might break system header les. Because
Cling stores input lines in a virtual le with overriden
contents, they may be easily recognized based on their
source location.
Is a UsingDirectiveDecl/UsingDecl.
As discussed in
Section 4.1,
using-directive
and
using-declara-
tion should not be transformed.
Is a NamespaceDecl.
Shadowing namespace members
is currently not supported.
Is a function template instantiation.
Cling copies in-
put lines in a distinct virtual le and starts parsing it.
Consequently, at end of le,
ASTConsumer::HandleTr-
anslationUnit()
emits pending template instantia-
tions. These instantiations are fed through AST trans-
formes as top-level declarations, and should be ignored
by DefinitionShadower.
5.3 Other minor changes
Cling is able to pretty-print the type and value of an expres-
sion. This behavior is automatically turned on if an input
line is not terminated by a semicolon. However, nesting type
declarations into a namespace changes the qualied name
of the type, which aects how it is printed.
Because the proposed transformation moves most top-
level
NamedDecl
nodes into a namespace, their fully qualied
name changes w.r.t. the original typename as seen by the
user. Take the input “
class MyClass { . . . } X
” as an
example. As shown in Figure 5.b,
X
is pretty-printed by Cling
as “
(class __cling_N50::MyClass &) @0x7f0. . .
” after
enabling DefinitionShadower.
As can be seen, the typename shown in the output changes
w.r.t. Figure 5.a. The issue is xed by setting the PrintingPol-
icy ag
SuppressUnwrittenScope = 1
in
ValuePrinter.c-
pp
. This ag species whether to print parts of qualied
names that are not required to be written, e.g. inline/anony-
mous namespaces.
Figure 5. Cling pretty-print for “class MyClass {} X”
(a) Original Cling ValuePrinter output
root [0] class M y C l a s s { } X
(class MyClass &) @ 0 x7 f b 4 d e b4 5 0 0 8
(b) DefinitionShadower enabled
root [0] class M y C l a s s { } X
(class _ _c li n g_ N5 0 :: M y Cl as s &)
@0x7f0f2da63008
(c) DefinitionShadower enabled and xed ValuePrinter
root [0] class M y C l a s s { } X
(class MyClass &) @ 0 x7 f d f 6 b aa c 0 0 8
5.4 Limitations
While the current implementation closes the behavioral gap
between the Cling C++ interpreter and other interpreted
languages, e.g. Python, it has some known limitations that
restrict its use, namely:
Shadowing a global object does not free storage.
In
C++, an l-value is an object that has a memory loca-
tion, e.g. a variable, and therefore it may appear on the
Relaxing the One Definition Rule in Interpreted C++ CC’20, February 22–26, 2019, San Diego, CA, USA
left-hand-side of an assignment expression. A shad-
owed l-value cannot be found via unqualied lookup,
but the memory it was referring to is still allocated.
Furthermore, these objects can be referenced using
their qualied name.
Changes in RTTI type information.
Run-Time Type
Identication (RTTI) is a C++ mechanism for type
introspection. As discussed in the previous section,
nesting type declarations into a namespace changes
the qualied name of types, which might be a problem
for applications heavily relying on RTTI. Fixing this
issue requires additional patches to the compiler.
6 Validation
This section presents an analysis of Cling DefinitionShad-
ower behaviour to validate the correctness of the proposed
entity redenition technique. To do so, we provide a close-
up view of the resulting AST and lookup table state for a
set of examples that covers most of the recurrent uses of
interpreted C++.
Table 2 shows the step-by-step sequential execution of
interpreted code in Cling, including the transformed AST
and the lookup table state.
In the rst line, an integer variable (
int i
) is declared. This
declaration is wrapped in a function named
__cling_Un1Qu-
30
. Then the AST tree is generated, and both DefinitionShad-
ower and DeclExtractor transformers are executed. First, Def-
initionShadower transforms the AST by nesting the function
into the
__cling_N50
inline namespace. Then, DeclExtrac-
tor extracts the declaration out of the wrapper function,
yielding the AST shown in Table 2. Given that this is the rst
declaration named i, the TU lookup table is not modied.
In the second line, a variable with the same name but dif-
ferent type (
double i
) is declared. As before, the declaration
is wrapped in a function named
__cling_Un1Qu31
. After the
AST is created, DefinitionShadower nests the function into
the
__cling_N51
inline namespace, and also removes the
previous entry for the given named declaration from the TU
lookup table. Finally, DeclExtractor extracts the declaration
out of the wrapper function.
Lines three and four declare two dierent functions with
the same name. However having both of them dierent pa-
rameters, it can be considered a function overload. These
input lines do not require to be wrapped. However, both are
nested into inline namespaces (
__cling_N52
and
__cling_-
N53
, respectively) by DefinitionShadower. In this case, De-
clExtractor does not do anything and the TU lookup table is
not modied.
Line ve declares a function with the same name and
parameters as the one on line four. As before, this declara-
tion does not require a wrapper. Again, DefinitionShadower
nests the declaration into namespace
__cling_N54
. This
transformer also removes the previous entry with the same
name from the TU lookup table.
Lines six through nine declare a templated structure with
one member (
struct S
), and an instance of
S<int>
with
the same name as the functions presented on lines three,
four and ve. In this case, only the variable declaration has
been wrapped into function
__cling_Un1Qu30
.Definition-
Shadower nests the templated structure, along with its spe-
cializations, into an inline namespace (
__cling_N56
). The
function wrapping the variable declaration is nested into a
dierent inline namespace (
__cling_N57
). This transformer
also removes all previous entries on the TU lookup table
that have the given name. Finally, DeclExtractor extracts the
declaration out of the wrapper function.
Lines ten through thirteen replace the templated struc-
ture introduced in lines six and seven. Also, we declare an
instance of this structure (
S<double> g
). As before, the vari-
able declaration requires the wrapper function
__cling_Un-
1Qu34
.DefinitionShadower nests the templated structure,
along with its specializations, into the
__cling_N58
names-
pace. On the other hand, the variable declaration is nested
into a dierent inline namespace (
__cling_N59
). The trans-
former also removes the entry for the previous declaration
of the structure from the TU lookup table. Finally, DeclEx-
tractor extracts the variable declaration out of the wrapper
function.
Finally, in lines fourteen through seventeen, we introduce
a using directive (
using namespace std
) and the
NS
names-
pace. Neither of these declarations have to be wrapped into a
function. Also, DefinitionShadower does not modify the AST
because both, using directives and user-dened namespaces
are considered exceptions. In this case, DeclExtractor does
not do anything and the TU lookup table is not modied.
As can be seen, this proposal improves the user experience
of using interpreted C++ for fast application prototyping,
while the code can still be reused for the high-performance
compiled version. Moreover, in a Jupyter notebook environ-
ment the user is allowed to edit existing cells and change
type/function denitions.
Finally, for the sake of completeness, we have also evalu-
ated the overhead caused by the transformations performed
by the DefinitionShadower. To do so, we have compared the
run time (JIT compilation and code execution) of the same
test program, both enabling and disabling entity redeni-
tion. This test program is comprised of a varying number of
top-level declarations of dierent types (function, class or
variable), ranging from 128 to 16384. Note that, in case of
enabling entity redenition, all the declarations have been
given the same name. In order to obtain the overhead, we
performed multiple executions and measured the average
run time. All the executions were run on a platform com-
prised of 24
×
Intel(R) Xeon(R) CPU E5-2695 v2 running at
2.40 GHz, and 128 GB of RAM.
CC’20, February 22–26, 2019, San Diego, CA, USA J. López-Gómez, J. Fernández, D. del Rio Astorga, V. Vassilev, et al.
Table 2. Step-by-step execution of interpreted code
Code Transformed AST TU lookup table
1in t i = 1 ; |-NamespaceDecl __cling_N50 inline
| |-VarDecl used i 'int'cinit
| | `-IntegerLiteral 'int'1
|`-FunctionDecl __cling_Un1Qu30 'void (void *)'
Name Type Value
i int 1
2do u b l e i = 3.141592; |-NamespaceDecl __cling_N51 inline
| |-VarDecl used i 'double'cinit
| | `-FloatingLiteral 'double'3.141592e+00
|`-FunctionDecl __cling_Un1Qu31 'void (void *)'
Name Type Value
i int 1
i double 3.141592
3char f ( int x) { re t u r n 'X'; }
4in t f ( ) { r e tur n 0 ; }
|-NamespaceDecl __cling_N52 inline
|`-FunctionDecl f'char (int)'
| |-ParmVarDecl x'int'
|`-CompoundStmt
|`-ReturnStmt
|`-CharacterLiteral 'char'88
|-NamespaceDecl __cling_N53 inline
|`-FunctionDecl f'int (void)'
|`-CompoundStmt
|`-ReturnStmt
|`-IntegerLiteral 'int'0
Name Type Value
i double 3.141592
f char (int)
f int ()
5do u b l e f ( ) { ret u r n 1.0; } |-NamespaceDecl __cling_N54 inline
|`-FunctionDecl f'double (void)'
|`-CompoundStmt
|`-ReturnStmt
|`-FloatingLiteral 'double'1.000000e+00
Name Type Value
i double 3.141592
f char (int)
f int ()
f double ()
6template <typename T >
7st r u c t S { T i ; } ;
8
9S < int > f {9 9 };
|-NamespaceDecl __cling_N56 inline
|`-ClassTemplateDecl S
| |-TemplateTypeParmDecl typename depth 0 index 0 T
| |-CXXRecordDecl struct S definition
| | `-FieldDecl i'T'
|`-ClassTemplateSpecializationDecl struct S
| |-TemplateArgument type 'int'
|`-FieldDecl i'int':'int'
|-NamespaceDecl __cling_N57 inline
| |-VarDecl f'S<int>':'__cling_N56::S<int>'
| | `-InitListExpr 'S<int>':'__cling_N56::S<int>'
| | `-IntegerLiteral 'int'99
|`-FunctionDecl __cling_Un1Qu33 'void (void *)'
Name Type Value
i double 3.141592
f char (int)
f double ()
S<T> __cling_N55::S<T>
f __cling_N55::S<int> {99}
10 template <typename T >
11 st r u c t S { T i , j ; } ;
12
13 S < double > g {0 , 33 . 0 }
|-NamespaceDecl __cling_N58 inline
|`-ClassTemplateDecl S
| |-TemplateTypeParmDecl typename depth 0 index 0 T
| |-CXXRecordDecl struct S definition
| | |-FieldDecl i'T'
| | `-FieldDecl j'T'
|`-ClassTemplateSpecializationDecl struct S
| |-TemplateArgument type 'double'
| |-FieldDecl i'double':'double'
|`-FieldDecl j'double':'double'
|-NamespaceDecl __cling_N59 inline
| |-VarDecl f'S<double>':'__cling_N58::S<double>'
| | `-InitListExpr 'S<double>':'__cling_N58::S<double>'
| | |-FloatingLiteral 'double'0.000000e+00
| | `-FloatingLiteral 'double'3.300000e+01
|`-FunctionDecl __cling_Un1Qu34 'void (void *)'
Name Type Value
i double 3.141592
S<T> __cling_N55::S<T>
f __cling_N55::S<int> {99}
S<T> __cling_N57::S<T>
g __cling_N57::S<int> {0,33.0}
14 using namespace st d ;
15 namespace NS {
16 st r in g s ( " Cl i ng " ) ;
17 }
|-UsingDirectiveDecl Namespace 'std'
|-NamespaceDecl NS
|`-VarDecl s'std::string':'std::basic_string<char>'
|`-ExprWithCleanups
|`-CXXConstructExpr
| |-ImplicitCastExpr 'const char *'
| | `-StringLiteral 'const char [6]'lvalue "Cling"
|`-CXXDefaultArgExpr
Name Type Value
i double 3.141592
f __cling_N55::S<int> {99}
S<T> __cling_N57::S<T>
g __cling_N57::S<int> {0,33.0}
NS [namespace]
Relaxing the One Definition Rule in Interpreted C++ CC’20, February 22–26, 2019, San Diego, CA, USA
Figure 6. Cling run time plots (both shadowing enabled/disabled)
(a) Function (re-)denition (b) Class (re-)denition (c) Variable (re-)denition
128
256
512
1024
2048
4096
8192
16384
1
2
3
5
7
10
20
40
No shadow Shadow
128
256
512
1024
2048
4096
8192
16384
1
2
3
5
7
10
20
40
No shadow Shadow
128
256
512
1024
2048
4096
8192
16384
1
2
3
5
7
10
20
40
No shadow No shadow, NORT
Shadow Shadow, NORT
As seen in Figures 6.a and 6.b, the run time behavior is sim-
ilar for both, the function and class denition tests, yielding a
time that grows with the number of declarations. Comparing
the original Cling implementation (No shadow) with our pro-
posal (Shadow), we can conclude that the “Shadow” version
icurrs in an non-linear overhead in the range of 4—52%. One
of the possible explanations for this variability, is that dif-
ferent LLVM/Clang data structure optimizations are applied
depending on the entry size.
However, the test performed for variables (see Figure 6.c),
while still growing with the number of declarations, it ex-
hibits a much higher run time, with a smaller overhead rang-
ing 2—13% for the “Shadow” version. This is due to the fact
that wrapper functions generated around variable declara-
tions call a Cling function that updates the internal state
of the interpreter. Cling includes the
-noruntime
command
line option that, among other things disables this behavior.
As shown in Figure 6.c, the run time using this option is
comparable to the other two cases. However, the overhead is
much smaller, in the range of
−
28—15%, with the “Shadow”
version being faster in some cases. Again, this variability
might be caused by dierent optimizations in LLVM/Clang
data structures.
At the light of the results, we can conclude that using
the proposed technique allows interpreted C++ to obtain a
closer behaviour to an interpreted language with moderate
overheads.
7 Conclusion and future work
Interpreted languages have been proved to be a good solu-
tion for fast prototyping in agile development methodologies,
and closing the gap between domain experts and applica-
tion development. Since interpreted languages incur in extra
overhead at runtime, in some cases it is necessary to gener-
ate a compiled version of the application. To pave the way,
multiple implementations of C and C++ interpreters have
been developed. Nonetheless, these languages present some
inherent limitations due to their compiled nature. In this
paper, we present a technique to support entity redenition
in C++ interpreters. Relaxing the ODR aids rapid prototyp-
ing while keeping the C++ language relatively sound and
consistent for the particular use case.
To validate the proposed technique, we have implemented
the DefinitionShadower AST transformer to support entity
redenition in Cling with moderate overheads. As observed
through the validation, entities can be given a new denition
similarly to other interpreted languages. It is important to
remark that the presented Cling implementation is part of
the ROOT master branch, and will be used by the domain
experts at CERN by the end of 2019.
As future work, we plan to address limitations of the cur-
rent implementation, including allocated storage issues, and
RTTI. Also, we intend to add more features to Cling, such as
the generation of debugging information for JIT’ed code.
Acknowledgments
This work has been partially funded by the Spanish Ministry
of Economy and Competitiveness through Project Grant
TIN2016-79637-P (BigHPC—Towards Unication of HPC and
Big Data Paradigms).
References
[1]
Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis,
Jerey Dean, Matthieu Devin, Sanjay Ghemawat, Georey Irving,
Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry
Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasude-
van, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2016.
TensorFlow: A System for Large-scale Machine Learning. In Proceed-
ings of the 12th USENIX Conference on Operating Systems Design and
Implementation (OSDI’16). USENIX Association, Berkeley, CA, USA,
265–283. hp://dl.acm.org/citation.cfm?id=3026877.3026899
[2]
J. Akeret, L. Gamper, A. Amara, and A. Refregier. 2015. HOPE: A Python
just-in-time compiler for astrophysical computations. Astronomy and
Computing 10 (2015), 1 – 8. hps://doi.org/10.1016/j.ascom.2014.12.001
CC’20, February 22–26, 2019, San Diego, CA, USA J. López-Gómez, J. Fernández, D. del Rio Astorga, V. Vassilev, et al.
[3]
I. Antcheva, M. Ballintijn, B. Bellenot, M. Biskup, R. Brun, N. Buncic,
Ph. Canal, D. Casadei, O. Couet, V. Fine, L. Franco, G. Ganis, A. Gheata,
D. Gonzalez Maline, M. Goto, J. Iwaszkiewicz, A. Kreshuk, D. Marcos
Segura, R. Maunder, L. Moneta, A. Naumann, E. Oermann, V. Onuchin,
S. Panacek, F. Rademakers, P. Russo, and M. Tadel. 2009. ROOT A
C++ framework for petabyte data storage, statistical analysis and
visualization. Computer Physics Communications 180, 12 (2009), 2499 –
2512. hps://doi.org/10.1016/j.cpc.2009.08.005 40 YEARS OF CPC: A
celebratory issue focused on quality software for high performance,
grid and novel computing architectures.
[4]
Erik Azar and Mario Eguiluz Alebicto. 2016. Swift Data Structure and
Algorithms. Packt Publishing.
[5]
Stefan Behnel, Robert Bradshaw, Craig Citro, Lisandro Dalcin,
Dag Sverre Seljebotn, and Kurt Smith. 2011. Cython: The Best of
Both Worlds. Computing in Science and Engg. 13, 2 (March 2011),
31–39. hps://doi.org/10.1109/MCSE.2010.118
[6]
Harry H. Cheng. 1993. Scientic Computing in the CH Programming
Language. Scientic Programming 2, 3 (1993), 49–75. hps://doi.org/
10.1155/1993/261875
[7]
Steve Donovan. 2002. C++ by example (underc learning ed. ed.). Que,
Indianapolis, IN. Accompanied by CD : CDR 01063.
[8]
Michael Foord and Christian Muirhead. 2009. IronPython in Action.
Manning Publications Co., Greenwich, CT, USA.
[9] Masaharu Goto. 1995. C++ Interpreter - CINT. CQ publishing.
[10]
ISO. 2017. ISO/IEC 14882:2017 Programming languages — C++. ISO,
1214 Vernier, Geneva, Switzerland. 1605 pages. hps://isocpp.org/std/
the-standard
[11]
Josh Juneau, Jim Baker, Frank Wierzbicki, Leo Soto, and Victor Ng.
2010. The Denitive Guide to Jython: Python for the Java Platform (1st
ed.). Apress, Berkely, CA, USA.
[12]
Donald E. Knuth. 1968. Semantics of context-free languages. Mathe-
matical systems theory 2, 2 (01 Jun 1968), 127–145. hps://doi.org/10.
1007/BF01692511
[13]
Harri Luoma, Essi Lahtinen, and Hannu-Matti Järvinen. 2007. CLIP,
a Command Line Interpreter for a Subset of C++. In Proceedings of
the Seventh Baltic Sea Conference on Computing Education Research -
Volume 88 (Koli Calling ’07). Australian Computer Society, Inc., Dar-
linghurst, Australia, Australia, 199–202. hp://dl.acm.org/citation.
cfm?id=2449323.2449351
[14]
James Martin. 1991. Rapid Application Development. Macmillan Pub-
lishing Co., Inc., Indianapolis, IN, USA.
[15]
Remigius Meier and Thomas R. Gross. 2019. Reections on the
Compatibility, Performance, and Scalability of Parallel Python. In
Proceedings of the 15th ACM SIGPLAN International Symposium on
Dynamic Languages (DLS 2019). ACM, New York, NY, USA, 91–103.
hps://doi.org/10.1145/3359619.3359747
[16]
Martin Odersky and al. 2004. An Overview of the Scala Programming
Language. Technical Report IC/2004/64. EPFL Lausanne, Switzerland.
[17]
T. E. Oliphant. 2007. Python for Scientic Computing. Computing in
Science Engineering 9, 3 (May 2007), 10–20. hps://doi.org/10.1109/
MCSE.2007.58
[18]
Guido Rossum. 1995. Python Reference Manual. Technical Report.
Amsterdam, The Netherlands, The Netherlands.
[19]
PyPy Team. 2005. Complete python implementation running on top of
cpython. Technical Report.
[20]
V Vasilev, Ph Canal, A Naumann, and P Russo. 2012. Cling – The
New Interactive Interpreter for ROOT 6. Journal of Physics: Conference
Series 396, 5 (dec 2012), 052071. hps://doi.org/10.1088/1742- 6596/
396/5/052071
Relaxing the One Definition Rule in Interpreted C++ CC’20, February 22–26, 2019, San Diego, CA, USA
A Artifact Appendix
A.1 Abstract
The artifact contains a pre-compiled Cling C++ interpreter
in its latest GitHub revision (595580b), and the required shell
scripts for evaluation. The provided les perform all the checks
included in Table 2 and Figure 6, in addition to more advanced
tests, e.g. shadowing typedefs, SFINAE support, etc. To evaluate
the artifact, run the scripts and check the results.
A.2 Artifact check-list (meta-information)
•Program: C++ code; included
•Compilation:
provided artifact compiled using gcc-9.2.0;
minimum required gcc ≥4.8.5
•Run-time environment:
artifact provides ArchLinux 2019.-
12.01-x86_64; works in any of the platforms supported by
the ROOT project.
•Hardware:
precompiled version is built for x86_64; if build-
ing from sources: any LLVM supported architecture.
•Output: Runtime or test pass/failure
•How much time is needed to complete experiments
(approximately)?: 5–7 minutes
•Publicly available?: Yes
•Code licenses (if publicly available)?:
Cling Release Li-
cense, UI/NCSAOSL, LGPL.
•Archived (provide DOI)?: 10.5281/zenodo.3579301
DOI
DOI 10.5281/zenodo.3579302
10.5281/zenodo.3579302
A.3 Description
A.3.1 How delivered
Cling is open sourced under the UI/NCSAOSL and LGPL licenses, and is
hosted on GitHub at
https://github.com/root-project/cling/
.
The repository contains source code, documentation and build instruc-
tions. The provided VM image contains a pre-compiled Cling version
from GitHub repository at revision 595580b, along with all the required
dependencies.
A.3.2 Soware dependencies
Cling requires gcc
≥
4
.
8
.
5, LLVM, Clang, cmake and a number of
other dependencies. Required packages may be checked and installed
using the Cling Packaging Tool (CPT).
A.4 Installation
For convenience, we provide a VM image in OVA format that con-
tains all the required dependencies and may be downloaded at:
https://doi.org/10.5281/zenodo.3579301
. Users of this VM
should skip to Section A.5.
Alternatively, Cling may be compiled as specied in the GitHub
project page. To build Cling run the commands below:
$ wg e t h tt p s : // r a w . gi t h u bu s e rc o n te n t . c om /
ro ot - p ro j ec t / c li n g / ma s te r / t oo ls /
p ac k ag i n g / cp t . py
$ c hm o d + x c p t . py
$ ./ c pt . p y - - ch ec k - r eq u ir e me n ts & & ./ c pt . p y
-- cr e at e - d ev - e nv Re l e a s e - - wi t h - wo r k d i r
=. / c li ng - b ui l d /
$ e xp o rt PA T H = ./ c li n g - b u il d / b ui l d di r / b in :
$PATH
Note that building Cling requires at least 3 GiB of disk space.
A.5 Experiment workflow
The
test-xxx.sh
shell scripts automate the evaluation process. These
scripts are described below:
test-noshadow.sh.
Pipes the
INTERPRETME.C
le through Cling,
with denition shadowing disabled. This le le makes heavy
use of the technique described in the paper, so its execution will
fail. The compiler is expected to produce
“error: redefinition
of . . . ” diagnostics.
test-shadow.sh.
Pipes the
INTERPRETME.C
le through Cling,
with denition shadowing enabled; its execution is expected
to be successful. Compare the output against the
//CHECK:
comments. Some of these tests were also discussed in Table 2
in the paper.
test-perf.sh.
Run performance tests and generate plots using
gnuplot. These may vary w.r.t. those in Figure 6, depending on
your hardware. Interference should be avoided while the script
is running.
To proceed with evaluation, run the scripts as follows:
$ ./ t e st - n o sh a d ow . sh
$ ./ t e st - s h ad o w . sh
$ ./ t es t - p e rf . s h o ut . p df
$ e vi n ce ou t . p df
A.6 Evaluation and expected result
For
test-noshadow.sh
, the interpreter outputs
“error: redef-
inition of . . . ”
diagnostics. For
test-shadow.sh
, the inter-
preter generates no diagnostics and the output should match what
is specied on the //CHECK: comments in INTERPRETME.C.
For
test-perf.sh
, run time plots are written to a PDF le. The
results are hardware-dependent; however, in general the aspect
should be preserved.
A.7 Experiment customization
The
INTERPRETME.C
le may be modied to run other user-dened
tests. Alternatively, C++ code may be typed in a Cling interactive
session, e.g.
$ cl i n g
[cling]$ // ty p e C + + c o d e h e r e
Note that the proposed transformation is automatically enabled for
the ROOT project, but it is disabled by default in Cling stand-alone.
However, it may be enabled by typing the following lines:
[cling]$ # i nc l ud e " c li n g / In t er p re t e r /
I nt er p re t er . h "
[cling]$ c li ng : : ru n ti me : : gC li ng - >
a ll o w R ed e f i n it i o n ( ) ;
A.8 Notes
More information about Cling and the ROOT project is available at
https://github.com/root-project/cling/
and
https://git-
hub.com/root-project/root/, respectively.