Content uploaded by Michael Joseph Mior
Author content
All content in this area was uploaded by Michael Joseph Mior on Aug 29, 2017
Content may be subject to copyright.
Locomotor: Transparent Migration of Client-Side Database Code
Michael J. Mior
University of Waterloo
200 University Avenue West
Waterloo, Ontario N2L 3G1
mmior@uwaterloo.ca
ABSTRACT
Server-side execution is a well-known method for improving the
performance of database applications. Running code on the data-
base server eliminates round trips to the client application resulting
in signicantly reduced latency. However, the common approach of
explicitly writing server-side code in stored procedures has signi-
cant drawbacks. Application developers must develop and maintain
code in two separate languages and manually partition code be-
tween the client and server. Code shipping is a viable alternative
but still requires an explicit specication of which code can be run
on the server. We propose a hybrid shipping approach based on
static analysis which automatically partitions client code and only
ships code to the server which is likely to improve performance.
We demonstrate the viability of this approach in a prototype
system which we call Locomotor. Locomotor operates on Python
applications using the Redis key-value store. Through static anal-
ysis, it identies fragments of code which can benet from being
executed on the server and automatically performs translation to
execute the fragments on the server. Unlike some previous systems,
Locomotor is not pattern-based and is able to ship a wide variety of
code. By shipping code to the server, Locomotor is able to achieve
signicant performance gains over client-side execution with no
modications to the application code.
CCS CONCEPTS
•Information systems →Database query processing
;
•Soft-
ware and its engineering →Scripting languages;
ACM Reference format:
Michael J. Mior. 2017. Locomotor: Transparent Migration of Client-Side
Database Code. In Proceedings of DBPL 2017, Munich, Germany, September 1,
2017, 5 pages.
DOI: 10.1145/3122831.3122840
1 INTRODUCTION
It is common for databases to allow the creation of parameterized
stored procedures which execute on the database server. These
stored procedures are generally more ecient than equivalent
client-side code since the use of stored procedures can eliminate
multiple network round trips to the database server. Some systems,
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for prot or commercial advantage and that copies bear this notice and the full citation
on the rst page. Copyrights for components of this work owned by others than ACM
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specic permission and/or a
fee. Request permissions from permissions@acm.org.
DBPL 2017, Munich, Germany
©2017 ACM. 978-1-4503-5354-0/17/09. . . $15.00
DOI: 10.1145/3122831.3122840
def i n c r e x ( r e d i s , ke y ) :
i f r e d i s . e x i s t s ( key ) :
return r e d i s . i n c r ( ke y )
else :
return Non e
Figure 1: Python code to conditionally increment in Redis
such as VoltDB [
12
], rely on the use of stored procedures to achieve
high performance. However, stored procedures can be more dicult
to write and manage than client code. Stored procedures may be
written in a dierent language than the application and it is nec-
essary to deploy and maintain stored procedures on the database
server as the code changes.
One example of these challenges is present in the Redis key-value
store. Applications accessing Redis do so via a programmatic API
using libraries provided in the source language of the application
(e.g. Python). For example, Figure 1 shows Python code that incre-
ments the value of a key in Redis only if the key exists. This code
requires two round trips to the server: rst to check if the key exists
(
EXISTS
), and second to increment the key (
INCR
). The remainder
of this work explores this problem using Redis and Python, but a
similar solution would be other possible scripting languages, such
as Ruby and JavaScript, and other datastores supporting server side
scripting, such as MongoDB.
Redis has a mechanism to enable scripts written in Lua to be
executed on the server. As shown in Figure 2, writing a server-side
script can reduce the latency of application code involving multiple
operations. Lua is a popular embedded scripting language, but it is
rarely used in application development. Therefore, it is likely that
code interacting with the database will be written in the source
language of the application. Conversion to a server-side script is
typically only considered in the case of performance issues since it
requires introducing an additional language as well the burden of
maintaining the code on the server.
Our goal in this work is to automate this approach of translating
client application code. We have developed a prototype system we
call Locomotor which performs the automation via static analysis
and code shipping. Section 2 provides an overview of our code anal-
ysis techniques. Section 3 then discusses the benets of partitioning
code between the client and the server. We give the details of our
approach to code shipping in Section 4 and evaluate the eective-
ness of our approach in Section 5. Finally, we examine some related
work in Section 6 and conclude in Section 7.
DBPL 2017, September 1, 2017, Munich, Germany Michael J. Mior
Client Redis
EXISTS
INCR
Client Redis
EVAL
(a) Client-driven (b) Stored procedure
Figure 2: Sequence diagrams for client-driven and scripted
conditional increment
2 CLIENT CODE ANALYSIS
We have two tasks to perform when analyzing database client
application code. The rst is to identify where database access is
occurring. This is necessary in order to determine where we need
to insert code which calls native server methods inside the stored
procedure. Second, we perform data ow analysis to determine
what data needs to be passed to/from a block of code which will be
translated into a stored procedure.
2.1 Identifying Database Access
We take a simplistic approach to identifying code which accesses the
database. First, we consider the scope for the potential optimizations
as a Python function. Second, we assume that application code uses
simple objects as the only interface to the backend database. Since
this form of encapsulation is good software engineering practice, we
expect this assumption to hold in practice. With additional eort, it
should be possible to automate the identication of database access
throughout the code instead of limiting the scope to single functions.
With Redis as our target database, we assume that database access
objects have methods which are named after Redis commands.
Locomotor currently searches for these method calls to identify
database access objects. It is also possible to explicitly annotate
database access objects. Although this approach is simplistic, it is
easy to implement and can be easily replaced if an application uses
another database interface.
After identifying these objects, we note where method calls on
these objects occur within the application source code. We only
consider shipping code to the server which contains database ac-
cess since there is no benet to shipping code which does not
interact with the server. We further ensure that any database calls
executed on the server have their parameters available by shipping
the relevant data as we further discuss in Section 4.
2.2 Data Flow Analysis
We use a simple form of information ow tracking [
10
] to identify
how data fetched from the database is propagated throughout the
application code. It is most benecial to colocate database calls on
the server which have dependencies between their parameters. For
example, if a store to the database depends on a previous read then
colocating those calls on the server will reduce the number of round
trips. After identifying database objects as in the previous section,
we denote any values returned by calling methods on those objects
as tainted. When those values are assigned to variables or used in
expressions, we mark those values as also tainted. Finally, when a
tainted value is used in a future call on a database object, we can
identify a link between these two calls.
Alongside this taint analysis, we perform a simple data ow
analysis to identify values required and emitted by a particular block
of code. For each block of code that we ship to the server, we must
ship all data which is used by that block of code. Furthermore, we
must ship back any data modied by the code or values which need
to be returned from the enclosing function. Locomotor is currently
able to send and receive integer and oating point numbers, strings,
and lists or associative arrays consisting of these values. We discuss
the full details of translation in Section 4.
3 CODE PARTITIONING
A key insight is that it is only necessary to ship code which accesses
the database to the server. This reduces the load on the server and
eliminates the need to add support for arbitrary application libraries
on the server. As we discuss later, we only translate a subset of
application syntax to the server. This simplies the translation pro-
cess while still allowing Locomotor to translate signicant portions
of real-world code. Since applications may mix database and non-
database code within a single method, we perform code partitioning
at the level of source lines. Making use of the information obtained
via analysis of the client code, Locomotor aims to reduce the latency
of application operations involving database code.
We assume that code involving database access is not CPU inten-
sive and that the main source of overhead is network trac between
the application and the database [
13
]. The potential gains we can
achieve from sending the code to the server result from eliminating
network overhead between the application and the database server.
However, it is impossible to precisely determine this overhead stat-
ically. The cost of partitioning the code at any point depends on
factors such as the size of the required data which must be trans-
ferred and the number of iterations for loops in the application
code. We make several simplifying assumptions which we expect
to hold in an OLTP setting. Firstly, we assume the amount of data
processed results in small transfer times relative to network latency.
This reduces the problem to determining the estimated minimum
number of round trips to the database. Second, we assume any
iterative code involving database access will execute more quickly
on the server, so we always ship iterative code. Since we assume
data size is small and network round trips dominate the runtime,
we expect this assumption to hold in practice.
4 CODE SHIPPING
Our approach to code shipping contains two main steps. First, we
transpile the client application code to the same language used by
server-side stored procedures. Second, we patch the client appli-
cation code to invoke the stored procedure in the location where
the original code was located. We describe the complete process of
code shipping in the following section.
Locomotor: Transparent Migration of Client-Side Database Code DBPL 2017, September 1, 2017, Munich, Germany
4.1 Transpilation
High-level scripting languages such Python, JavaScript, or Ruby are
popular choices for building database applications [
8
]. Redis, our
target database for Locomotor, makes use of the Lua programming
language as an embedded scripting engine. The advantage of our
choice of dynamic scripting languages for both application and
server-side code helps simplify transpilation. When performing
transpilation, we operate at the level of the abstract syntax tree
(AST). The similarity of the two languages means that the AST of
equivalent Python and Lua code are very similar. We can therefore
perform transpilation by walking the AST of the Python code we
wish to translate while emitting the equivalent Lua code. There
are some caveats to this approach resulting from minor dierences
in the two languages which we discuss in Appendix A. After per-
forming transpilation, Redis provides an interface to register the
script with the server via a string of source code. We discuss in the
following section how we patch the invocation of server-side code
into the client application.
We also need a mechanism for transporting data structures be-
tween the two languages. As discussed in the previous section,
Locomotor allows the use of complex types such as lists and as-
sociative arrays. Since the Redis scripting interface only permits
the transfer of binary strings and numbers, we must serialize these
values during transfer. The Redis scripting engine contains a library
for working with the MessagePack [
6
] serialization format. Mes-
sagePack is an ecient binary format which supports all the data
types we choose to work with in Locomotor. This solves the prob-
lem of translating complex sequences of nested objects between
the two languages. The Python MessagePack library is invoked in
the client application when values are sent to the server. The Lua
MessagePack library deserializes the data and then serializes any
values which are required by the application code.
4.2 Method Patching
After translating the Python code to the equivalent Lua, we must
modify the application source to call server-side code for that por-
tion of the method. To accomplish this, Locomotor edits the byte-
code of functions to replace the relevant portion with a call to the
server-side method. This call accepts all the relevant values which
must be passed to the server-side code. Upon returning from the
server-side code, the Locomotor-patched code checks a ag to see
if it should return from the enclosing function. If so, the return
value is unpacked from the server and the function returns. Other-
wise, we unpack all the values which must be stored back to local
variables and continue executing the client-side code.
5 EVALUATION
The goal of our evaluation is to demonstrate that automated ap-
proaches for stored procedure conversion can improve application
performance without requiring runtime information. We focus on
the delivery transaction of py-tpcc [11], an implementation of the
TPC-C benchmark [
9
] in Python with support for Redis. Some minor
modications were required to x bugs discovered in the imple-
mentation. To perform a delivery, few parameters are required, but
many keys are accessed on the Redis server. Furthermore, a small
amount of data is returned when the transaction completes. These
properties make this transaction an ideal target for optimization.
Each machine used in the remainder of this section has two six
core Xeon E52620 processors operating at 2.10 GHz and 64 GB of
memory. A 1TB SATA drive operating at 7,200 RPM was used for
the Redis data directory. A manually compiled version of Redis
3.2 (unreleased) was used to serve all requests. This version of
Redis was required because previous versions do not allow any
scripts which are nondeterministic. This includes scripts which
write values from a previous read, a restriction which would limit
our potential optimizations.
5.1 Manual Partitioning
To explore the eects of code shipping on performance, we rst
start with a manual partitioning of the TPC-C delivery transaction.
In this partitioning, we ship the entire transaction to the server
for execution. To measure the eect on latency, we execute a num-
ber of transactions in three dierent settings: 1) the unmodied
Python application, 2) a pure Lua client-side implementation of
the same transaction generated by Locomotor, 3) a hand-optimized
Lua client-side implementation of the same transaction, 4) entire
transaction shipped to the server as Lua. For each implementation,
we execute 100,000 transactions in a single thread and measure the
total completion time to see how latency is aected. Results are
shown in Figure 3.
We rst note that there is signicant room for optimization of the
automatically translated Lua code. Because both Python and Lua
are dynamically typed languages with diering semantics, Locomo-
tor currently requires some runtime checking. We expect that much
of this overhead could be eliminated by performing type inference
on the client application in Python. The hand-optimized version
of the Lua code is approximately 37% faster than the one gener-
ated by Locomotor, suggesting that there are indeed signicant
opportunities for improvement.
Although the Lua implementation of the transaction generated
by Locomotor is slower than the original Python application, we
still see a 4
×
reduction in runtime when this code is shipped to
the server as a stored procedure. This is because the transaction
contains multiple loops make a number of database calls. When
all this code resides on the client, there are a signicant number
of network round trips between the application and the server. By
placing this code on the server as a stored procedure, we greatly
reduce the latency of each of these database accesses.
For example, we measured the number of network round trips
between the client and the server for a single instance of the deliv-
ery transaction in the above experiments. With the unpartitioned
Python implementation, 24 round trips were required. Using the
partitioned Python implementation reduced the number of net-
work round trips to 8. The network used in our experiments has
a measured average latency of approximately 0.163 ms. This leads
to an anticipated time savings from reduced round trips of 16.3
seconds. The actual improvement is approximately 14.7 seconds.
We expect the performance penalty is a result of ineciencies in
our Lua script implementation.
DBPL 2017, September 1, 2017, Munich, Germany Michael J. Mior
0
5
10
15
20
25
30
35
Python Lua Optimized Lua Partitioned
Execution time (s)
Figure 3: TPC-C delivery transaction runtime
6 RELATED WORK
While systems exist which perform automated partitioning of ap-
plications between client and server, Locomotor has several key
distinctions. Firstly, existing systems which perform code shipping
assume that the client and the server are written in the language
and that the server code can be modied to enable the shipping
process. Existing work also makes limited use of the information
available from analyzing application code and instead requires sig-
nicant runtime analysis. We provide a more detailed comparison
with related systems below.
There has been signicant work in optimizing the performance
of applications via automatic partitioning. Coign [
7
] automatically
partitions applications using Microsoft’s COM interface. This parti-
tioning relies on proling application binaries to determine com-
munication patterns. After proling, components were partitioned
using a series of classiers to decide which to place on the client or
server. Unlike Locomotor, the client and server both use the same
runtime and code can be moved largely unchanged between the
client and server.
Others have examined automatic partitioning for mobile applica-
tions. MAUI [
5
] and CloneCloud [
4
] aim to optimize performance of
mobile device applications by ooading compute-intensive code to
more powerful processors on remote servers. Both also use coarse-
grained partitioning and require a signicant amount of proling
data to make correct decisions.
Another approach to reducing communication overhead is to
automatically batch application queries together in a single com-
munication round with the database server. Scalpel [
1
] attempts
to do so by monitoring database requests at the network layer and
prefetching results for queries which are expected to be issued
in the near future. Furthermore, Scalpel is able to rewrite queries
to fetch correlated results automatically. Both of these tasks are
achieved by predicting correlations of parameters between multiple
queries. While Scalpel performs well, it does not take advantage of
any knowledge of the application code and performs unnecessary
requests to the database when predictions fail. Sloth [
3
] performs
batching by delaying queries until their results are needed by sub-
sequent application code. Queries which are not dependent on the
results of previous queries can then be batched together to save on
round trips. Sloth performs this batching by adding lazy evaluation
to Java programs using JDBC. While this approach works well, it is
unable to batch queries which have dependencies between them.
Furthermore, the lazy evaluation required to perform batching adds
5–15% overhead.
The most similar work to ours is Pyxis [
2
]. Pyxis uses runtime
proling to dynamically ship portions of JDBC application code
onto a server side component. The goal of Pyxis is to minimize
round trips between the client and the server when an applica-
tion makes consecutive database requests. However, Pyxis requires
dynamic proling of the application code which adds up to 6
×
overhead for code not involving database access. This restricts the
class of applications which can benet from Pyxis without man-
ual restructuring of the application to separate compute-intensive
code. The overhead is partially a result of a server-side component
which is required to communicate with the database process on
the server. We avoid this issue in Locomotor by using the scripting
language already embedded in the database server thus requiring
no changes to the server code. Furthermore, Locomotor directly
handles dynamic languages while Pyxis relies on a compiler which
modies Java application source code.
7 CONCLUSIONS AND FUTURE WORK
By automatically translating code that interacts with the database,
we are able to produce ecient stored procedures while eliminating
the overhead of manual rewriting. Our prototype implementation
showed a 4×performance increase over a purely client-side appli-
cation for the TPC-C delivery transaction. Furthermore, scripts can
be automatically installed on the server and updated as the source
language changes. For code which includes multiple interactions
with the database, this conversion to stored procedures can result
in signicant performance improvements.
Currently it is necessary for the application developer to specify
functions they wish to execute on the server. In future work, we
anticipate that by constructing an accurate performance model of
both the server and client-side code, it should be possible to au-
tomate the selections of code fragments to execute on the server.
While our prototype only operates on Python applications commu-
nicating with Redis, we believe similar techniques could apply to
other databases.
REFERENCES
[1]
I.T. Bowman and K. Salem. 2005. Optimization of Query Streams Using Semantic
Prefetching. ACM TODS 30, 4 (Dec 2005), 1056–1101.
[2]
A. Cheung and others. 2012. Automatic Partitioning of Database Applications.
Proc. VLDB Endow. 5, 11 (Jul 2012).
[3]
A. Cheung and others. 2014. Sloth: Being Lazy is a Virtue (when Issuing Database
Queries). In Proc. of SIGMOD ’14.
[4]
B. Chun and others. 2011. CloneCloud: Elastic Execution Between Mobile Device
and Cloud. In Proc. of EuroSys ’11. ACM.
[5]
E. Cuervo and others. 2010. MAUI: Making Smartphones Last Longer with Code
Ooad. In Proc. of MobiSys ’10. ACM.
[6] S. Furuhashi. 2017. MessagePack. (2017). https://msgpack.org/
[7]
G.C. Hunt and M.L. Scott. 1999. The Coign Automatic Distributed Partitioning
System. In Proc. of OSDI ’99. USENIX Association, 1471–1482.
[8]
M. Jazayeri. 2007. Some Trends in Web Application Development. In Proc. FOSE
’07. IEEE Comp. Soc., 199–213.
[9]
W. Kohler and others. 1991. Overview of TPC Benchmark C: The Order-Entry
Benchmark. Transaction Processing Performance Council, Technical Report (1991).
Locomotor: Transparent Migration of Client-Side Database Code DBPL 2017, September 1, 2017, Munich, Germany
return {'a': 1 } [ 'a']
(a) Python
d = { [ '__DICT '] = t r u e , [ 'a'] = 1 }
return d [ ( d ._ _D IC T ) and ('a')or ('a'+ 1 ) ]
(b) Lua
Figure 4: Dictionary access
[10]
L. Lam and others. 2006. A General Dynamic Information Flow Tracking Frame-
work for Security Applications. In Proc. ACSAC ’06. 463–472.
[11]
A. Pavlo. 2015. Python implementation of TPC-C. (2015). https://github.com/
apavlo/py-tpcc
[12]
M. Stonebraker and A. Weisberg. 2013. The VoltDB Main Memory DBMS. IEEE
Data Eng. Bull. 36, 2 (2013), 21–27.
[13]
X. Wu and others. 2013. Optimizing Event Polling for Network-Intensive Appli-
cations: A Case Study on Redis. In Proc. ICPADS ’13. 687–692.
A PYTHON TO LUA TRANSLATION DETAILS
In general, our translation process works by examining an AST
node in the original Python code and emitting the corresponding
Lua code. However, there are caveats which arise as a result of the
dierences between the two languages which we detail below.
A.1 Data Type Conversions
One of the main diculties with translating between two languages
is the incongruence of their type systems. For example, Python has
explicit types for arrays (lists) and hash tables (dictionaries) while
Lua has a single type (tables) which are used for both purposes. This
is complicated by the fact that Lua uses 1-based indexing for tables
with integer indices (arrays). The syntax for accessing both arrays
and dictionaries is the same in Python. This means it is necessary
to know at runtime whether the Lua table is being used as an array
or hash table to construct the appropriate index. We solve this
problem by annotating each Lua table with attribute indicating if
it corresponds to a Python dictionary. On each table access, we
check this attribute to construct the correct index into the table. An
example translation is given in Figure 4.
A.2 Language Semantics
While there is a high correspondence between the operators avail-
able in Python and Lua, some operators have slightly dierent
semantics. For example, Lua has two separate operators for numer-
ical addition (
+
) and string concatenation (
..
). Python simply uses
+
for both purposes and we cannot necessary statically determine
whether the operation represents addition or concatenation. To
resolve this issue, we use the simple heuristic of identifying a nu-
merical or string constant on either side of the operator and falling
back to integer addition. This heuristic could be replaced by either
static type inference (if possible) or runtime type checks as needed.
Lua’s truth semantics also dier from those of Python. In Lua,
only
nil
and
false
evaluate to
false
while in Python other values
such as empty strings and collections also evaluate to false. To
match these semantics, we simply check Lua values according to
Python semantics whenever an expression is evaluated in a boolean
context.
for iin r a nge (10):
i f i == 3 :
break
else :
continue
x += 1
(a) Python
l o c a l __BREAK1 = false
for i = 0 , 1 0 −1 , 1 do
i f __BREAK1 then brea k end
repeat
i f i == 3 th en
__BREAK1 = t r ue
do bre ak end
else
i f t ru e t he n br ea k end
end
x=x+1
u n t i l t r u e
end
(b) Lua
Figure 5: Loop break and continue
A.3 Built-in Functions
Python has built-in functions as well as a large standard library
which is commonly used by applications. We choose to assume that
built-in functions have not been redened. In practice, it is possible
that a developer chose to name their own functions using these
names in which case the translation would be invalid. Most of the
time this assumption is valid but if it is violated, then the translation
will not be faithful to the original code. While this has not been
a problem with the examples we considered, this issue could be
resolved by more detailed static analysis to identify overridden
functions coupled with runtime checks to ensure that the guess
that the function was overridden was correct. When translating
built-in functions, we either use an equivalent built-in function in
Lua or provide a simple Lua implementation.
A.4 Loop Constructs
Python and Lua have dierent constructs for looping. For example,
Lua does not support a
continue
statement which Python uses
to end the current loop iteration and continue with the next. We
emulate this in Lua by nesting an additional loop and using a
break
statement to stop executing the rest of the loop. This has the dis-
advantage of disallowing both break and continue in a single loop.
We can resolve this issue by modifying
break
statements in the
original code to set a ag which can be checked on each iteration
to abort the loop as we demonstrate in Figure 5.