Content uploaded by Thamir Qadah
Author content
All content in this area was uploaded by Thamir Qadah on Jul 24, 2021
Content may be subject to copyright.
A Queue-oriented Transaction Processing Paradigm
Middleware 2019 Doctoral Symposium
Thamir M. Qadah∗†
Exploratory Systems Lab
School of Electrical and Computer Engineering, Purdue University, West Lafayette
tqadah@purdue.edu
Abstract
Transaction processing has been an active area of research
for several decades. A fundamental characteristic of classical
transaction processing protocols is non-determinism, which
causes them to suer from performance issues on modern
computing environments such as main-memory databases
using many-core, and multi-socket CPUs and distributed
environments. Recent proposals of deterministic transaction
processing techniques have shown great potential in address-
ing these performance issues. In this position paper, I argue
for a queue-oriented transaction processing paradigm that
leads to better design and implementation of deterministic
transaction processing protocols. I support my approach
with extensive experimental evaluations and demonstrate
signicant performance gains.
CCS Concepts •Information systems →Database trans-
action processing
;
Distributed database transactions
;
Main memory engines
;
•Computer systems organiza-
tion →Multicore architectures
;
Distributed architec-
tures;
Keywords
database systems, transaction processing, con-
currency control, distributed database systems, performance
evaluation
ACM Reference Format:
Thamir M. Qadah. 2019. A Queue-oriented Transaction Processing
Paradigm: Middleware 2019 Doctoral Symposium. In Middleware
’19: 20th International Middleware Conference Doctoral Symposium
(Middleware ’19), December 9–13, 2019, Davis, CA, USA. ACM, New
York, NY, USA, 5 pages. hps://doi.org/10.1145/3366624.3368163
∗The author is co-advised by Prof. Mohammad Sadoghi
†Also with Umm Al-Qura University, Makkah, Saudi Arabia.
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are not
made or distributed for prot or commercial advantage and that copies bear
this notice and the full citation on the rst page. Copyrights for components
of this work owned by others than ACM must be honored. Abstracting with
credit is permitted. To copy otherwise, or republish, to post on servers or to
redistribute to lists, requires prior specic permission and/or a fee. Request
permissions from permissions@acm.org.
Middleware ’19, December 9–13, 2019, Davis, CA, USA
©2019 Association for Computing Machinery.
ACM ISBN 978-1-4503-7039-4/19/12. . . $15.00
hps://doi.org/10.1145/3366624.3368163
1 Introduction
Transaction processing is an old-aged problem that has been
an active area of research for the past 40 years[
8
]. Classical
transaction processing is characterized as non-deterministic
because the nal database state cannot be entirely deter-
mined by the input database state and the input set of trans-
actions. The output database state acceptable as long as the
resulted history of concurrent transaction execution is equiv-
alent to some serial history of execution according to serial-
izability theory.
The goal of transaction processing protocols is to ensure
ACID properties and increase the concurrency of executed
transactions. Serializable isolation ensures anomaly-free ex-
ecution. Using other isolation levels (e.g., Read-committed)
improves concurrency but is prone to producing anomalies
that defy users’ intentions and leave the database in an un-
desirable inconsistent state.
Due to the non-deterministic nature of classical trans-
action processing protocols, they suer from performance
issues on modern computing environments such as main-
memory databases that use many-core and multi-socket
CPUs, and cloud-based distributed environment. In this Ph.D.
dissertation, I look into ways to impose determinism to im-
prove the performance of transaction processing in modern
computing environments.
2 Transaction Processing in Modern
Computing Environments
This section describes two major performance issues when
running database transactions using non-deterministic trans-
action processing protocols. In this section, our discussion
assumes the requirement of a serializable isolation model.
2.1 High-contention Workloads
Under high-contention workloads, non-deterministic trans-
action processing protocols suer from high abort rates
because their concurrency control algorithms need to en-
sure serializable histories. Pessimistic concurrency control
algorithms, abort transactions to avoid deadlocks, and op-
timistic concurrency control algorithms abort transactions
during the validation phase. Ensuring deadlock-free execu-
tion and validating transactions require extensive coordina-
tion among concurrent threads executing transactions while
26
Middleware ’19, December 9–13, 2019, Davis, CA, USA Thamir M. Qadah
guaranteeing serializability. The main research question for
this problem is: Is it possible to process high-contended work-
loads in a concurrency control-free and minimal coordination
while ensuring serializability? What is the right abstraction
and principles to achieve that?
2.2 Distributed Commit Protocols
In distributed transaction processing, agreement protocols
introduce signicant overhead to the processing because all
participant nodes need to agree on the fate of an executed
distributed transaction. Achieving this agreement involves
multiple rounds of communication messages to be exchanged
among participating nodes.
The state-of-the-art to solve the agreement problem on
the fate of transactions in database systems is the two-phase
commit protocol (2PC) [
9
]. In general cases, 2PC is required
to ensure atomicity for processing distributed transactions.
Note that 2PC by itself does not ensure serializable histories.
A distributed concurrency control augments it to guarantee
serializable execution of transactions. Therefore, the research
questions for this problem are as follows: Can we reduce
the cost of commitment in distributed transaction processing
protocols? What conditions are needed to avoid using the costly
2PC-based protocol?
Fortunately, in many useful and practical cases, we can
do away with 2PC. The work on deterministic transaction
processing protocols has demonstrated that. The next sec-
tion describes how determinism is a step toward overcom-
ing this obstacle. However, proposed deterministic transac-
tion processing protocols suer from in-eciencies. Another
step toward eliminating these in-eciencies is the proposed
queue-oriented paradigm, which addresses the following ad-
ditional research questions: What is the best way to abstract
deterministic transaction processing? Is it possible to provide
a unied framework that unies transaction processing for
centralized and distributed transaction processing?
2.3 Potentials and Limitations of Determinism
Work on deterministic transaction processing protocols has
demonstrated great potential for improving the performance
of transaction processing systems [
2
]. In distributed trans-
action processing systems, recently proposed deterministic
approaches almost eliminates the need to perform a costly
2PC protocol [
18
]. In other words, they rely on commit pro-
tocols that minimize overhead for committing a distributed
transaction because they perform agreement ahead of time,
which avoids aborting transactions for non-deterministic
reasons (e.g., deadlocks, validation, or node failures).
In deterministic databases, the output database state is
entirely determined by the input database state and the
input set of transactions. Thus, the full knowledge of the
read/write set is required to process transactions determin-
istically, which the main weakness of using deterministic
transaction processing protocols. Despite the existence of
such limitation, there are commercial oerings that adopt
this deterministic philosophy [
7
,
21
], which indicates that
the approach has found practical use cases in practice.
3 Approach
Our goal is to process transactions eciently in modern
computing environments with minimal coordination among
the threads running in our system. The proposed approach
addresses the research questions presented in the previous
section. The answer to these questions relies on three princi-
ples: transaction fragmentation, deterministic two-phase pro-
cessing, and priority-based queue-oriented representation
of transnational workload. The essence of the approach is to
minimize the overhead of transactional concurrency control
and coordination across the whole system. A second goal
is to provide a unied extensible abstraction for determin-
istic transaction processing that seamlessly admits various
congurations (e.g., speculative execution, conservative exe-
cution, serializable isolation, and read-committed isolation).
To lay the foundations for describing the queue-oriented
transaction processing paradigm, we start by describing the
transaction fragmentation model.
3.1 Transaction Fragmentation Model
Now, I briey describe the transaction fragmentation model.
For more formal specication of this model, I refer the read-
ers to [
17
]. In this model, a transaction is broken into frag-
ments containing the relevant transaction logic and aborting
conditions. A fragment can perform multiple operations on
the same record, such as read, modify, and write operations.
A fragment can cause the transaction to abort, and in this
case, we refer to such fragments as abortable fragments. In
Table 1, a summary is provided on the kinds of dependencies
that may exist among fragments.
3.2 Queue-oriented Transaction Processing
The essence of this paradigm is to process batches of transac-
tions in two deterministic phases. Figure 1 depicts the basic
ow. The rst phase is called a planning phase, where threads
deterministically create queues tagged with deterministic
priorities containing transaction fragments. Dependencies
among fragments are not shown in Figure 1. The dependency
information is maintained in a shared lock-free and thread-
safe distributed data structure. In the second execution phase,
execution threads receive their assigned queues (lled with
fragments) and use the tagged priorities to determine the
processing order of queues from dierent planning threads.
At this point, execution threads are not ware of the actual
transactions. They are simply executing the logic associated
with the fragments in the queues, and obey the FIFO property
of queues when processing fragments with conict depen-
dencies. Processing all queues is equivalent to processing
the whole batch of planned transactions and committing
27
A eue-oriented Transaction Processing Paradigm Middleware ’19, December 9–13, 2019, Davis, CA, USA
Name Fragment relation Notes
Data dependency Same transaction
dependent fragment required values read by dependee
fragment
Conict dependency Dierent transactions framents access the same record
Commit dependency Same transaction
dependee fragment may abort and dependent fragment
updates the database
Speculation dependency Dierent transactions
dependent fragment uses data values updated by is an
abortable fragment
Table 1. Summary of dependencies in the transaction fragmentation model
Figure 1. Queue-oriented Transaction Processing Architecture
them. Other than the necessary communication to resolve
dependencies among fragments, no other coordination is
needed.
Queue Execution Mechanisms
. The proposed paradigm
supports multiple execution mechanisms, such as specula-
tive or conservative. When using speculative execution, addi-
tional speculation dependencies occur. Resolving them may
cause cascading aborts. Conservative execution, on the other
hand, ensures that uncommitted updated are not processed
until all abortable fragments complete with aborting, which
require additional synchronization and coordination among
threads.
Isolation Levels
. The queue-oriented paradigm admits
read-committed isolation level in addition to serializable iso-
lation. Supporting read-committed isolation with speculative
is interesting as it requires maintaining a speculative version
and a committed version of records. Other than the storage
requirements, the planning phase would create additional
queues for read operations. In the execution phase, multiple
threads can execute these read operations using committed
data.
4 Evaluation
For evaluation, I implemented the queue-oriented processing
protocol in ExpoDB [
10
,
11
]. I also ported the state-of-the-art
non-deterministic and deterministic protocols into ExpoDB.
Using a single test-bed implementation allows apple-to-apple
comparison among dierent protocols. I used industry-standard
macro-benchmarks such as YCSB [
4
], and TPC-C [
19
]. Table
2 summarizes the experimental results obtained from the
centralized implementation running on multi-core hardware
with speculative execution. More details is available in [
17
]
for our centralized implementation. Furthermore, Table 2
reports results for our distributed implementation against
state-of-the-art distributed deterministic transaction process-
ing protocol. The key performance metrics for evaluating
transaction processing protocols are throughput and latency.
Another criterion for evaluating this paradigm is the ap-
plicability and broader impact. For this criterion, the queue-
oriented paradigm scores high because it is the rst determin-
istic transaction processing paradigm that allows dierent
execution models and isolation levels. It also has the po-
tential to guide implementations that improves blockchain
systems.
5 Related Work
The related work to my Ph.D. dissertation falls into two cat-
egories. In the rst category, many centralized deterministic
transaction processing protocols are proposed. LADS by Yao
et al. [
23
] creates multiple sub-graphs representing transac-
tion dependencies of a batch of transactions, and execute
these transactions according to the dependency sub-graphs.
28
Middleware ’19, December 9–13, 2019, Davis, CA, USA Thamir M. Qadah
Environment Compared Proto-
cols
Throughput
improve-
ment
Macro-
benchmark
Notes
Centralized
(deterministic)
H-Store [13]
two-orders of
magnitude
YCSB
Multi-partition work-
load
Distributed
(deterministic)
Calvin [18]
22×YCSB
Low-contention
workload (Uniform
access)
Centralized
(non-deterministic)
Cicada [
16
], TicToc
[
25
], Foudus [
15
],
Ermia [
14
], Silo [
20
],
2PL-NoWait [24]
3×TPC-C
High-contention
workload (1 ware-
house)
Table 2.
Experimental results using TPC-C and YCSB for the centralized implementation of queue-oriented paradigm [
17
],
and a distributed deterministic database.
The main issue with this approach is the graph-based process-
ing is not ecient. Using a dierent graph-based approach,
Faleiro et al. [
6
] process transactions deterministically and
introduces the notion of “early write visibility” which al-
lows transactions to read uncommitted data safely. In our
approach, we use queues of transaction fragments with dif-
ferent dependency semantics that allows us to process trans-
actions more eciently when compared to a purely graph-
based approach. BOHM [
5
] started re-thinking multi-version
concurrency control for deterministic multi-core in-memory
data stores. BOHM relies on pessimistic transactional con-
currency control while our proposed paradigm avoids trans-
actional concurrency control during execution. Some ideas
presented in [
5
,
6
] are complementary to our approach. For
example, our current implementation is single-version but
can be extended to multi-version in the future.
In the second category, one of the rst proposed distributed
deterministic database systems is H-Store [
13
], which focuses
on partitioned workloads. The design of H-Store does not lend
itself to work well for multi-partition transactional workload
because of the partition-level locking mechanism and 2PC.
To improve the performance of multi-partition workloads,
Jones et al. [
12
] introduced the idea of speculative execu-
tion in H-Store while still relying on 2PC as a distributed
commit protocol. In contrast to these proposals, the use of
speculative execution in the proposed paradigm is dierent
because speculative execution is at the level of fragments.
Furthermore, the proposed paradigm does not require 2PC
to commit distributed multi-partition transactions.
As mentioned previously, Calvin [
18
] greatly reduces the
overhead of distributed transactions because it does not rely
on 2PC. Wu et al. proposes T-Part [
22
], which uses the same
fundamental design as Calvin.T-Part optimizes the handling
remote reads by using a forward-pushing technique at the
cost of more complex scheduling that involves solving a
graph-partitioning problem. The key characteristic of Calvin
and T-Part is that they use thread-to-transaction assignment
while our approach uses thread-to-queue assignment. There-
fore, these systems cannot exploit intra-transaction paral-
lelism within a single node.
6 Conclusion
In this paper, I argued for a queue-oriented transaction pro-
cessing paradigm, which improves the performance of de-
terministic databases. On-going work includes using this
paradigm to design and implement distributed transaction
processing with byzantine-fault-tolerance.
Future work includes using the proposed paradigm to re-
alize a deterministic version of production-ready NewSQL
databases such as TiDB [
1
]. Moreover, I believe that this
paradigm can also improve the performance of blockchain
systems. In particular, the queue-oriented paradigm can lead
to a design and implementation that improves the perfor-
mance of the ordering service in HyperLedger Fabric [3].
Acknowledgments
I want to thank my co-advisors Prof. Arif Ghafoor, for his
continuous support during my Ph.D. journey, and Prof. Mo-
hammad Sadoghi, for his valuable comments that helped
me develop the ideas in my thesis. The author would also
like to thank the anonymous referees and Yahya Javed for
their valuable comments and helpful suggestions. The work
is supported in part by a scholarship from Umm Al-Qura
University, Makkah, Saudi Arabia.
References
[1] 2019. TiDB | SQL at Scale. https://pingcap.com/en/.
[2]
Daniel J. Abadi and Jose M. Faleiro. 2018. An Overview of Deterministic
Database Systems. Commun. ACM 61, 9 (Aug. 2018), 78–88. hps:
//doi.org/10.1145/3181853
[3]
Elli Androulaki, Artem Barger, Vita Bortnikov, Christian Cachin, Kon-
stantinos Christidis, Angelo De Caro, David Enyeart, Christopher
29
A eue-oriented Transaction Processing Paradigm Middleware ’19, December 9–13, 2019, Davis, CA, USA
Ferris, Gennady Laventman, Yacov Manevich, Srinivasan Muralidha-
ran, Chet Murthy, Binh Nguyen, Manish Sethi, Gari Singh, Keith Smith,
Alessandro Sorniotti, Chrysoula Stathakopoulou, Marko Vukolić,
Sharon Weed Cocco, and Jason Yellick. 2018. Hyperledger Fabric:
A Distributed Operating System for Permissioned Blockchains. In Pro-
ceedings of the Thirteenth EuroSys Conference (EuroSys ’18). ACM, New
York, NY, USA, 30:1–30:15. hps://doi.org/10.1145/3190508.3190538
[4]
Brian F. Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan,
and Russell Sears. 2010. Benchmarking Cloud Serving Systems with
YCSB. In Proc. SoCC. ACM, 143–154. hps://doi.org/10.1145/1807128.
1807152
[5]
Jose M. Faleiro and Daniel J. Abadi. 2015. Rethinking Serializable
Multiversion Concurrency Control. Proc. VLDB Endow. 8, 11 (July
2015), 1190–1201. hps://doi.org/10.14778/2809974.2809981
[6]
Jose M. Faleiro, Daniel J. Abadi, and Joseph M. Hellerstein. 2017. High
Performance Transactions via Early Write Visibility. Proc. VLDB Endow.
10, 5 (Jan. 2017), 613–624. hps://doi.org/10.14778/3055540.3055553
[7] FaunaDB. 2019. FaunaDB Website. https://fauna.com/.
[8]
Jim Gray and Andreas Reuter. 1992. Transaction Processing: Concepts
and Techniques (1st ed.). Morgan Kaufmann Publishers Inc., San Fran-
cisco, CA, USA.
[9]
J. N. Gray. 1978. Notes on Data Base Operating Systems. In Op-
erating Systems: An Advanced Course, R. Bayer, R. M. Graham, and
G. Seegmüller (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg,
393–481.
[10]
Suyash Gupta and Mohammad Sadoghi. 2018. Blockchain Transaction
Processing. In Encyclopedia of Big Data Technologies, Sherif Sakr and
Albert Zomaya (Eds.). Springer International Publishing, Cham, 1–11.
hps://doi.org/10.1007/978-3- 319-63962- 8_333-1
[11]
Suyash Gupta and Mohammad Sadoghi. 2018. EasyCommit: A Non-
Blocking Two-Phase Commit Protocol. In EDBT.hps://doi.org/10.
5441/002/edbt.2018.15
[12]
Evan P.C. Jones, Daniel J. Abadi, and Samuel Madden. 2010. Low Over-
head Concurrency Control for Partitioned Main Memory Databases. In
Proceedings of the 2010 ACM SIGMOD International Conference on Man-
agement of Data (SIGMOD ’10). ACM, New York, NY, USA, 603–614.
hps://doi.org/10.1145/1807167.1807233
[13]
Robert Kallman, Hideaki Kimura, Jonathan Natkins, Andrew Pavlo,
Alexander Rasin, Stanley Zdonik, Evan P. C. Jones, Samuel Madden,
Michael Stonebraker, Yang Zhang, John Hugg, and Daniel J. Abadi.
2008. H-Store: A High-Performance, Distributed Main Memory Trans-
action Processing System. Proc. VLDB Endow. 1, 2 (Aug. 2008), 1496–
1499. hps://doi.org/10.14778/1454159.1454211
[14]
Kangnyeon Kim, Tianzheng Wang, Ryan Johnson, and Ippokratis Pan-
dis. 2016. ERMIA: Fast Memory-Optimized Database System for Het-
erogeneous Workloads. In Proceedings of the 2016 International Con-
ference on Management of Data (SIGMOD ’16). ACM, San Francisco,
California, USA, 1675–1687. hps://doi.org/10.1145/2882903.2882905
[15]
Hideaki Kimura. 2015. FOEDUS: OLTP Engine for a Thousand Cores
and NVRAM. In Proceedings of the 2015 ACM SIGMOD International
Conference on Management of Data (SIGMOD ’15). ACM, Melbourne,
Victoria, Australia, 691–706. hps://doi.org/10.1145/2723372.2746480
[16]
Hyeontaek Lim, Michael Kaminsky, and David G. Andersen. 2017.
Cicada: Dependably Fast Multi-Core In-Memory Transactions. In Proc.
SIGMOD. ACM, 21–35. hps://doi.org/10.1145/3035918.3064015
[17]
Thamir M. Qadah and Mohammad Sadoghi. 2018. QueCC: A Queue-
Oriented, Control-Free Concurrency Architecture. In Proceedings of
the 19th International Middleware Conference (Middleware ’18). ACM,
New York, NY, USA, 13–25. hps://doi.org/10.1145/3274808.3274810
[18]
Alexander Thomson, Thaddeus Diamond, Shu C. Weng, Kun Ren,
Philip Shao, and Daniel J. Abadi. 2012. Calvin: Fast Distributed Trans-
actions for Partitioned Database Systems. In Proc. SIGMOD. ACM, 1–12.
hps://doi.org/10.1145/2213836.2213838
[19]
TPC. 2010. TPC-C, On-Line Transaction Processing Benchmark, Version
5.11.0. TPC Corporation.
[20]
Stephen Tu, Wenting Zheng, Eddie Kohler, Barbara Liskov, and
Samuel Madden. 2013. Speedy Transactions in Multicore In-Memory
Databases. In SOSP. ACM, 18–32. hps://doi.org/10.1145/2517349.
2522713
[21] VoltDB. 2019. VoltDB. https://www.voltdb.com/.
[22]
Shan-Hung Wu, Tsai-Yu Feng, Meng-Kai Liao, Shao-Kan Pi, and Yu-
Shan Lin. 2016. T-Part: Partitioning of Transactions for Forward-
Pushing in Deterministic Database Systems. In Proceedings of the
2016 International Conference on Management of Data (SIGMOD ’16).
ACM, New York, NY, USA, 1553–1565. hps://doi.org/10.1145/2882903.
2915227
[23]
C. Yao, D. Agrawal, G. Chen, Q. Lin, B. C. Ooi, W. F. Wong, and M.
Zhang. 2016. Exploiting Single-Threaded Model in Multi-Core In-
Memory Systems. IEEE TKDE 28, 10 (2016), 2635–2650. hps://doi.
org/10.1109/TKDE.2016.2578319
[24]
Xiangyao Yu, George Bezerra, Andrew Pavlo, Srinivas Devadas, and
Michael Stonebraker. 2014. Staring into the Abyss: An Evaluation of
Concurrency Control with One Thousand Cores. Proc. VLDB Endow.
8, 3 (Nov. 2014), 209–220. hps://doi.org/10.14778/2735508.2735511
[25]
Xiangyao Yu, Andrew Pavlo, Daniel Sanchez, and Srinivas Devadas.
2016. TicToc: Time Traveling Optimistic Concurrency Control. In Proc.
SIGMOD. ACM, 1629–1642. hps://doi.org/10.1145/2882903.2882935
30