Conference PaperPDF Available

A queue-oriented transaction processing paradigm

Authors:

Abstract

Transaction processing has been an active area of research for several decades. A fundamental characteristic of classical transaction processing protocols is non-determinism, which causes them to suffer from performance issues on modern computing environments such as main-memory databases using many-core, and multi-socket CPUs and distributed environments. Recent proposals of deterministic transaction processing techniques have shown great potential in addressing these performance issues. In this position paper, I argue for a queue-oriented transaction processing paradigm that leads to better design and implementation of deterministic transaction processing protocols. I support my approach with extensive experimental evaluations and demonstrate significant performance gains.
A Queue-oriented Transaction Processing Paradigm
Middleware 2019 Doctoral Symposium
Thamir M. Qadah∗†
Exploratory Systems Lab
School of Electrical and Computer Engineering, Purdue University, West Lafayette
tqadah@purdue.edu
Abstract
Transaction processing has been an active area of research
for several decades. A fundamental characteristic of classical
transaction processing protocols is non-determinism, which
causes them to suer from performance issues on modern
computing environments such as main-memory databases
using many-core, and multi-socket CPUs and distributed
environments. Recent proposals of deterministic transaction
processing techniques have shown great potential in address-
ing these performance issues. In this position paper, I argue
for a queue-oriented transaction processing paradigm that
leads to better design and implementation of deterministic
transaction processing protocols. I support my approach
with extensive experimental evaluations and demonstrate
signicant performance gains.
CCS Concepts Information systems Database trans-
action processing
;
Distributed database transactions
;
Main memory engines
;
Computer systems organiza-
tion Multicore architectures
;
Distributed architec-
tures;
Keywords
database systems, transaction processing, con-
currency control, distributed database systems, performance
evaluation
ACM Reference Format:
Thamir M. Qadah. 2019. A Queue-oriented Transaction Processing
Paradigm: Middleware 2019 Doctoral Symposium. In Middleware
’19: 20th International Middleware Conference Doctoral Symposium
(Middleware ’19), December 9–13, 2019, Davis, CA, USA. ACM, New
York, NY, USA, 5 pages. hps://doi.org/10.1145/3366624.3368163
The author is co-advised by Prof. Mohammad Sadoghi
Also with Umm Al-Qura University, Makkah, Saudi Arabia.
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are not
made or distributed for prot or commercial advantage and that copies bear
this notice and the full citation on the rst page. Copyrights for components
of this work owned by others than ACM must be honored. Abstracting with
credit is permitted. To copy otherwise, or republish, to post on servers or to
redistribute to lists, requires prior specic permission and/or a fee. Request
permissions from permissions@acm.org.
Middleware ’19, December 9–13, 2019, Davis, CA, USA
©2019 Association for Computing Machinery.
ACM ISBN 978-1-4503-7039-4/19/12. . . $15.00
hps://doi.org/10.1145/3366624.3368163
1 Introduction
Transaction processing is an old-aged problem that has been
an active area of research for the past 40 years[
8
]. Classical
transaction processing is characterized as non-deterministic
because the nal database state cannot be entirely deter-
mined by the input database state and the input set of trans-
actions. The output database state acceptable as long as the
resulted history of concurrent transaction execution is equiv-
alent to some serial history of execution according to serial-
izability theory.
The goal of transaction processing protocols is to ensure
ACID properties and increase the concurrency of executed
transactions. Serializable isolation ensures anomaly-free ex-
ecution. Using other isolation levels (e.g., Read-committed)
improves concurrency but is prone to producing anomalies
that defy users’ intentions and leave the database in an un-
desirable inconsistent state.
Due to the non-deterministic nature of classical trans-
action processing protocols, they suer from performance
issues on modern computing environments such as main-
memory databases that use many-core and multi-socket
CPUs, and cloud-based distributed environment. In this Ph.D.
dissertation, I look into ways to impose determinism to im-
prove the performance of transaction processing in modern
computing environments.
2 Transaction Processing in Modern
Computing Environments
This section describes two major performance issues when
running database transactions using non-deterministic trans-
action processing protocols. In this section, our discussion
assumes the requirement of a serializable isolation model.
2.1 High-contention Workloads
Under high-contention workloads, non-deterministic trans-
action processing protocols suer from high abort rates
because their concurrency control algorithms need to en-
sure serializable histories. Pessimistic concurrency control
algorithms, abort transactions to avoid deadlocks, and op-
timistic concurrency control algorithms abort transactions
during the validation phase. Ensuring deadlock-free execu-
tion and validating transactions require extensive coordina-
tion among concurrent threads executing transactions while
26
Middleware ’19, December 9–13, 2019, Davis, CA, USA Thamir M. Qadah
guaranteeing serializability. The main research question for
this problem is: Is it possible to process high-contended work-
loads in a concurrency control-free and minimal coordination
while ensuring serializability? What is the right abstraction
and principles to achieve that?
2.2 Distributed Commit Protocols
In distributed transaction processing, agreement protocols
introduce signicant overhead to the processing because all
participant nodes need to agree on the fate of an executed
distributed transaction. Achieving this agreement involves
multiple rounds of communication messages to be exchanged
among participating nodes.
The state-of-the-art to solve the agreement problem on
the fate of transactions in database systems is the two-phase
commit protocol (2PC) [
9
]. In general cases, 2PC is required
to ensure atomicity for processing distributed transactions.
Note that 2PC by itself does not ensure serializable histories.
A distributed concurrency control augments it to guarantee
serializable execution of transactions. Therefore, the research
questions for this problem are as follows: Can we reduce
the cost of commitment in distributed transaction processing
protocols? What conditions are needed to avoid using the costly
2PC-based protocol?
Fortunately, in many useful and practical cases, we can
do away with 2PC. The work on deterministic transaction
processing protocols has demonstrated that. The next sec-
tion describes how determinism is a step toward overcom-
ing this obstacle. However, proposed deterministic transac-
tion processing protocols suer from in-eciencies. Another
step toward eliminating these in-eciencies is the proposed
queue-oriented paradigm, which addresses the following ad-
ditional research questions: What is the best way to abstract
deterministic transaction processing? Is it possible to provide
a unied framework that unies transaction processing for
centralized and distributed transaction processing?
2.3 Potentials and Limitations of Determinism
Work on deterministic transaction processing protocols has
demonstrated great potential for improving the performance
of transaction processing systems [
2
]. In distributed trans-
action processing systems, recently proposed deterministic
approaches almost eliminates the need to perform a costly
2PC protocol [
18
]. In other words, they rely on commit pro-
tocols that minimize overhead for committing a distributed
transaction because they perform agreement ahead of time,
which avoids aborting transactions for non-deterministic
reasons (e.g., deadlocks, validation, or node failures).
In deterministic databases, the output database state is
entirely determined by the input database state and the
input set of transactions. Thus, the full knowledge of the
read/write set is required to process transactions determin-
istically, which the main weakness of using deterministic
transaction processing protocols. Despite the existence of
such limitation, there are commercial oerings that adopt
this deterministic philosophy [
7
,
21
], which indicates that
the approach has found practical use cases in practice.
3 Approach
Our goal is to process transactions eciently in modern
computing environments with minimal coordination among
the threads running in our system. The proposed approach
addresses the research questions presented in the previous
section. The answer to these questions relies on three princi-
ples: transaction fragmentation, deterministic two-phase pro-
cessing, and priority-based queue-oriented representation
of transnational workload. The essence of the approach is to
minimize the overhead of transactional concurrency control
and coordination across the whole system. A second goal
is to provide a unied extensible abstraction for determin-
istic transaction processing that seamlessly admits various
congurations (e.g., speculative execution, conservative exe-
cution, serializable isolation, and read-committed isolation).
To lay the foundations for describing the queue-oriented
transaction processing paradigm, we start by describing the
transaction fragmentation model.
3.1 Transaction Fragmentation Model
Now, I briey describe the transaction fragmentation model.
For more formal specication of this model, I refer the read-
ers to [
17
]. In this model, a transaction is broken into frag-
ments containing the relevant transaction logic and aborting
conditions. A fragment can perform multiple operations on
the same record, such as read, modify, and write operations.
A fragment can cause the transaction to abort, and in this
case, we refer to such fragments as abortable fragments. In
Table 1, a summary is provided on the kinds of dependencies
that may exist among fragments.
3.2 Queue-oriented Transaction Processing
The essence of this paradigm is to process batches of transac-
tions in two deterministic phases. Figure 1 depicts the basic
ow. The rst phase is called a planning phase, where threads
deterministically create queues tagged with deterministic
priorities containing transaction fragments. Dependencies
among fragments are not shown in Figure 1. The dependency
information is maintained in a shared lock-free and thread-
safe distributed data structure. In the second execution phase,
execution threads receive their assigned queues (lled with
fragments) and use the tagged priorities to determine the
processing order of queues from dierent planning threads.
At this point, execution threads are not ware of the actual
transactions. They are simply executing the logic associated
with the fragments in the queues, and obey the FIFO property
of queues when processing fragments with conict depen-
dencies. Processing all queues is equivalent to processing
the whole batch of planned transactions and committing
27
A eue-oriented Transaction Processing Paradigm Middleware ’19, December 9–13, 2019, Davis, CA, USA
Name Fragment relation Notes
Data dependency Same transaction
dependent fragment required values read by dependee
fragment
Conict dependency Dierent transactions framents access the same record
Commit dependency Same transaction
dependee fragment may abort and dependent fragment
updates the database
Speculation dependency Dierent transactions
dependent fragment uses data values updated by is an
abortable fragment
Table 1. Summary of dependencies in the transaction fragmentation model
Figure 1. Queue-oriented Transaction Processing Architecture
them. Other than the necessary communication to resolve
dependencies among fragments, no other coordination is
needed.
Queue Execution Mechanisms
. The proposed paradigm
supports multiple execution mechanisms, such as specula-
tive or conservative. When using speculative execution, addi-
tional speculation dependencies occur. Resolving them may
cause cascading aborts. Conservative execution, on the other
hand, ensures that uncommitted updated are not processed
until all abortable fragments complete with aborting, which
require additional synchronization and coordination among
threads.
Isolation Levels
. The queue-oriented paradigm admits
read-committed isolation level in addition to serializable iso-
lation. Supporting read-committed isolation with speculative
is interesting as it requires maintaining a speculative version
and a committed version of records. Other than the storage
requirements, the planning phase would create additional
queues for read operations. In the execution phase, multiple
threads can execute these read operations using committed
data.
4 Evaluation
For evaluation, I implemented the queue-oriented processing
protocol in ExpoDB [
10
,
11
]. I also ported the state-of-the-art
non-deterministic and deterministic protocols into ExpoDB.
Using a single test-bed implementation allows apple-to-apple
comparison among dierent protocols. I used industry-standard
macro-benchmarks such as YCSB [
4
], and TPC-C [
19
]. Table
2 summarizes the experimental results obtained from the
centralized implementation running on multi-core hardware
with speculative execution. More details is available in [
17
]
for our centralized implementation. Furthermore, Table 2
reports results for our distributed implementation against
state-of-the-art distributed deterministic transaction process-
ing protocol. The key performance metrics for evaluating
transaction processing protocols are throughput and latency.
Another criterion for evaluating this paradigm is the ap-
plicability and broader impact. For this criterion, the queue-
oriented paradigm scores high because it is the rst determin-
istic transaction processing paradigm that allows dierent
execution models and isolation levels. It also has the po-
tential to guide implementations that improves blockchain
systems.
5 Related Work
The related work to my Ph.D. dissertation falls into two cat-
egories. In the rst category, many centralized deterministic
transaction processing protocols are proposed. LADS by Yao
et al. [
23
] creates multiple sub-graphs representing transac-
tion dependencies of a batch of transactions, and execute
these transactions according to the dependency sub-graphs.
28
Middleware ’19, December 9–13, 2019, Davis, CA, USA Thamir M. Qadah
Environment Compared Proto-
cols
Throughput
improve-
ment
Macro-
benchmark
Notes
Centralized
(deterministic)
H-Store [13]
two-orders of
magnitude
YCSB
Multi-partition work-
load
Distributed
(deterministic)
Calvin [18]
22×YCSB
Low-contention
workload (Uniform
access)
Centralized
(non-deterministic)
Cicada [
16
], TicToc
[
25
], Foudus [
15
],
Ermia [
14
], Silo [
20
],
2PL-NoWait [24]
3×TPC-C
High-contention
workload (1 ware-
house)
Table 2.
Experimental results using TPC-C and YCSB for the centralized implementation of queue-oriented paradigm [
17
],
and a distributed deterministic database.
The main issue with this approach is the graph-based process-
ing is not ecient. Using a dierent graph-based approach,
Faleiro et al. [
6
] process transactions deterministically and
introduces the notion of “early write visibility” which al-
lows transactions to read uncommitted data safely. In our
approach, we use queues of transaction fragments with dif-
ferent dependency semantics that allows us to process trans-
actions more eciently when compared to a purely graph-
based approach. BOHM [
5
] started re-thinking multi-version
concurrency control for deterministic multi-core in-memory
data stores. BOHM relies on pessimistic transactional con-
currency control while our proposed paradigm avoids trans-
actional concurrency control during execution. Some ideas
presented in [
5
,
6
] are complementary to our approach. For
example, our current implementation is single-version but
can be extended to multi-version in the future.
In the second category, one of the rst proposed distributed
deterministic database systems is H-Store [
13
], which focuses
on partitioned workloads. The design of H-Store does not lend
itself to work well for multi-partition transactional workload
because of the partition-level locking mechanism and 2PC.
To improve the performance of multi-partition workloads,
Jones et al. [
12
] introduced the idea of speculative execu-
tion in H-Store while still relying on 2PC as a distributed
commit protocol. In contrast to these proposals, the use of
speculative execution in the proposed paradigm is dierent
because speculative execution is at the level of fragments.
Furthermore, the proposed paradigm does not require 2PC
to commit distributed multi-partition transactions.
As mentioned previously, Calvin [
18
] greatly reduces the
overhead of distributed transactions because it does not rely
on 2PC. Wu et al. proposes T-Part [
22
], which uses the same
fundamental design as Calvin.T-Part optimizes the handling
remote reads by using a forward-pushing technique at the
cost of more complex scheduling that involves solving a
graph-partitioning problem. The key characteristic of Calvin
and T-Part is that they use thread-to-transaction assignment
while our approach uses thread-to-queue assignment. There-
fore, these systems cannot exploit intra-transaction paral-
lelism within a single node.
6 Conclusion
In this paper, I argued for a queue-oriented transaction pro-
cessing paradigm, which improves the performance of de-
terministic databases. On-going work includes using this
paradigm to design and implement distributed transaction
processing with byzantine-fault-tolerance.
Future work includes using the proposed paradigm to re-
alize a deterministic version of production-ready NewSQL
databases such as TiDB [
1
]. Moreover, I believe that this
paradigm can also improve the performance of blockchain
systems. In particular, the queue-oriented paradigm can lead
to a design and implementation that improves the perfor-
mance of the ordering service in HyperLedger Fabric [3].
Acknowledgments
I want to thank my co-advisors Prof. Arif Ghafoor, for his
continuous support during my Ph.D. journey, and Prof. Mo-
hammad Sadoghi, for his valuable comments that helped
me develop the ideas in my thesis. The author would also
like to thank the anonymous referees and Yahya Javed for
their valuable comments and helpful suggestions. The work
is supported in part by a scholarship from Umm Al-Qura
University, Makkah, Saudi Arabia.
References
[1] 2019. TiDB | SQL at Scale. https://pingcap.com/en/.
[2]
Daniel J. Abadi and Jose M. Faleiro. 2018. An Overview of Deterministic
Database Systems. Commun. ACM 61, 9 (Aug. 2018), 78–88. hps:
//doi.org/10.1145/3181853
[3]
Elli Androulaki, Artem Barger, Vita Bortnikov, Christian Cachin, Kon-
stantinos Christidis, Angelo De Caro, David Enyeart, Christopher
29
A eue-oriented Transaction Processing Paradigm Middleware ’19, December 9–13, 2019, Davis, CA, USA
Ferris, Gennady Laventman, Yacov Manevich, Srinivasan Muralidha-
ran, Chet Murthy, Binh Nguyen, Manish Sethi, Gari Singh, Keith Smith,
Alessandro Sorniotti, Chrysoula Stathakopoulou, Marko Vukolić,
Sharon Weed Cocco, and Jason Yellick. 2018. Hyperledger Fabric:
A Distributed Operating System for Permissioned Blockchains. In Pro-
ceedings of the Thirteenth EuroSys Conference (EuroSys ’18). ACM, New
York, NY, USA, 30:1–30:15. hps://doi.org/10.1145/3190508.3190538
[4]
Brian F. Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan,
and Russell Sears. 2010. Benchmarking Cloud Serving Systems with
YCSB. In Proc. SoCC. ACM, 143–154. hps://doi.org/10.1145/1807128.
1807152
[5]
Jose M. Faleiro and Daniel J. Abadi. 2015. Rethinking Serializable
Multiversion Concurrency Control. Proc. VLDB Endow. 8, 11 (July
2015), 1190–1201. hps://doi.org/10.14778/2809974.2809981
[6]
Jose M. Faleiro, Daniel J. Abadi, and Joseph M. Hellerstein. 2017. High
Performance Transactions via Early Write Visibility. Proc. VLDB Endow.
10, 5 (Jan. 2017), 613–624. hps://doi.org/10.14778/3055540.3055553
[7] FaunaDB. 2019. FaunaDB Website. https://fauna.com/.
[8]
Jim Gray and Andreas Reuter. 1992. Transaction Processing: Concepts
and Techniques (1st ed.). Morgan Kaufmann Publishers Inc., San Fran-
cisco, CA, USA.
[9]
J. N. Gray. 1978. Notes on Data Base Operating Systems. In Op-
erating Systems: An Advanced Course, R. Bayer, R. M. Graham, and
G. Seegmüller (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg,
393–481.
[10]
Suyash Gupta and Mohammad Sadoghi. 2018. Blockchain Transaction
Processing. In Encyclopedia of Big Data Technologies, Sherif Sakr and
Albert Zomaya (Eds.). Springer International Publishing, Cham, 1–11.
hps://doi.org/10.1007/978-3- 319-63962- 8_333-1
[11]
Suyash Gupta and Mohammad Sadoghi. 2018. EasyCommit: A Non-
Blocking Two-Phase Commit Protocol. In EDBT.hps://doi.org/10.
5441/002/edbt.2018.15
[12]
Evan P.C. Jones, Daniel J. Abadi, and Samuel Madden. 2010. Low Over-
head Concurrency Control for Partitioned Main Memory Databases. In
Proceedings of the 2010 ACM SIGMOD International Conference on Man-
agement of Data (SIGMOD ’10). ACM, New York, NY, USA, 603–614.
hps://doi.org/10.1145/1807167.1807233
[13]
Robert Kallman, Hideaki Kimura, Jonathan Natkins, Andrew Pavlo,
Alexander Rasin, Stanley Zdonik, Evan P. C. Jones, Samuel Madden,
Michael Stonebraker, Yang Zhang, John Hugg, and Daniel J. Abadi.
2008. H-Store: A High-Performance, Distributed Main Memory Trans-
action Processing System. Proc. VLDB Endow. 1, 2 (Aug. 2008), 1496–
1499. hps://doi.org/10.14778/1454159.1454211
[14]
Kangnyeon Kim, Tianzheng Wang, Ryan Johnson, and Ippokratis Pan-
dis. 2016. ERMIA: Fast Memory-Optimized Database System for Het-
erogeneous Workloads. In Proceedings of the 2016 International Con-
ference on Management of Data (SIGMOD ’16). ACM, San Francisco,
California, USA, 1675–1687. hps://doi.org/10.1145/2882903.2882905
[15]
Hideaki Kimura. 2015. FOEDUS: OLTP Engine for a Thousand Cores
and NVRAM. In Proceedings of the 2015 ACM SIGMOD International
Conference on Management of Data (SIGMOD ’15). ACM, Melbourne,
Victoria, Australia, 691–706. hps://doi.org/10.1145/2723372.2746480
[16]
Hyeontaek Lim, Michael Kaminsky, and David G. Andersen. 2017.
Cicada: Dependably Fast Multi-Core In-Memory Transactions. In Proc.
SIGMOD. ACM, 21–35. hps://doi.org/10.1145/3035918.3064015
[17]
Thamir M. Qadah and Mohammad Sadoghi. 2018. QueCC: A Queue-
Oriented, Control-Free Concurrency Architecture. In Proceedings of
the 19th International Middleware Conference (Middleware ’18). ACM,
New York, NY, USA, 13–25. hps://doi.org/10.1145/3274808.3274810
[18]
Alexander Thomson, Thaddeus Diamond, Shu C. Weng, Kun Ren,
Philip Shao, and Daniel J. Abadi. 2012. Calvin: Fast Distributed Trans-
actions for Partitioned Database Systems. In Proc. SIGMOD. ACM, 1–12.
hps://doi.org/10.1145/2213836.2213838
[19]
TPC. 2010. TPC-C, On-Line Transaction Processing Benchmark, Version
5.11.0. TPC Corporation.
[20]
Stephen Tu, Wenting Zheng, Eddie Kohler, Barbara Liskov, and
Samuel Madden. 2013. Speedy Transactions in Multicore In-Memory
Databases. In SOSP. ACM, 18–32. hps://doi.org/10.1145/2517349.
2522713
[21] VoltDB. 2019. VoltDB. https://www.voltdb.com/.
[22]
Shan-Hung Wu, Tsai-Yu Feng, Meng-Kai Liao, Shao-Kan Pi, and Yu-
Shan Lin. 2016. T-Part: Partitioning of Transactions for Forward-
Pushing in Deterministic Database Systems. In Proceedings of the
2016 International Conference on Management of Data (SIGMOD ’16).
ACM, New York, NY, USA, 1553–1565. hps://doi.org/10.1145/2882903.
2915227
[23]
C. Yao, D. Agrawal, G. Chen, Q. Lin, B. C. Ooi, W. F. Wong, and M.
Zhang. 2016. Exploiting Single-Threaded Model in Multi-Core In-
Memory Systems. IEEE TKDE 28, 10 (2016), 2635–2650. hps://doi.
org/10.1109/TKDE.2016.2578319
[24]
Xiangyao Yu, George Bezerra, Andrew Pavlo, Srinivas Devadas, and
Michael Stonebraker. 2014. Staring into the Abyss: An Evaluation of
Concurrency Control with One Thousand Cores. Proc. VLDB Endow.
8, 3 (Nov. 2014), 209–220. hps://doi.org/10.14778/2735508.2735511
[25]
Xiangyao Yu, Andrew Pavlo, Daniel Sanchez, and Srinivas Devadas.
2016. TicToc: Time Traveling Optimistic Concurrency Control. In Proc.
SIGMOD. ACM, 1629–1642. hps://doi.org/10.1145/2882903.2882935
30
... To address this challenge, we build on our highly efficient queue-oriented transaction processing paradigm [27,26,25]. In our earlier work QueCC [27], we addressed the issue of the overhead of exiting concurrency control techniques under high-contention workloads and demonstrate that speculative and queue-oriented transaction processing can improve the system's throughput by up to two orders of magnitudes over the state-of-the-art centralized (non-partitioned) transaction processing systems. ...
... All proposed DTP systems assume a deterministic stored procedure transaction model (e.g., [12,29,27,26,25]). Furthermore, the stored procedure model assumes that the transaction logic is deterministic. ...
... The reader is referred to existing literature that cover the traditional database replication techniques very well (e.g., [14,13,15,16,24]) for more details. Compared to traditional database replication techniques, the techniques proposed in this paper are deterministic, speculative, and adopt the queue-oriented transaction processing paradigm [26]. Hence, this paper explores a new research territory. ...
Preprint
Full-text available
Deterministic database systems have received increasing attention from the database research community in recent years. Despite their current limitations, recent proposals of distributed deterministic transaction processing systems demonstrated significant improvements over systems using traditional transaction processing techniques (e.g., two-phase-locking or optimistic concurrency control with two-phase-commit). However, the problem of ensuring high availability in deterministic distributed transaction processing systems has received less attention from the research community, and this aspect has not been analyzed and evaluated well. This paper proposes a generic framework to model the replication process in deterministic transaction processing systems and use it to study three cases. We design and implement QR-Store, a queue-oriented replicated transaction processing system, and extensively evaluate it with various workloads based on a transactional version of YCSB. Our prototype implementation QR-Store can achieve a throughput of 1.9 million replicated transactions per second in under 200 milliseconds and a replication overhead of 8%-25% compared to non-replicated configurations.
Conference Paper
Full-text available
We investigate a coordination-free approach to transaction processing on emerging multi-sockets, many-core, shared-memory architecture to harness its unprecedented available parallelism. We propose a queue-oriented, control-free concur-rency architecture, referred to as QueCC, that exhibits minimal contention among concurrent threads by eliminating the overhead of concurrency control from the critical path of the transaction. QueCC operates on batches of transactions in two deterministic phases of priority-based planning followed by control-free execution. We extensively evaluate our transaction execution architecture and compare its performance against seven state-of-the-art concurrency control protocols designed for in-memory, key-value stores. We demonstrate that QueCC can significantly out-perform state-of-the-art concurrency control protocols under high-contention by up to 6.3×. Moreover, our results show that QueCC can process nearly 40 million YCSB transactional operations per second while maintaining serializability guarantees with write-intensive workloads. Remarkably, QueCC out-performs H-Store by up to two orders of magnitude.
Article
Full-text available
Blockchain Transaction Processing.
Conference Paper
Full-text available
Large scale distributed databases are designed to support commercial and cloud based applications. The minimal expectation from such systems is that they ensure consistency and reliability in case of node failures. The distributed database guarantees reliability through the use of atomic commitment protocols. Atomic commitment protocols help in ensuring that either all the changes of a transaction are applied or none of them exist. To ensure efficient commitment process, the database community has mainly used the two-phase commit (2PC) protocol. However, the 2PC protocol is blocking under multiple failures. This necessitated the development of the non-blocking, three-phase commit (3PC) protocol. However, the database community is still reluctant to use the 3PC protocol, as it acts as a scalability bottleneck in the design of efficient transaction processing systems. In this work, we present Easy Commit which leverages the best of both the worlds (2PC and 3PC), that is, non-blocking (like 3PC) and requires two phases (like 2PC). Easy Commit achieves these goals by ensuring two key observations: (i) first transmit and then commit , and (ii) message redundancy. We present the design of the Easy Commit protocol and prove that it guarantees both safety and liveness. We also present a detailed evaluation of EC protocol, and show that it is nearly as efficient as the 2PC protocol.
Article
Deterministic database systems show great promise, but their deployment may require changes in the way developers interact with the database.
Article
Concurrency control for on-line transaction processing (OLTP) database management systems (DBMSs) is a nasty game. Achieving higher performance on emerging many-core systems is difficult. Previous research has shown that timestamp management is the key scalability bottleneck in concurrency control algorithms. This prevents the system from scaling to large numbers of cores. In this paper we present TicToc, a new optimistic concurrency control algorithm that avoids the scalability and concurrency bottlenecks of prior T/O schemes. TicToc relies on a novel and provably correct data-driven timestamp management protocol. Instead of assigning timestamps to transactions, this protocol assigns read and write timestamps to data items and uses them to lazily compute a valid commit timestamp for each transaction. TicToc removes the need for centralized timestamp allocation, and commits transactions that would be aborted by conventional T/O schemes. We implemented TicToc along with four other concurrency control algorithms in an in-memory, shared-everything OLTP DBMS and compared their performance on different workloads. Our results show that TicToc achieves up to 92% better throughput while reducing the abort rate by 3.3x over these previous algorithms.
Conference Paper
Fabric is a modular and extensible open-source system for deploying and operating permissioned blockchains and one of the Hyperledger projects hosted by the Linux Foundation (www.hyperledger.org). Fabric is the first truly extensible blockchain system for running distributed applications. It supports modular consensus protocols, which allows the system to be tailored to particular use cases and trust models. Fabric is also the first blockchain system that runs distributed applications written in standard, general-purpose programming languages, without systemic dependency on a native cryptocurrency. This stands in sharp contrast to existing block-chain platforms that require "smart-contracts" to be written in domain-specific languages or rely on a cryptocurrency. Fabric realizes the permissioned model using a portable notion of membership, which may be integrated with industry-standard identity management. To support such flexibility, Fabric introduces an entirely novel blockchain design and revamps the way blockchains cope with non-determinism, resource exhaustion, and performance attacks. This paper describes Fabric, its architecture, the rationale behind various design decisions, its most prominent implementation aspects, as well as its distributed application programming model. We further evaluate Fabric by implementing and benchmarking a Bitcoin-inspired digital currency. We show that Fabric achieves end-to-end throughput of more than 3500 transactions per second in certain popular deployment configurations, with sub-second latency, scaling well to over 100 peers.
Conference Paper
Multi-core in-memory databases promise high-speed online transaction processing. However, the performance of individual designs suffers when the workload characteristics miss their small sweet spot of a desired contention level, read-write ratio, record size, processing rate, and so forth. Cicada is a single-node multi-core in-memory transactional database with serializability. To provide high performance under diverse workloads, Cicada reduces overhead and contention at several levels of the system by leveraging optimistic and multi-version concurrency control schemes and multiple loosely synchronized clocks while mitigating their drawbacks. On the TPC-C and YCSB benchmarks, Cicada outperforms Silo, TicToc, FOEDUS, MOCC, two-phase locking, Hekaton, and ERMIA in most scenarios, achieving up to 3X higher throughput than the next fastest design. It handles up to 2.07 M TPC-C transactions per second and 56.5 M YCSB transactions per second, and scans up to 356 M records per second on a single 28-core machine.
Article
In order to guarantee recoverable transaction execution, database systems permit a transaction's writes to be observable only at the end of its execution. As a consequence, there is generally a delay between the time a transaction performs a write and the time later transactions are permitted to read it. This delayed write visibility can significantly impact the performance of serializable database systems by reducing concurrency among conflicting transactions. This paper makes the observation that delayed write visibility stems from the fact that database systems can arbitrarily abort transactions at any point during their execution. Accordingly, we make the case for database systems which only abort transactions under a restricted set of conditions, thereby enabling a new recoverability mechanism, early write visibility, which safely makes transactions' writes visible prior to the end of their execution. We design a new serializable concurrency control protocol, piece-wise visibility (PWV), with the explicit goal of enabling early write visibility. We evaluate PWV against state-of-the-art serializable protocols and a highly optimized implementation of read committed, and find that PWV can outperform serializable protocols by an order of magnitude and read committed by 3X on high contention workloads.
Conference Paper
Concurrency control for on-line transaction processing (OLTP) database management systems (DBMSs) is a nasty game. Achieving higher performance on emerging many-core systems is difficult. Previous research has shown that timestamp management is the key scalability bottleneck in concurrency control algorithms. This prevents the system from scaling to large numbers of cores. In this paper we present TicToc, a new optimistic concurrency control algorithm that avoids the scalability and concurrency bottlenecks of prior T/O schemes. TicToc relies on a novel and provably correct data-driven timestamp management protocol. Instead of assigning timestamps to transactions, this protocol assigns read and write timestamps to data items and uses them to lazily compute a valid commit timestamp for each transaction. TicToc removes the need for centralized timestamp allocation, and commits transactions that would be aborted by conventional T/O schemes. We implemented TicToc along with four other concurrency control algorithms in an in-memory, shared-everything OLTP DBMS and compared their performance on different workloads. Our results show that TicToc achieves up to 92% better throughput while reducing the abort rate by 3.3x over these previous algorithms.
Conference Paper
Deterministic database systems have been shown to yield high throughput on a cluster of commodity machines while ensuring the strong consistency between replicas, provided that the data can be well-partitioned on these machines. However, data partitioning can be suboptimal for many reasons in real-world applications. In this paper, we present T-Part, a transaction execution engine that partitions transactions in a deterministic database system to deal with the unforeseeable workloads or workloads whose data are hard to partition. By modeling the dependency between transactions as a T-graph and continuously partitioning that graph, T-Part allows each transaction to know which later transactions on other machines will read its writes so that it can push forward the writes to those later transactions immediately after committing. This forward-pushing reduces the chance that the later transactions stall due to the unavailability of remote data. We implement a prototype for T-Part. Extensive experiments are conducted and the results demonstrate the effectiveness of T-Part.