Mohammad SadoghiUniversity of California, Davis | UCD · Department of Computer Science
Mohammad Sadoghi
PhD
ResilientDB: Global-Scale Sustainable Blockchain Fabric
About
149
Publications
67,719
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
2,221
Citations
Introduction
Our mission at Exploratory Systems Lab (ExpoLab) is to pioneer a resilient data platform at scale, a distributed ledger centered around a democratic and decentralized computational model.
Publications
Publications (149)
Recent developments in blockchain technology have inspired innovative new designs in distributed resilient database systems. At their core, these database systems typically use Byzantine fault-tolerant consensus protocols to maintain a common state across all replicas, even if some replicas are faulty or malicious. Unfortunately, existing consensus...
Since the introduction of Bitcoin—the first widespread application driven by blockchain—the interest of the public and private sectors in blockchain has skyrocketed. In recent years, blockchain-based fabrics have been used to address challenges in diverse fields such as trade, food production, property rights, identity-management, aid delivery, hea...
The recent growth of blockchain technology has accelerated research on decentralized platforms. Initial such platforms decide on what should be added to the ledger based on the Proof-of-Work (PoW) consensus protocol. PoW protocol requires its participants to perform massive computations and leads to massive energy wastage. Existing solutions to rep...
The problem of distributed consensus has played a major role in the development of distributed data management systems. This includes the development of distributed atomic commit and replication protocols. In this monograph, we present foundations of consensus protocols and the ways they were utilized to solve distributed data management problems....
Consensus is a fundamental problem in distributed systems, involving the challenge of achieving agreement among distributed nodes. It plays a critical role in various distributed data management problems. This tutorial aims to provide a comprehensive primer for data management researchers on the topic of consensus and its fundamental and modern app...
Stream processing acceleration is driven by the continuously increasing volume and velocity of data generated on the Web and the limitations of storage, computation, and power consumption. Hardware solutions provide better performance and power consumption, but they are hindered by the high research and development costs and the long time to market...
This paper introduces HotStuff-1, a BFT consensus protocol that improves the latency of HotStuff-2 by two network-hops while maintaining linear communication complexity against faults. Additionally, HotStuff-1 incorporates an incentive-compatible leader rotation regime that motivates leaders to commit consensus decisions promptly. HotStuff-1 achiev...
In the realm of blockchain systems, smart contracts have gained widespread adoption owing to their programmability. Consequently, developing a system capable of facilitating high throughput and scalability is of paramount importance. Directed acyclic graph (DAG) consensus protocols have demonstrated notable enhancements in both throughput and laten...
Data regulations, such as GDPR, are increasingly being adopted globally to protect against unsafe data management practices. Such regulations are, often ambiguous (with multiple valid interpretations) when it comes to defining the expected dynamic behavior of data processing systems. This paper argues that it is possible to represent regulations su...
The growing interest in reliable multi-party applications has fostered widespread adoption of Byzantine Fault-Tolerant (bft) consensus protocols. Existing bft protocols need f more replicas than Paxos-style protocols to prevent equivocation attacks. trust-bft protocols seek to minimize this cost by making use of trusted components at replicas.
Thi...
The emergence of blockchains has fueled the development of resilient systems that deal with Byzantine failures due to crashes, bugs, or even malicious behavior. Recently, we have also seen the exploration of sharding in these resilient systems, this to provide the scalability required by very large data-based applications. Unfortunately, current sh...
Stream processing acceleration is driven by the continuously increasing volume and velocity of data generated on the Web and the limitations of storage, computation, and power consumption. Hardware solutions provide better performance and power consumption, but they are hindered by the high research and development costs and the long time to market...
Federated Learning (FL) is a machine learning approach that allows multiple clients to collaboratively learn a shared model without sharing raw data. However, current FL systems provide an all-in-one solution, which can hinder the wide adoption of FL in certain domains such as scientific applications. To overcome this limitation, this paper propose...
The emergence of blockchain technology has renewed the interest in consensus-based resilient data management systems that can provide resilience to failures and can manage data between fully-independent parties. To maximize the performance of these systems, we have recently seen the development of several prototype consensus solutions that optimize...
Agreement protocols have been extensively used by distributed data
management systems to provide robustness and high availability.
The broad spectrum of design dimensions, applications, and fault
models have resulted in different flavors of agreement protocols.
This proliferation of agreement protocols has made it hard to argue
their correctness an...
The emergence of blockchains is fueling the development of resilient data management systems that can deal with Byzantine failures due to crashes, bugs, or even malicious behavior. As traditional resilient systems lack the scalability required for modern data, several recent systems explored using sharding. Enabling these sharded designs requires t...
Byzantine fault-tolerant protocols cover a broad spectrum of design dimensions from environmental setting on communication topology, to more technical features such as commitment strategy and even fundamental social choice related properties like order fairness. Designing and building BFT protocols remains a laborious task despite of years of inten...
The introduction of Bitcoin fueled the development of blockchain-based resilient data management systems that are resilient against failures, enable federated data management, and can support data provenance. The key factor determining the performance of such resilient data management systems is the consensus protocol used by the system to replicat...
The growing interest in secure multi-party database applications has led to the widespread adoption of Byzantine Fault-Tolerant (BFT) consensus protocols that can handle malicious attacks from byzantine replicas. Existing BFT protocols permit byzantine replicas to equivocate their messages. As a result, they need f more replicas than Paxos-style pr...
With a growing interest in edge applications, such as the Internet of Things, the continued reliance of developers on existing edge architectures poses a threat. Existing edge applications make use of edge devices that have access to limited resources. Hence, they delegate compute-intensive tasks to the third-party cloud servers. In such an edge-cl...
Traditional resilient systems operate on fully-replicated fault-tolerant clusters, which limits their scalability and performance. One way to make the step towards resilient high-performance systems that can deal with huge workloads, is by enabling independent fault-tolerant clusters to efficiently communicate and cooperate with each other, as this...
The recent surge in federated data-management applications has brought forth concerns about the security of underlying data and the consistency of replicas in the presence of malicious attacks. A prominent solution in this direction is to employ a permissioned blockchain framework that is modeled around traditional Byzantine Fault-Tolerant (BFT) co...
A blockchain is an append-only linked-list of blocks, which is maintained at each participating node. Each block records a set of transactions and their associated metadata. Blockchain transactions act on the identical ledger data stored at each node. Blockchain was first perceived by Satoshi Nakamoto as a peer-to-peer digital-commodity (also known...
Deterministic database systems have received increasing attention from the database research community in recent years. Despite their current limitations, recent proposals of distributed deterministic transaction processing systems demonstrated significant improvements over systems using traditional transaction processing techniques (e.g., two-phas...
The emergence of blockchains has fueled the development of resilient systems that can deal with Byzantine failures due to crashes, bugs, or even malicious behavior. Recently, we have also seen the exploration of sharding in these resilient systems, this to provide the scalability required by very large data-based applications. Unfortunately, curren...
Due to the recent explosion of data volume and velocity, a new array of lightweight key-value stores have emerged to serve as alternatives to traditional databases. The majority of these storage engines, however, sacrifice their read performance in order to cope with write throughput by avoiding random disk access when writing a record in favor of...
In the previous chapter, we characterized blockchains as fully replicated resilient distributed systems. Furthermore, we introduced the consensus problem, the problem of coordinating between possibly faulty replicas, that is at the core of such systems, and studied consensus from a theoretical perspective. From this theoretical perspective, the con...
The rise of Bitcoin [193] and other cryptocurrencies [57, 244] led to the research and design of several new BFT consensus protocols. These protocols were designed with following two goals in mind: (1) running BFT consensus among a massively large set of replicas and (2) protecting identities of the participating replicas. The former goal is a cons...
Until now, we looked at the design of different BFT protocols that can help achieve consensus among a set of replicas. A key use case for these protocols is a permissioned blockchain fabric, which, unlike a permissionless blockchain application, expects the identities of the participants to be known a priori. Permissioned blockchains have found app...
In the previous two chapters we reviewed the basic technologies necessary to operate a permissioned blockchain. First, we consider PBFT, a practical and efficient consensus protocol. Then, we investigated clever engineering and implementation techniques that can be applied to PBFT (and other consensus protocols) to yield systems that can process te...
In the previous chapter, we presented a simplified version of the PBFT consensus protocol. Although this simplified protocol can easily serve as the workhorse in any deployment of a per-missioned blockchain, there is still much room for improvement. In this chapter, we will take six steps to develop such improvements. First, we will formally model...
Due to recent explosion of data volume and velocity, a new array of lightweight key-value stores have emerged to serve as alternatives to traditional databases. The majority of these storage engines, however, sacrifice their read performance in order to cope with write throughput by avoiding random disk access when writing a record in favor of fast...
To enable high-performance and scalable blockchains, we need to step away from traditional consensus-based fully-replicated designs. One direction is to explore the usage of sharding in which we partition the managed dataset over many shards that independently operate as blockchains. Sharding requires an efficient fault-tolerant primitive for the o...
With the advent of Bitcoin, the interest of the database community in blockchain systems has steadily grown. Many existing blockchain applications use blockchains as a platform for monetary transactions, however. We deviate from this philosophy and present ResilientDB, which can serve in a suite of non-monetary data-processing blockchain applicatio...
Since the introduction of Bitcoin---the first widespread application driven by blockchains---the interest in the design of blockchain-based applications has increased tremendously. At the core of these applications are consensus protocols that securely replicate client requests among all replicas, even if some replicas are Byzantine faulty. Unfortu...
In-memory key-value stores have quickly become a key enabling technology to build high-performance applications that must cope with massively distributed workloads. In-memory key-value stores (also referred to as NoSQL) primarly aim to offer low-latency and high-throughput data access which motivates the rapid adoption of modern network cards such...
Large scale distributed databases are designed to support commercial and cloud based applications. The minimal expectation from such systems is that they ensure consistency and reliability in case of node failures. The distributed database guarantees reliability through the use of atomic commitment protocols. Atomic commitment protocols help in ens...
Distributed database systems partition the data across multiple nodes to improve the concurrency, which leads to higher through-put performance. Traditional concurrency control algorithms aim at producing an execution history equivalent to any serial history of transaction execution. Hence an agreement on the final serial history is required for co...
Recent developments in blockchain technology have inspired innovative new designs in resilient distributed and database systems. At their core, these blockchain applications typically use Byzantine fault-tolerant consensus protocols to maintain a common state across all replicas, even if some replicas are faulty or malicious. Unfortunately, existin...
Since the introduction of Bitcoin---the first wide-spread application driven by blockchains---the interest of the public and private sector in blockchains has skyrocketed. At the core of this interest are the ways in which blockchains can be used to improve data management, e.g., by enabling federated data management via decentralization, resilienc...
Since the inception of Bitcoin, the distributed and database community has shown interest in the design of efficient blockchain systems. At the core of any blockchain application is a Byzantine-Fault Tolerant (BFT) protocol that helps a set of replicas reach an agreement on the order of a client request. Initial blockchain applications (like Bitcoi...
The recent surge in blockchain applications and database systems has renewed the interest in traditional Byzantine Fault Tolerant consensus protocols (BFT). Several such BFT protocols follow a primary-backup design, in which a primary} replica coordinates the consensus protocol. In primary-backup designs, the normal-case operations are rather simpl...
Since the introduction of blockchains, several new database systems and applications have tried to employ them. At the core of such blockchain designs are Byzantine Fault-Tolerant (BFT) consensus protocols that enable designing systems that are resilient to failures and malicious behavior. Unfortunately, existing BFT protocols seem unsuitable for u...
The recent surge of blockchain systems has renewed the interest in traditional Byzantine fault-tolerant consensus protocols. Many such consensus protocols have a primary-backup design in which an assigned replica, the primary, is responsible for coordinating the consensus protocol. Although the primary-backup design leads to relatively simple and h...
The development of fault-tolerant distributed systems that can tolerate Byzantine behavior has traditionally been focused on consensus protocols, which support fully-replicated designs. For the development of more sophisticated high-performance Byzantine distributed systems, more specialized fault-tolerant communication primitives are necessary, ho...
Efficient real-time analytics are an integral part of an increasing number of data management applications, such as computational targeted advertising, algorithmic trading, and Internet of Things. In this paper, we focus primarily on accelerating stream joins, which are arguably one of the most commonly used and resource-intensive operators in stre...
Blockchain is an enabler of many emerging decentralized applications in areas of cryptocurrency, Internet of Things, smart healthcare, among many others. Although various open-source blockchain frameworks are available, the infrastructure is complex enough and difficult for many users to modify or test out new research ideas. To make it worse, many...
The byzantine fault-tolerance model captures a wide-range of failures-common in real-world scenarios-such as ones due to malicious attacks and arbitrary software/hardware errors. We propose Blockplane, a middleware that enables making existing benign systems tolerate byzantine failures. This is done by making the existing system use Blockplane for...
The last decade has brought groundbreaking developments in transaction processing. This resurgence of an otherwise mature research area has spurred from the diminishing cost per GB of DRAM that allows many transaction processing workloads to be entirely memory-resident. This shift demanded a pause to fundamentally rethink the architecture of databa...
In this chapter, we cover the fundamental concepts of transaction processing in databases, which is essential to appreciate the later chapters.
The hardware trends described in Chapter 1 suggest that future hardware will be increasingly more heterogeneous and reconfigurable to accommodate for domain-specific or problem-specific processing needs. To ensure uniformity in communication between discrete hardware units with different roles, data transmissions will largely be through memory-cent...
There are many open challenges ahead for transaction processing. A major challenge is the scalability of transaction processing when databases do not fit in the memory of a single node. This commonly arises with modern Hybrid Transactional-Analytical (HTAP) workloads. Such databases necessitate a scaleable and distributed solution. The problem with...
The previous chapter focused on transaction processing schemes that attempt to non-deterministically identify the most effective interleaving of transaction operations. Such non-determinism is not a fundamental necessity as it often results in high abort rates and transaction restarts for contention-intensive workloads. This has led to a new wave o...
Many crucial components of a transaction processing system are orthogonal to the design of the concurrency control kernel. Such utility modules in a transactional system automatically perform database partitioning (sharding) and index data for efficient accesses. This chapter describes different designs, which are often complementary, for data part...
There exists a rich body of research on concurrency control techniques, including several classic books on the subject (e.g., [12, 68, 127]). In the past decade the interest in multi-version concurrency control protocols has revived, leading to the development of numerous novel approaches. To close this gap, this chapter first focuses on optimistic...
As the hardware landscape changes, database system designers continuously seek opportunities to utilize the new resources. The current trends point to massive concurrency in a single box due to the emergence of many-core architectures coupled with the substantial increase in the size of main memory at a diminishing cost. Thus, there are a multitude...
We investigate a coordination-free approach to transaction processing on emerging multi-sockets, many-core, shared-memory architecture to harness its unprecedented available parallelism. We propose a queue-oriented, control-free concur-rency architecture, referred to as QueCC, that exhibits minimal contention among concurrent threads by eliminating...
Known for powering cryptocurrencies such as Bitcoin and Ethereum, blockchain is seen as a disruptive technology capable of revolutionizing a wide variety of domains, ranging from finance to governance, by offering superior security, reliability, and transparency founded upon a decentralized and democratic computational model. In this tutorial, we f...
The maturity of RDBMSs has motivated academia and industry to invest efforts in leveraging RDBMSs for graph processing, where efficiency is proven for vital graph queries. However, none of these efforts process graphs natively inside the RDBMS, which is particularly challenging due to the impedance mismatch between the relational and the graph mode...
The plethora of graphs and relational data give rise to many interesting graph-relational queries in various domains, e.g., finding related proteins retrieved by a relational subquery in a biological network. The maturity of RDBMSs motivated academia and industry to invest efforts in leveraging RDBMSs for graph processing , where efficiency is prov...
While the growing corpus of knowledge is now being encoded in the form of knowledge graphs with rich semantics, the current graph embedding models do not incorporate ontology information into the modeling. We propose a scalable and ontology-aware graph embedding model, EmbedS, which is able to capture RDFS ontological assertions. EmbedS models enti...
To derive real-time actionable insights from the data, it is important to bridge the gap between managing the data that is being updated at a high velocity (i.e., OLTP) and analyzing a large volume of data (i.e., OLAP). However, there has been a divide where specialized solutions were often deployed to support either OLTP or OLAP workloads but not...
Large scale distributed databases are designed to support commercial and cloud based applications. The minimal expectation from such systems is that they ensure consistency and reliability in case of node failures. The distributed database guarantees reliability through the use of atomic commitment protocols. Atomic commitment protocols help in ens...