Chapter

RecAGT: Shard Testable Codes with Adaptive Group Testing for Malicious Nodes Identification in Sharding Permissioned Blockchain

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Recently, permissioned blockchain has been extensively explored in various fields, such as asset management, supply chain, healthcare, and many others. Many scholars are dedicated to improving its verifiability, scalability, and performance based on sharding techniques, including grouping nodes and handling cross-shard transactions. However, they ignore the node vulnerability problem, i.e., there is no guarantee that nodes will not be maliciously controlled throughout their life cycle. Facing this challenge, we propose RecAGT, a novel identification scheme aimed at reducing communication overhead and identifying potential malicious nodes. First, shard testable codes are designed to encode the original data in case of a leak of confidential data. Second, a new identity proof protocol is presented as evidence against malicious behavior. Finally, adaptive group testing is chosen to identify malicious nodes. Notably, our work focuses on the internal operation within the committee and can thus be applied to any sharding permissioned blockchains. Simulation results show that our proposed scheme can effectively identify malicious nodes with low communication and computational costs.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

ResearchGate has not been able to resolve any citations for this publication.
Article
Full-text available
This paper studies the PBFT-based sharded permissioned blockchain, which executes in either a local datacenter or a rented cloud platform. In such permissioned blockchain, the transaction (TX) assignment strategy could be malicious such that the network shards may possibly receive imbalanced transactions or even bursty-TX injection attacks. An imbalanced transaction assignment brings serious threats to the stability of the sharded blockchain. A stable sharded blockchain can ensure that each shard processes the arrived transactions timely. Since the system stability is closely related to the blockchain throughput, how to maintain a stable sharded blockchain becomes a challenge. To depict the transaction processing in each network shard, we adopt the Lyapunov Optimization framework. Exploiting drift-plus-penalty (DPP) technique, we then propose an adaptive resource-allocation algorithm, which can yield the near-optimal solution for each network shard while the shard queues can also be stably maintained. We also rigorously analyze the theoretical boundaries of both the system objective and the queue length of shards. The numerical results show that the proposed algorithm can achieve a better balance between resource consumption and queue stability than other baselines. We particularly evaluate two representative cases of bursty-TX injection attacks, i.e., the continued attacks against all network shards and the drastic attacks against a single network shard. The evaluation results show that the DPP-based algorithm can well alleviate the imbalanced TX assignment, and simultaneously maintain high throughput while consuming fewer resources than other baselines.
Conference Paper
Full-text available
The unique features of blockchains such as immutability, transparency, provenance, and authenticity have been used by many large-scale data management systems to deploy a wide range of distributed applications including supply chain management, health-care, and crowdworking in a permissioned setting. Unlike permissionless settings, e.g., Bitcoin, where the network is public, and anyone can participate without a specific identity, a permissioned blockchain system consists of a set of known, identified nodes that might not fully trust each other. While the characteristics of permissioned blockchains are appealing to a wide range of large-scale data management systems, these systems, have to satisfy four main requirements: confidentiality, verifiability, performance, and scalability. Various approaches have been developed in industry and academia to satisfy these requirements with varying assumptions and costs. The focus of this tutorial is on presenting many of these techniques while highlighting the trade-offs among them. We demonstrate the practicality of such techniques in real-life by presenting three different applications, i.e., Supply Chain Management, Large-scale Databases, and Multi-platform Crowdworking, and show how those techniques can be utilized to meet the requirements of such applications.
Article
Full-text available
In this paper, we systematically explore the attack surface of the Blockchain technology, with an emphasis on public Blockchains. Towards this goal, we attribute attack viability in the attack surface to 1) the Blockchain cryptographic constructs, 2) the distributed architecture of the systems using Blockchain, and 3) the Blockchain application context. To each of those contributing factors, we outline several attacks, including selfish mining, the 51% attack, DNS attacks, distributed denial-of-service (DDoS) attacks, consensus delay (due to selfish behavior or distributed denial-of-service attacks), Blockchain forks, orphaned and stale blocks, block ingestion, wallet thefts, smart contract attacks, and privacy attacks. We also explore the causal relationships between these attacks to demonstrate how various attack vectors are connected to one another. A secondary contribution of this work is outlining effective defense measures taken by the Blockchain technology or proposed by researchers to mitigate the effects of these attacks and patch associated vulnerabilities.
Conference Paper
Full-text available
Permissioned Blockchain systems rely mainly on Byzantine fault-tolerant protocols to establish consensus on the order of transactions. While Byzantine fault-tolerant protocols mostly guarantee consistency (safety) in an asynchronous network using 3f+1 machines to overcome the simultaneous malicious failure of any f nodes, in many systems, e.g., blockchain systems, the number of available nodes (resources) is much more than 3f + 1. To utilize such extra resources, in this paper we introduce a model that leverages transaction parallelism by partitioning the nodes into clusters (partitions) and processing independent transactions on different partitions simultaneously. The model also shards the blockchain ledger, assigns different shards of the blockchain ledger to different clusters, and includes both intra-shard and cross-shard transactions. Since more than one cluster is involved in each cross-shard transaction, the ledger is formed as a directed acyclic graph.
Article
Full-text available
Traditional distributed transaction processing (TP) systems, such as replicated databases, faced difficulties in getting wide adoption for scenarios of enterprise integration due to the level of mutual trust required. Ironically, public blockchains, which promised to solve the problem of mutual trust in collaborative processes, suffer from issues like scalability, probabilistic transaction finality, and lack of data confidentiality. To tackle these issues, permissioned blockchains were introduced as an alternative approach combining the positives of the two worlds and avoiding their drawbacks. However, no sufficient analysis has been done to emphasize their actual capabilities regarding TP. In this paper, we identify a suitable collection of TP criteria to analyze permissioned blockchains and apply them to a prominent set of these systems. Finally, we compare the derived properties and provide general conclusions.
Conference Paper
Full-text available
Existing blockchain systems scale poorly because of their distributed consensus protocols. Current attempts at improving blockchain scalability are limited to cryptocurrency. Scaling blockchain systems under general workloads (i.e., non-cryptocurrency applications) remains an open question. This work takes a principled approach to apply sharding to blockchain systems in order to improve their transaction throughput at scale. This is challenging, however, due to the fundamental difference in failure models between databases and blockchain. To achieve our goal, we first enhance the performance of Byzantine consensus protocols, improving individual shards' throughput. Next, we design an efficient shard formation protocol that securely assigns nodes into shards. We rely on trusted hardware, namely Intel SGX, to achieve high performance for both consensus and shard formation protocol. Third, we design a general distributed transaction protocol that ensures safety and liveness even when transaction coordinators are malicious. Finally, we conduct an extensive evaluation of our design both on a local cluster and on Google Cloud Platform. The results show that our consensus and shard formation protocols outperform state-of-the-art solutions at scale. More importantly, our sharded blockchain reaches a high throughput that can handle Visa-level workloads, and is the largest ever reported in a realistic environment.
Conference Paper
Full-text available
Designing a secure permissionless distributed ledger (blockchain) that performs on par with centralized payment processors, such as Visa, is a challenging task. Most existing distributed ledgers are unable to scale-out, i.e., to grow their total processing capacity with the number of validators; and those that do, compromise security or decentralization. We present OmniLedger, a novel scale-out distributed ledger that preserves longterm security under permissionless operation. It ensures security and correctness by using a bias-resistant public-randomness protocol for choosing large, statistically representative shards that process transactions, and by introducing an efficient cross-shard commit protocol that atomically handles transactions affecting multiple shards. OmniLedger also optimizes performance via parallel intra-shard transaction processing, ledger pruning via collectively-signed state blocks, and low-latency "trust-but-verify" validation for low-value transactions. An evaluation of our experimental prototype shows that OmniLedger's throughput scales linearly in the number of active validators, supporting Visa-level workloads and beyond, while confirming typical transactions in under two seconds.
Article
Full-text available
We consider a large-scale matrix multiplication problem where the computation is carried out using a distributed system with a master node and multiple worker nodes, where each worker can store parts of the input matrices. We propose a computation strategy that leverages ideas from coding theory to design intermediate computations at the worker nodes, in order to efficiently deal with straggling workers. The proposed strategy, named as \emph{polynomial codes}, achieves the optimum recovery threshold, defined as the minimum number of workers that the master needs to wait for in order to compute the output. Furthermore, by leveraging the algebraic structure of polynomial codes, we can map the reconstruction problem of the final output to a polynomial interpolation problem, which can be solved efficiently. Polynomial codes provide order-wise improvement over the state of the art in terms of recovery threshold, and are also optimal in terms of several other metrics. Furthermore, we extend this code to distributed convolution and show its order-wise optimality.
Article
Full-text available
Spanner is Google’s scalable, multiversion, globally distributed, and synchronously replicated database. It is the first system to distribute data at global scale and support externally-consistent distributed transactions. This article describes how Spanner is structured, its feature set, the rationale underlying various design decisions, and a novel time API that exposes clock uncertainty. This API and its implementation are critical to supporting external consistency and a variety of powerful features: nonblocking reads in the past, lock-free snapshot transactions, and atomic schema changes, across all of Spanner.
Article
Blockchain performance cannot meet the requirement nowadays. One of the crucial ways to improve performance is sharding. However, most blockchain sharding research focuses on the public blockchain. As for consortium blockchain, previous studies cannot support high cross-shard efficiency, multiple-shard contract calling, strict transaction atomicity, and shard availability, which are essential requirements but also challenges in consortium blockchain systems. Facing these challenges, we propose Meepo , a systematic study on sharded consortium blockchain. Meepo enhances cross-shard efficiency via the cross-epoch and cross-call. Moreover, a partial cross-call merging strategy is designed to handle the multi-state dependency in contract calls, achieving flexible multiple-shard contract calling. Meepo employs a replay-epoch to ensure strict transaction atomicity, and it also uses a backup algorithm called shadow shard based recovery to improve the shard robustness. On a test-bed of 128 AliCloud servers, setting 32 shards and 4 consortium members, Meepo-OpenEtheruem can achieve more than 140,000 cross-shard TPS under the workload of 100,000,000 asset transactions. It also shows more than 50,000 TPS under the transactions of real-world shopping behaviors.
Article
Coded computing has proved its efficiency in handling a straggler issue in distributed computing framework. It uses error correcting codes to mitigate the effect of the stragglers. However, in a coded distributed computing framework, there may exist Byzantine workers who send the wrong computation results to a master in order to contaminate the overall computation output. Therefore, it is essential to identify Byzantine workers from their computation results in coded computing. In this paper, we consider Byzantine attack identification problem in coded computing for distributed matrix multiplication tasks. We propose a new coding scheme which facilitates the efficient Byzantine attack identification, namely locally testable codes. We also suggest a hierarchical group testing method for Byzantine attack identification. We claim the required number of tests for group testing in our scheme, and show that it requires smaller number of tests than the conventional group testing method for the existing coded computing schemes.
Chapter
In recent years, blockchain technology has received more and more attention. Blockchain is a storage technology for public decentralized databases. The emergence of blockchain technology makes it possible to solve the trust problem of distributed system nodes within the wide area network. This article elaborated on the current advantages and disadvantages of blockchain technology and the main problems facing the development of wide-area distributed systems. This paper attempts to combine the ideas of blockchain and distributed system, and gives a complete design scheme of blockchain for building a wide area network distributed system. The designed blockchain not only retains the tamper-proof and traceable characteristics of the existing blockchain technology, but also can overcome the node trust problem of the wide-area distributed system.
Conference Paper
We consider a scenario involving computations over a massive dataset stored distributedly across multiple workers, which is at the core of distributed learning algorithms. We propose Lagrange Coded Computing (LCC), a new framework to simultaneously provide (1) resiliency against stragglers that may prolong computations; (2) security against Byzantine (or malicious) workers that deliberately modify the computation for their benefit; and (3) (information-theoretic) privacy of the dataset amidst possible collusion of workers. LCC, which leverages the well-known Lagrange polynomial to create computation redundancy in a novel coded form across workers, can be applied to any computation scenario in which the function of interest is an arbitrary multivariate polynomial of the input dataset, hence covering many computations of interest in machine learning. LCC significantly generalizes prior works to go beyond linear computations. It also enables secure and private computing in distributed settings, improving the computation and communication efficiency of the state-of-the-art. Furthermore, we prove the optimality of LCC by showing that it achieves the optimal tradeoff between resiliency, security, and privacy, i.e., in terms of tolerating the maximum number of stragglers and adversaries, and providing data privacy against the maximum number of colluding workers. Finally, we show via experiments on Amazon EC2 that LCC speeds up the conventional uncoded implementation of distributed least-squares linear regression by up to 13.43×, and also achieves a 2.36×-12.65× speedup over the state-of-the-art straggler mitigation strategies.
Article
Blockchain technologies are expected to make a significant impact on a variety of industries. However, one issue holding them back is their limited transaction throughput, especially compared to established solutions such as distributed database systems. In this paper, we rearchitect a modern permissioned blockchain system, Hyperledger Fabric, to increase transaction throughput from 3000 to 20 000 transactions per second. We focus on performance bottlenecks beyond the consensus mechanism, and we propose architectural changes that reduce computation and I/O overhead during transaction ordering and validation to greatly improve throughput. Notably, our optimizations are fully plug‐and‐play and do not require any interface changes to Hyperledger Fabric. This work shows how a permissioned blockchain framework such as Hyperledger Fabric can be re‐engineered to support nearly 20 000 transactions per second, a factor of almost 7 better than the prior work. We accomplished this goal by implementing a series of independent optimizations focusing on I/O, caching, parallelism and efficient data access. We also use aggressive caching, and we leverage lightweight data structures for fast data access on the critical path.
Conference Paper
In this paper, we explore the attack surfaces in open source permissioned blockchain project Hyperledger Fabric that can be exploited and compromised through cryptographic tactics. Attacks such as insider threats, DNS attacks, private key attacks, and certificate authority (CA) attacks are proposed and discussed. Points in transaction flow where the proposed attacks are threats to the permissioned blockchain are specified and analyzed. Key management systems are discussed, and a deep analysis of Hierarchical Deterministic wallets is conducted. The Membership Service Provider (MSP) proves to be a centralizing aspect of an otherwise decentralized system and proves to be a weakness of the permissioned blockchain network.
Article
Spanner is Google’s scalable, multiversion, globally distributed, and synchronously replicated database. It is the first system to distribute data at global scale and support externally-consistent distributed transactions. This article describes how Spanner is structured, its feature set, the rationale underlying various design decisions, and a novel time API that exposes clock uncertainty. This API and its implementation are critical to supporting external consistency and a variety of powerful features: nonblocking reads in the past, lock-free snapshot transactions, and atomic schema changes, across all of Spanner.
Conference Paper
There are different types of encryption techniques are being used to ensure the privacy of data transmitted over internet. Digital Signature is a mathematical scheme which ensures the privacy of conversation, integrity of data, authenticity of digital message/sender and non-repudiation of sender. Digital Signature is embedded in some hardware device or also exits as a file on a storage device. Digital Signature are signed by third party some certifying authority. This paper describe the different key factor of digital signature with the working of digital signature, through various methods and procedures involved in signing the data or message by using digital signature. It introduces algorithms used in digital signatures.
Article
Suppose a population containing items classified either as defectives or as non-defectives. This article presents a new group testing method to detect all defectives in the population. The proposed method applies either when the actual number of defectives (or an upper bound) is known or when its probability distribution is known. The method is surprisingly simple yet compares favorably with other existing methods by the number of tests required to detect all defectives.
Article
We obtain randomized algorithms for factoring degree n univariate polynomials over F_q requiring O(n^(1.5+o(1)) log^(1+o(1))q + n^(1+o(1)) log^(2+o(1))q) bit operations. When log q < n, this is asymptotically faster than the best previous algorithms [J. von zur Gathen and V. Shoup, Comput. Complexity, 2 (1992), pp. 187–224; E. Kaltofen and V. Shoup, Math. Comp., 67 (1998), pp. 1179–1197]; for log q ≥ n, it matches the asymptotic running time of the best known algorithms. The improvements come from new algorithms for modular composition of degree n univariate polynomials, which is the asymptotic bottleneck in fast algorithms for factoring polynomials over finite fields. The best previous algorithms for modular composition use O(n^((ω+1)/2)) field operations, where ω is the exponent of matrix multiplication [R. P. Brent and H. T. Kung, J. Assoc. Comput. Mach., 25 (1978), pp. 581–595], with a slight improvement in the exponent achieved by employing fast rectangular matrix multiplication [X. Huang and V. Y. Pan, J. Complexity, 14 (1998), pp. 257–299]. We show that modular composition and multipoint evaluation of multivariate polynomials are essentially equivalent, in the sense that an algorithm for one achieving exponent α implies an algorithm for the other with exponent α+o(1), and vice versa. We then give two new algorithms that solve the problem near-optimally: an algebraic algorithm for fields of characteristic at most n^(o(1)), and a nonalgebraic algorithm that works in arbitrary characteristic. The latter algorithm works by lifting to characteristic 0, applying a small number of rounds of multimodular reduction, and finishing with a small number of multidimensional FFTs. The final evaluations are reconstructed using the Chinese remainder theorem. As a bonus, this algorithm produces a very efficient data structure supporting polynomial evaluation queries, which is of independent interest. Our algorithms use techniques that are commonly employed in practice, in contrast to all previous subquadratic algorithms for these problems, which relied on fast matrix multiplication.