Content uploaded by Huawei Huang
Author content
All content in this area was uploaded by Huawei Huang on Apr 20, 2021
Content may be subject to copyright.
MVCom: Scheduling Most Valuable Committees for
the Large-Scale Sharded Blockchain
Huawei Huang∗, Zhenyi Huang∗, Xiaowen Peng∗, Zibin Zheng∗Song Guo†
∗School of Computer Science and Engineering, Sun Yat-Sen University, Guangzhou, China
huanghw28@mail.sysu.edu.cn; {huangzhy83; pengxw3}@mail2.sysu.edu.cn
†Department of Computing, The Hong Kong Polytechnic University, Hong Kong. song.guo@polyu.edu.hk
Corresponding author: Zibin Zheng, zhzibin@mail.sysu.edu.cn
Abstract—In a large-scale sharded blockchain, transactions are
processed by a number of parallel committees collaboratively.
Thus, the blockchain throughput can be strongly boosted. A
problem is that some groups of blockchain nodes consume
large latency to form committees at the beginning of each
epoch. Furthermore, the heterogeneous processing capabilities of
different committees also result in unbalanced consensus latency.
Such unbalanced two-phase latency brings a large cumulative age
to the transactions waited in the final committee. Consequently,
the blockchain throughput can be significantly degraded because
of the large transaction’s cumulative age. We believe that a
good committee-scheduling strategy can reduce the cumulative
age, and thus benefit the blockchain throughput. However, we
have not yet found a committee-scheduling scheme that works
for accelerating block formation in the context of blockchain
sharding. To this end, this paper studies a fine-balanced tradeoff
between the transaction’s throughput and their cumulative age
in a large-scale sharded blockchain. We formulate this tradeoff
as a utility-maximization problem, which is proved NP-hard. To
solve this problem, we propose an online distributed Stochastic-
Exploration (SE) algorithm, which guarantees a near-optimal
system utility. The theoretical convergence time of the proposed
algorithm as well as the performance perturbation brought by the
committee’s failure are also analyzed rigorously. We then evaluate
the proposed algorithm using the dataset of blockchain-sharding
transactions. The simulation results demonstrate that the pro-
posed SE algorithm shows an overwhelming better performance
comparing with other baselines in terms of both system utility
and the contributing degree while processing shard transactions.
Index Terms—Sharded Blockchain, Committee Scheduling
I. INTRODUCTION
Bitcoin adopts the classical Byzantine consensus protocol,
i.e., the Nakamoto consensus [1], in which miners generate
new blocks by solving a puzzle. When one of the miners
calculates a nonce value that fits the required difficulty, a
new block is generated. Through varying the difficulty, Bitcoin
maintains a stable rate of block-generation. However, such
Proof-of-Work (PoW) protocol does not scale well on the
transaction throughput. Thus, Luu et al. [2] proposed a new
distributed agreement protocol, called Elastico, for the permis-
sionless blockchains. The key idea of Elastico is to partition
the blockchain network nodes into smaller committees. Each
committee consists of a group of miners, who collaboratively
process a disjoint set of transactions, which is called a shard.
In the Elastico-based sharded blockchain as shown in Fig.
1, we name an epoch as the procedure a new global block
In this epoch, C1~C4are member committees, C5is the final committee.
BFT
C1
Shard 1
BFT
C2
Shard 2
BFT
C3
Shard 3
BFT
C4
Shard 4
BFT
C5
Final
block
Roo t Chain
…
Run ning Po W
at each node
Stage of
committee formation
Stage of
intra -committee consen sus
Stage of
final consen sus
Fig. 1. Motivation: at each epoch of Elastico [2], the committee-formation la-
tency plus the intra-committee consensus latency consumed by some straggler
committees will slow down the block formation in the final committee.
is generated in the root chain. Referring to Elastico [2], each
epoch generally includes the following 5 stages:
1) Committee Formation. Some groups of processors, i.e.,
miners, will be chosen to form committees following the
PoW-based election mechanism.
2) Overlay Configuration. Processors are configured to dis-
cover and identify each other by exchanging the commit-
tee membership.
3) Intra-Committee Consensus. Processors in each commit-
tee achieve an agreed set of transactions, i.e., a shard, by
running a standard Byzantine protocol such as PBFT [3].
4) Final Consensus. Next, the shards generated by all com-
mittees will be submitted to the final committee, which
yields a new global block for the root chain.
5) Epoch Randomness Refreshing. Finally, the final commit-
tee generates a set of random strings, which are used to
help other committees form new ones in the next epoch.
Motivation. We first define a two-phase latency for each
committee consisting of its formation latency as well as the
intra-committee consensus latency. In order to know how long
of such the two-phase latency, we implemented the Elastico
sharding protocol and measured the latency spending on the
stages of committee-formation and intra-committee consensus.
The results are shown in Fig. 2. Firstly, Fig. 2(a) demonstrates
the two-phase latency while changing the network size, i.e., the
number of nodes participating in committee-formation. We see
that the formation latency consumes a large portion comparing
with the consensus latency, and it increases linearly when the
(a) Two-phase latency. (b) CDF of latency.
Fig. 2. The measurement of two-phase latency, which includes committee-
formation latency and intra-committee consensus latency, under Elastico [2].
network size scales out. Then, Fig. 2(b) shows that both the
two terms of latency show a random distribution within a
particular range, respectively.
For the stage 3) of Elastico protocol [2], we have shown
that the intra-committee consensus latency varies among dif-
ferent committees. This is because committees perform their
local consensus with heterogeneous network connections and
transaction-verification capabilities. Thus, when committees
finish their local PBFT protocol and submit their individual
shard blocks to the final committee, the intra-consensus la-
tency will show an unbalanced distribution. Therefore, the
unbalanced two-phase latency will result in some stragglers at
the final committee, and thus prevent the final committee from
the early beginning of final consensus. Considering these facts
mentioned above, the tradeoff between the contributing degree
of each committee (measured by the number of transactions
included in this committee’s shard) and its two-phase latency
should be carefully made.
We further show a motivation example using Fig. 1. Suppose
that C5is the final committee, and C1-C4are other member
committees. Each member committee spends a certain two-
phase latency to achieve a local consensus, and then notifies
the final committee C5towards the final consensus. However,
due to the diverse two-phase latency yielded in different
committees, some stragglers (e.g., C3) may exist when the
final committee intends to start the final consensus. As a
result, the large two-phase latency harms the freshness of the
transactions included in the committee’s shards. The freshness
of transactions can be measured by their cumulative age. A
large cumulative age significantly degrades the transaction
throughput of the sharded blockchain. On the other hand, if
the total number of transactions packaged in C3is the largest
during the 4 committees. Thus, a dilemma is that whether
the final committee shall wait for the straggler C3. To solve
this dilemma for each epoch, a good committee-scheduling
that considers both the cumulative age of transactions and the
total number of transactions contributed by this committee is
needed. However, we have not yet found such a scheduling
for the large-scale sharded blockchain from the literature.
Motivated by this tradeoff, this paper studies the committee-
evaluating mechanism in the context of such 5-stage sharding
protocol [2]. The objective of our committee scheduling is to
maximize the total number of processed transactions in each
epoch, and to minimize the transaction’s cumulative age of all
shards submitted to the final committee, simultaneously.
Contributions.
•We focus on how to accelerate the block formation for the
large-scale sharded blockchain, when the final committee
executes the final consensus for the root chain.
•We propose an online distributed algorithm that can
schedule the most valuable committees for the sharded
blockchain by exploiting the stochastic-exploration opti-
mization technique. The algorithm can also handle the
dynamic joining and leaving events of member commit-
tees. The theoretical convergence time and the perfor-
mance perturbation brought by committee’s failure are
also analyzed rigorously.
•We conduct trace-driven simulations using the dataset of
real-world blockchain transactions. The simulation results
show that the proposed algorithm can select the most
valuable committees to participate in the final committee,
thus accelerating the block formation by eliminating the
straggler shards in each epoch.
The organization of the remaining of this paper is described
as follows. Section II reviews the related work. Section III
presents the system model and problem formulation. Section
IV elaborates the proposed stochastic-exploration algorithm.
In Section V, we analyze the theoretical properties of the
proposed algorithm when handling the dynamic committee-
failure events. Then, Section VI conducts performance evalu-
ation. Finally, Section VII concludes this paper.
II. TH E REL ATED WORK
Bitcoin’s single leader consensus protocol suffers from
significant transaction throughput and network scalability is-
sues due to the computational requirements of its PoW-based
leader-selection strategy. To overcome this issue, a number of
studies [4] proposed the sharding-based technique. Essentially,
sharding technique is one of the committee-based approaches
(e.g., Elastico [2]) that partition the whole group of trans-
actions into different shards and select multiple committees
to process different bunches of transactions in parallel. The
synchrony [5] gains growing attention in some protocol details
of blockchains. For example, LibraBFT incorporates a novel
round synchronization mechanism that provides bounded com-
mit latency under synchrony. ACCEL [6] also proposed an
accelerating mechanism towards the transaction-verification in
the latency-bounded blockchain networks.
On the other hand, we also review more sharding blockchain
protocols here to show the difference of our study. Since
Elastico [2], many other sharding blockchain protocols [5],
[7]–[13] have been proposed. Some representative studies are
reviewed as follows. For example, Zamani et al. [7] pro-
posed a Byzantine-resilient sharding protocol, namely Rapid-
chain, for permissionless blockchains. RapidChain improves
the throughput by exploiting an efficient cross-shard verifi-
cation approach that avoids the flooding broadcast messages.
Aiming to improve the efficiency of cross-shard transactions,
Amiri et al. [8] proposed SharPer, a permissioned blockchain
system for improving the scalability of blockchains by dividing
and reallocating different data shards to various network
clusters. Furthermore, Nguyen et al. [10] proposed a new
shard-placement paradigm different from the random sharding,
called OptChain, which can minimize the number of cross-
shard transactions. Similar to OptChain, Dang et al. [5]
proposed a general distributed transaction protocol for sharded
blockchains, aiming to design an efficient shard formation
approach based on the synchronous sharding protocol Elastico.
In summary, the state-of-the-art sharding protocols improves
the scalability or the throughput by proposing new methods
to handle the cross-shard transactions or to reduce the huge
across-shard data migration. In contrast, we attempt to improve
the throughput of sharded blockchains by eliminating the
unbalanced two-phase latency of member committees when
executing the final consensus in the final committee.
III. PROB LE M STATEM EN T
A. System Model
We consider a large-scale sharded blockchain that executes
in a number of epochs denoted by J={1,2,· · · , j, · · · }.
For each epoch j∈J, as illustrated in Fig. 1, there is a
committee leader called the final committee, who is in charge
of generating the final block for the root chain. All the other
committees are called member committees in this paper. All
shards generated by member committees are denoted by the
set Ij={1,2,· · · , i, · · · },∀j∈J. At the beginning of
each epoch, the final committee waits for two most important
features from each member committee, i.e., a) the two-phase
latency, and b) the number of transactions packaged in the
shard delivered by a member committee. In reality, those
two features, i.e., the two-phase latency and the number of
transactions (shorten as TXs) can be quickly reported to the
final committee. The set of individual two-phase latency of
member committees is denoted by Lj={lj
i, i ∈Ij},∀j∈J,
while the set of number of TXs included in each shard is
written as Sj={sj
i, i ∈Ij},∀j∈J. Note that, the size
of Ljgrows when new shards are submitted.
For each epoch, we assume the sharding protocol can set
a deadline (DDL) for all member committees. At a specified
DDL, the final committee needs to choose a part of the shards
generated from all member committees to form a final block
by running a byzantine consensus protocol [3]. Note that, we
do not change the byzantine consensus for the final committee.
What we attempt to do is to help the final committee identify
the most valuable group of shards generated by member
committees before running the byzantine consensus protocol.
Thus, we define a binary variable xj
i∈ {0/1}to denote
whether a shard i∈Ijis permitted to participate in the stage
of final consensus at an epoch j∈J:
xj
i=
1,if final committee permits the consensus shard
i∈Ijfor final PBFT at epoch j∈J;
0,otherwise.
C2
C3
C1
Ci
If the DDLof epoch
jis placed here
Ciwill not
be permitted
…
…
DDLfor
epoch j+1
C1will not
be permitted
Suppose C1and C3are final committees of epoch jand j+1, respectively.
Time
lj3
lj2ljilj+11
lj+12
lj+1i
Fig. 3. Handling the two-phase latency across 2 successive epochs.
In each epoch, the predefined DDL is the maximum tolera-
ble time to wait for a sufficient number of member committees
to submit their shards. Note that, this paper is not trying to tell
how to set such the DDL. Instead, the proposed approach helps
the final committee evaluate what groups of member commit-
tees should be permitted to participate in the final consensus
before a specified DDL. In particular, when the transaction
capacity of the final block is limited, such DDL should be
shorten as much as possible to accelerate the final block’s
generating. We then let tj∈ {min Lj,· · · ,max Lj}represent
the given DDL for assessing the received shards at epoch
j∈J. In practice, the DDL can be set to the moment when
a predefined percentage of committees submit their shards to
the final committee. Thus, we have tj= maxi∈Ijlj
i,∀j∈J.
For example, as shown in Fig. 1, suppose that the two-phase
latency of committees C1,C2,C3, and C4is 800, 900, 1200,
and 1000 seconds, respectively. If the DDL is set to 1000th
second, C3will become a straggler that will be very possibly
not considered to be included by the final committee for the
current epoch.
Another question is how to handle the refused committees in
the next epoch? As shown in Fig. 3, if Ciwas not permitted in
epoch j, its two-phase latency will be updated by reducing the
previous DDL in epoch j+1. Thus, a refused committee will
be more likely to be permitted with a new smaller two-phase
latency at epoch j+1.
Now we can calculate the cumulative age of all TXs
packaged in each shard i∈Ijusing the summed waiting
time counted from the beginning of committee-formation to
the specified DDL. Thus, it is written as follows:
Πj
i=xj
i(tj−lj
i) = xj
i(max
k∈Ijlj
k−lj
i),∀i∈Ij, j ∈J. (1)
B. Offline-Version Problem Formulation
The goal of the problem we study includes two objectives:
(1) to maximize the total number of TXs permitted by the
final committee, and (2) to minimize the cumulative age of
those TXs. The first objective is positively related to the root-
chain throughput, while the second one is reversely related to
the freshness of the TXs packaged in the permitted commit-
tee shards. Therefore, we have the following offline-version
utility-maximization problem: scheduling the Most Valuable
Committees (MVCom) for the large-scale sharded blockchain.
MVCom : max U=X
j∈JX
i∈Ij
(α·xj
isj
i−Πj
i)(2)
s.t. X
i∈Ij
xj
i≥Nmin,∀j∈J. (3)
X
i∈Ij
xj
isj
i≤ˆ
C, ∀j∈J. (4)
Variables: xj
i∈ {0,1},∀i∈Ij,∀j∈J. (5)
In objective function (2), αis a tunable weight measuring
the bias of the total number of permitted TXs. Const. (3)
specifies that the number of chosen member committees
should be greater than a predefined minimum number Nmin
for each epoch. In const. (4), ˆ
Crepresents the capacity of
the number of TXs that can be packaged in the final block at
each epoch. Thus, const. (4) claims that the capacity of the
final block should not be violated for each epoch j∈J.
C. NP-hardness Proof
Lemma 1: The MVCom problem is NP-hard.
Proof: The NP-hardness of MVCom problem can be
proved by reducing from the well-known Binary Knapsack
(BKP) Problem [14], which is defined as follows. Given a
knapsack with a capacity ¯
C, and a set Kof item types. Each
type of item k∈ K is with an identical weight wkand an
identical value pk. Each type khas only one item. The goal
of BKP is to find a subset of items to place into the knapsack,
such that the total value of selected items is maximized, and
the total weight of selected items is restricted by the resource
capacity of the knapsack.
BKP: max X
k∈K
pk·xk
s.t. X
k∈K
wk·xk≤¯
C,
Variables: xk∈ {0,1},∀k∈ K,
where xkis a 0-1 variable representing whether to select type-
kitem to package into the knapsack.
We now construct a problem based on the original BKP.
First we have the following reconstruction parameters: pk=
s1
k−α(t1−l1
k),∀k∈ K;wk=s1
k,∀k∈ K; and ¯
C=ˆ
C. We
then have the following BKP-New problem:
BKP-New: max X
k∈K
(α·s1
k−(t1−l1
k)) ·x1
k
s.t. X
k∈K
s1
k·x1
k≤ˆ
C,
Variables: x1
k∈ {0,1},∀k∈ K.
Discussion: Through the reduction shown above, we can
see that such the BKP-New problem is essentially a simplified
case of the proposed MVCom problem when J={1}
and Nmin = 0. The mapping from the original BKP to the
simplified MVCom problem, i.e., BKP-New, is completed in a
polynomial time. Due to the NP-hardness of BKP, we conclude
that the MVCom problem is NP-hard, too.
IV. ONLINE DISTRIBUTED STOCHASTIC-EXP LORATI ON
ALGORITHM
A. Advantages of the Proposed Online Distributed Algorithm
In the real-world sharded blockchain, for each epoch, it is
already late to start the evaluation of which group of shards
are worthy packing in the final block when all member com-
mittees have submitted their shards. Thus, to shorten the block
formation at each epoch in real time, we strive for designing
an online distributed algorithm by adopting the framework of
Stochastic Exploration (SE) technique [15]. On one hand, such
SE-based algorithm can make real-time decisions to terminate
the current epoch by eliminating both the stragglers whose
two-phase latency exceeds the predefined DDL, and the group
of shards having low values. Thus, the final committee can
start the final consensus earlier than the original schedule. On
the other hand, the proposed SE algorithm can execute some-
where independently beyond the sharded blockchain system in
a distributed manner. This is because the proposed algorithm
consists of multiple independent threads that can run in either
one single machine or multiple distributed machines, as long
as those independent threads can communicate with each other
with a low delay. The distributed execution manner improves
the algorithm’s robustness. Furthermore, the proposed online
algorithm can hand the committee’s dynamic joining and
leaving events.
In the following, we describe the two major stages of the
algorithm design: log-sum-exp approximation and implemen-
tation of Markov Chain.
B. Log-Sum-Exp Approximation
For each j∈J, let f={xi,∀i∈Ij}denote a feasible
solution to the MVCom problem, and F={f1, f2, ...}the
set of all feasible solutions to the MVCom problem. For the
convenience of presentation, we denote by Ufthe objective
function (2) corresponding to a given solution f. To better
understand the log-sum-exp approximation, let each solution
f∈ F associate with a probability pf, which indicates the
percentage of time the solution fis in use by the system.
Then, MVCom can be approximated through applying the ap-
proximation technique proposed in [15]. We call the problem
approximated by the log-sum-exp technique the MVCom(β)
problem, where β∈(0,∞)is defined as a positive constant
related to the approximation performance. The motivation
behind such time-averaged approximation is that it potentially
leads to stochastic solutions. In those stochastic solutions, a
near-optimal solution can be reached with a well-designed
transition rate matrix in the Markovian space constructed upon
the solution set F.
Let p∗
f(f∈ F)be the optimal solution to the MVCom(β)
problem. Then, by solving the Karush-Kuhn-Tucker (KKT)
conditions [16] of MVCom(β), we receive the stationary
probability of each solution f∈ F:
p∗
f=exp(βUf)
Pf0∈F exp(βUf0),∀f∈ F .(6)
Remark 1: With the log-sum-exp approximation approach
described above, we obtain an approximate version of the
MVCom problem with the assistance of an entropy term
1
βPf∈F pflog pf. If we can assign different solutions into
different time shares according to the stochastic probability
distribution p∗
fshown in (6), then MVCom can be solved
approximately within an approximation-loss bound 1
βlog |F|.
C. Markov Chain Design
Next we design a Markov chain with a state space consisting
of all feasible solutions f∈ F and a stationary distribution
represented by p∗
fgiven in (6). Since the system operates
under different solutions, the transition between two states in
the designed Markovian space indicates replacing a permitted
shard generated by any member committee. Therefore, in the
implemented Markov chain, if the transitions among states can
be trained to converge to the desired stationary distribution p∗
f,
the system can achieve a near-optimal performance.
To construct a time-reversible Markov chain with the sta-
tionary distribution p∗
f, we need to define f, f 0∈ F as two
states of the Markov chain, and let qf,f 0be the non-negative
transition rate from fto f0. As illustrated in Fig. 4, the
following two conditions must be guaranteed when a state
transits to another: (a) in the Markov chain, any two states
are mutually reachable, and (b) the detailed balance equation
p∗
fqf,f 0=p∗
f0qf0,f [17], [18] must be satisfied for any pair of
adjacent states ∀f, f 0∈ F. Our design of the desired Markov
chain is then described as follows.
1) State-Space Structure: Recall that a solution f∈ F
represents a set of selected committee shards to participate
in the block formation in the final committee. Initially, we
set the transition rate between two solutions fand f0to 0,
until they satisfy the following two conditions. Condition a):
|f∪f0|−|f∩f0|= 2; Condition b): f∪f0−f∩f0∈ {xi},
where iis the shard that causes the state transition f→f0.
That is, as shown in Fig. 4, only if the adoption decisions
of any pair of shards is swapped, i.e., xj
iturns from 1 to 0
and the other involved variable turns from 0 to 1, the state f
transits to f0.
2) Design of Transition Rate: In our design, we let the
transition rate qf,f 0positively correlate to the system-utility
difference under two adjacent states fand f0in the state
matrix of Markov chain. In particular, the transition rate qf,f0
is defined as follows.
qf,f 0=1
exp(τ)exp(1
2β(Uf0−Uf)),∀f, f 0∈ F,(7)
where τis a conditional constant used to avoid the zero-floored
computing error of exp(.) function. It can be seen when Uf0−
Ufgrows, the transition rate also becomes larger, and vice
versa. Thus, such the design of the transition rate aims to drive
the system towards a better solution f0with a larger utility.
D. Implementation of Stochastic-Exploration (SE) Algorithm
The implementation of the proposed SE algorithm based
on Markov chain theory with a well-designed transition-
rate matrix is presented in Alg. 1. The interactions between
Algorithm 1: Online Distributed Stochastic-Exploration Algorithm
Input : Ij, j ∈J
Output: A converged solution f
1if the # of arrived committees exceeds Nmin and Pi∈Ijsj
i>ˆ
Cthen
2Start to listen to dynamic events for this epoch j
3for each n∈ {1,2, ..., |Ij| − 1}do
4fn← ∅
5fn←Initialization(Ij, n)
6hTn,˜
i,¨
ii ← Set-timer(fn)
7while Algorithm is not converged do
8/*Online Dynamics-Handling*/
9if any committee ijoins in or leaves (fails) then
10 Update Ijand n
11 f←Initialization(Ij, n)
12 Set-timer(f)
13 /*Listen to State-Transit*/
14 if Tnexpires then
15 x˜
i←0;x¨
i←1
16 Set-timer(fn)
17 Broadcast RESET signals to other solutions f∈ F\{fn}
18 /*Listen to RESET Signals*/
19 if a solution freceives a RESET signal then
20 Set-timer(f)
21 /*Listen to Convergence*/
22 for n∈ {1,2, ..., |Ij| − 1}do
23 λ←λ∪ {fn}
24 if utility converges then
25 λ←λ∪ {f|Ij|}, if ˆ
Cis satisfied
26 f←f: arg max{Uf, f ∈λ}
27 Break
28 /*Listen to Termination Signal*/
29 if final committee receives more than a specified maximum
percentage Nmax of all member committees then
30 Stop listening to the member committees newly arrived
Algorithm 2: Initialization()
Input : Ij(j∈J),n(∈ {1,2, ..., |Ij| − 1})
Output: fn(n∈ {1,2, ..., |Ij| − 1})
1// Generate the initial solutions with different # of shards for a
specified n∈ {1,2, ..., |Ij| − 1}do
2fn← ∅;I0← ∅
3while Permitted shards in I0not satisfy Cons. (4) do
4I0←randomly pick nnumbers from Ij
5for i∈I0do
6xj
i←1
7for i∈Ij\I0do
8xj
i←0
9fn={xj
i,∀i∈Ij}
committees and the proposed algorithm are shown in Fig. 5, in
which the proposed algorithm executes on several independent
parallel threads, each runs a set of feasible solutions {fn}(n=
1,2, ..., |Ij| − 1) and their timers {Tn}(n= 1,2, ..., |Ij| − 1).
All the parallel threads communicate with committees in real
time only for sharing a very limited state information such as
the RESET signals and the current system utility. That is why
we claim that the proposed algorithm can execute in an online
distributed manner. Typically, each feasible solution follows
a general state machine shown in Fig. 6, using which we
Algorithm 3: Set-timer()
Input : fn(n∈ {1,2, ..., |Ij| − 1})
Output: hTn,˜
i,¨
ii
1˜
i←i: arg(if xj
i= 1,xj
iis randomly chosen from fn)
2¨
i←i: arg(if xj
i= 0,xj
iis randomly chosen from fn)
3Uf←current system utility under fn
4Uf0←estimated system utility if the values of xj
˜
iand xj
¨
iare swapped
5Generate a random exponentially distributed timer Tnfor fnwith the
mean value that is equal to
exp(τ−1
2β(Uf0−Uf))
|Ij| − n(8)
6Let timer Tnbegin to count down
qf,f
’
...
f
’
f
...
qf
’
,f
f f
’
: swapping the adoption decisions of any pair of shards
Fig. 4. Transition between two adjacent states (i.e., solutions).
describe the design details of the SE algorithm as follows.
•Bootstrap: Once the condition in line 1 is triggered,
Alg. 1 starts to listen to committee’s dynamic events.
•Initialization(): According to the live committee
shards i∈Ij(j∈J)in the sharded blockchain system,
the main algorithm (i.e., Alg. 1) creates an associated
solution fn(n∈ {1,2, ..., |Ij|− 1})for each solution case
where the # of permitted shards is n. Note that, the initial
solution fnyielded by Initialization() function
must satisfy Cons. (4), as shown in lines 3-4 of Alg. 2.
•Set-timer(): This function mainly generates a timer
for a given solution. Let fand f0denote the currently
adopted and the next potential solutions, respectively.
This function first chooses two random indices of a pair
of committees ˜
iand ¨
ifrom the solutions f(i.e., the given
fnshown in Alg. 3) and f0, respectively, with xj
˜
i= 1
and xj
¨
i= 0. Next, the algorithm generates a timer Tnfor
fnwith a mean value that is equal to exp(τ−1
2β(Uf0−Uf))
|Ij|−n.
Then, timer Tnbegins to count down to zero. The other
details are referred to Alg. 3.
•State Transit: When a timer Tnexpires (i.e., counts
down to 0), the main algorithm swaps the chosen pair of
member committees ˜
iand ¨
i. As a result, the variables are
changed to xj
˜
i=0 and xj
¨
i=1. Next, the main algorithm re-
peats Set-timer() for the solution fn, and broadcasts
a RESET signal to all the other solutions f∈ F\{fn}.
•RESET: When a solution fnreceives a RESET signal,
it refreshes its timer Tnreferring to equation (8) with
the updated system utility Uf0, which is yielded by the
previous variable swapping shown in line 15 of Alg. 1.
•Check Convergence: When utility converges, we say
that the algorithm yields a converged solution. In practice,
a converged solution can be received by specifying an
empirical number of running iterations.
•Termination: Once the condition (line 29) is satisfied,
Alg. 1 terminates.
…
Real-time message exchange between committees and algorithm
Committees
C1C2
Ci
C3
T1T2Tn
f1f2fnAtimer expires
A converged solution
for each epoch
Algorithm
Solutions :
Timers :
… …
C0
f=max arg Uf∈{f1, f1,…, fn}
Final
committee
Next iteration
1) # of TXs
2) latency lji
Fig. 5. Interactions between the committee and the distributed algorithm.
Initialization() Set-timer() State Transit
&Broadcast RESET
When receives a
RESE T signal.
Any timer counts
down to 0
Fig. 6. State machine for each parallel feasible solution in Alg. 1.
E. Time-Reversible Property of the Proposed Algorithm
We now analyze the theoretical issues of the proposed
stochastic-exploration algorithm. The first step is to prove that
the proposed algorithm can realize a time-reversible Markov
chain with the stationary distribution given in (6) through
the following properties. First, all the transition rates of the
state-transition in the devised Markov chain are apparently
observed finite from equation (7). We then have the following
two lemmas.
Lemma 2: The constructed Markov chain is irreducible.
Proof: The designed state space implies all solutions are
reachable from each other within a finite number of transitions.
Thus, the constructed Markov chain is irreducible.
Lemma 3: The detailed balance equations p∗
fqf,f 0=
p∗
f0qf0,f [17], [18] hold in the constructed Markov chain.
Proof: In Alg. 3, we set each feasible solution to a random
timer following an exponential distribution. Now we analyze
the transition rate when adopting a committee shard.
At an epoch j∈J, let P rf→f0denote the transition
probability from the current state fto the next state f0when
a timer Tnexpires (Tnis the timer of fn), and define Nfas
the set of all neighbor solutions given a state f. According to
Alg. 3, the current shard adoption solution fncan transit to
any other neighbor state f0∈Nfnwith equal probabilities.
When the main algorithm selects the next feasible solution,
there are |Nfn|=P|Ij|−1
n=1 |Ij|−n
1choices. Thus, P rf→f0=
1/|Nfn|= 1/P|Ij|−1
n=1 (|Ij| − n),∀n= 1,2, ..., |Ij| − 1.
Referring to (8), given a current state f, each timer counts
down with the rate:
ρf=1
exp(τ)·|Ij| − n
exp(1
2β(Uf0−Uf)),∀f0∈Nfn.(9)
Thus, Alg. 1 leaves state fwith a cumulative rate P|Ij|−1
n=1 ρf.
The transition rate from fto f0is then calculated as:
qf,f 0=
|Ij|−1
X
n=1
ρf×P rf→f0
= exp(−τ)·exp(1
2β(Uf0−Uf)).
(10)
Similarly, the transition rate qf0,f = exp(1
2β(Uf−Uf0)−
τ). Finally, we have p∗
fqf,f 0=p∗
f0qf0,f . That is, the detailed
balance equations hold.
Based on Lemma 2 and Lemma 3, and referring to [17],
[18], we conclude that Alg. 1 realizes a time-reversible Markov
chain with the stationary distribution shown in (6).
F. Convergence Time of the Proposed Algorithm
Next, we analyze the convergence property of the proposed
Alg. 1. In general, the convergence time of a Markov chain is
measured by the mixing time [18] of Markovian random field.
Let Ht(f)denote the probability distribution of all states in
Fat time tif the initial state is given as f. Recall that p∗is
the stationary distribution of the designed Markov chain. We
define the mixing time as follows:
tmix() := inf {t≥0 : max
f∈F kHt(f)−p∗kT V ≤},(11)
where >0 represents the gap between the performance of the
optimal solution and that of the converged solution, while the
term k.kT V denotes the total variance distance between the
probability distributions Ht(f)and p∗.
For each epoch j∈J, we know the facts that the total size
of feasible solution set is η=P|Ij|
n=0 |Ij|
n=2|Ij|, and the
maximum number of choices when transiting to a neighbor
solution is |Nfn|=P|Ij|−1
n=1 |Ij|−n
1=1
2(|Ij|2− |Ij|). Then,
we have the following conclusions on the convergence times
of the proposed algorithm.
Theorem 1: Given a set of member committees, and let
Umax = max
∀f∈F Uf,Umin = min
∀f∈F Uf, the mixing time tmix()
for each epoch of the constructed Markov chain in Algorithm
1 is bounded by:
tmix()≥exp[τ−1
2β(Umax −Umin)]
|Ij|2− |Ij|ln 1
2,(12)
and
tmix()≤4|Ij|(|Ij|2− |Ij|) exp[ 3
2β(Umax −Umin) + τ]·
[ln 1
2+1
2|Ij|ln 2 + 1
2β(Umax −Umin)].
(13)
The proof can be conducted by referring to the uniformiza-
tion technique [15] following the routine presented in [18],
[19]. Thus, we omit the proof details due to space limitation.
Remark 2: First, we see that the upper bound of conver-
gence time of the proposed Algorithm 1 scales in an order
of O4|Ij|·Oexp (β)·Oln 1
. Thus, if we purse a very
small performance loss (i.e., a small ) to get a converged
solution, the upper bound of convergence time also becomes
large, and vice versa. We then discuss the trade-off between the
optimality loss 1
βlog |F| and the mixing time under different
values of β. Apparently, as β→ ∞, the optimality loss
approaches 0, but the upper bound of mixing time scales in
an order of O4|Ij|·Oln 1
and approaches infinity. That
is, a large βresults in a very slow convergence.
G:
F: the solution space of the original Markov chain
F\G:
C1
C5
C4
C2
C3
∞
f1f2…f|Ij|-1
…
~
f1f2
~f|Ij|-1
~
f|Ij|
A committee C3can be found
becoming failed by testing its
connection latency as infinity
∞
∞
Fig. 7. Suppose that C3fails due to an attack or a network failure, its
connection latency can be tested as infinity. Thus, the original solution space
should be trimmed by eliminating all states that relates to C3. That is, in
space F\G, every single trimmed state ˜
fn(n= 1,2, ..., |Ij| − 1) associates
with the failed committee C3.
V. THEORETICAL ANALYS IS O F COMMITTEE FAILURE
A. When a Member Committee Fails
When running the online distributed SE algorithm, we are
particularly interested in what will happen if a member com-
mittee fails. For example, a committee might become offline
if it is under a DoS (Denial of Service) attack. In practice, the
final committee can perceive a failed member committee by
using the ping network protocol. Once a member committee
is found having a large ping delay, we say that the committee
can be viewed as failed. Then, we intend to study the following
two crucial questions.
•Can we still use the proposed stochastic-exploration al-
gorithm when a member committee fails?
•What is the performance perturbation brought by the
failed committee?
These two questions implicate the practicality and robust-
ness of the proposed algorithm. Thus, we give the related
theoretical findings in the following subsection.
B. Performance Perturbation Brought by a Failed Committee
When a committee fails, the feasible solutions associated
to this committee become invalid. That is, those involved
solutions should be removed from the original solution space
F, which is then turned to a trimmed space G. We record
those disappeared solutions in set F\G. As shown in Fig. 7,
the longest feasible solution f|Ij|is first removed. After that,
all the candidate solutions that include the failed committee,
i.e., ˜
fn(n= 1,2, ..., |Ij| − 1), are migrated to F\G. On the
other hand, the associated transition rates also become invalid
consequently. It can be proved that the new Markov chain is
still irreducible when removing the disappeared solutions as
well as their corresponding transition rates. Thus, Algorithm
1 only needs to keep running with the trimmed solution space
and the updated transition rate matrix in real time.
We then study the performance perturbation brought by
the failed committee. First, the stationary distribution of the
solutions in the trimmed Markov chain is denoted by q∗:
[q∗
g(u), g ∈ G]. We define another vector ˜
q: [˜qg(u), g ∈ G]
to represent the solution distribution g∈ G exactly when a
member committee fails. To measure the distance between the
two vectors q∗and ˜
q, we define dT V (q∗,˜
q)as their total
variation distance [19]. Thus, we have the following lemma.
Lemma 4: The total variation distance between q∗and ˜
q
is bounded by
dT V (q∗,˜
q),1
2X
g∈G
|q∗
g−˜qg| ≤ 1
2.(14)
Proof: Referring to (6), the stationary distribution of the
solutions in the trimmed Markov chain is:
q∗
g=exp(βUg)
Pg0∈G exp(βUg0),∀g∈ G.(15)
Next, we analyze the instant distribution of solutions g∈ G
in the trimmed Markov chain, i.e., ˜
q. It is not hard to see that,
the survived solutions g∈ G still follow the original stationary
distribution exactly when a committee fails. Thus,
˜qg=exp(βUg)
Pf0∈F exp(βUf0),∀g∈ G.(16)
Apparently, we have q∗
g≥˜qg,∀g∈ G. Therefore,
q∗
g−˜qg≤exp(βUg)
Pg0∈G exp(βUg0)−exp(βUg)
Pf0∈F exp(βUf0)
=exp(βUg)
Pg0∈G exp(βUg0)−exp(βUg)
Pf0∈G exp(βUf0)+∆,∀g∈ G,
(17)
where ∆ = Pˇg∈F \G exp(βUˇg).
Thus, the dT V (q∗,˜
q)is calculated as follows.
dT V (q∗,˜
q) = 1
2X
g∈G
|q∗
g−˜qg|=X
g∈go
(q∗
g−˜qg),(18)
where go,{g∈ G :q∗
g≥˜qg}, and go⊂ G.
Furthermore, the system utilities Uf∈F are independent to
each other, and follow the normal distribution. That is, Uf∈F
are independent and identically distributed (i.i.d.) discrete ran-
dom values and the expectation of system utility exists within
the finite solution space F. Let this expectation be denoted
by ¯
U, and referring to the law of large numbers [20], we
have Pf∈F\G exp(βUf) = |F\G| exp(β¯
U)with probability
1 and Pf∈F exp(βUf) = |F | exp(β¯
U)with probability 1.
Therefore, we can continue as
dT V (q∗,˜
q) = X
g∈go
(q∗
g−˜qg)≤X
g∈G
(q∗
g−˜qg)
=Pg∈G exp(βUg)
Pg0∈G exp(βUg0)−Pg∈G exp(βUg)
Pf0∈G exp(βUf0)+∆
= 1 −Pg∈G exp(βUg)
Pf0∈G exp(βUf0) + Pf∈F \G exp(βUf)
=Pf∈F\G exp(βUf)
Pf0∈F exp(βUf0)=|F \G| exp(β¯
U)
|F | exp(β¯
U)
=|F\G|
|F| .
On the other hand, we already know the size of set F
is P|Ij|
n=0 |Ij|
n=2|Ij|. Now we calculate the size of the
trimmed space G. Recall that, in each of the trimmed feasible
solutions ˜
fn(n= 1,2, ..., |Ij| − 1), the failed committee is
placed to one of the permitted committees. Thus, the number
of independent Markov chain states in Gis P|Ij|−1
m=0 |Ij|−1
m=
2|Ij|−1. Therefore, |F\G| =2|Ij|−2|Ij|−1=2|Ij|−1. Finally,
|F\G |
|F| =1
2. This concludes Lemma 4.
Leveraging Lemma 4, we have the following theorem on
the utility perturbation when any single committee fails.
Theorem 2: Suppose that a single committee fails during
the running of Algorithm 1, the performance perturbation is
bounded by
kq∗uT−˜
quTk ≤ max
g∈G Ug,(19)
where maxg∈G Ug(denoted by ˜
Umax) represents the utility
under the best solution in the new state space G.
Proof: The utility perturbation is calculated as follows:
kq∗uT−˜
quTk=kX
g∈G
(q∗
g−˜qg)·Ugk ≤ ˜
Umax ·X
g∈G
|q∗
g−˜qg|
= max
g∈G Ug·2dT V (q∗,˜
q)≤max
g∈G Ug.
This concludes the proof.
Remark 3: Theorem 2 implies that although a committee’s
failure might reduce the utility by maxg∈G Ug, the proposed
SE algorithm still works. We only have to recalculate the
theoretical boundaries of algorithm’s mixing time tmix()
(i.e., convergence time) when a committee failure occurs,
by referring to Theorem 1, the updated parameter η=
P|Ij|−1
n=0 |Ij|−1
n=2|Ij|−1,˜
Umax, and ˜
Umin = min∀g∈G Ug.
VI. PERFORMANCE EVALUATIO N
A. Simulation Settings
Dataset. We generate the synthesized multiple-shard trans-
action dataset leveraging the real-world Bitcoin TXs [21].
The adopted Bitcoin TXs are sampled from the historical
TX snapshot which contains the first 1,500,000 (1500K)
TXs recorded in January 2016. We randomly sample 1378
transaction blocks, each transaction item includes 4 categories:
blockID,bhash (block hash), btime (creation timestamp
of this block), and txs (number of TXs). For each epoch,
those blocks are divided into a different number of groups to
simulate the transaction shards generated by member commit-
tees. In each shard, the total number of TXs is accumulated
together from all blocks included. We calculate the two-
phase latency of each shard considering the delays of both
the committee formation and intra-committee consensus. In
detail, the committee-formation latency of each committee is
generated by solving the PoW puzzle, in which the expec-
tation of solving latency is set to 600 seconds. Then, the
consensus latency of each committee is obtained by running
the PBFT protocol, in which the total consensus latency is
summed by the voting times spending on the three stages
(i.e., pre-preparation, preparation, and commit) of PBFT. The
expectation of consensus latency is set to 54.5 seconds. For
the dynamic online cases, we also generate the dataset used
for multiple epochs. In each epoch, multiple committees join
in the final committee depending on their two-phase latency.
Fig. 8. Convergence of Stochastic-Exploration (SE) algorithm under different
Γ(Γis defined as the number of distributed parallel execution threads).
Other Parameters. The number of member committees
varies from 50 to 1000. The weight of the total number of
TXs, i.e., α, is ranged from 1.5 to 10. By default, βand τare
set to 2 and 0, respectively. For all online cases when running
algorithms, Nmin is fixed to a value that is equal to 50%·|Ij|,
and Nmax is set to 80%.
B. Baseline Algorithms
We also implement 3 other state-of-the-art baseline algo-
rithms to compare the performance with the proposed SE.
SA [22]: Simulated Anneal (SA) algorithm is a conventional
heuristic algorithm used to find the global optimal solution by
simulating the annealing process of metals.
DP [23], [24]: Dynamic Programming (DP) is a classical
decision-making technique for mathematical optimization. The
basic idea is to simplify a complicated problem by dividing it
into several simpler sub-problems which can be solved in an
iterative manner.
WOA [25], [26]: Whale Optimization Algorithm (WOA)
[26] was proposed in 2016 to solve optimization problems by
imitating the hunting mechanism of humpback whales.
Note that, both the SA and WOA execute in an iterative
way, same with the proposed SE algorithm.
C. Effect of the Number of Parallel Execution Threads
As described in Section IV-D, the proposed SE algorithm
can run in different number of distributed parallel threads to
accelerate the convergence speed. We denote the number of
parallel execution threads as Γ. Fig. 8 shows the performance
of SE by varying Γfrom 1 to 25, and fixing |Ij|=500,
ˆ
C=500K, α=1.5. When Γ=1, it means that SE algorithm only
executes in a single thread. From the results, we see that when
Γbecomes larger, the utility performance of SE algorithm
converges in a faster speed, and the final converged utility also
grows. However, the benefit of enlarging the number of parallel
execution threads becomes saturated when Γexceeds 10. This
is because an appropriate number of parallel threads can help
SE algorithm find good feasible solutions more quickly, but
too many parallel threads are useless since the convergence
speed of SE algorithm is restricted by its theoretical boundary
given in Theorem 1.
D. Dynamic Event-Handling
In order to prove that the proposed SE algorithm can handle
the dynamic committee’s joining and leaving events. First
(a) Committee’s leaving & joining, |Ij|=50, ˆ
C=40K.
(b) Committee’s consecutive joining, |Ij|=100, ˆ
C=80K.
Fig. 9. Results of dynamic-event handing, with parameters α(the weight of
the number of TXs) = 1.5, and Γ=1.
Fig. 10. Algorithm’s Valuable Degree, which is defined as the numerical ratio
dividing the total number of processed TXs by the cumulative age of the TXs
packaged in the permitted shards.
we generate a dataset that manually enforces a committee
to fail and then to recover again in an epoch of sharding
operation. This unique dataset imitates the dynamic events
where a committee fails due to some network anomaly (i.e.,
the leaving event) and then recovers online shortly (i.e., the
joining event). Fig. 9(a) illustrates that SE algorithm can
quickly converge again when a committee either fails or
recovers online. And the performance perturbation brought by
the leaving event is shown pretty large. However, the results
show that SE algorithm can still quickly find a pretty good
converged solution with a trimmed solution space. Next, in the
simulation shown in Fig. 9(b), we evaluate how shall the SE
algorithm perform while new committees keep joining in the
final committee consecutively. The results show that SE can
converge to the maximum in the first few hundreds of iterations
when each new committee joins in. Thus, we believe that the
proposed SE is able to handle the dynamic both leaving and
joining events for the sharded blockchain.
E. Valuable Degree of Algorithms
To evaluate the efficacy of the solutions yielded by each
algorithm, we define a Valuable Degree, which is calculated
as Pi∈Ij,j∈J(xj
i·sj
i/Πj
i). It can be seen that such Valuable
(a) |Ij|=500, ˆ
C=0.5 million, α=1.5 (b) |Ij|=800, ˆ
C=0.8 million, α=1.5 (c) |Ij|=1000, ˆ
C=1 million, α=1.5
Fig. 11. Convergence of algorithms with a fixed set of committees, while varying |Ij|={500,800,1000}, and fixing α=1.5, Γ= 10, ˆ
C= 1000 ×|Ij|.
(a) |Ij|=50, ˆ
C=50K, α=1.5 (b) |Ij|=50, ˆ
C=50K, α=5 (c) |Ij|=50, ˆ
C=50K, α=10
Fig. 12. Convergence of algorithms with a fixed set of committees, while varying α={1.5,5,10}, and fixing |Ij|=50, Γ= 25, ˆ
C=50,000 (50K).
(a) |Ij|=50, ˆ
C=50K, α=1.5 (b) |Ij|=50, ˆ
C=50K, α=5 (c) |Ij|=50, ˆ
C=50K, α=10
Fig. 13. Distribution of the converged utilities with a fixed set of committees, while varying α={1.5,5,10}, and fixing |Ij|=50, Γ= 25, ˆ
C=50,000 (50K).
(a) |Ij|=50, ˆ
C=40K, α=1.5 (b) |Ij|=50, ˆ
C=40K, α=5 (c) |Ij|=50, ˆ
C=40K, α=10
Fig. 14. Online execution with committee’s consecutive joining events, while varying α={1.5,5,10}, and fixing |Ij|=50, Γ= 25, ˆ
C=40,000 (40K).
Degree is positively related to the total number of processed
TXs, and reversely related to the cumulative age of the TXs
packaged in the permitted shards. A large Valuable Degree
indicates that the corresponding algorithm can give a high
valuable solution.
On the simulation shown in Fig. 10, we compare the
Valuable Degree of the chosen member committees indicated
by the 4 algorithms, under the parameter settings |Ij|=500,
ˆ
C=500K, α=1.5, Γ=25. We observe that the proposed SE
algorithm demonstrates the highest valuable degree comparing
with the other baselines. SA has a close performance to the
proposed SE algorithm but with a lower valuable degree.
Surprisingly, DP and WOA algorithms generate solutions with
pretty low valuable degrees. This result proves that DP and
WOA failed to help the final committee choose the most
valuable member committees.
F. Varying |Ij|with a Fixed Set of Arrived Committees
In this group of simulation, we then evaluate the im-
pact of the number of committees by varying |Ij|within
{500,800,1000}and fixing a set of arrived member commit-
tees. Fig. 11(a)-Fig. 11(c) illustrate the utilities of algorithms
under the parameters α= 1.5,Γ = 10, and ˆ
C0=1000 ×|Ij|.
We see that the performance of the proposed SE algorithm
converges finally and maintains around 20%-30% higher con-
verged utilities comparing with other baselines. Furthermore,
we observe that the performance gaps between SE and other
baselines seem to enlarge when the number of shards increases
from 500 to 1000. We attribute this finding to the fact that SE
algorithm becomes easier to explore a better solution when the
number of feasible solutions fngrows. In contrast, the utility
of DP becomes higher than that of SA while |Ij|varies from
500 to 1000. WOA always has the lowest converged utility
among the 4 algorithms.
G. Effect of Varying αunder Two Cases
We then evaluate the weight of the number of all processed
TXs by varying αwithin {1.5,5,10}, and setting |Ij|=50 and
Γ=25. Fig. 12 and Fig. 13 illustrate the converged utilities
under the case given a fixed set of arrived committees, and
Fig. 14 shows the case with committee’s consecutive joining
events. In Fig. 12 and Fig. 13, we see that an increasing α
makes the converged utilities grow for all algorithms. And,
the utility gap between SE and other baselines seems to be
enlarged when αincreases. Then, with totally 23 committee’s
joining events in an epoch, we plot the converged utilities in
Fig. 14. Again, we observe that the converged utilities of SE
outperform that of other 3 baselines by around 20%-30%. Fur-
thermore, following the increase of α, the converged utilities
of all algorithms also improve, and their utility distributions
look similar while varying α.
VII. CONCLUSION
This paper studies how to schedule the most valuable
committees for the final committee in a large-scale sharded
blockchain. This scheduling problem is denoted by MVCom,
which considers the tradeoff between the number of transac-
tions processed in an epoch and their cumulative waiting la-
tency. We first prove the NP-hardness of the MVCom problem,
and then address it by exploiting the stochastic-exploration
(SE) technique. The theoretical properties of the proposed
SE algorithm, such as the convergence-time boundaries and
the performance perturbation brought by committee’s failure
are also analyzed rigorously in Theorem 1 and Theorem 2,
respectively. The trace-driven simulations demonstrate that the
proposed SE algorithm can handle the committee’s dynamic
joining and leaving events. Then, more extensive results show
that SE algorithm outperforms other three baselines, including
Simulated Anneal, Dynamic Programming and WOA algo-
rithms, in terms of both the converged system utility and the
valuable degree of yielded solutions.
VIII. ACKN OWLEDGEMENT
This work was partially supported by the Key-Area Re-
search and Development Program of Guangdong Province
(No.2019B020214006), the National Natural Science Foun-
dation of China (No. 62032025, No. 61902445), the Guang-
dong Basic and Applied Basic Research Foundation (No.
2019A1515011798), and Alibaba Group through Alibaba In-
novative Research (AIR) programme.
REFERENCES
[1] S. Nakamoto, “Bitcoin: A peer-to-peer electronic cash system,” Working
Paper, 2008.
[2] L. Luu, V. Narayanan, C. Zheng, K. Baweja, S. Gilbert, and P. Saxena,
“A secure sharding protocol for open blockchains,” in Proc. of ACM
SIGSAC CCS, 2016, pp. 17–30.
[3] M. Castro, B. Liskov et al., “Practical byzantine fault tolerance,” in
Proc. of OSDI, vol. 99, no. 1999, 1999, pp. 173–186.
[4] H. Huang, W. Kong, S. Zhou, Z. Zheng, and S. Guo, “A survey of
state-of-the-art on blockchains: Theories, modelings, and tools,” ACM
Computing Surveys (CSUR), vol. 54, no. 2, pp. 1–42, 2021.
[5] H. Dang, T. T. A. Dinh, D. Loghin, E.-C. Chang, Q. Lin, and B. C. Ooi,
“Towards scaling blockchain systems via sharding,” in Proc. of the 2019
International Conference on Management of Data, 2019, pp. 123–140.
[6] A. Hari, M. Kodialam, and T. Lakshman, “Accel: Accelerating the
bitcoin blockchain for high-throughput, low-latency applications,” in
IEEE Conference on Computer Communications (INFOCOM). IEEE,
2019, pp. 2368–2376.
[7] M. Zamani, M. Movahedi, and M. Raykova, “Rapidchain: Scaling
blockchain via full sharding,” in Proc. of ACM SIGSAC CCS, 2018,
pp. 931–948.
[8] M. J. Amiri, D. Agrawal, and A. E. Abbadi, “Sharper: Shard-
ing permissioned blockchains over network clusters,” arXiv preprint
arXiv:1910.00765, 2019.
[9] J. Wang, Y. Zhou, X. Li, T. Xu, and T. Qiu, “A Node Rating Based
Sharding Scheme for Blockchain,” in Proc. of IEEE 25th International
Conference on Parallel and Distributed Systems (ICPADS). IEEE, 2019,
pp. 302–309.
[10] L. N. Nguyen, T. D. Nguyen, T. N. Dinh, and M. T. Thai, “Optchain:
optimal transactions placement for scalable blockchain sharding,” in
Proc. of IEEE ICDCS, 2019, pp. 525–535.
[11] H. Chen and Y. Wang, “Sschain: A full sharding protocol for public
blockchain without data migration overhead,” Pervasive and Mobile
Computing, vol. 59, p. 101055, 2019.
[12] J. Wang and H. Wang, “Monoxide: Scale out Blockchains with Asyn-
chronous Consensus Zones,” in Proc. of NSDI, 2019, pp. 95–112.
[13] P. Zheng, Q. Xu, Z. Zheng, Z. Zhou, Y. Yan, and H. Zhang, “Meepo:
Sharded consortium blockchain,” in Proc. of ICDE, 2021, pp. 1–6.
[14] A. Fr´
eville, “The multidimensional 0–1 knapsack problem: An
overview,” European Journal of Operational Research, vol. 155, no. 1,
pp. 1–21, 2004.
[15] M. Chen, S. C. Liew, Z. Shao, and C. Kai, “Markov approximation for
combinatorial network optimization,” IEEE Transactions on Information
Theory, vol. 59, no. 10, pp. 6301–6327, 2013.
[16] S. Boyd and L. Vandenberghe, Convex optimization. Cambridge
university press, 2004.
[17] F. P. Kelly, Reversibility and stochastic networks. Cambridge University
Press, 2011.
[18] D. A. Levin and Y. Peres, Markov chains and mixing times. American
Mathematical Soc., 2017, vol. 107.
[19] P. Diaconis and D. Stroock, “Geometric bounds for eigenvalues of
markov chains,” The Annals of Applied Probability, pp. 36–61, 1991.
[20] K. L. Judd, “The law of large numbers with a continuum of iid random
variables,” Journal of Economic theory, vol. 35, no. 1, pp. 19–25, 1985.
[21] J. Wu, J. Liu, W. Chen, H. Huang, Z. Zheng, and Y. Zhang, “Detecting
mixing services via mining bitcoin transaction network with hybrid
motifs,” IEEE Trans. on Systems, Man, and Cybernetics: Systems, pp.
1–13, 2021.
[22] F. Knust and L. Xie, “Simulated annealing approach to nurse rostering
benchmark and real-world instances,” Annals of Operations Research,
vol. 272, no. 1-2, pp. 187–216, 2019.
[23] R. Mitze and M. M ¨
onnigmann, “A dynamic programming approach to
solving constrained linear–quadratic optimal control problems,” Auto-
matica, vol. 120, p. 109132, 2020.
[24] M. Han, H. Kim, G. Gu, K. Park, and W.-S. Han, “Efficient subgraph
matching: Harmonizing dynamic programming, adaptive matching order,
and failing set together,” in Proceedings of the 2019 International
Conference on Management of Data, 2019, pp. 1429–1446.
[25] Q.-V. Pham, S. Mirjalili, N. Kumar, M. Alazab, and W.-J. Hwang,
“Whale optimization algorithm with applications to resource allocation
in wireless networks,” IEEE TVT, vol. 69, no. 4, pp. 4285–4297, 2020.
[26] S. Mirjalili and A. Lewis, “The whale optimization algorithm,” Advances
in engineering software, vol. 95, pp. 51–67, 2016.