Content uploaded by Dilum Bandara
Author content
All content in this area was uploaded by Dilum Bandara on Jan 26, 2022
Content may be subject to copyright.
Patterns for Blockchain Data Migration
HMN Dilum Bandara
Xiwei Xu
firstname.lastname@data61.csiro.au
Data61, CSIRO
Sydney, Australia
Ingo Weber
firstname.lastname@tu-berlin.de
Chair of Software and Business Engineering
Technische Universitaet Berlin
Berlin, Germany
ABSTRACT
With the rapid evolution of technological, economic, and regulatory
landscapes, contemporary blockchain platforms are all but certain
to undergo major changes. Therefore, the applications that rely on
them will eventually need to migrate from one blockchain instance to
another to remain competitive and secure, as well as to enhance the
business process, performance, cost efficiency, privacy, and regula-
tory compliance. However, the differences in data and smart contract
representations, modes of hosting, transaction fees, as well as the
need to preserve consistency, immutability, and data provenance in-
troduce unique challenges over database migration. We first present
a set of blockchain migration scenarios and data fidelity levels using
an illustrative example. We then present a set of migration patterns to
address those scenarios and the above data management challenges.
Finally, we demonstrate how the effort, cost, and risk of migration
could be minimized by choosing a suitable set of data migration
patterns, data fidelity level, and proactive system design. Practical
considerations and research challenges are also highlighted.
CCS CONCEPTS
•Software and its engineering →Software development tech-
niques
;
•Applied computing →Enterprise data management
;
•Computing methodologies →Distributed algorithms.
KEYWORDS
blockchain, data migration, patterns, smart contract, transactions
ACM Reference Format:
HMN Dilum Bandara, Xiwei Xu, and Ingo Weber. 2020. Patterns for Block-
chain Data Migration. In European Conference on Pattern Languages of Pro-
grams 2020 (EuroPLoP ’20), July 1–4, 2020, Virtual Event, Germany. ACM,
New York, NY, USA, 19 pages. https://doi.org/10.1145/3424771.3424796
1 INTRODUCTION
Since the launch of Bitcoin over a decade ago [
33
], an unprecedented
number of blockchain platforms with different designs, features, and
operational models have emerged. While each claims its superiority
over predecessors in terms of performance, features, security, or
governance, given an application scenario, it is non-trivial to identify
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and the full citation
on the first page. Copyrights for components of this work owned by others than ACM
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specific permission and/or a
fee. Request permissions from permissions@acm.org.
EuroPLoP ’20, July 1–4, 2020, Virtual Event, Germany
©2020 Association for Computing Machinery.
ACM ISBN 978-1-4503-7769-0/20/07.. . $15.00
https://doi.org/10.1145/3424771.3424796
a fitting blockchain platform. However, not wanting to lose the early
adopter advantage, even enterprise information systems and busi-
ness process management systems are starting to adopt blockchain
platforms [
27
,
53
]. Alternatively, as the technological, business, eco-
nomic, and regulatory landscapes are still evolving, it is quite unclear
what blockchain platforms will make the cut in such domains. There-
fore, if the chosen blockchain turned out to be ill-suited, it might
be difficult, costly, and risky to change it. This is due to the incom-
patibilities in platforms, mode of hosting, and blockchain properties
such as consistency, immutability, transparency, and openness. Thus,
even though data migration has been an afterthought, it is imperative
to know the feasibility and caveats of blockchain migration.
An application that uses a blockchain as an underlying data store
may opt to migrate its data due to diverse reasons. For example,
Fig. 1shows the reason for data migration as cited by 72 block-
chain platforms and Decentralized Applications (DApps) that mi-
grated their data between July 2017 and April 2020 (see Section 5.2
for details). Business reasons include the interest to launch own
blockchain instances, partnerships, mergers and acquisitions, and
multi-blockchain operations. Another key reason is the emergence
of new blockchain platforms with better performance (i.e., higher
throughput, lower latency, or faster finality), new features, and low
transaction fees compared to incumbent platforms such as Bitcoin
and Ethereum. Essential upgrades due to the blockchain platform
changes, bug fixes, security, and governance issues also lead to mi-
gration. For example, Gartner expects that “through 2021, 90% of
the enterprise blockchain implementations will require replacement
within 18 months to remain competitive and secure, and to avoid
obsolescence” [
21
]. Most reasons for migration stem from the im-
maturity of technical, business, economic, and regulatory facets of
blockchains. However, in the same way that database migration
and enterprise application integration have not gone away, data mi-
gration in the blockchain context is also a lasting problem. For
example, business mergers and acquisitions, establishing/joining a
new consortium in cross-organizational processes, as well as regula-
tory changes [
22
,
32
] may also force an organization to move into a
consortium blockchain or Blockchain as a Service (BaaS) platform.
Moreover, business process reengineering [
12
], separation of inter-
nal and shared data, change of hardware, and consolidation, may
also require a migration [
32
]. Furthermore, an organization may also
adopt multiple blockchains with different cost, performance, and
workload characteristics, which might involve partial data migration.
While most applications use blockchain as a data store, block-
chains have several notable differences compared to conventional
databases. For example, a blockchain could be abstracted as a key-
value store that maintains a set of states [
37
]. Thus, the schema-
less nature makes the mapping between blockchains relatively easy.
While blockchain transactions could perform complex operations
EuroPLoP ’20, July 1–4, 2020, Virtual Event, Germany Bandara, Xu, and Weber
Figure 1: Reasons for blockchain data migration.
on multiple states, they do not fully support CRUD operations and
ACID properties [
37
,
48
]. Nevertheless, the quality of data on a
blockchain is generally high due to the data consistency and com-
pleteness provided by consensus and immutability, respectively. Op-
tionally, blockchains might include business logic in the form of
smart contracts [
59
]. Smart contracts are not only more expressive
and complex than stored procedures but also specific to a blockchain
platform and instance [
2
]. Thus, smart contracts may need to be
ported to the target blockchain leading to potential errors despite
significant time and cost needed to modify/rewrite and test them.
Moreover, as smart contracts have embedded data, ported smart con-
tracts also need to reestablish their data. However, the global state,
transaction sequence, and history cannot be arbitrarily recreated on
the target blockchain due to the consensus process and the need to
preserve consistency, transparency, and data provenance. Even when
they are recreated on the target blockchain, records must be kept on
the blockchain itself on how the new data came into existence and
any changes made to data during the migration process. Furthermore,
as transactions are digitally signed, they may not be replayable across
blockchains without having access to the private keys. Also, block-
chain migration has to be a single-shot process, as rollbacks could
be impossible due to immutability. The replication factor of block-
chains is several orders of magnitude higher than in databases, and
both the replica holders and users have a say in the governance pro-
cess. Thus, in addition to paying transaction fees to offset the cost of
resource utilization, multi-party approval is needed to introduce any
changes to data and platform. Therefore, even when workarounds
are possible, significant time and cost are required to migrate a large
volume of accounts, native assets, states, transactions, and smart
contracts that are interrelated. Consequently, blockchain migration
becomes a nontrivial, costly, and risky process [
21
] compared to
database migration. While we can learn from database migration
best practices [
26
,
32
] and numerous patterns [
41
,
42
,
47
,
55
], it is
imperative to answer the following research question to understand
the extent to which we can generalize the related work and identify
new patterns to address unique challenges of blockchains:
What are the effective patterns for safeguarding data when
migrating from one blockchain to another?
In this paper, we first explain a set of blockchain migration scenar-
ios and data fidelity levels using an illustrative example. The example
is derived based on the literature and our experience from several
blockchain projects for industry and government agencies. Second,
we identified ten patterns to achieve those migration scenarios un-
der varying data fidelity levels. While four of these patterns stem
from database migration, specific adaptations are needed to support
blockchain migration. Three additional patterns from blockchain
and database literature are also adopted to address non-functional as-
pects such as quality of migrated data, cost, and privacy. Third, using
the illustrative example, we discuss how the proposed patterns and
data fidelity levels could be used to address the identified migration
scenarios and data management challenges. The proposed patterns
are applicable to software architects, developers, system administra-
tors, and technical leads who need to plan, develop, configure, and
monitor blockchain and distributed ledger data migration projects.
In conclusion, while migrating to a private or consortium blockchain
could be achieved relatively easily, recreating full blockchain his-
tory on an existing public blockchain is impractical. Nevertheless,
the global state can still be recreated, which is sufficient for most
practical migration scenarios. Therefore, the success of blockchain
migration boils down to choosing a suitable data fidelity level and a
set of data migration patterns that balances competing factors such as
performance, cost, time, effort, granularity of data, transparency, se-
curity, privacy, and risk. Finally, we discuss practical considerations
and several research challenges.
The rest of the paper is organized as follows; Section 2 defines
terminology and presents an illustrative example. Migration sce-
narios and data fidelity levels are presented in Section 3.Section 4
explains the methodology we adopted to identify patterns and the
proposed patterns are described in Section 5. Migration pattern to
scenario mapping and a set of use cases are presented in Section 6.
Section 7 discusses practical considerations and research challenges.
Concluding remarks are highlighted in Section 8.
2 PRELIMINARIES
2.1 Background and Definitions
We first define a set of terms related to blockchain data and migration.
While most of these terms originate from Ethereum and Bitcoin,
they are widely adopted by the blockchain industry and research
community [
10
,
20
,
23
]. These terms are defined to precisely express
the illustrative example, migration scenarios, data fidelity levels, and
proposed patterns.
Astate is anything tracked by a blockchain such as the balance of
an account, Unspent Transaction Output (UTXO), value of an asset,
ownership of an asset represented as a digital token, or an attribute of
a physical object. Public blockchains use one or more native assets
(i.e., default cryptocurrency) for value transfer and transaction fees.
Our definition of state also includes the states embedded within
a smart contract such as tokens and other data. An account (aka.,
address) is a reference/key to a state or smart contract, e.g., owner of
a UTXO in Bitcoin [
33
] or balance and other data of an Ethereum
account [
59
]. Global state (aka., world state) is the collection of all
accounts and their current states tracked by the blockchain.
Patterns for Blockchain Data Migration EuroPLoP ’20, July 1–4, 2020, Virtual Event, Germany
Atransaction (TX) is used to initiate a valid state transition. For
example, a transaction can change the ownership of a title or debit
cryptocurrency from one account and credit it to another. A Smart
Contract (SC) is a set of executable instructions that are activated
in response to a transaction. When executing, these instructions
may change the states and call other functions on the blockchain.
History (aka., block history) is the collection of all blocks produced
by the blockchain. Each block tracks all the included transactions
and resulting states after executing those transactions and smart
contracts.
Blockchain platform is the software needed to operate a block-
chain [
62
], such as the client software for Bitcoin, Ethereum, or
Hyperledger. Execution of such platform software with a specific
set of nodes, storage, permissions, and configurations is referred to
as a blockchain instance. We consider scenarios where an applica-
tion using a blockchain instance need to either move to a different
instance or hardware. The current blockchain instance is referred to
as the source blockchain and the one after the change is referred to
as the target blockchain. The source and target blockchains could be
on the same or different instances with varying platforms and levels
of permissions such as private, consortium, and public.
Different terms such as migration, transfer, conversion, moving,
and replication refer to the processes of copying data from a source
data store to the target. Moreover, blockchain-specific terminol-
ogy such as mainnet migration/swap [
3
,
36
], token migration/swap
[
31
,
54
], DApp migration/swap, smart contract migration, bootstrap-
ping [
16
,
38
], hard spooning [
34
], and teleportation [
45
] also refer
to various forms of data copying within and across blockchains.
Therefore, in this paper, we define blockchain data migration as the
process of recreating full or part of the accounts, states, transactions,
smart contracts, and history on the target blockchain while adher-
ing to the consistency, immutability, and transparency properties of
blockchains. Underneath, this not only includes data transfer, but
also distributed commutation, data exchange, and consensus across
a large number of distributed nodes.
Migration issues may also arise within the scope of a single block-
chain platform, e.g., across versions, smart contract languages, and
APIs. For example, some smart contracts and DApps had to be up-
graded when Ethereum Virtual Machine (EVM) instructions and
their gas consumption were changed to strengthen the security and
optimize the cost and performance [
35
]. Similarly, Ethereum smart
contracts written in Serpent language need to be rewritten, as Serpent
is now deprecated due to known weaknesses [
4
]. Also, Ethererum
Web3 API has many upgraded and depreciated functions, in turn ne-
cessitating DApp upgrades to remain relevant. The above definition
of blockchain data migration includes scenarios where data need to
be recreated on the same or a different blockchain instance. How-
ever, we exclude reconfiguration, software upgrade, and cross-chain
operation where data remain backward compatible and within the
same blockchain instance. Such examples include changes to DApps
and smart contracts that do not require movement of data, soft forks
where old blocks remain valid under updated state transition rules,
and changing the consensus algorithm in Hyperledger Fabric while
retaining the state database [
23
]. While we also consider migration
as a one-way process with no return [
32
], we consider the two op-
tions of continuing or decommissioning the source blockchain after
the migration. Such consideration is necessary as a public blockchain
may continue, even though an alt-coin (i.e., alternative currency)
moves to its blockchain instance [16,40,43].
2.2 Motivating Application Scenario
Based on the reasons for blockchain data migration listed in Fig. 1
and our experience in multiple blockchain projects, we construct a
hypothetical scenario to illustrate different data migration scenarios,
migration options, proposed patterns, and their interrelationships.
This scenario captures the growing concerns on poor performance,
increasing transaction fees, and poor quality of service experienced
in public blockchains, like Ethereum [
5
], NEO [
9
], EOS [
25
], and
IOTA [13].
Suppose a nonprofit hosted a concert to raise funds for its charity
work. The nonprofit pledged to allocate at least 50% of the profit
to the charity work. The nonprofit puts $5,000 as the seed money,
hoping to recover it from the remaining profit. It further planned
to set aside up to $3,000 as next year’s seed money and use the
remaining balance for further charity work. Moreover, they reached
out to several sponsors and donors to raise additional capital. Neces-
sary expenses include payments to key artists (although most artists
volunteered), venue, equipment, and marketing.
The nonprofit decided to use a blockchain, as it hoped better trans-
parency in managing funds would attract more attendees, sponsors,
and donors. Pledge to charity work and recovery of seed money were
defined as smart contracts. In case of a loss, the nonprofit planned
to settle part of the dues to the venue and equipment from a future
event. The nonprofit chose a public blockchain that does not charge
transaction fees. A budget was prepared, and separate accounts were
created for each budget item, and transaction records were main-
tained using the blockchain. A ConcertCoin was defined to minimize
the impact of cryptocurrency fluctuation and enhance the traceability.
Several exchanges also agreed to buy and sell ConcertCoins without
a commission. Seed money was then converted to ConcertCoins and
dispersed to accounts using a funding smart contract. Another set
of smart contracts was defined to disperse the funds, as well as to
enforce budget limits, spending rules, and payments.
The concert was an overwhelming success, partly due to its high
transparency in handling finances and ensuring that committed funds
do go to the charity work. Eventually, the ledger was balanced, and
all the payments, seed money, and next-year concert’s retention were
settled as per the smart contracts. Given the overwhelming success
of the concert and greater transparency it demonstrated, the nonprofit
planned on using the blockchain for other fundraisers and to track
the disbursement of aid.
However, soon after, the blockchain community started charging
transaction fees, as it was grappled with spam transactions. Con-
sequently, exchanges were also forced to charge a commission for
ConcertCoin conversions. It was further realized that the cost and
performance problems are unlikely to be solved soon, as changes
to the blockchain platform’s architecture and incentive mechanism
were needed to fix the issue. Given these circumstances and the need
to integrate future fundraisers and internal activities, the nonprofit
decided to change the blockchain instance.
In this case, the nonprofit has several options, such as moving
to a private or another public blockchain. Under the first option,
the nonprofit could use the same or a different blockchain platform
EuroPLoP ’20, July 1–4, 2020, Virtual Event, Germany Bandara, Xu, and Weber
from a BaaS provider or host its instance on-premise or on the
cloud. The nonprofit also has a range of choices when it comes to
migrating the data to the new blockchain instance, e.g., starting the
application without any history of the concluded concert, with the
closing balances, or with the entire history.
3 BLOCKCHAIN MIGRATION SCENARIOS
We consider different choices for the target blockchain instance,
such as preexisting or new; private, consortium, or public; and same
or a different blockchain platform. We also consider the possibility
of not decommissioning the source blockchain after the migration.
However, we do not attempt to recommend a target blockchain,
and instead assume that the nonprofit has already chosen a target
blockchain based on competing factors such as performance, cost,
time, effort, granularity of data, security, and privacy. The nonprofit
has a few options such as to cash out and start anew, transfer only
the closing balance(s), or transfer the global state and history to
the target blockchain. In the context of the nonprofit’s migration
requirements, following data migration scenarios can be identified:
(1)
Relocate – changes the hardware on which the blockchain
instance runs. For example, the nonprofit may run the same
blockchain platform as an on-premise or cloud (IaaS or BaaS)
instance to gain cost efficiency, performance, and privacy.
(2)
Upgrade – creates a new blockchain or smart contract in-
stance to gain better performance, security, privacy, novel
features, and lower cost while losing backward compatibility.
For example, the nonprofit may move from proof of work to
proof of stake version of the same blockchain platform to get
better performance and cost-efficiency [18,31].
(3)
Consolidate – combines two or more existing blockchains to a
single target blockchain. For example, the concert application
could move to another existing blockchain to get benefits
similar to upgrade.
(4)
Separation – forks offone or more target blockchains and
partitions the global state across them. For example, the con-
cert application running on the public blockchain can declare
independence by forking offits instance and managing all its
states on the new instance.
(5)
Archive – creates a full archive of block history on the target
blockchain. For example, the nonprofit may create an archive,
as the source blockchain prunes old blocks to reduce storage
or go out of business. Moreover, the archive can be used to
serve read-only workloads targeting transaction validation,
auditing, data analytics, and public access.
Differences between the source and target blockchain platforms,
their existence, and mode of deployment determine what and how
data can be extracted from the source and recreated on the target.
Moreover, as per the third golden rule of Morris’s data migration [
32
],
“no one needs, wants, or will pay for perfect data.” However, in the
context of blockchains, the absolute correctness of asset ownership,
data provenance, and smart contracts are essential to gain complete
trust. Also, transaction history may need to be retained for years, e.g.,
the banking and finance industry typically retains transaction history
for five to seven years. Therefore, given a migration scenario, it is
imperative to identify what data to migrate and what to retain on the
source blockchain while finding the right balance among blockchain
constraints; differences between platforms, their existence, and mode
of deployment; preserving trust, auditability, and compliance; data
consistency, granularity, and utility; and cost and time to migrate.
Thus, while moving to the target blockchain, the nonprofit may
decide to migrate data at different levels of fidelity as follows:
(1)
Fresh Start – Start a new application round/instance without
migrating any blockchain data. For example, the nonprofit
could cash out any remaining ConcertCoins and disregard
blockchain state, smart contracts, and history on the source
blockchain. Next year, it can start the concert application
anewonadifferent blockchain instance.
(2)
State Only – Migrate a chosen subset of states that is essential
to transact during the next application round to the target
blockchain. For example, the nonprofit could transfer next
year’s seed money to the new blockchain instance. It may
also transfer the closing balance of each budget item when
commitments to vendors/artists are pending, or spending rules
are attached to budget items.
(3)
State and Transactions – Migrate both the chosen states and
associated transactions. For example, in addition to recreating
closing balance(s), it may be necessary to look-up past trans-
actions of the concert application on the target blockchain.
(4)
Genesis and Transactions – Initial states and all subsequent
transactions are migrated to the target blockchain to recreate
the global state and full history. This enables the nonprofit to
preserve the integrity of the data and facilitate auditing.
(5)
Blockchain History – Full history of the source blockchain
(including state, transactions, smart contracts, and blocks) are
migrated to the target blockchain. This enables the nonprofit
to facilitate transparency, auditing, and data analytics.
As we go down the levels, completeness of the data migrated to
the target blockchain increases. All related smart contracts and their
embedded data need to be migrated as per the selected fidelity level.
4 METHODOLOGY
Many different constraints need to be overcome while migrating data
under the above scenarios and fidelity levels. While each migration
scenario is unique, we can always learn from the commonly occur-
ring problems and recurring solutions to those. These reusable solu-
tions can be formulated as a set of technology and implementation-
independent patterns. Patterns demonstrate techniques and strategies
to meet the requirements of a blockchain migration project while
minimizing risk and cost.
Data migration patterns are derived from our experience in multi-
ple blockchain projects and related work. While database migration
has been discussed in the formal literature, details on blockchain
migration are available only in grey literature (e.g., white papers,
online news sites, blogs, and videos). Therefore, we used a Multivo-
cal Literature Review (MLR) [
17
] – a form of systematic literature
review that includes grey literature – as our methodology to explore
the reasons and techniques for blockchain data migration. MLR
process started with our research question in Section 1. We used
Google Scholar, News, and Search as the data sources. Search key-
words included “blockchain,” “DApp,” “token,” “smart contract,”
“migration,” “swap,” “mainnet,” and their combinations. Snowball
Patterns for Blockchain Data Migration EuroPLoP ’20, July 1–4, 2020, Virtual Event, Germany
sampling was used to expand the pool of potential sources by fol-
lowing citations and links from already found sources. The process
was repeated until theoretical saturation was reached. Then the pool
of sources was filtered according to predefined inclusion and ex-
clusion criteria, e.g., “migration must include movement of data,”
“must specify reasons and techniques for migration,” “and migration
must be completed”. Information sources from the blockchain plat-
form and DApp governance body were given higher preference over
third-party sources.
Key formal literature in the vetted pool of sources included smart
contract migration [
2
,
11
], blockchain patterns [
61
,
62
], database mi-
gration and patterns [
19
,
26
,
32
,
41
,
42
,
55
], and cloud patterns [
47
].
While 128 web pages related to blockchain and DApp migration
were selected, only the ones related to the presented patterns and
with a detailed explanation of the migration process are listed in the
paper [
1
,
3
–
5
,
7
–
10
,
13
–
16
,
18
,
20
,
22
,
24
,
25
,
28
,
29
,
31
,
34
–
36
,
38
–
40
,
43
–
46
,
49
–
52
,
54
,
56
–
58
,
60
,
63
–
65
]. Finally, the vetted pool of
sources was analyzed using open and axial coding. This process was
iterated several times to derive a set of generic and related patterns.
Consequently, we identified reasons (see Fig. 1) and techniques used
by 72 cases of blockchain and DApp data migrations between July
2017 and April 2020.
5 BLOCKCHAIN MIGRATION PATTERNS
5.1 Migration Architecture
We explain the patterns in the context of data migration architecture
illustrated in Fig. 2. Similar to database migration, we envision
the migration team will utilize a tool (either developed in-house
or off-the-shelf) to simplify the migration process. The migration
tool could follow the Extract, Transform, and Load (ETL) process
[
19
,
42
] to copy data from the source blockchain and recreate them
on the target blockchain. However, phases of the ETL process may
be interchanged as per the chosen set of migration patterns. For
example, a state could be transformed as a token on the source
blockchain before extraction. Moreover, to preserve the consistency
and accountability, all data transformations must be recorded on
either the source or target blockchain. Thus, most transformations
happen within a blockchain rather than in a separate staging area.
Scattered arrows in the figure indicate this.
Figure 2: Data migration architecture.
Due to the incompatibilities between the source and target block-
chains’ data representations, as well as the creation of new accounts,
smart contracts, and replay of transactions, changes may be needed
at the Blockchain Access/API Layer (BAL) [
37
,
47
]. Similar to
the data access layer in databases, BAL abstracts the connectivity
to the blockchain. It may also map application-level references to
blockchain identifiers (ID) as they are very different. For example, a
username used by a DApp needs to be mapped to the user’s address
or public key on the blockchain. Such application-level reference to
blockchain ID mapping is usually maintained in a protected data-
base within BAL, which we refer to as the ID database. When the
application holds the user’s private key (e.g., custodial wallet), keys
may also be maintained in this database. Therefore, in addition to
updating the BAL to integrate the target blockchain, ID database
within the BAL needs to be updated to reflect new account and smart
contract addresses, keys, and transaction IDs during the migration.
Moreover, ID database can be used to identify what accounts, states,
transactions, and smart contracts to migrate, as blockchains try to
be anonymous by not keeping track of applications and their users.
Furthermore, even the migration tool could use the BAL to access
both the blockchains. Therefore, BAL and its ID database are an
integral part of the migration architecture. The dotted lines in Fig.
2show the flow of account (
accID
), smart contract (
scID
), and
transaction (txID) identifiers from/to the ID database.
5.2 Pattern Overview
Fig. 3shows the proposed blockchain data migration patterns, their
grouping, and relationships. We grouped the patterns used to copy
data between source and target blockchains based on the phases
in the ETL process. Patterns related to smart contracts and non-
functional aspects are also arranged into respective groups. For ex-
ample, the snapshotting pattern in the state extraction pattern group
is used to extract states from the source blockchain. States can be
transformed before or after the extraction using state transformation
patterns. For example, the state aggregation pattern can combine
multiple states to reduce the volume of data to be migrated. States
could be marked as unusable on the source blockchain using the
token burning pattern. Moreover, snapshotting could also be used
to specify the list of states to be aggregated or marked as unusable.
Patterns in the state and transaction load group can be applied to
recreate/load states and transactions on the target blockchain. For
example, the establish genesis pattern can be used to spin up a new
blockchain instance by including the extracted data in the genesis
block. In contrast, the extracted data can be appended to an existing
blockchain using the hard fork pattern. Exchange transfer pattern
can be used to transfer tradable assets/states between blockchains,
whereas state initialization pattern allows arbitrary reaction of other
forms of states. Node sync pattern can establish the full history on
the target blockchain. Smart contract patterns present techniques
to migrate smart contracts either by reusing the smart contract ex-
ecution environment on the target blockchain or translating smart
contracts. Smart contract patterns need to be used before recreating
the states using state and transaction load patterns.
Non-functional patterns could assist other pattern groups to opti-
mize storage, enhance security, and measure the success of migration.
For example, it is expensive and impractical to store a large volume
of data on the target blockchain. However, when the source block-
chain is decommissioned, we still need to store large snapshots,
details of state aggregation, as well as updates to the ID database and
smart contracts on the target blockchain to preserve accountability,
EuroPLoP ’20, July 1–4, 2020, Virtual Event, Germany Bandara, Xu, and Weber
Figure 3: Overview of patterns.
transparency, and integrity. In such cases, the off-chain data storage
pattern [
61
] could be used to add a Proof of Existence (PoE) entry
by adding a hash of the required data to the blockchain while storing
the actual data offline. While migrating sensitive data to a public
blockchain, encrypting on-chain data pattern [
61
] could be used to
encrypt the states and transaction data to enhance the data confiden-
tiality. The measure migration quality pattern [
55
] uses metrics to
define success criteria for migration, as per the set application and
organizational objectives. Therefore, it is useful in determining a
suitable data fidelity level for a given migration scenario, as well
as to confirm that the migration is successfully achieved as per the
set quality objectives. These non-functional patterns are taken from
related work [
55
,
61
] to form a complete pattern collection. However,
they do not need extensive adaptations to use with blockchain data
migration; hence, they are not presented in detail. In contrast, shaded
boxes in the figure indicate the six new blockchain-specific data
migration patterns we identified. We also present four other patterns
from database migration, which need to be extended to work with
unique properties of blockchains.
Multiple technical and process aspects such as consensus, im-
mutability, state conversion, order of transactions, transaction fees,
key management, and privacy need to be taken into account to pre-
serve consistency, accountability, transparency, and provenance of
data during the migration. Hence, the proposed patterns reflect a
combination of data migration, process, and cloud architectural pat-
terns. Moreover, these patterns could be applied to migrate data
of a DApp and an entire blockchain alike. The intended audience
for the proposed patterns is software architects, developers, system
administrators, and technical leads who have to deal with the data
migration life cycle from planning, design, execution to performance
testing.
Next, we describe the patterns using the extended pattern form
[
30
] and use the following eight elements to present essential details.
The summary presents a high-level description of what a pattern does.
Context defines the circumstances under which the problem exists
and needs to be solved. Forces are the contradictory considerations
that must be taken into account while solving the problem under the
given context. Problem and solution have the usual meaning where
they specify what aspect of migration to be solved and how to solve
it. Consequences illustrate how the proposed solution resolves the
forces while addressing the problem. Other patterns that may be of
interest while solving the bigger problem are given under related
patterns.Known use illustrates the applications of the pattern.
5.3 State Extraction Pattern
5.3.1 Pattern 1 – Snapshotting.
Summary. Get a snapshot of states, smart contracts, and transac-
tions on the source blockchain.
Context. The latest states and transactions of all accounts and
smart contracts of the concert application need to be migrated to the
target blockchain. Source blockchain is currently active; hence, the
global state continues to change as new transactions arrive. The state
to be migrated is already aggregated on the source blockchain.
Problem. How to get a complete account of states, smart contracts,
and transactions on the source blockchain before migration?
Forces.
•
Anonymity – A blockchain does not track the ownership
of accounts, states, smart contracts, and transactions of an
application to enhance the anonymity.
•
Consistency – Must capture the latest states and transactions
of all accounts and smart contracts on the source blockchain.
However, getting a globally consistent view of a distributed
system is hard, e.g., it is difficult to freeze all transactions
precisely at the same time on a distributed set of nodes.
Patterns for Blockchain Data Migration EuroPLoP ’20, July 1–4, 2020, Virtual Event, Germany
Figure 4: Process of taking a snapshot.
•
Finality – A state change may not be confirmed immediately
after a transaction is included in a block.
•
Latency – Time to collect states, smart contracts, and trans-
actions are proportional to the number of accounts and smart
contracts, as well as their states and transactions.
Solution. Get a snapshot of relevant states, smart contracts, and
transactions on the source blockchain at a given time. Fig. 4shows
the process of making a snapshot. First, select a block number to
initiate the snapshotting process and number of blocks to wait for
finality (aka.,
x-confirmation
). Second, update all instances of
the BAL. Third, the BAL should wait until the chosen block num-
ber is reached. Once reached, it should freeze processing further
transactions to prevent any state changes. BAL should further wait
for
x-confirmation
to ensure the finality of transactions already
included in a block. Once it is reached, the migration tool can extract
all required states, smart contracts, and transactions by querying the
source blockchain via the BAL. Finally, all extracted data are saved
as a snapshot file.
Time is not reliable in blockchains due to clock skew. Therefore,
time to freeze transactions and
x-confirmation
should be speci-
fied as block numbers because they are consistent across all block-
chain nodes. If the entire blockchain is being migrated, all blockchain
nodes need to freeze transaction processing at the set block number.
In case the BAL or nodes are not designed to freeze transactions at a
set number, they need to be patched (aka., soft fork). Such changes
usually require approval from the blockchain governance body. Fur-
ther considerations include the time needed to develop and patch
all the BAL instances or nodes, as well as to notify cryptocurrency
exchanges and users about the potential downtime. Given such tech-
nical, operational, and governance considerations, the governance
body should decide on the block number to freeze transactions
well-ahead. For example, the typical time frame for a public block-
chain migration range from weeks to months.
x-confirmation
is
essential in Nakamoto consensus [
33
] based blockchains, as the final-
ity is probabilistic. A relatively higher value of
x-confirmation
is used to minimize the probability of state change after the mi-
gration. For example, while Bitcoin and Ethereum typically use
x-confirmation
s of six and 12, respectively, they are almost dou-
bled during migration. The time that the BAL or blockchain nodes
remain frozen can be minimized by first taking a snapshot of the
relevant data after the set block number and
x-confirmation
, then
iteratively going through the new blocks that were generated since
then to find only the updated state. If only the application-specific
data are migrated, corresponding accounts can be found from the ID
database. States represented as native assets of a blockchain platform,
transactions, and smart contract code can be queried using the API
exposed by the blockchain client nodes. If a smart contract exposes
getter
functions, its embedded state can be easily extracted. Other-
wise, states have to be extracted by going through the global state
data structure of a blockchain node. If the entire blockchain is to
be migrated, the migration tool needs to go through the global state
and entire block history to extract relevant accounts, states, smart
contracts, and transactions. While a blockchain explorer can simplify
this process, it is essential to validate that the explorer is in sync
with the source blockchain. All data extractions can be parallelized
as they are read-only.
Consequences.
•
Consistency is preserved as all blockchain activities are frozen
on distributed instances of the BAL or blockchain nodes, and
the snapshot is created after reaching time to finality.
•
Transactions could be made to freeze only when the snapshot
of states, smart contracts, and transactions that were updated
after initiating the snapshotting process is being taken. Thus,
a snapshot on a blockchain can be made relatively faster than
in a database, which needs to freeze all transactions due to
the difficulty in determining the data that got changed once
the migration process begins.
•
ID database simplifies the identification of application-specific
data overcoming anonymity.
•
This pattern can be applied to states, smart contracts, and
transactions, and any combination thereof.
•
Latency depends on the time to finality and freeze time.
Freeze time depends on the parallelization of state, smart
contract, and transaction extraction. Hence, this pattern is
more desirable when the source blockchain is private, or the
application does not interact with any external state.
Related patterns. To recreate the collected states, smart contracts,
and transactions on the target blockchain use state and transaction
load patterns, such as establish genesis,hard fork, and state initial-
ization. Also, state transform, as well as state and transaction load
patterns could use the snapshot to decide on what states and smart
contracts to transform, extract, and load to the target blockchain.
Known use. Snapshotting is commonly used to bootstrap block-
chains, sidechains, and new nodes. For example, a snapshot file
format and bootstrap procedure for Bitcoin alt-coins is presented in
[
38
]. EOS [
31
], Telos [
15
], TOMO [
39
], Tron [
31
], and Vechain [
54
]
alt-coins took a snapshot while freezing all the transactions when
EuroPLoP ’20, July 1–4, 2020, Virtual Event, Germany Bandara, Xu, and Weber
they moved away from the Ethereum. Bithereum [
34
] took a snap-
shot of the Ethereum global state to spin up a duplicate blockchain
(aka., hard spoon). Moreover, both Bitcoin and Ethereum support
dumping the history at different granularities and bootstrapping a
new node using the dumped snapshot. Such daily snapshots are avail-
able on the web. Snapshots are also used while migrating a DApp
to a different smart contract or blockchain instance. For example,
Tronbet took a snapshot of all ANTE tokens before exchanging them
to the upgraded WIN tokens managed by a new smart contract [
52
].
5.4 State Transformation Patterns
5.4.1 Pattern 2 – State Aggregation.
Summary. Aggregate a set of states into a single (or a few) state.
Context. Concert application has a large number of accounts and
states. Extracting all states from the source blockchain and recreating
them on the target blockchain is both costly and time-consuming.
The list of states to be migrated is given in the snapshot or ID
database.
Problem. How to extract and recreate a large number of states
while minimizing the time and cost?
Forces.
•
Size – A large number of accounts and their states to be
migrated to the target blockchain.
•
Latency – Time to extract and recreate the states is propor-
tional to the number of accounts and their states.
•
Cost – On a public blockchain, each state transfer/update
needs to pay a transaction or exchange fee. The fee could also
be proportional to the size of the state.
•
Consistency – Must capture all accounts and states on the
source blockchain. Any transformation of data before/after
the migration must not violate consistency property.
•
Accountability – Any data transformed to simplify the migra-
tion must be recorded with proof.
Solution. Transfer all ConcertCoins to a single account, such
that only the closing balance needs to be migrated. Fig. 5shows
the sequence of activities required to perform such an aggregation
of state. To aggregate blockchain native assets, first, create a new
account on the source blockchain. To aggregate states embedded in
smart contracts, deploy a new Smart Contract (SC). Both cases will
produce a new address (
aggregateAddress
). Then get the users’
consent to transfer their states by signing a transaction with the
current state as payload and
aggregateAddress
as the recipient.
Next, submit the signed transaction to the source blockchain. Finally,
trigger the aggregate function at the aggregateAddress.
The summation of native assets will be atomically performed
once a transaction is executed. Hence, native asserts do not need
a separate
aggregate
function. Smart contracts could be used to
perform more generic aggregation scenarios. For example, a smart
contract can aggregate fungible tokens, similar to native assets. Non-
fungible tokens that could be mapped to a set of binary states can
be concatenated to a bitmap represented as a single state. In a more
general case, nset of states could be aggregated to mset of states
(
m<< n
), e.g., closing account balance vs. closing balances of each
budget item. Once the
aggregate
function is called, it must lock the
Migration Tool User
Aggregate Account/SC
<<create aggregate
Account/SC>>
aggregateAddress
signTX(aggregateAddress, state)
signedTX
signedTX
aggregate()
loop [for each User]
Figure 5: Object interaction during state aggregation.
account or smart contract using the token burning pattern to prevent
further aggregations. When a user holds the private key, he/she needs
to sign the transfer transaction via the application or wallet software.
It is pragmatic for the migration tool to submit transactions to the
blockchain on behalf of the users, as it can track the transactions and
trigger the
aggregate
function as soon as all the transactions are
included. In practice, this is achieved by integrating the BAL or user
wallet with the migration tool. Either the ID database or snapshot
can be used to find the list of accounts and their states to be aggre-
gated. Unless the application can work with the aggregated state(s)
on the target blockchain (e.g., closing balance), the state needs to
be disaggregated to match the states that were on the source block-
chain. Disaggregation could be achieved by running relevant inverse
functions and smart contracts on the target blockchain, e.g., Concert-
Coins can be split into multiple accounts as per the state recorded in
the snapshot. Aggregation or disaggregation could also be performed
after extracting states from the source blockchain depending on the
transaction fees, performance, and complexity of functions. Loss of
accountability due to such external transformations can be overcome
by adding the list of aggregations as a PoE entry to the blockchain.
Transaction fees need to be paid in public blockchains to transfer
states and run the
aggregate
function. While public blockchain
users typically bear this cost, sometimes they are reimbursed while
recreating the state on the target blockchain. Alternatively, the mi-
gration tool could top-up/reimburse user accounts before/after the
state transfer [65].
Consequences.
•
The number of accounts, their states, data size, and latency to
extract and recreate are reduced.
•
Consensus and the ability to record aggregation/disaggrega-
tion operations within the blockchain provide greater consis-
tency and accountability compared to database migration.
•
In public blockchains, the cost could be reduced as transac-
tions required for aggregation within the same blockchain is
typically lower than the exchange fees or inter-blockchain
transactions. However, the cost depends on the internal and
Patterns for Blockchain Data Migration EuroPLoP ’20, July 1–4, 2020, Virtual Event, Germany
external transaction fees, as well as the complexity and num-
ber of transactions and smart contracts required for aggrega-
tion/disaggregation.
•
Transaction fees need to be either paid separately or could be
deducted from the native assets being aggregated.
•
While this pattern works on any blockchain, it works only on
states that could be reduced to a single or a few values.
Related patterns. Aggregation can be performed before or after
snapshotting and before token burning. Moreover, the set of states to
be aggregated can be found using the snapshotting pattern. Relevant
state and transaction load patterns could rely on this pattern to
reduce the number of accounts and states to recreate. Off-chain data
storage pattern is needed to add a PoE entry when the aggregation is
performed outside the source blockchain.
Known use. Aggregation is used while bootstrapping a token
on the target blockchain. For example, Storj [
46
] and Binance [
3
]
used the aggregated token balance from their source blockchains
to set the initial balance of ERC-20 and BEP-2 token contracts,
respectively. This pattern was also used in Ethereum DAO hard fork
to aggregate blocked funds and transfer them back to the users [
58
].
When the Bitcoin network is less congested, multiple UTXOs can be
aggregated to produce a single UTXO by paying a small transaction
fee. The resulting UTXO can be later spent during times of high
network congestion while paying lower traction fee, as the overall
transaction size is reduced [57].
5.4.2 Pattern3–TokenBurning.
Summary. Make states and smart contracts on the source block-
chain unusable before the migration.
Context. Because the source blockchain used by the nonprofit
is public, it is not decommissioned after the migration. Therefore,
any state and smart contracts left in the source blockchain could
be misused (e.g., double spending). The list of states and smart
contracts to be migrated is given in the snapshot or ID database.
Problem. How to ensure states and smart contracts exist only on
the target blockchain after the migration?
Forces.
•
Immutability – State of accounts and smart contracts on a
blockchain is immutable.
•
Consistency – If the source blockchain is not decommis-
sioned, states and smart contracts could be used in both the
blockchains leading to misuse, such as double-spending at-
tacks [
44
]. Any transformation of data to prevent misuse must
not violate consistency properties.
•
Accountability – Any data transformation to prevent misuse
must be recorded with proof.
Solution. Use transactions to transfer states such as native assets
and tokens to an unusable account. Delete tokens by calling respec-
tive functions on the smart contracts that created them. Similarly,
smart contracts can call the self-destruct function. All such attempts
to make states and smart contracts unusable are referred to as token
burning.
While transferring states/assets, it is vital to ensure that the re-
cipient address (aka., burn address) is invalid, i.e., there will not
be a corresponding private key that can control the states sent to
that address. Therefore, it is essential to use the burn address recom-
mended by the chosen blockchain platform. Another alternative is
to deploy a smart contract that immediately self-destructs as soon as
it receives a transaction. However, it is more costly as smart contract
deployment and execution are relatively expensive. A smart contract
and its state can be made unusable using the self-destruct function,
e.g.,
selfdestruct
function in Ethereum. In case a self-destruct
function is not built into the smart contract, a carefully crafted trans-
action(s) may be attempted to set the state to a terminating state [
4
].
However, this approach is not guaranteed to work or may fail in the
future once vulnerabilities are found. Users holding the private keys
need to sign the transactions that burn their states or trigger rele-
vant smart contract functions. The burned state is typically verified
before recreating the state on the target blockchain (aka., proof of
burn) [
56
]. If a custodial wallet maintains aggregated state, a user
trying to establish proof of burn for his/her state needs to transfer
the state to a non-custodial wallet before burning the state. If the
source and target blockchains are different instances, the service of
an oracle could be used to attest the state from one blockchain to
another. The migration tool may also work as an oracle, as it has
access to both the blockchains. However, when transaction fees are
charged, the burned state may not be an accurate representation of
the original state. Sometimes token burning is performed after the
migration, where they are retained as a rollback option in case of a
failed migration. The risk of maintaining states on two blockchains
needs to be carefully evaluated based on the permission type of
blockchains, migration window, and the ability to prevent users from
issuing transactions within the migration window. The risk could
also be reduced by burning only one state at a time and recreating
that on the target blockchain.
Consequences.
•
Immutability, consistency, and accountability are preserved as
all state changes are performed within the source blockchain.
When recreating states on the target blockchain, proof of burn
should be verified.
•
This pattern works on all blockchains. On a public blockchain,
transaction fees need to be paid to make the states and smart
contracts unusable.
•
A smart contract should either implement a self-destruct func-
tion or it should be possible to generate a specific transaction
sequence that sets a smart contract’s state as terminated.
Related patterns. Snapshotting and state aggregation patterns
could be used before token burning. Moreover, the set of states and
smart contracts to burn can be found using the snapshotting pattern.
State and transaction load patterns can verify proof of burn before
initializing a new state on the target blockchain.
Known use. While migrating from Ethereum or Bitcoin to their
own instances, Binance [
3
], Bitizens [
5
], KARMA [
25
], Kin [
8
],
Safex [
36
], and Storj [
46
] asked users to burn their tokens by trans-
ferring them to a designated address. Whereas EOS [
31
], Qubicle
[
40
], and Vechain [
54
] asked users to burn a small number of tokens
to confirm the ownership of their accounts, and the remaining tokens
EuroPLoP ’20, July 1–4, 2020, Virtual Event, Germany Bandara, Xu, and Weber
were burned by the respective smart contracts that created them.
Qubicle’s token burning followed the EOS inter-blockchain commu-
nication protocol [
10
,
45
]. While tokens are typically burned before
the migration, Binance, KARMA, and Storj tokens were burned after
the migration. This pattern is also used while bootstrapping alt-coins
in Bitcoin [56] and sidechains in Horizen [16].
5.5 State and Transaction Load Patterns
5.5.1 Pattern 4 – Node Sync.
Summary. Create a clone of a blockchain node by synchronizing
blocks and history.
Context. The nonprofit wants to add new nodes or swap nodes to
change the physical location/node, improve performance, or create
multiple instances of the source blockchain. The consensus rules
of the target blockchain platform are compatible with the source
blockchain. A snapshot of the source blockchain is available.
Problem. How to load states and smart contracts to the target
blockchain?
Forces.
•
Consistency – States such as blockchain native assets cannot
be arbitrarily created because migration must preserve system
invariants. It is difficult to create a complete copy of the ledger
and block history, as the system is distributed, new blocks are
built continuously, and finality is not immediate.
•
Accountability – Initiation of new accounts and their states
on the target blockchain must be recorded with proof.
•
Size – A blockchain may contain a large number of accounts,
states, smart contracts, and transactions. Moreover, history
may consist of a large number of blocks. A large subset of
these may need to be initiated on the target blockchain.
•
Latency – Time to create accounts and initialize their states is
proportional to the number of accounts and their states. Given
the large size of the global state and history, time to make a
copy could be very long. It takes even more time to recreate
and validate the global state by applying all transactions and
running smart contracts.
•
Cost – Each account creation and state assignment may needs
to pay a transaction fee. The transaction fee could also be
proportional to the size of the state.
•
Consensus – To accept a state, smart contract, transaction,
or block as valid, both the sender and receiver nodes must
follow the same consensus rules.
•
Governance – The consensus of the blockchain’s governance
body is required to update any state or blockchain platform
software.
Solution. Steps shown in Fig. 6can be used to synchronize the
new node using the blockchain platform-specific sync tool. First,
install the source blockchain platform’s client software (or an up-
dated version that is backward compatible) on the new node. Also,
configure the new node to connect to other members of the source
blockchain. Second, enable the sync tool on the node to copy various
data structures representing the global state, smart contracts, trans-
actions, and blocks from other blockchain nodes. Next, the node
should rebuild and validate all the transactions from the genesis to
verify the global state. Any errors, such as failed data transfer, need
to be resolved by requesting further data. Finally, reconfigure the
node to accept new transactions.
Figure 6: Process of synchronizing a node.
The volume of the global state, smart contract, transaction, and
block data of a highly-active blockchain could be in excess of hun-
dreds of Gigabytes. Hence, many days to weeks could be required
to download and validate them depending on bandwidth, disk IO,
and CPU limitations, as well as ongoing transactions. Several op-
timizations could be applied to reduce the sync time. For example,
the new node can be bootstrapped using a snapshot of the source
blockchain. Then only the data from new blocks created after the
snapshot need to be synchronized. Moreover, validation could be
performed at different levels, such as validation of the last nblocks,
random blocks, entire block history, and entire block history and
transactions. If the objective is to swap blockchain nodes, decom-
mission the old nodes. Whereas if the objective is to create multiple
blockchain instances, reconfigure the new node(s) to behave as an
independent blockchain after a set block number. This is usually
achieved by setting a new blockchain ID and disconnecting from the
parent blockchain’s nodes.
Consequences.
•
This pattern works on any blockchain where source and tar-
get blockchain platforms are compatible (i.e., follow same
consensus rules), and the target node(s) is new. No explicit
approval of the blockchain governance body is needed to add
a new node.
•
Blockchain sync tools rely on asynchronous messaging, im-
mutability (to identify states updated after a given block
number), consensus (to provide consistency), and validation.
Hence, no state, history, and IDs are lost or changed in the
process, ensuring consistency, accountability, integrity, and
immutability.
•
No transaction fees as data synchronization are at the block-
chain data structure level.
•
Time to synchronize a large volume of states and history could
vary from hours to days. Time can be substantially reduced
by bootstrapping a node using a snapshot and synchronizing
only the ledger state (i.e., state only data fidelity level).
Related patterns. Snapshotting pattern can be used to speed up
the bootstrap process.
Known use. This pattern is frequently used to connect new nodes
to an existing blockchain. Bitcoin, Ethereum, and Bithereum [
34
]
allow new nodes to download a snapshot file that reflects the global
Patterns for Blockchain Data Migration EuroPLoP ’20, July 1–4, 2020, Virtual Event, Germany
state and blocks at a set block number, and then sync with other
nodes to speed up the process. Moreover, Ethereum supports syncing
data at different granularity levels, such as full, fast, and light [
6
].
Similarly, Bitcoin supports full node and thin client syncing.
5.5.2 Pattern 5 – Establish Genesis.
Summary. Set the state on the target blockchain’s genesis block.
Context. Concert application has a large number of states, and all
of them need to be recreated on the target blockchain. The nonprofit,
as the blockchain governance body, has decided to spin up a new
blockchain instance as the target and use ConcertCoin as the native
asset. The list of accounts and states to be migrated is given in the
snapshot. After the snapshot, states are marked as unusable using
token burning.
Problem. How to load states and smart contracts to the target
blockchain?
Forces. Contradictory considerations, such as consistency, ac-
countability, size, latency, cost, consensus, and governance must be
taken into account as described in pattern four.
Solution. As seen in Fig. 7, use the snapshot of states from the
source blockchain to set states on the target blockchain’s genesis
block during its initialization. If a user cannot use an existing private
key to prove its ownership of a state migrated to the target block-
chain, a new key pair needs to be created. Thus, the first step is to
get each user to create a new key pair and a corresponding account
ID (
accID
) using the account creation algorithm of the target block-
chain. Second, update the account ID in the snapshot file with
accID
.
Next, create the genesis block configuration file (aka., genesis file)
while including the accounts and states from the updated snapshot
file. Then use the genesis file to initialize the target blockchain. Up-
date the ID database on the BAL to reflect the new set of
accID
s.
Also, add a PoE entry that tracks the mapping between old and new
account IDs to record how the new accounts came into existence.
The first two and last two steps are optional when the accounts can
be reused across the blockchains. However, it is desirable to create
new private and public key pairs, as well as corresponding accounts
on the target blockchain. This can guard against weaknesses that may
appear in the source blockchain, its wallets, and exchanges that could
compromise a private key or seed used to generate it. Further, the use
of new accounts could enhance the anonymity of the application’s
accounts, especially when states are not very specific such that it
is nontrivial to build a mapping table, e.g., when state aggregation
pattern is used. User involvement is needed to create key pairs and
new accounts if they hold the private keys. Moreover, platforms such
as Hyperledger requires users to enroll their digital certificates with
the certificate authority. If the number of states in the snapshot is
too big to fit into the genesis block, aggregation pattern could be
used to reduce the state. Native assets are typically set during the
genesis block creation. Typically, smart contracts are not deployed
during the genesis. However, while Hyperledger Fabric [
23
] does
not use a native asset, it allows smart contracts to be deployed using
the genesis block. Encrypting on-chain data pattern can be used to
encrypt the PoE entry added to the blockchain to hide the mapping
between old and new accounts. For example, Hyperledger Fabric
private data collections can be used to allow a subset of peers to
Migration
Tool User
Tar ge t
Blockchain
Blockchain
Access Layer
Has snapshot
createAccount()
accID
updateSnapshot(
accID)
buildGenesisFile(
snapshot)
createGenesisBlock(genesisFile)
update({oldAccID:accID})
addPoE({oldAccID:accID})
loop [for each User]
Figure 7: Object interaction to create the genesis block.
store the mapping between old and new accounts in a private state
database and share them with each other while adding a hash of that
data to the ledger using a transaction. Time taken by all users to
create new accounts is difficult to predict. Hence, sufficient notice
ranging from a couple of days to weeks needs to be given to the
users before the snapshot and genesis block are created.
Consequences.
•
Arbitrarily creating states on the genesis block to bootstrap
a blockchain is not considered a consistency violation, as
the genesis block is built by the blockchain governing body
and the content of the block is transparent. Moreover, the
nonprofit, as the blockchain governance body, establishes the
genesis block.
•
Migration can be performed fast, as only the genesis block is
used to recreate the state. However, the process slows down
when users have to create new accounts.
•
No transaction fees as the global state is initialized by the
genesis block’s configuration than using transactions.
•
Accountability could be preserved by adding the PoE entry
that tracks the mapping between old and new account IDs.
•
This pattern can be used to recreate native assets of any new
instance of a blockchain. The consensus is enforced only for
the format of the genesis block, as the global state is yet to
establish.
•
The pattern works only if the states fit into a single block or
can be aggregated to fit into a single block.
Related patterns. State extraction and state transformation pat-
terns could be used to identify, reduce, and burn the states to include
in the genesis block. Use the off-chain data storage pattern to add a
PoE entry that tracks the mapping between old and new account IDs.
EuroPLoP ’20, July 1–4, 2020, Virtual Event, Germany Bandara, Xu, and Weber
Alternatively, if the mapping between account IDs is not very large,
an encrypted version can be added using the encrypting on-chain
data pattern.
Known use. The genesis block is used to specify the initial dis-
tribution of native assets on a blockchain. For example, the genesis
block of the Æternity [
1
] included the ERC-20 tokens of users who
supported the initial migration from Ethereum to its blockchain.
Zeepin [
64
] used a similar approach as it migrated from NEO to its
blockchain. Telos [
15
] – an instance of the EOS platform – used the
EOS snapshot file to build its genesis block. The pattern is also used
for hard spooning, e.g., Bithereum [
34
] created its genesis block
using a snapshot from Bitcoin at a set block number.
5.5.3 Pattern 6 – Hard Fork.
Summary. Change the global state of the target blockchain.
Context. Concert application has a large number of states, and all
of them need to be recreated on the target blockchain. The nonprofit
has chosen an existing private or consortium blockchain as the target.
The nonprofit has gained the consent of blockchain’s governance
body to introduce state changes that violate the consensus rules.
The list of accounts and states to be migrated is given in the snap-
shot. After the snapshot, states are marked as unusable using token
burning.
Problem. How to load states and smart contracts to the target
blockchain?
Forces. Contradictory considerations, such as consistency, ac-
countability, size, latency, cost, consensus, and governance must be
taken into account as described in pattern four.
Solution. As seen in Fig. 8, use the states from the snapshot to
change the global state on the target blockchain. If a user cannot
use an existing private key to access the state migrated to the target
blockchain, follow the first two steps to create a new account and
update the snapshot file. Next, update the blockchain client software
on all nodes of the target blockchain to include the snapshot file at
a set block number (
blockNo
). Once the
blockNo
is reached, all
blockchain nodes should append the states from the snapshot file into
the ledger to update the global state. Due to the arbitrary addition
of new states, blocks produced before and after the software update
will be incompatible. Such a change in blockchain state transition
rules is referred to as a hard fork [
58
]. Finally update the ID database
and add a PoE entry to reflect new account IDs.
While the first two and last two steps are optional when the ac-
counts can be reused across the blockchains, it is desirable to create
new key pairs and accounts. As discussed in the establish genesis
pattern, this can enhances the anonymity and guard against exposure
of private key due to weaknesses that may appear in the source block-
chain. Similarly, user involvement is needed to create new accounts.
While any state can be amended during a hard fork, deployment of
smart contracts is unprecedented. However, compared to the estab-
lish genesis pattern, there is no limit on the number of states that can
be recreated during a hard fork. Similar to the snapshotting pattern,
approval of the blockchain governance body is needed to update the
blockchain client software. Further, the
blockNo
to initiate the hard
fork need to be determined while considering the time needed to
Migration
Tool User
Tar ge t
Blockchain
Blockchain
Access Layer
Has snapshot
createAccount()
accID
updateSnapshot(
accID)
updateSoftware(blockNo, snapshot)
wait(blockNo)
append(snapshot)
update({oldAccID:accID})
addPoE({oldAccID:accID})
loop [for each User]
Figure 8: Object interaction during a hard fork.
develop and patch all the nodes, as well as time taken by all users to
create new accounts. Hence, a notice period ranging from a couple
of days to weeks needs to be given before the hard fork [22].
Consequences.
•
Migration happens at a set block number and independent of
the number of states and smart contracts to be migrated. Any
node that misses the update will no longer be part of the same
blockchain.
•
No transaction fees as the global state is altered by the hard
fork than using transactions.
•
Accountability could be preserved by adding the PoE entry
that tracks the mapping between old and new account IDs.
•
While this pattern works with any state, it violates consistency
as the changes do not comply with the consensus rules. Hence,
the consensus of the blockchain’s governance body is required
to update the blockchain software to initiate the hard fork
[
58
]. Therefore, it is more suitable for private and consortium
blockchains.
Related patterns. Same as pattern five.
Known use. Hard forks are used to spin-up new cryptocurrencies.
For example, after launching the blockchain using establish genesis
pattern, Æternity went through three rounds of hard forks to migrate
the remaining ERC-20 tokens [
1
]. When Steem was sold to Tron,
objecting users spin up a new cryptocurrency called Hive by forking
the Steem blockchain [
22
]. Similarly, a conflict over the use of a
hard fork to recover cryptocurrency lost due to the DOA attack [
58
]
resulted in Ethereum and Ethereum Classic.
Patterns for Blockchain Data Migration EuroPLoP ’20, July 1–4, 2020, Virtual Event, Germany
5.5.4 Pattern 7 – State Initialization.
Summary. Initialize/recreate states on the target blockchain.
Context. Concert application has a large number of states and
smart contracts, and all of them need to be recreated on the target
blockchain. The nonprofit has chosen an existing public blockchain
as the target. The list of accounts, smart contracts, and states to be
migrated is given in the snapshot or ID database. After the snapshot,
states are marked as unusable using token burning.
Problem. How to load states and smart contracts to the target
blockchain?
Forces. Contradictory considerations, such as consistency, ac-
countability, size, latency, cost, consensus, and governance must be
taken into account as described in pattern four.
Solution. Create one state at a time on the target blockchain using
the steps shown in Fig. 9. If a user cannot use an existing private key
to access the state migrated to the target blockchain, follow the first
step to create a new account. Then get the user to sign a transaction
with the state listed in the snapshot as the payload and new address
(i.e.,
accID
) on the target blockchain as the recipient. Then submit
the signed transaction to the target blockchain to recreate the state.
Update the ID database and add a PoE entry to reflect new account
IDs. It is also desirable to include the snapshot file as a PoE entry,
as it is not included in the target blockchain’s history compared to
establish genesis and hard fork patterns.
Migration
Tool User
Tar ge t
Blockchain
Blockchain
Access Layer
Has snapshot
or proof_of_burn
createAccount()
accID
signTX(accID,
state)
signedTX
signedTX
txID
update({oldAccID:accID})
addPoE([{oldAccID:accID},
snapshot\proof_of_burn])
loop [for each User]
Figure 9: Object interaction during state initialization.
While the first and last two steps are optional when the accounts
can be reused across the blockchains, it is desirable to create new key
pairs and accounts. As discussed in the establish genesis pattern, this
can enhances the anonymity and guard against exposure of private
key due to weaknesses that may appear in the source blockchain.
Also, user involvement is needed to create new accounts. When users
hold the private keys, they also need to sign the transaction used to
recreate the state. Signed transactions are issued by the migration
tool to the target blockchain, mainly for the convince of managing
the migration. For example, when the transactions flow through
the migration tool, it is easier to update the ID database and use
the measure migration quality pattern to quantify the progress of
migration. Instead, users may send the signed transactions directly,
as each state is initiated using an independent transaction. In fact,
the entire loop can be parallelized. Moreover, rather than using a
snapshot file to identify the list of accounts and states to recreate,
ID database could be used. However, it is essential that the latest
state on the source blockchain is captured and recreated on the target
blockchain. Thus, the token burning pattern can be used to grantee
that a state will not further change on the source blockchain. This
pattern is suitable to redeploy smart contracts and recreate states
embedded in them, e.g., tokens. Sometimes a series of transactions
may need to be replayed to set a particular state. However, this is
time-consuming and costly, as transactions need to be replayed in
exact order with finality. However, native assets cannot be recreated
due to the violation of systems invariants. Assets lost as transaction
fees during state transformation,token burning, and setting state
could also be credited back while initiating a state. Time taken by
all users to create new accounts and sign transactions is difficult to
predict. Hence, sufficient notice ranging from a couple of days to
weeks needs to be given to the users [
9
,
31
,
36
,
54
]. However, user
involvement does not slow down the entire migration process, as
each state is independently recreated.
Consequences.
•
One state can be initialized or smart contract can be deployed
at a time without requiring the migration to be a single-shot
operation; hence, the risk is reduced.
•
Consistency and accountability are preserved when proof of
burn is verified before the state initialization and a PoE entry
of the snapshot is added to the blockchain.
•
This pattern works with any target blockchain regardless of its
existence. No explicit approval of the blockchain governance
body is needed to issue transactions. However, it works only
for states and smart contracts that can be arbitrarily created,
e.g., tokens generated by smart contracts. It does not work on
native assets.
•
If the number of accounts and their states is large, many
transactions are needed to set their state, increasing the cost
and latency. Size, cost, and latency could be reduced using
the state aggregation pattern.
Related patterns. Same as pattern five.
Known use. Binance [
3
], Bithereum [
34
], Bitizens [
5
], Effect.AI
[
9
], EOS [
31
], Gifto [
18
], Kin [
8
], Qubicle [
40
], Safex [
36
], Storj
[
46
], and Vechain [
54
] migration used this pattern to issue new
tokens on the target blockchain. Bithereum, Effect.AI, EOS, Kin,
Qubicle, Safex, Storj, and Vechain also relied on users to create new
accounts on the target blockchain, and inform new addresses to the
EuroPLoP ’20, July 1–4, 2020, Virtual Event, Germany Bandara, Xu, and Weber
migration tool. Other than Bithereum and Gifto, others also used
token burning in combination with this pattern.
5.5.5 Pattern 8 – Exchange Transfer.
Summary. Transfer states via an exchange.
Context. Concert application has a large number of states, and all
of them need to be recreated on the target blockchain. The nonprofit
has chosen an existing public blockchain as the target and Concert-
Coins are to be converted to its native assets. The list of states to be
migrated can be found from the ID database.
Problem. How to load states and smart contracts to the target
blockchain?
Forces. Contradictory considerations, such as consistency, ac-
countability, size, latency, cost, consensus, and governance must be
taken into account as described in pattern four.
Solution. As seen in Fig. 10, use a cryptocurrency/token exchange
to transfer the states. If a user cannot use an existing private key to
access the state migrated to the target blockchain, follow the first step
to create a new account. Then get the user to sign a transaction with
both the state and new account ID (i.e.,
accID
) on the target block-
chain as the payload. Set the address of the exchange (
excAddress
)
as the recipient. Then submit the signed transaction to transfer the
state to exchange’s account on the source blockchain. This is similar
to a sell order in financial markets. Once the exchange confirms that
the state is transferred to its account, it will look for a matching buy
order. Once a match is made, the exchange will transfer the state to
the
accID
from its account on the target blockchain. Finally update
the ID database to reflect new account IDs (this step is not shown in
Fig. 10).
Migration Tool
User
Tar ge t
Blockchain
Source
Blockchain
Exchange
createAccount()
accID
signTX(excAddress,
[state, accID])
signedTX
signedTX
pull()
signedTX
match(
)
signTX(accID, state)
Figure 10: Object interaction during exchange transfer.
Some of the exchange-based data migration scenarios use a com-
bination of token burning and state initialization patterns. Whereas
Fig. 10 captures the behavior of an exchange matching sell and buy
orders from source and target blockchains, which is more prevalent
in decentralized exchanges. Chosen exchange(s) should support the
state to be traded. If the format of states used by the application
and exchange does not match, the state could be tokenized to a
common format (e.g., ERC-20 standards in Ethereum). This could
be achieved by deploying a smart contract to convert tokens on
the source blockchain. Similarly, tokens could be decoded on the
target blockchain. The number of public blockchains and types of as-
sets/tokens supported by a centralized exchange are usually limited.
However, a decentralized exchange can be configured to connect
with any blockchain and support any token format agreeable to both
the blockchains. Even the migration tool could be extended to act as
a decentralized exchange. The first and last steps are optional when
the accounts can be reused across the blockchains, it is desirable to
create new accounts. As discussed in the establish genesis pattern,
this can enhances the anonymity and guard against exposure of pri-
vate key due to weaknesses that may appear in the source blockchain.
Also, user involvement is needed to create new accounts. Similar to
the state initialization pattern, signed transactions are issued by the
migration tool to the source blockchain, mainly for the convince of
managing the migration. Instead, users may send the signed trans-
actions directly, as each state is exchanged using an independent
transaction. Moreover, as the centralized exchanges usually keep
custody of a user’s private key, it could automatically transfer the
state without user involvement. In all scenarios, a transaction fee and
commission are usually charged. Time taken by all users to create
new accounts and sign transactions is difficult to predict. Similarly,
the time needed by the
match
function depends on market demand
for asset/state. Hence, sufficient notice ranging from a couple of
weeks to months needs to be given to the users [
54
]. However, this
user involvement does not slow down the entire migration process,
as each state is independently transferred.
Consequences.
•
One state can be transferred at a time without requiring the
migration to be a single-shot operation; hence, the risk is
reduced. Cost is proportional to the number of states and their
sizes/values to transfer. Bulk/aggregated transfers and the use
of centralized exchanges could reduce size, cost, and latency.
•
Because the state transfers to and from the exchange’s ac-
count are recorded on the blockchain, better consistency and
accountability can be achieved. Decentralized exchanges pro-
vide enhanced privacy compared to centralized ones due to
disintermediation.
•
This pattern works with both blockchain native assets and
tradable tokens on any blockchain platform. No explicit ap-
proval of the blockchain governance body is needed to trade
assets via a supported exchange. However, the state should be
tradeable on the chosen exchange(s) or should be convertible
to a tradable token.
•
Exchanges charge transaction fees and apply asset conversion
rates reducing the value of assets during the transfer.
Related patterns. Same as pattern five.
Known use. During the migration from Ethereum, Atomic [
60
],
Binance [
3
], Gifto [
18
], TOMO [
39
], Tron [
31
], and VeChain [
54
]
Patterns for Blockchain Data Migration EuroPLoP ’20, July 1–4, 2020, Virtual Event, Germany
requested users to migrate their assets to a selected centralized ex-
change. ChangeNOW [
60
] token swap service was used by Atomic
to migrate from Ethereum to Binance. KyberSwap also offers a simi-
lar service for ERC-20 tokens [
35
]. EOS [
45
] and Interledger [
51
]
provide protocols to transfer tokens via decentralized exchanges.
5.6 Smart Contract Patterns
5.6.1 Pattern 9 – Virtual Machine Emulation.
Summary. Allow smart contracts written in one language to run
on another blockchain platform.
Context. Concert application uses a set of smart contracts and
embedded states which need to be usable on the target blockchain.
State and transaction load patterns cannot be used to load smart
contracts from the source blockchain to target blockchain, as the
execution environment is not identical. However, the target block-
chain platform could emulate the Virtual Machine (VM) used to
execute smart contracts. The list of smart contracts and their states
to be migrated is given in the snapshot. After the snapshot, smart
contracts are marked as unusable using token burning.
Problem. How to load and run smart contracts from the source
blockchain on the target blockchain?
Forces.
•
Platform dependence – Usually, the smart contract language
is blockchain platform-specific.
•
Turing completeness – Not all smart contract languages are
Turing complete. Thus, it may not be possible to recreate the
same behavior using a different smart contract language.
•
Correctness – It is difficult to guarantee that the rewritten
smart contract behaves precisely like the original smart con-
tract.
•
Time and cost – Rewriting and testing smart contracts take
time hence also costly.
Solution. Use the process outlined in Fig. 11 to reuse the smart
contract execution environment (aka., VM, sandbox, or container) on
the target blockchain. First, copy the VM from the source blockchain.
Second, integrate the VM into the target blockchain. Third, if the VM
does not hold the smart contract code, redeploy the smart contracts
on the target blockchain using the state initialization pattern. Fourth,
use the same pattern to set the states of the deployed smart contracts.
Smart contract code and their states can be found from the snapshot.
Sixth, update the mapping between old and new smart contract
addresses on the ID database, as smart contract addresses vary across
blockchain instances, and could also depend on the address that
deployed the smart contract, transaction sequence number, among
others. Finally, it is also desirable to include the snapshot file and
mapping between old and new smart contract addresses as a PoE
entry, as they are not included in the target blockchain’s history.
If the target blockchain allows importing the source blockchain’s
VM, it could be copied over. However, in practice, it is more likely
for the target blockchain to use a customized VM that supports
the same instruction set. For example, while Hyperledger Burrow
supports the Ethereum VM (EVM), a proxy is used to abstract the
transaction fee-related parameters, as Hyperledger does not have a
concept of transaction fees. Moreover, not all VMs hold the smart
contract code. For example, while Hyperledger chaincode containers
hold the code, EVM uses the code stored in the global state. Also,
earlier versions of Hyperledger Fabric required smart contract to be
instantiated on the target blockchain before use. Such instantiation
may also be used to set the address and access rights of a smart
contract. Moreover, even if the addresses can be reused, it is desirable
to create new accounts, as discussed in the establish genesis pattern.
This intern could change the address of the smart contract. Therefore,
rather than copying the VM, which is likely to require extensive
configuration, it is easier to redeploy the smart contracts on the
target blockchain’s VM. In such cases, the first two steps can be
skipped, given that the two VMs exhibit the same execution behavior.
Such compatibilities could be checked using bytecode-level formal
verification while reducing the cost and complexity of smart contract
validation [
2
]. When account owners hold the private keys, users
need to create new accounts, redeploy smart contracts, and set their
states. Also, if the reference to a smart contract is provided via a
smart contract registry or transactions are routed via a proxy contract
[
61
], first, the corresponding registry or proxy contract needs to be
deployed. Then update the registry or proxy with the new smart
contract address.
Consequences.
•
Execution behavior and correctness of the original smart con-
tract are preserved, as the same smart contract code and exe-
cution environment are used.
•
Saves time and reduces cost, as no code reverse engineering,
translation, and testing are needed.
•
Works only when the target blockchain platform can emu-
late source blockchain’s VM (i.e., smart contract platform
independence) without any limitations in behavior and Turing
completeness.
Related patterns. When emulation is not possible, use the smart
contract translation pattern. State extraction and state transforma-
tion patterns can be used to capture the smart contract code, em-
bedded states, aggregate the states, and burn the smart contracts.
Use the state initialization pattern to deploy and set the state of
smart contracts. PoE entries of mapping between old and new smart
contract addresses can be added using either off-chain data storage
or encrypting on-chain data patterns.
Known use. EVM bytecode can run on Hyperledger [
24
], VMware
Concord [
11
], Tron, and VeChain. For example, Deloitte was able
to move smart contracts from Ethereum to VeChain, as both block-
chains supported the same EVM [
28
]. Docker containers used by
Hyperledger to store and execute smart contract code could be reused
on multiple channels and across different networks.
5.6.2 Pattern 10 – Smart Contract Translation.
Summary. Translate smart contract code from one language to
another.
Context. Concert application uses a set of smart contracts and
associated states which need to be usable on the target blockchain.
The smart contract language of the target blockchain is not interoper-
able; hence, state and transaction load or virtual machine emulation
EuroPLoP ’20, July 1–4, 2020, Virtual Event, Germany Bandara, Xu, and Weber
Figure 11: Process of virtual machine emulation.
Figure 12: Process of smart contract translation.
patterns cannot be used. The list of smart contracts and their states
to be migrated is given in the snapshot.
Problem. How to load and run smart contracts on the target block-
chain?
Forces. Contradictory considerations, such as platform depen-
dence, Turing completeness, correctness, time, and cost must be
taken into account as described in pattern nine.
Solution. Follow the process outlined in Fig. 12 to translate and
deploy a smart contract on the target blockchain. First, verify that the
respective source code produces the exact smart contract deployed
on the source blockchain. Second, translate the smart contract to the
new language. Then, test the functional correctness and security of
the translated contract. Fourth, deploy the new smart contract to the
target blockchain using the state initialization pattern. Fifth, use the
same pattern to set the states of the deployed contract, as per the
state recorded on the snapshot. Next, update the mapping between
old and new smart contract addresses on the ID database, as smart
contract addresses vary across different blockchain instances and
could also depend on the address that deployed the smart contract,
transaction sequence number, among others. Finally, it is also desir-
able to include the original and translated code, snapshot file, and
mapping between old and new smart contract addresses as a PoE
entry, as they are not included in the target blockchain’s history.
Smart contracts updated or rewritten for the same blockchain
instance are within the purview of this pattern because the new smart
contract needs to be redeployed. Further, its states need to be set
to the values of the previous contract. When the smart contract
language is a compiled one, it is more desirable to compile the
source code and then compare it with the binary code on the source
blockchain. Compared to syntax-level comparison, this could re-
veal details such as compiler version and optimizations used during
deployment of the original contract, which may need to be taken
into consideration during the translation. Depending on the language
support and tool availability, translation may happen either at the
source or binary code level. In addition to the functional correctness,
accuracy of data types, their visibility, and smart contract function-
level access control must also be preserved. Also, aspects such as
semantic equivalence, event handling, object-oriented features, and
optimizations to reduce computation time and execution fees should
be taken into account. Ideally, extensive functional and security
testing should be performed on the test network provided by the
target blockchain. It is not uncommon to use a bug bounty program
to test critical smart contracts, e.g., token contracts. Therefore, the
migration time frame should allocate sufficient time for such testing.
When account owners hold the private keys, users need to create new
accounts, deploy new smart contracts, and set their states. Also, if the
reference to a smart contract is provided via a smart contract registry
[
61
], the registry contract needs to be translated and deployed first.
Then update the registry with the new smart contract address.
Consequences.
•
This pattern works for any blockchain platform that has a
smart contract language rich enough to recreate the same
behavior. However, smart contract translation is not trivial
and involves many complications.
•
It is difficult to guarantee that a translated smart contract ex-
hibits the same behavior. Correctness can be enhanced by
verifying the source code before translation, formal verifica-
tion, and extensive testing.
•
Errors, time, and cost could be reduced via automated trans-
lation and testing.
Related patterns. VM emulation pattern is preferred when ap-
plicable. State extraction and state transformation patterns can be
used to capture the smart contract code, embedded states, aggregate
the states, and burn the smart contracts. Use the state initialization
pattern to deploy and set the state of translated smart contracts. PoE
entries of smart contract codes and updates to the contract addresses
can be added using the off-chain data storage and encrypting on-
chain data patterns.
Known use. As Serpent was deprecated and had several security
weaknesses, Augur [
4
] smart contract written in Serpent had to be
rewritten in Solidity. Kyber [
35
] redeployed its contract with fixes, as
the contract was about to be unusable due to the revision of Ethereum
gas costs. Tronbet [
52
] migrated its ANTE tokens to new WIN
tokens managed by a new smart contract with enhanced features.
Both Augur and Tronbet used token burning and state initialization
patterns to set the state of the new contract. Counterparty [
7
] and
Tron [
31
] allow executing Ethereum smart contracts on Bitcoin and
Tron by modifying the Solidity code, respectively. Tools are available
Patterns for Blockchain Data Migration EuroPLoP ’20, July 1–4, 2020, Virtual Event, Germany
to translate among smart contract language, e.g., Ethereum Solidity
to Hyperledger Javascript translator is presented in [63].
6 MAPPING PATTERNS TO MIGRATION
SCENARIOS
Given the five migration scenarios, five data fidelity levels, and 13
patterns, the potential solution space is large. Therefore, we present a
selected set of use cases from the illustrative example that combines
a mix of migration scenarios, data fidelity levels, and blockchain
platforms and hosting modes. We then discuss how a migration plan
for a given use case could be achieved using a selected set of patterns.
Table 1shows how the proposed patterns map to the migration use
cases.
Table 1: Pattern to migration scenario mapping.
Pattern Group Pattern
Scenario
Relocate
Upgrade
Consolidate
Separation
Archive
State extraction Snapshotting
State
transformation
State aggregation – –
Token burning – –
State and
transaction load
Node sync ? – – ? ?
Establish genesis ? ? – ? ?
Hard fork – ? – ?
State initialization
Exchange transfer – –
Smart contract Virtual machine
emulation
– ? ? – –
Smart contract
translation
– – –
Non-functional
Measure migration
quality
–
Off-chain data storage –
Encrypting on-chain
data
–
* Applicable (), Maybe applicable (?), Not applicable (–)
Suppose the nonprofit decided to spin up a private instance of the
same blockchain platform and migrate its data. This is an example
of the separation scenario which can be achieved by spinning up
a new blockchain instance and recreating data using most of the
migration patterns. State extraction and transformation patterns, as
well as state initialization and establish genesis patterns can be used
to accurately determine, efficiently recreate, and consistently transfer
states to the new blockchain instance. Hard fork,VM emulation, and
smart contract translation patterns are not needed, as the target
blockchain is new and compatible. Because the blockchain instance
is private, the exchange-transfer pattern is not desirable due to the
cost of using a public exchange. However, the migration tool may
act as a decentralized or private exchange that provides a common
protocol for state transfer [
45
,
51
]. Similarly, encrypting on-chain
data pattern is not essential as the target blockchain is private. Node
sync could also be used where the new nodes can be reconfigured
to act as a different blockchain instance after cloning. However, if
the private blockchain instance is to be set up using a BaaS, node
sync and establish genesis patterns may not work as the BaaS may
not provide node-level access limiting finer control on blockchain
software, data, storage, and inter-node communication. Therefore,
these patterns are marked as may be applicable.
Alternatively, suppose the nonprofit decided to move to a private
and incompatible blockchain instance to get better performance and
features. In this case, both separation and upgrade scenarios should
be applied together, as only the application-related data are moved
to an incompatible blockchain platform. Therefore, the applicable
patterns depend on the constraints of each scenario. For example,
the node sync pattern does not apply as the target blockchain is
incompatible, whereas hard fork pattern is not needed as the instance
is new. Establish genesis pattern may not apply when the target
instance is hosted as a BaaS. The VM emulation pattern is preferred
when it is supported by the target blockchain. Else, one can use
the smart contract translation pattern given that it is possible to
translate/rewrite the smart contract to achieve the same functionality.
In another scenario, suppose the nonprofit decided to use an
existing but incompatible public blockchain as the target. In this
case, both separation and consolidate scenarios should be applied
together, as it a separation from the point of view of the source
blockchain and consolidation from the point of view of the target
blockchain. Node sync,establish genesis, and hard fork patterns do
not apply, as the target blockchain platform is incompatible, already
established, and public. While the smart contract translation pattern
is applicable, the applicability of VM emulation pattern depends on
the target blockchain platform.
Different data fidelity levels could be used under upgrade,consol-
idate, and separation scenarios. Therefore, the measure migration
quality pattern should be used in all three scenarios to ensure that
the desired subset of data was migrated as per the application and
organizational objectives, and respective blockchain constraints are
honored. For example, the nonprofit may choose to manage next
year’s concert budget using data fidelity levels such as fresh start,
state only,orgenesis and transactions. Moreover, off-chain data
storage and encrypting on-chain data patterns could be used to
enhance transparency and privacy, respectively.
If the nonprofit needs to change the blockchain nodes, the relocate
scenario applies. Relocation could be achieved by cloning a set
of new nodes (either on-premise or on the cloud) using the node
sync pattern. Finally, these nodes can be reconfigured to run as a
separate blockchain instance. If the target blockchain is a BaaS
instance, snapshotting and state initialization patterns could be used
to recreate states on new nodes. Establish genesis and hard fork
patterns may not apply when the target instance is hosted as a BaaS.
Smart contract and non-functional patterns are not needed, as the
objective is to swap the hardware while enforcing data integrity
through blockchain properties. The archive scenario applies when
the nonprofit needs to keep the full history on a different blockchain
instance for data analytics, auditing, or archiving [
37
] purposes. As
the archive is created on a new blockchain while transferring only a
subset of states, transactions, and smart contracts, hard fork and non-
functional patterns also apply based on the chosen data fidelity level,
cost, and privacy goals. While these use cases consider migrating
only the concert application’s data, proposed patterns could also be
used to migrate an entire blockchain.
EuroPLoP ’20, July 1–4, 2020, Virtual Event, Germany Bandara, Xu, and Weber
7 DISCUSSION
Next, we briefly discuss practical considerations and challenges
in blockchain migration. Migration is relatively straightforward
when the target blockchain is a new instance, as it provides greater
flexibility in synchronizing and recreating data using node sync
and establish genesis patterns, respectively. In contrast, preexisting
blockchains, especially the public ones, require extensive efforts
to recreate states using relevant patterns. Such efforts involve the
use of new transactions to aggregate and initialize states, burn and
exchange tokens, disable smart contracts, and redeploy smart con-
tracts, increasing the time and cost of migration. While the node
sync pattern can be used to establish the history, it can be used only
to change the blockchain nodes or when the target blockchain is
new and compatible with the source blockchain. It could also fail
when the target blockchain is hosted on a BaaS platform. While
one could attempt to replay transactions on the target blockchain, it
is both costly and time-consuming, unless state aggregation style
optimizations are used.
Time and cost of migration can be reduced using the state ag-
gregation pattern depending on the chosen data fidelity level. By
lowering the number of states, the risk of failure could also be re-
duced, which is essential as it is next to impossible to rollback a
failed migration on a public blockchain. Therefore, while moving
into an existing blockchain, it is desirable to choose a data fidelity
level that is sufficient to retain the correct function and availability
of the application. The source blockchain can continue to provide
access to historical data, if not decommissioned. Otherwise, transac-
tions and block data can be added to the target blockchain as a set of
PoE entries using the off-chain data storage pattern while substan-
tially reducing the cost and time to migrate. The risk could be further
reduced by migrating one state and smart contract at a time, which
works with patterns like state initialization and exchange transfer
[
36
]. Trial migration on the test network of the target blockchain
platform is essential to reduce the risk further [
42
]. In conclusion,
while the global state can be recreated on any target blockchain with
reasonable effort and time, it is desirable not to recreate full history.
Translating smart contracts is not only time and cost-prohibitive
but also provides no guarantees on the exact behavior (even with
automated translation). While many high-level smart contract lan-
guages are still being proposed, they seem to be opting to a few
instruction set architectures, e.g., EVM [
20
,
50
]. Therefore, the reuse
of smart contracts at the bytecode level is preferred, as it could better
preserve the smart contract behavior. However, code reuse requires
both the source and target smart contract execution environment to
behave the same way [2].
To apply the token-burning pattern for smart contracts, it is es-
sential that they implement a self-destruct function before being
deployed on the source blockchain. It is recommended to separate
the smart contract state from the business logic [
61
], as it simplifies
smart contract translation and enables efficient migration of smart
contract state using patterns such as snapshotting,state aggrega-
tion,token burning,establish genesis, and state initialization.As
the addresses could change during the migration, they should not be
hardcoded into the smart contract code, e.g., address used in a dele-
gated call. Instead, it is recommended to use a smart contract registry
[
61
] to keep track of smart contract addresses. Similarly, having an
ID database at the BAL is essential to handle the change of identi-
fiers during the migration. Else, the migration will require significant
changes to the BAL or even the application. Further, the use of states
that can be tokenized is useful, as they could be created, aggregated,
swapped, and burned using smart contracts. Therefore, proactive
system design and application of smart contract best practices are
essential to simplify future blockchain migrations.
8 SUMMARY
While blockchains are designed to be immutable, many technical,
business, economic, and regulatory-level changes already making it
necessary for an application to migrate from one blockchain instance
to another. In this paper, we outlined the need for blockchain migra-
tion in DApps, enterprise information systems, and business process
management systems. We introduced six new patterns and seven
others adapted from the literature to address five migration scenarios
and blockchain-specific data management challenges. We further
identified five data fidelity levels to balance competing factors, such
as an application’s data requirements, performance, effort, cost, time,
security, privacy, and risks of migration failure. While blockchain
migration is expected to be difficult and costly, we show that most
migration scenarios can be achieved within a reasonable time and
cost by choosing a data fidelity level that satisfies the application’s
minimum requirements and a combination of migration patterns.
Some of the challenges requiring further research are: identifying
the best data fidelity level for a given application scenario, lack of
access to private keys and external accounts, confirming the correct-
ness of translated smart contracts, best practices to simplify future
migration, and handling user permissions.
REFERENCES
[1]
Aeternity-team. 2019. Frequently asked questions (FAQ): Token migration phases
1, 2, 3. Retrieved Apr. 10, 2020 from https://forum.aeternity.com/t/frequently-
asked-questions- faq-token-migration-phases-1-2-3/1411
[2]
Sidney Amani, Myriam Bégel, Maksym Bortin, and Mark Staples. 2018. Towards
verifying Ethereum smart contract bytecode in Isabelle/HOL. In Proc. 7th ACM
SIGPLAN Intl. Conf. on Certified Programs and Proofs (CCP ’18).66–77.
[3]
Binance Chain Assistant. 2019. Binance chain mainnet swap. Retrieved Jan. 28,
2020 from https://community.binance.org/topic/44/binance-chain-mainnet- swap
[4]
Augur. 2017. Serpent compiler vulnerability, REP &Solidity. Retrieved Apr. 10,
2020 from https://medium.com/@AugurProject/serpent-compiler- vulnerability-
rep-solidity- migration-5d91e4ae90dd
[5]
BitGuild. 2018. Bitizens is moving to TRON. Retrieved Apr. 10, 2020 from
https://medium.com/the-notice- board/bitizens-is-moving- to-tron-71e5c9a39ef
[6]
Vitalik Buterin. 2014. A next-generation smart contract and decentralized appli-
cation platform.https://github.com/ethereum/wiki/wiki/White-Paper
[7]
Counterparty. [n.d.]. Smart contracts/EVM FAQ. Retrieved Jan. 28, 2020 from
https://counterparty.io/docs/faq-smartcontracts/
[8]
Kin Ecosystem. 2018. Kin blockchain migration - iOS. Retrieved Jan. 28, 2020
from https://kinecosystem.github.io/kin-ecosystem-sdk-docs/docs/migration_ios
[9]
Effect.AI. 2019. Effect.AI brings artificial intelligence to EOS main net. Re-
trieved Apr. 10, 2020 from https://medium.com/effect-ai/effect-ai- brings-artificial-
intelligence-to- eos-main-net- ead7e68e09fa
[10]
EOS.IO. 2018. EOS.IO technical white paper v2. (Mar. 2018). https://github.
com/EOSIO/Documentation/blob/master/TechnicalWhitePaper.md
[11]
Guy Golan Gueta et al. 2019. SBFT: A scalable and decentralized trust infrastruc-
ture. In 49th Annual IEEE/IFIP Intl. Conf. on Dependable Systems and Networks
(DSN ’20). 568 – 580.
[12]
Jan Mendling et al. 2018. Blockchains for business process management – Chal-
lenges and opportunities. ACM Trans. on Management Information Systems
(TMIS) 9, 1, Article 4 (Feb. 2018), 4:1 - 4:16 pages.
[13]
IOTA Foundation. 2020. Protecting user tokens and rebooting the coordinator.
Retrieved Apr. 10, 2020 from https://blog.iota.org/protecting-user-tokens-and-
rebooting-the- coordinator-95ff96625186
[14]
TRON Foundation. 2018. Answers to frequently asked questions of TRON.
Retrieved Jan. 28, 2020 from https://medium.com/tron-foundation/answers-to-
Patterns for Blockchain Data Migration EuroPLoP ’20, July 1–4, 2020, Virtual Event, Germany
frequently-asked- questions-of-tron- 738b4653758a
[15]
Telos Foundation. 2018. Telos token distribution - Use of the EOS genesis snapshot
&why. Retrieved Apr. 10, 2020 from https://medium.com/telos-foundation/telos-
token-distribution-use-of-the-eos-genesis-snapshot-why-2d849a2b0055
[16]
Alberto Garoffolo and Robert Viglione. 2018. Sidechains: Decoupled consen-
sus between chains. (Oct. 2018). https://horizen.global/assets/files/Horizen-
Sidechains-Decoupled- Consensus-Between-Chains.pdf
[17]
Vahid Garousi, Michael Felderer, and Mika V. Mäntylä. 2019. Guidelines for
including grey literature and conducting multivocal literature reviews in software
engineering. Information and Software Technology 106 (2019), 101 – 121.
[18]
Gifto Official. 2019. Mass adoption token meets mass adoption
chain: Gifto migrates to Binance chain. Retrieved Jan. 28, 2020
from https://medium.com/@gifto/mass-adoption- token-meets-mass- adoption-
chain-gifto- migrates-to-binance- chain-af8cf906e13a
[19]
Klaus Haller. 2009. Towards the industrialization of data migration: Concepts and
patterns for standard software implementation projects. In Advanced Information
Systems Eng., P. van Eck, J. Gordijn, and R. Wieringa (Eds.). 63 – 78.
[20]
Mike Hearn. 2016. Corda: A distributed ledger. (Nov. 2016). https://www.r3.
com/wp-content/uploads/2017/06/corda_technical_R3.pdf
[21]
Nick Heudecker and Arun Chandrasekaran. 2018. Debunking the top 3 blockchain
myths for data management. (Apr. 2018). https://gartner.com/en/documents/
3871956
[22]
Hive.IO. 2020. Announcing the launch of Hive blockchain. Retrieved Apr.
10, 2020 from https://steempeak.com/communityfork/@hiveio/announcing-the-
launch-of- hive-blockchain
[23]
Hyperledger. [n.d.]. A blockchain platform for the enterprise - Hyperledger
Fabric documentation. Retrieved Jan. 28, 2020 from https://hyperledger-fabric.
readthedocs.io/en/release-1.4/
[24]
Hyperledger. 2018. Hyperledger Fabric now supports Ethereum. Retrieved Jan.
28, 2020 from https://hyperledger.org/blog/2018/10/26
[25]
KARMA. 2019. KARMA is moving from EOS to WAX.https://medium.com/
@karmaapp/karma-is- moving-from-eos- to-wax-b081100c2702
[26]
Ho-Jun Kim, Eun-Jeong Ko, Young-Ho Jeon, and Ki-Hoon Lee. 2018. Migration
from RDBMS to column-oriented NoSQL: Lessons learned and open problems. In
Proc. 7th Intl. Conf. on Emerging Databases, Wookey Lee, Wonik Choi, Sungwon
Jung, and Min Song (Eds.). 25 – 33.
[27]
Orlenys López-Pintado, Luciano García-Bañuelos, Marlon Dumas, Ingo Weber,
and Alexander Ponomarev. 2019. Caterpillar: A business process execution engine
on the Ethereum blockchain. Software: Practice and Experience 49, 7 (Apr. 2019),
1162 – 1193.
[28]
P. H. Madore. 2019. Deloitte ditches Ethereum for VeChain, brags about overtak-
ing Bitcoin transactions. Retrieved Jan. 28, 2020 from https://finance.yahoo.com/
news/deloitte-ditches-ethereum-vechain-brags-065730503.html
[29]
Vakeesan Mahalingam. 2018. The ‘Hard Spoon’ concept explained. Retrieved
Jan. 28, 2020 from https://cryptovest.com/education/the-hard- spoon-concept-
explained/
[30]
Doble J. Meszaros and Jim Doble. 1997. A pattern language for pattern writing.
In Proc. Intl. Conf. on Pattern Languages of Program Design.
[31]
Annaliese Milano. 2018. $ 3 billion blockchain Tron kicks offtoken migration.
Retrieved Jan. 28, 2020 from https://coindesk.com/3-billion-blockchain-tron-
kicks-off- token-migration-today
[32]
Johny Morris. 2012. Practical data migration (2nd ed.). BCS, The Chartered
Institute.
[33]
Satoshi Nakamoto. 2008. Bitcoin: A peer-to-peer electronic cash system. (2008).
https://bitcoin.org/bitcoin.pdf
[34]
Bithereum Network. 2018. Bithereum’s hard spoon snapshot is complete! Re-
trieved Jan. 28, 2020 from https://medium.com/bithereum-network/bithereums-
hard-spoon- snapshot-is-complete- c1814024ea9a
[35]
Kyber Network. 2019. Istanbul upgrade: Kyber smart contract migration. Re-
trieved Apr. 10, 2020 from https://blog.kyber.network/istanbul-upgrade-kyber-
smart-contract- migration-c8a6bcd84a1b
[36]
Safex News. 2018. Prepare yourself for the Safex blockchain swap. Retrieved
Jan. 28, 2020 from https://safexnews.net/prepare-for-safex- blockchain-swap/
[37]
Hye-Young Paik, Xiwei Xu, H.M.N. Dilum Bandara, Sung Une Lee, and
Sin Kuang Lo. 2019. Analysis of data Management in blockchain-based sys-
tems: From architecture to governance. IEEE Access 7 (2019), 186091–186107.
[38]
R. Peter. 2014. Spin-offs: Bootstrap your alt-coin with a Bitcoin-blockchain-based
initial coin distribution. Retrieved Jan. 28, 2020 from https://bitcointalk.org/
index.php?topic=563972.0
[39]
TomoChain Publisher. 2018. TomoChain’s mainnet launch, and token swap-
ping schedule. Retrieved Apr. 10, 2020 from https://medium.com/tomochain/
tomochains-mainnet- launch-and-token- swapping-schedule-6f556e2f772
[40]
Qubicles. 2019. Migrating Ethereum Qubicle tokens to the Telos chain
of EOS.IO using the EOS21 protocol. Retrieved Jan. 28, 2020
from https://medium.com/@Qubicles/migrating-ethereum- qubicle-tokens-to- the-
telos-chain- of-eos-io- using-the-eos21- protocol-e79c14fcf112
[41]
Leonardo Rocha, Fernando Vale, Elder Cirilo, Dárlinton Barbosa, and Fernando
Mourão. 2015. A framework for migrating relational datasets to NoSQL. Procedia
Computer Science 51 (2015), 2593 – 2602. Intl. Conf. on Computational Science
(ICCS ’15).
[42] Andreas Rüping. 2013. Transform! Patterns for data migration. (2013), 1–23.
[43]
Kai Sedgwick. 2019. Blockchain migration is all the rage. Retrieved Jan. 28,
2020 from https://news.bitcoin.com/blockchain-migration- is-all-the- rage/
[44]
SFOX. 2019. Life after hard forks: What you need to know about replay protection.
Retrieved Jan. 28, 2020 from https://blog.sfox.com/life-after-hard- forks-what-
you-need- to-know-about-replay-protection-ab8adaf6ddf6?gi=9b4099fe431
[45]
Ben Sigman and Alessandro Siniscalchi. [n.d.]. Teleport your ERC20 tokens to
EOS. Retrieved Jan. 28, 2020 from https://github.com/sheos-org/eos21
[46]
Storj. 2017. Token migration plan pt.1. Retrieved Apr. 10, 2020 from https:
//storj.io/blog/2017/04/token-migration- plan-pt.1/
[47]
Steve Strauch, Vasilios Andrikopoulos, Thomas Bachmann, and Frank Leymann.
2013. Migrating application data to the cloud using cloud data patterns. In Proc.
3rd Intl. Conf. on Cloud Computing and Service Science (CLOSER ’13).36–46.
[48]
Stefan Tai, Jacob Eberhardt, and Markus Klems. 2017. Not ACID, not BASE,
but SALT: A transaction processing perspective on blockchains. In Proc. 7th Intl.
Conf. on Cloud Computing and Services Science (CLOSER ’17). 755 – 764.
[49]
Moonlight Team. 2019. Moonlight contract migration — A decentralized tech-
nique. Retrieved Apr. 10, 2020 from https://blog.moonlight.io/moonlight-contract-
migration-a- decentralized-technique/
[50]
Vyper Team. [n.d.]. Vyper. Retrieved Jan. 28, 2020 from https://vyper.readthedocs.
io
[51]
Stefan Thomas and Evan Schwartz. 2015. A protocol for interledger payments.
(2015). https://interledger.org/interledger.pdf
[52]
TRONbet. 2019. ANTE/WIN: Prepare for take-off. Retrieved Jan. 28,
2020 from https://medium.com/@tronbethelp/ante-win- prepare-for-take-off-
353d41b43401
[53]
Claire Vanner. 2018. Blockchain: The fuel to energize your financial services
transformation. Retrieved Apr. 10, 2020 from https://blog.bizagi.com/2018/01/
23/blockchain-fuel- energize-financial-services- transformation/
[54]
VeChain. 2018. VeChainThor wallet manual including token swap and X node
migration. Retrieved Jan. 28, 2020 from https://cdn.vechain.com/vechainthor_
wallet_manual_en_v1.0.pdf
[55]
Martin Wagner and Tim Wellhausen. 2011. Patterns for data migration projects.
(2011). https://tim-wellhausen.de/papers/DataMigrationPatterns.pdf
[56]
Bitcoin Wiki. 2018. Proof of burn. Retrieved Jan. 28, 2020 from https://en.bitcoin.
it/wiki/Proof_of_burn
[57]
Bitcoin Wiki. 2019. How to cheaply consolidate coins to reduce miner fees. Re-
trieved Jan. 28, 2020 from https://en.bitcoin.it/wiki/How_to_cheaply_consolidate_
coins_to_reduce_miner_fees
[58]
Jeffrey Wilcke. 2016. To fork or not to fork. Retrieved Jan. 28, 2020 from
https://blog.ethereum.org/2016/07/15/to-fork- or-not-to-fork/
[59]
Gavin Wood. 2019. Ethereum: A secure decentralised generalised transaction
ledger: Byzantium version. Ethereum project yellow paper (Mar. 2019). https:
//ethereum.github.io/yellowpaper/
[60]
Elizabeth Wright. 2020. Guide to swap Atomic wallet token (AWC). Retrieved
Apr. 10, 2020 from https://atomicwallet.io/binance-dex-awc- token-swap-guide
[61]
Xiwei Xu, Cesare Pautasso, Liming Zhu, Qinghua Lu, and Ingo Weber. 2018. A
pattern collection for blockchain-based applications. In Proc. 23rd European Conf.
on Pattern Languages of Programs (EuroPLoP ’18). Article 3, 20 pages.
[62]
Xiwei Xu, Ingo Weber, and Mark Staples. 2019. Architecture for blockchain
applications. Springer. https://doi.org/10.1007/978-3- 030-03035-3
[63]
Muhammad Ahmad Zafar, Falak Sher, Muhammad Umar Janjua, and Salman
Baset. 2018. Sol2js: Translating Solidity contracts into Javascript for Hyperledger
Fabric. In Proc. 2nd Workshop on Scalable and Resilient Infrastructures for
Distributed Ledgers.19–24.
[64]
Zeepin. 2018. Announcement: Launch of ZEEPIN mainnet and
mapping of ZPT and Gala. Retrieved Apr. 10, 2020 from https:
//medium.com/zeeblog/announcement-launch- of-zeepin-mainnet- and-
mapping-of- zpt-and-gala- e34735c65418
[65]
Erik Zhang. 2019. Roadmap of NEO 3.0 development. Retrieved Jan.
28, 2020 from https://medium.com/neo-smart- economy/roadmap-of-neo- 3-0-
development-e2ae64edf226