Conference PaperPDF Available

Thwarting Unwanted Blockchain Content Insertion

Authors:

Abstract

Since the introduction of Bitcoin in 2008, blockchain systems have seen an enormous increase in adoption. By providing a persistent, distributed, and append-only ledger, blockchains enable numerous applications such as distributed consensus, robustness against equivocation, and smart contracts. However, recent studies show that blockchain systems such as Bitcoin can be (mis)used to store arbitrary content. This has already been used to store arguably objectionable content on Bitcoin's blockchain. Already single instances of clearly objectionable or even illegal content can put the whole system at risk by making its node operators culpable. To overcome this imminent risk, we survey and discuss the design space of countermeasures against the insertion of such objectionable content. Our analysis shows a wide spectrum of potential countermeasures, which are often combinable for increased efficiency. First, we investigate special-purpose content detectors as an ad hoc mitigation. As they turn out to be easily evadable, we also investigate content-agnostic countermeasures. We find that mandatory minimum fees as well as mitigation of transaction manipulability via identifier commitments significantly raise the bar for inserting harmful content into a blockchain.
Thwarting Unwanted Blockchain Content Insertion
Roman Matzutt, Martin Henze, Jan Henrik Ziegeldorf, Jens Hiller, Klaus Wehrle
Communication and Distributed Systems
RWTH Aachen University
Aachen, Germany
Email: lastname@comsys.rwth-aachen.de
©2018 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media,
including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers
or lists, or reuse of any copyrighted component of this work in other works. DOI: 10.1109/IC2E.2018.00070
Abstract—Since the introduction of Bitcoin in 2008, block-
chain systems have seen an enormous increase in adoption.
By providing a persistent, distributed, and append-only ledger,
blockchains enable numerous applications such as distributed
consensus, robustness against equivocation, and smart con-
tracts. However, recent studies show that blockchain systems
such as Bitcoin can be (mis)used to store arbitrary content. This
has already been used to store arguably objectionable content
on Bitcoin’s blockchain. Already single instances of clearly
objectionable or even illegal content can put the whole system
at risk by making its node operators culpable. To overcome
this imminent risk, we survey and discuss the design space of
countermeasures against the insertion of such objectionable
content. Our analysis shows a wide spectrum of potential
countermeasures, which are often combinable for increased
efficiency. First, we investigate special-purpose content detectors
as an ad hoc mitigation. As they turn out to be easily evadable,
we also investigate content-agnostic countermeasures. We find
that mandatory minimum fees as well as mitigation of trans-
action manipulability via identifier commitments significantly
raise the bar for inserting harmful content into a blockchain.
Keywords-Bitcoin, blockchain, security, objectionable con-
tent, countermeasure
I. INTRODUCTION
Blockchain-based cryptocurrencies such as Bitcoin en-
joy unbroken popularity, averaging at over 280 000 daily
confirmed transactions in 2017 [1]. This popularity is also
reflected by the size of the cryptocurrencies’ underlying
peer-to-peer networks and their user base: Bitcoin’s network
size has doubled since 2015 [2], while its number of users
is peaking in the millions [3]. Cryptocurrencies, especially
Bitcoin, have thus become a well-accepted trading medium
due to their security, timeliness, and decentralization.
Besides offering a platform for financial transactions,
recent work [4]–[6] shows that Bitcoin’s blockchain can
also be used as an anonymous and irrevocable content
store. By inserting short, non-financial messages, Bitcoin
can be extended to realize additional services, e.g., digital
notary services [7], secure releases of cryptographic com-
mitments [8], or non-equivocation schemes [9].
While this initially unintended extensibility appears
promising, and in fact 1.4 % of Bitcoin transactions hold
non-financial data [6], it can severely compromise block-
chain systems: A recent study [6] reveals that over 1600 files
have been irrevocably engraved into Bitcoin’s blockchain,
e.g., to be shared in a censorship-resistant manner. These
files range from simple text to over 155 images, source
codes, and PDF files. Any objectionable content, e.g., illegal
pornography, in such files is then inevitably distributed to all
nodes of the cryptocurrency’s underlying peer-to-peer net-
work. It is expected that court rulings in major jurisdictions
such as Germany and the USA will then find node operators
culpable of possessing objectionable content [6]. As a con-
sequence, the node operators must delete affected parts of
the blockchain, thereby breaking the blockchain’s integrity
and verifiability. The insertion of objectionable content has
thus the potential of jeopardizing cryptocurrencies, as all
users ultimately depend on this verification. Indeed, recent
research [6] finds that, while most content is likely harmless,
Bitcoin’s blockchain already today contains content that is
objectionable in many jurisdictions, e.g., an image of a nude
young woman or hundreds of links to child pornography.
Reacting to the evident threats, in this work we explore
potential countermeasures to prevent insertion of objection-
able content w.r.t. blockchain-based cryptocurrencies, using
Bitcoin as a real-world working example. We first analyze
the harmfulness of different types of blockchain content
and find that short, token-like messages enable beneficial
use cases while insertion of arbitrarily-sized content must
be prevented and we acknowledge that full prevention of
content insertion is impossible. We hence focus on coun-
termeasures that either heuristically hinder the insertion of
large chunks of arbitrary data or financially disincentivize
content insertion to a freely tunable degree. Our results
are two-fold. One the one hand, we find that the na¨
ıve
approach of targeted content detection constitutes a first ad
hoc solution against objectionable content, but it is easily
evadable. On the other hand, simple adaptions to Bitcoin,
such as introducing mandatory minimum fees or replacing
manipulable blockchain identifiers, can effectively mitigate
objectionable blockchain content with moderate overheads.
II. BACKGROU ND
In this section, we provide the technical background on
Bitcoin required for the remainder of this paper.
Bitcoin Primer. Bitcoin [10] was the first digital currency
to rely on the blockchain. The blockchain is a persistent,
distributed, and append-only ledger of events, serving in
Bitcoin as distributed, immutable record of financial transac-
tions between pseudonymous Bitcoin addresses. A Bitcoin
address is controlled by a cryptographic key pair, which is
used to access and transfer the associated bitcoins. A trans-
action collects funds from one or more addresses, the inputs,
and reassigns them to one or more other addresses, the
outputs. To prevent double spending of bitcoins, transactions
are only considered valid if they are immutably recorded in
the blockchain. Transaction inputs and outputs are realized
using a script language that allows authenticating payments
via addresses’ public keys or hash values thereof.
Blockchain Maintenance. The blockchain is maintained
by the Bitcoin network in a distributed manner to reach
consensus about valid transactions among honest nodes [11]:
Users propose their transactions to the network, which are
then added to the blockchain by special nodes, the miners,
via a proof-of-work scheme. To ensure correctness of the
blockchain and its transactions, the full nodes of the Bitcoin
network independently verify all transactions and blocks
they receive and reject incorrect information. Furthermore,
full nodes maintain a full copy of the whole blockchain to
serve newly joining nodes. Assuming an honest majority
(among full nodes), the longest blockchain constitutes the
Bitcoin network’s consensus.
Incentives and Fees. Miners are incentivized to perform
the proof of work via block rewards. Each miner includes a
coinbase transaction in her blocks, which rewards her with
a prescribed number of freshly minted bitcoins. As the block
reward is exponentially decreasing to limit the total supply of
bitcoins, transaction fees are a second tier of miner rewards.
Miners may collect excess bitcoins from all transactions
in their blocks as a form of tip, which is paid as a byte-
wise fee. As overpaying fees incentivizes miners to consider
a transaction faster, recommendations on transaction fees
emerged [12]. For instance, to get a transaction included into
the blockchain during December 2017 within an hour, it is
recommended to pay on average 423 satoshi per Byte (B),
i.e., 16.14 USD for an average transaction of 250 B size [12].
Content Insertion. A comprehensive study on how Bit-
coin transactions can be augmented with arbitrary content
is given in [6]. While short messages of up to 100 B can
be added via intended mechanisms (coinbase transactions
and OP_RETURN), transactions can be manipulated to hold
arbitrarily-sized content such as images or archives. Predom-
inantly, content inserters arbitrarily replace the blockchain
identifiers, usually 20 B-long cryptographic hash values of
public keys, of multiple outputs with their content, poten-
tially making the output unspendable [6]. Using encodings
such as Apertus [13] allows spreading content over multiple
transactions while retaining efficient decoding [6].
III. SCE NAR IO A ND PROBLEM STAT EM EN T
In this section, we define the underlying scenario for
blockchain content insertion with the goal of designing
countermeasures against such practices. We first outline
that different classes of arbitrary blockchain content can be
harmful or beneficial to cryptocurrencies (Section III-A). We
then discuss related work (Section III-B) and show that it
is impossible to prevent insertion of all unintended content
into the blockchain (Section III-C). From this analysis, we
distill the problem statement for this paper (Section III-D).
A. Harmfulness of Arbitrary Blockchain Content
Current blockchain designs allow augmenting user-
generated transactions with short chunks of arbitrary content
as described in Section II. Notably, manipulating transac-
tions allows inserting arbitrary amounts of unintended data
even into special-purpose blockchains, e.g., storing non-
financial data on cryptocurrency blockchains. In this paper,
we refer to Bitcoin as our real-work working example.
If a miner includes content-holding transactions into her
blocks, the content is irrevocably distributed to all full
nodes. Recent research [6] shows that this puts full node
operators at risk: Although court rulings are yet to come, it is
expected that major jurisdictions could find that maintaining
a blockchain containing objectionable content, e.g., illegal
pornography, constitutes possession of illegal documents [6].
Full node operators hence face a dilemma: If they keep
maintaining the blockchain, they may become culpable. Yet,
if they delete content-holding transactions from their local
blockchain, they break its integrity and thus its verifiability.
Deleting blockchain content locally to comply with le-
gal obligations severely impedes the health of the Bitcoin
network. It is critical that newly joining full nodes obtain
an intact blockchain copy in order to successfully synchro-
nize with the Bitcoin network. Furthermore, also the users
of lightweight solutions such as online wallets ultimately
depend on full nodes performing the verification process on
their behalf. Hence, we argue that objectionable, i.e., illegal-
to-possess, blockchain content must be proactively prevented
from entering the blockchain to the largest extent possible.
However, only certain content can jeopardize blockchain
systems. In fact, the outlined culpability only holds if content
is objectionable and easily extractable from the blockchain.
For instance, full pictures of illegal pornography, as one the
most ubiquitously objected content type, can be stored on the
blockchain using tens of kilobytes [14]. The most imminent
risk therefore stems from arbitrary-length and easy-to-read
content, especially if it is easily accessible via standard
software after extracting it from the local blockchain copy.
Contrarily, short pieces of blockchain content are less
likely to be harmful as they cannot hold objectionable
content directly. Even short links to objectionable content do
not put full node operators at risk of possessing said content:
As the content is not stored directly on the blockchain,
operators do not own a physical copy of it. Furthermore, they
could even cooperate with local authorities to take down the
target server without impeding the blockchain integrity.
As a consequence, we deem short-sized content (1 KiB)
to have a lower harm potential and consider very short con-
tents (100 B) harmless. These thresholds may, however,
require adaption in the future, e.g., due to court rulings.
On the contrary, short-sized blockchain content has proven
to fuel innovation and create new applications. A wide range
of applications now rely on engraving short tokens into the
blockchain to leverage its security model for off-blockchain
services. By adding hash values of files, it becomes publicly
verifiable that a given document existed by the time the
transaction was added to the blockchain [7]. Similarly,
the blockchain can become a general-purpose event ledger,
e.g., for non-equivocation logs [9]. Allowing short text
messages, e.g., meta information, on the blockchain further-
more enables services such as distributed management of
assets [15] or key-value pairs [16] and the execution of smart
contracts [17]. Although arbitrary-length blockchain content
can also be beneficial, e.g., for whistleblowers to unveil
misconducts in a censorship-resistant manner, a single piece
of objectionable content can jeopardize the whole system.
Hence, the risks introduced by arbitrary-length blockchain
content by far outweigh the potential benefits.
It is thus crucial to face the currently disregarded risks
of arbitrary-length, easy-to-read blockchain content and to
design countermeasures. However, the benefits of short
blockchain messages require such designs to trade off secu-
rity against innovation by gauging which content is harmless.
B. Related Work
Monitoring incoming data as well as recognizing and
filtering unwanted content is a classical application of fire-
walls, intrusion detection systems, and spam filters [18]–
[21], which have drastically improved security and quality
of service within their respective domains. However, their
often high adaptability requirements are commonly tackled
via supervised or automated learning of what content should
be filtered. This is challenging w.r.t. blockchain systems as
it must be guaranteed that learning is deterministic and all
overhead from computation-intense local checks multiply
over the whole network and should be avoided. However,
we deem a further investigation promising future work.
A new line of blockchain-based systems promises to avoid
the problems of objectionable blockchain content altogether
by persistently maintaining account balances instead of the
whole transaction ledger [22]–[24]. As transaction outputs
are separated during the balance update, it is hard to link
individual chunks of blockchain content to each other to
reveal the full content. Yet, forfeiting the event history con-
siderably limits potential applications. For instance, notary
services cannot be realized via such blockchains. The risks
of content insertion must thus be tackled for all blockchains.
Similarly, redactable blockchains [25] emerged, which
enable after-the-fact alternation and deletion of transactions.
These blockchains use chameleon hash functions [26] to
link blocks such that trusted entities or quorums can alter
them. The arising trust issues can be tackled by issuing
decentralized votes on blockchain alternations [27], but
again, this enhancement is incompatible to existing systems.
C. Impossibility of Rigorous Blockchain Content Filtering
We have shown that objectionable content puts blockchain
systems at risk. However, as we argue in this section,
unintended data cannot entirely be prevented from entering
public blockchains. This circumstance stems from the fact
that public blockchains are pseudonymous. Full nodes do not
verify whether an alleged receiver in a proposed transaction
does indeed exist. As of now, content can thus be easily
inserted by manipulating the receiver’s identifiers. E.g.,
Bitcoin allows inserting tens of kilobytes of arbitrary data
per transaction using multiple manipulated outputs [6].
Unfortunately, such manipulation cannot be comprehen-
sively detected by full nodes: Users can, and are even
encouraged to [10], refresh their Bitcoin addresses arbitrarily
often. Hence, a user willing to insert blockchain content
can continuously create new Bitcoin addresses for herself
by brute-forcing private keys such that the content can be
encoded via, e.g., the first few bytes of each output used.
The user can subsequently craft a transaction that sends
arbitrary amounts of bitcoins to each of her chosen outputs
and publish it. This transaction is valid and indistinguishable
from non-manipulated transactions. Furthermore, full nodes
cannot link the transaction outputs to the single user during
transaction validation due to the lack of centralized user-
identity associations. Hence, it is impossible for full nodes to
detect and reject all transactions holding unintended content.
In the following, we thus explore heuristics that reject
potentially harmful content, either by analyzing transactions
to reveal content, e.g., plain text or image files, or by
disincentivizing insertion of arbitrarily-sized content.
D. Problem Statement
We argued that persistently storing arbitrarily-sized con-
tent puts blockchain systems at risk while short data pieces
have proven beneficial. Furthermore, we have shown the
impossibility of preventing all unintended blockchain con-
tent. The goal of this paper is therefore to explore the
design space of countermeasure heuristics that (i) prevent
harmful content from entering the blockchain, (ii) are easily
deployable even for established systems such as Bitcoin, and
(iii) are adaptable in case that also short pieces of blockchain
content reveal unforeseen risks. To this end, content pieces
shall be (i) allowed if very short (100 B), (ii) tolerated
if medium-sized (1 KiB), and (iii) effectively prevented
if arbitrarily-sized. We note that these thresholds can be
freely adapted if deemed necessary, e.g., by court rulings.
As complete prevention of content insertion is impossible,
our countermeasures must render the insertion of arbitrarily-
sized content either computationally or financially infeasible.
IV. COUNTERMEASURES AGAINST CON TE NT INSERTION
To effectively mitigate risks of blockchain content, we
identify countermeasures that limit the amount of insertable,
unintended content or make such insertion financially in-
feasible in accordance with our problem statement in Sec-
tion III-D. As a representative working example, we consider
countermeasures that are easily applicable to Bitcoin. After
proposing evaluation criteria for the countermeasure quality
(Section IV-A), we propose to (i) introduce na¨
ıve content
filtering (Section IV-B), to (ii) adapt the fee model (Sec-
tion IV-C), and to (iii) include proofs of key authenticity
within the transactions themselves (Section IV-D).
A. Evaluation Criteria
We evaluate our countermeasures’ efficiency against in-
sertion of harmful blockchain content w.r.t. filtering quality,
usability,network burden, and deployability. High filtering
quality means that insertion of harmful content (as in Sec-
tion III-D) becomes either computationally or financially
infeasible. A countermeasure is usable if it does not impede
normal system use. A low network burden is achieved
if neither the blockchain’s nor the Bitcoin network’s per-
formance is decreased significantly by a countermeasure.
Finally, a countermeasure should be deployable via only
minor changes to already-established blockchain systems
(changing transaction acceptance always requires an update).
B. Filtering Content-holding Transactions
An intuitive countermeasure against unintended block-
chain content is the systematic analysis of proposed trans-
actions and subsequent rejection of content-holding trans-
actions by full nodes and miners. As discussed in Sec-
tion III-A, the most imminent threat stems from arbitrary-
sized content that is objectionable and easily accessible, e.g.,
common files such as images. We thus first explore the na¨
ıve
approach of adding content detectors to Bitcoin’s transaction
verification to detect and reject content-holding transactions.
To detect easily accessible content in transactions, we
propose (i) a text detector to identify transactions carrying
text or ASCII-based files and (ii) a known-file detector to
identify binary files such as images or archives.
Detecting large fractions of printable text within a trans-
action prevents custom text as well as text-based files,
e.g., HTML pages or Python scripts, from entering the
blockchain. We hence propose a text detection threshold
t[0,1] to check whether individual transaction outputs
consist of large fractions of printable ASCII characters.
To choose t, we consider the detector’s expected false-
positive rate (FPR). False positives can occur as block-
chain identifiers may contain printable ASCII characters
0.00 0.25 0.50 0.75 1.00
Text Detection Threshold
108
106
104
102
100
False Positive Probability
Figure 1: Expected false-
positive rate for text detec-
tion in a 20 B-long identifier
100101102103104
Outputs per Transaction [#]
0.2
0.4
0.6
0.8
1.0
Fraction of Data
Figure 2: Cumulative distri-
bution of numbers of out-
puts per transaction
by chance. Figure 1 shows the expected FPR for random
blockchain identifiers (20 B length, 95 of 256 printable
characters) and varying thresholds t. We observe that only
high thresholds lead to negligible FPRs: While t= 0.75
still yields an expected FPR of 0.064 %,t= 0.9yields
an expected FPR of 1.42 ×104%. As this confirms the
intuition from previous works [6], [28], we suggest t= 0.9.
Unfortunately, reusing Bitcoin addresses, e.g., to collect
donations, can cause the text detector to reject valid pay-
ments if the corresponding blockchain identifier is a false
positive. Hence, also small expected FPRs can impede the
usability seriously as such identifiers are potentially used
heavily. This is illustrated, e.g., by one Bitcoin address1,
which is a false positive w.r.t. our text detector but received
bitcoins from over 370 transactions as of January 8th, 2018.
We thus propose to further restrict the text detector to
reject only transactions with more than 5distinct text-
holding outputs (100 B worth of content). This way, only
short and thus harmless texts can be inserted. Notably, most
of these texts can already be inserted using an OP_RETURN
output (at lower costs). Hence, the text detector mitigates
harmful text insertion without discriminating honest users.
Contrarily, preventing insertion of binary files is not fea-
sible via this approach. While file types can be determined
using magic numbers, i.e., byte sequences that are unique to
the respective file type, these sequences are often only few
bytes long [29]. Hence, the expected FPR increases drasti-
cally. Although the accuracy can be improved by considering
more characteristic features, e.g., default headers, content
inserters can evade this overly specific detector more easily.
They can, e.g., introduce easily revertible modifications to
the content such as a deterministic padding.
Evaluation. Content detectors can be fine-tuned specif-
ically to reject unwanted content with a low overhead for
full nodes. Furthermore, usability is guaranteed since we
detect false positives with high probability. Thus, honest
users are not affected by the detector. However, the detector’s
filtering quality is insufficient for binary files and we expect
a wide range of evasion schemes to emerge. This would
imply a poor deployability, as novel evasion schemes result
in frequent mandatory updates for all honest full nodes. In
1Bitcoin address: 154QWLN3Uz43nHMAM7ioYUx8tkYXdNKDtQ
100101
Outputs [#]
0
50
100
150
200
Fees [USD]
Our scheme
Current fees
0.0
0.2
0.4
0.6
0.8
1.0
Fees [USD]
(a) Piecewise constant growth
100101
Outputs [#]
0
1000
2000
3000
Fees [USD]
Our scheme
Current fees
0.0
0.2
0.4
0.6
0.8
1.0
Fees [USD]
(b) Piecewise linear growth
Figure 3: Proposed minimum fees for different growths β(n)
conclusion, explicit content detection constitutes a first line
of defense against unwanted blockchain content and works
for text-based content. Mitigating the insertion of unwanted
binary files, however, requires more general approaches.
C. Mandatory Minimal Transaction Fees
Bitcoin transaction fees are usually paid per byte, with
a current recommendation of 423 satoshi/B(16.14 USD
for the average 250 B-large transaction) as of December
2017 [12]. Although such fees seem high, new content is
actively being added [6]. We thus propose to adapt Bitcoin’s
underlying fee model to hinder content insertion.
As Figure 2 shows, the vast majority of all nearly 255
million transactions (as of August 2017) has at most 50
outputs (99.73 %). Of all transactions, 97.22 % even have
5outputs or fewer and 91.77 % have at most 2outputs.
Thus, we propose to enforce mandatory minimum fees to
penalize large transactions and thus disincentivize content
insertion. Given a proposed transaction twith size stand
number of outputs nt, we propose the simple fee function:
F(t) = α·(st+β(nt)·nt).
Here, αis the byte-wise base fee and β(n)is a piecewise-
defined function depending on thresholds Tsand Tmthat
distinguish small, medium-sized, and large output numbers.
As an example, Figure 3 shows the resulting fees for α=
423 satoshi/B, thresholds Ts= 6, Tm= 51 and
βC(n) =
0n<Ts,m
10 n[Ts, Tm]
20 n>Tm
in Figure 3a and βL(n) = Pn
0βC(n)in Figure 3b (the
dashed lines denote Tsand Tm). While we leave the exact
parametrization open, our choices for β(n)to be piecewise
constant (βC(n)) or piecewise linear (βL(n)) showcase the
design space for mandatory minimum fees. Neither approach
impedes small transactions (over 97 % of all transactions),
but both penalize larger transactions in varying degrees.
For instance, using a piecewise constant fee growth per
output, a borderline-large transaction (50 outputs) would
inflict additional 29 USD of fees, while the linear fee growth
yields roughly 1310 USD in additional fees, i.e., legitimate
creators of medium-sized transactions are penalized more.
However, the increase in penalty fees grows gradually for
medium-sized transactions. Content inserters, contrarily, are
especially punished by linear-growth fee penalties: E.g.,
storing a small JPEG image of 20 KiB (1024 outputs), would
cost the content inserter roughly 1.19 ×106USD in fees.
Evaluation. Mandatory minimum fees are promising to
disincentivize content insertion. They are easily deployable
(a single check during transaction verification) and have neg-
ligible overhead as full nodes only must check whether the
transaction pays at least the required fees. However, manda-
tory fees have an inherent trade-off between usability and
filtering quality: If the fee model is tuned towards rejecting
even small amounts of content, honest users currently relying
on large transactions, e.g., exchange services, are potentially
penalized. Hence, the fee model’s parametrization must be
thoroughly evaluated prior to its deployment.
D. Self-Verifying Account Identifiers
We propose a simple adaption of standard Bitcoin trans-
actions to make content insertion computationally infeasible.
We approximate the best case of infeasible content insertion
outlined in Section III-C by only recording cryptographi-
cally non-manipulable values on the blockchain.
Currently, content inserters can easily replace mutable
identifiers in their transaction outputs with arbitrary values.
Full nodes are unable to validate the correctness of these
identifiers until they receive a future transaction attempting
to spend a particular output. Hence, full nodes are forced to
accept manipulated transaction outputs into the blockchain.
We hence propose to replace manipulable identifiers of trans-
action outputs with identifier commitments (ICs). Namely,
the IC C(x)is obtained by interpreting an identifier xas
a private key over Bitcoin’s elliptic curve secp256k1 and
signing the corresponding public value x·G(Gthe generator
of secp256k1) together with a salting nonce rvia ECDSA:
C(x) := (x·G, r, sig(x·Gkr, x))
Replacing xwith C(x)ensures that (a possibly content-
hiding) xnever appears on the blockchain. This hiding of x
is sufficient for outputs, i.e., inputs are not manipulable as
they can only be created by proving possession of a private
key. To hinder content insertion, xmust not be efficiently
computable from C(x)and C(x)must not enable content
insertion by other means than brute-forcing it. To this end,
we rely on ICs to be one-way,fresh, and self-verifying.
The one-way property of x·Gguarantees that xcannot
be computed efficiently from x·G. Thus, x·Gcan safely
be added to the blockchain. Furthermore, xis unknown for
manipulated x·Gand hence the signature sig(x·Gkr, x)
cannot be computed. Thus, it is computationally infeasible
to compute a valid C(x)hiding content in x·G.
Freshness of C(x)is required to hamper the creation of
rainbow tables for identifiers xthat yield ICs well-suited
for content insertion, e.g., a reusable file header. We ensure
freshness via adding the salt r=CRC32(t1k...ktn), where
P2UC: P2PKH:
Input Script: [signature] [signature]
[public key] [public key]
Output Script: OP_DUP OP_DUP
OP_HASH256 OP_HASH160
OP_COMMIT
[commitment x*G] [public key hash]
[salt r]
[sig(x*G||r, x)]
OP_DROP
OP_DROP
OP_EQUALVERIFY OP_EQUALVERIFY
OP_CHECKSIG OP_CHECKSIG
Figure 4: P2UC script; P2SC replaces P2SH analogously.
tiis the hash value identifying the previous, already mined
transaction referenced by the i-th input. As rdepends on pre-
decessor transactions, it changes for almost all transactions,
which makes creating rainbow tables unprofitable. Moreover,
it prevents content storage in ritself. The 4 B short salt
ensures that the blockchain is not bloated unnecessarily.
Finally, the self-verifying property ties C(x)to xsuch that
only possession of the private value xenables the spending
of funds. Furthermore, it ensures that x·G,r, and the
signature sig(x·Gkr, x)cannot be manipulated to contain
content without invalidating the signature.
Implementation. ICs can be easily integrated into Bitcoin
by adding only a new OP_COMMIT operation to it’s stack-
based scripting language. This way, we can replace the most
common transaction scripts, P2PKH and P2SH, with non-
manipulable alternatives: Pay-to-User-Commitment (P2UC)
and Pay-to-Script-Commitment (P2SC).
Figure 4 exemplarily shows the transition from P2PKH
to P2UC (P2SC analogously replaces P2SH). We replace
the manipulable identifier xwith its IC C(x). However, full
nodes must verify the correctness of C(x)using the salt r
and the signature sig(x·Gkr, x)before the transaction is
added to the blockchain. Hence, this check is independent
from actually executing the script once the transaction output
should be spent. Instead, full nodes only need to verify the
correctness of x·Gto authenticate payments. To this end,
we introduce OP_COMMIT, which replaces the current top
stack element xwith x·G. We ignore the signature and salt
during payment verification via the OP_DROP operations.
Performance. We evaluate our scheme w.r.t. validation
and payment verification times and transaction sizes.
Validating a IC requires one additional signature verifi-
cation, which takes 0.4 ms on average on a commodity PC
(Intel Core 2 Quad Q9400 CPU running at 2.66 GHz with
4 GiB RAM). Hence, full nodes need only 3.8 s to validate
a proposed block consisting of roughly 1900 transactions
(average over all blocks of 2017 [1]) with at most 5outputs
each (cf. Section IV-C). As new blocks are only published
on average every 10 min [11], this additional check is
clearly feasible for full nodes even for exceptionally large
transactions with up to 50 outputs each (38 s).
Introducing ICs slightly changes payment verification:
A full node must execute OP_COMMIT and compute
OP_HASH256 instead of OP_HASH160. Computing one IC
takes 0.2 ms on average and, notably, OP_HASH256 out-
performs OP_HASH160 for small input sizes. Furthermore,
computing the salt as a CRC32 checksum is negligible.
Hence, ICs do not impede payment verification.
Replacing blockchain identifiers with ICs increases the
overall transaction size. An IC is 112 B long (x·G:33 B;r:
4 B; sig(x·Gkr, x):77 B;4 B for additional operations), in
contrast to the 20 B size of the unprotected identifier. Hence,
a standard (P2PKH) transaction consisting of one input
and two outputs grows from 225 B to 409 B. Blocks can
therefore hold up to 2445 transactions, which comfortably
sustains the current average of 1900 transactions per block.
Evaluation. ICs have a high filtering quality as they
reduce insertable content to the theoretic minimum (cf.
Section III-C). Furthermore, usability is not impeded: Users
only need to additionally compute the IC, which is clearly
feasible for frequencies of transaction creations expected
for individual users. Even though we need to introduce
OP_COMMIT and extend Bitcoin’s transaction validation
process, we argue that the required changes are only minor
and thus ICs are well-deployable. ICs inevitably increase
transaction sizes. However, the Bitcoin network can still
sustain its transaction throughput. Thus, the overhead is
worth the IC-based protection against content insertion.
V. CONCLUSION
The threat of inserting arbitrary content into blockchains
was only recently recognized: Objectionable content can be
anonymously and irrevocably inserted and thus distributed
to the nodes of a blockchain-based system, whose operators
can then be culpable for possessing the content.
We proposed conceptual countermeasures to empower the
nodes of a blockchain-based system to heuristically reject
transactions holding unintended content with high proba-
bility. Namely, we propose a content detector to identify
and reject content-holding transactions, mandatory mini-
mum transaction fees to make content insertion econom-
ically infeasible, and a computationally non-manipulatable,
commitment-based replacement for easily manipulable iden-
tifiers in transaction outputs.
Until our countermeasures fully complement each other,
they can be deployed gradually, e.g., the content detector is
immediately deployable. Nevertheless, content inserters can
quickly adapt to its strategy and evade detection. It is thus
only an effective ad hoc defense that should soon be accom-
panied by a fee model that disincentivizes large transactions
which are suited to hold objectionable content. Furthermore,
using non-manipulable blockchain identifiers limits content
insertion to the theoretical minimum at moderate costs.
ACK NOW LE DG ME NT S
This work has been funded by the German Federal
Ministry of Education and Research (BMBF) under funding
reference number 16KIS0443. The responsibility for the
content of this publication lies with the authors, who would
also like to thank the German Research Foundation DFG for
the kind support within the Cluster of Excellence “Integra-
tive Production Technology for High-Wage Countries”.
REFERENCES
[1] Blockchain.info. (2011) Bitcoin Charts & Graphs. Accessed
02/04/2018. [Online]. Available: https://blockchain.info/
charts
[2] A. Yeow, “Bitnodes: Global Bitcoin Nodes Distribution,”
2018, accessed 02/04/2018. [Online]. Available: https:
//bitnodes.earn.com/dashboard/?days=730
[3] D. D. F. Maesa, A. Marino, and L. Ricci, “Data-driven
analysis of Bitcoin properties: exploiting the users graph,”
International Journal of Data Science and Analytics, pp. 1–
18, Sep. 2017.
[4] K. Shirriff. (2014) Hidden surprises in the Bitcoin blockchain
and how they are stored: Nelson Mandela, Wikileaks,
photos, and Python software. Accessed 02/04/2018.
[Online]. Available: http://www.righto.com/2014/02/ascii-
bernanke-wikileaks-photographs.html
[5] R. Matzutt, O. Hohlfeld, M. Henze, R. Rawiel, J. H. Ziegel-
dorf, and K. Wehrle, “POSTER: I Don’t Want That Content!
On the Risks of Exploiting Bitcoin’s Blockchain as a Content
Store,” in Proceedings of the 23rd ACM Conference on
Computer and Communications Security (CCS). ACM, 2016.
[6] R. Matzutt, J. Hiller, M. Henze, J. H. Ziegeldorf,
D. M¨
ullmann, O. Hohlfeld, and K. Wehrle, “A Quantitative
Analysis of the Impact of Arbitrary Blockchain Content on
Bitcoin,” in Proceedings of the 22nd International Confer-
ence on Financial Cryptography and Data Security (FC).
Springer, 2018.
[7] PoEx Co., Ltd. (2015) Proof of Existence. Accessed
02/04/2018. [Online]. Available: https://poex.io
[8] J. Clark and A. Essex, “CommitCoin: Carbon Dating Com-
mitments with Bitcoin,” in Proceedings of the 16th Inter-
national Conference on Financial Cryptography and Data
Security (FC). Springer, 2012, pp. 390–398.
[9] A. Tomescu and S. Devadas, “Catena: Efficient non-
equivocation via bitcoin,” in IEEE Symposium on Security
and Privacy (S&P). IEEE, 2017, pp. 393–409.
[10] S. Nakamoto, “Bitcoin: A Peer-to-Peer Electronic Cash
System,” Tech. Rep., 2008, accessed 02/04/2018. [Online].
Available: https://bitcoin.org/bitcoin.pdf
[11] J. Bonneau, A. Miller, J. Clark, A. Narayanan, J. A. Kroll, and
E. W. Felten, “SoK: Research Perspectives and Challenges
for Bitcoin and Cryptocurrencies,” in IEEE Symposium on
Security and Privacy (S&P). IEEE, 2015, pp. 104–121.
[12] (2016) Bitcoin Transaction Fees. Accessed 02/04/2018.
[Online]. Available: https://bitcoinfees.info
[13] HugPuddle. (2013) Apertus – Archive data on your favorite
blockchains. Accessed 02/04/2018. [Online]. Available:
http://apertus.io
[14] A. Russell, P. Norby, and S. Bakhshi. (2015) Perceptual
Image Compression at Flickr. Accessed 02/04/2018. [On-
line]. Available: https://code.flickr.net/2015/09/25/perceptual-
image-compression-at-flickr
[15] M. Bartoletti and L. Pompianu, “An analysis of Bitcoin
OP RETURN metadata,” in Proceedings of the 4th Workshop
on Bitcoin and Blockchain Research (BITCOIN), 2017.
[16] Namecoin. Accessed 02/04/2018. [Online]. Available: https:
//namecoin.org
[17] G. Wood, “Ethereum: A Secure Decentralised Generalised
Transaction Ledger,” 2016, accessed 02/04/2018. [Online].
Available: http://gavwood.com/Paper.pdf
[18] M. Roesch, “Snort - Lightweight Intrusion Detection for
Networks,” in Proceedings of the 13th USENIX Conference on
System Administration (LISA). USENIX Association, 1999,
pp. 229–238.
[19] S. Ioannidis, A. D. Keromytis, S. M. Bellovin, and J. M.
Smith, “Implementing a Distributed Firewall,” in Proceedings
of the 7th ACM Conference on Computer and Communica-
tions Security (CCS). ACM, 2000, pp. 190–199.
[20] H. Hu, W. Han, G.-J. Ahn, and Z. Zhao, “FLOWGUARD:
Building Robust Firewalls for Software-defined Networks,” in
Proceedings of the Third Workshop on Hot Topics in Software
Defined Networking (HotSDN). ACM, 2014, pp. 97–102.
[21] U. Hanani, B. Shapira, and P. Shoval, “Information filtering:
Overview of issues, research and systems,User Modeling
and User-Adapted Interaction, vol. 11, no. 3, pp. 203–259,
Aug. 2001.
[22] J. D. Bruce, “The Mini-Blockchain Scheme,” White
paper, 2014, accessed 02/04/2018. [Online]. Available:
http://cryptonite.info/files/mbc-scheme-rev3.pdf
[23] A. Chepurnoy, M. Larangeira, and A. Ojiganov, “Rollerchain,
a Blockchain With Safely Pruneable Full Blocks,” White
paper, 2016, accessed 02/04/2018. [Online]. Available:
https://arxiv.org/pdf/1603.07926
[24] A. Molina and H. Schoenfeld, “PascalCoin Version
2,” White paper, 2017, accessed 02/04/2018. [Online].
Available: https://www.pascalcoin.org/wp-content/uploads/
2017/07/PascalCoinWhitePaperV2.pdf
[25] G. Ateniese, B. Magri, D. Venturi, and E. Andrade,
“Redactable Blockchain – or – Rewriting History in Bitcoin
and Friends,” in IEEE European Symposium on Security and
Privacy (EuroS&P). IEEE, 2017, pp. 111–126.
[26] J. Camenisch, D. Derler, S. Krenn, H. C. P¨
ohls, K. Samelin,
and D. Slamanig, “Chameleon-Hashes with Ephemeral Trap-
doors,” in Proceedings of Public-Key Cryptography (PKC),
S. Fehr, Ed. Springer, 2017, pp. 152–182.
[27] I. Puddu, A. Dmitrienko, and S. Capkun, “µchain: How
to forget without hard forks,IACR Cryptology ePrint
Archive, vol. 2017:106, 2017, accessed 02/04/2018. [Online].
Available: http://eprint.iacr.org/2017/106
[28] “Hyena”. Cryptograffiti.info. Accessed 02/04/2018. [Online].
Available: http://cryptograffiti.info
[29] G. Kessler. (2002) File Signature Table. Accessed 02/04/2018.
[Online]. Available: https://www.garykessler.net/library/file
sigs.html
... One approach to reducing the risks of blockchain content is preventing it from being recorded. Available strategies include: (a) detecting content in pending transactions, (b) financially disincentivizing the creation of large transactions, or (c) hardening the easily replaceable hash values of transaction outputs against manipulation [18]. However, while these strategies can limit the extent of content insertion, none of them can provide full protection [18]. ...
... Available strategies include: (a) detecting content in pending transactions, (b) financially disincentivizing the creation of large transactions, or (c) hardening the easily replaceable hash values of transaction outputs against manipulation [18]. However, while these strategies can limit the extent of content insertion, none of them can provide full protection [18]. ...
... Unfortunately, our analysis shows that previous proposals do not provide swift and transparent redactions that also cover unintended insertion methods without complicating the validation process. Content prevention [18] mitigates blockchain content but only provides heuristics, i.e., full prevention cannot be guaranteed. Local erasure [19] allows node operators to react quickly but burdens them with removing the illicit content themselves. ...
Conference Paper
Full-text available
Blockchains gained tremendous attention for their capability to provide immutable and decentralized event ledgers that can facilitate interactions between mutually distrusting parties. However, precisely this immutability and the openness of permissionless blockchains raised concerns about the consequences of illicit content being irreversibly stored on them. Related work coined the notion of redactable blockchains, which allow for removing illicit content from their history without affecting the blockchain's integrity. While honest users can safely prune identified content, current approaches either create trust issues by empowering fixed third parties to rewrite history, cannot react quickly to reported content due to using lengthy public votings, or create large per-redaction overheads. In this paper, we instead propose to outsource redactions to small and periodically exchanged juries, whose members can only jointly redact transactions using chameleon hash functions and threshold cryptography. Multiple juries are active at the same time to swiftly redact reported content. They oversee their activities via a global redaction log, which provides transparency and allows for appealing and reversing a rogue jury's decisions. Hence, our approach establishes a framework for the swift and transparent moderation of blockchain content. Our evaluation shows that our moderation scheme can be realized with feasible per-block and per-redaction overheads, i.e., the redaction capabilities do not impede the blockchain's normal operation.
... data storage [11]- [13] has put a permanent burden onto the system and its users, as (a) such misuse typically bloats the set of unspent transaction outputs (UTXO set) with entries that are never spendable and (b) objectionable content can irrevocably be engraved into the blockchain and is subsequently distributed to all nodes [14]. Large blockchain sizes and the presence of objectionable blockchain content cause individual nodes to prune older blockchain data [15], i.e., older payment flows that have been superseded by newer ones, or locally erase UTXOs that hold unwanted content [16] at the cost of becoming dependent on other nodes for transaction validation. ...
... The spending condition associated with a P2PKH transaction output requires the designated spender to present the public key matching a cryptographic hash value given in the P2PKH script and a digital signature created using the corresponding private key. However, nodes receiving a transaction have no way to verify that this private key is indeed known to any user [14]. Therefore, nodes are forced to include potentially unspendable transaction outputs in their UTXO set, which would then reside there forever. ...
... Second, other users manipulate the mutable values of standard financial transactions, e.g., hash values within P2PKH transaction outputs, as an unintended means to insert arbitrary data at a higher capacity [12]. As this manipulation cannot be detected reliably and the affected transaction outputs cannot be determined to be unspendable [14], such transaction outputs are added to the UTXO set and are likely to remain there forever. Consequently, any objectionable content may not only enter the immutable blockchain but also reside in the UTXO set indefinitely. ...
Preprint
Popular cryptocurrencies continue to face serious scalability issues due to their ever-growing blockchains. Thus, modern blockchain designs began to prune old blocks and rely on recent snapshots for their bootstrapping processes instead. Unfortunately, established systems are often considered incapable of adopting these improvements. In this work, we present CoinPrune, our block-pruning scheme with full Bitcoin compatibility, to revise this popular belief. CoinPrune bootstraps joining nodes via snapshots that are periodically created from Bitcoin's set of unspent transaction outputs (UTXO set). Our scheme establishes trust in these snapshots by relying on CoinPrune-supporting miners to mutually reaffirm a snapshot's correctness on the blockchain. This way, snapshots remain trustworthy even if adversaries attempt to tamper with them. Our scheme maintains its retrospective deployability by relying on positive feedback only, i.e., blocks containing invalid reaffirmations are not rejected, but invalid reaffirmations are outpaced by the benign ones created by an honest majority among CoinPrune-supporting miners. Already today, CoinPrune reduces the storage requirements for Bitcoin nodes by two orders of magnitude, as joining nodes need to fetch and process only 6 GiB instead of 271 GiB of data in our evaluation, reducing the synchronization time of powerful devices from currently 7 h to 51 min, with even larger potential drops for less powerful devices. CoinPrune is further aware of higher-level application data, i.e., it conserves otherwise pruned application data and allows nodes to obfuscate objectionable and potentially illegal blockchain content from their UTXO set and the snapshots they distribute.
... Hasan et al 2020 [100] Jaiswal et al 2019 [75] Lallas et al 2019 [79] Madhumida et al 2019 [55] Bose et al 2018 [85] Shwetha et al 2019 [102] Ramalingaiah et al 2018 [103] Omar et al 2019 [104] Li et al 2019 [86] Ding et al 2020 [90] Li et 2018 [105] Xie et al 2018 [91] Bodkhe et al 2019 [94] Epiphaniou et al 2020 [5] Ledwaba et al 2019 [106] Seitz et al 2018 [107] Abdellatif et 2018 [108] Stodt et al 2018 [109] Devi et al 2019 [110] Wang et al 2019 [98] Matzutt et al 2018 [111 ] Sidorov et al 2018 [96] Dinh et al 2018 [48] Demir et al 2018 [73] Yu et al 2018 [2] Onishi 2018 [31] Karama?oski et al 2019 [38] Perez et al 2018 [39] Zou et al 2019 [93] Xu et al 2019 [14] Bellini et al 2020 [4] Kozma et al 2019 [12] Zhang et al 2018 [52] Ensor et al 2018 [46] Kshetri et al 2019 [23] Demir et al 2019 [84] Zhang et al 2019 [101] Security Assessments Soni et al 2019 [8] Zhang et al 2019 [18] Sabbagh et al 2015 [19] Xu et al 2018 [34] Rahouti et al 2018 [15] Shalini et al 201 [33] Vinayak et al 2018 [40] Monrat et al 2019 [45] Golatowski et al 2019 [57] Murray et al 2019 [59] Zorzo et al 2018 [58] Mohanty et al 2019 [92] Miscellanous Kitchenham et al 2007 [24] Kitchenham et al 2009 [25] Keele et al 2007 [26] Snyder 2019 [27] Fig. 2. Taxonomy: Distributed Ledger Technologies Integration in Supply Chain Security Management the inclusion of a fork-resolving policy that proactively ignore blocks that are not published within a set time. ZeroBlock, a timestamp free solution operates on a similar principle which allows honest miners to reject blocks that are mined for longer than a set interval. ...
... Matzutt et al [111] argued that permanently storing information on BC puts systems at risk. They highlight that it is extremely difficult to preclude unintended content in BC networks. ...
Article
Supply chains (SC) present performance bottlenecks that contribute to a high level of costs, infiltration of product quality, and impact productivity. Examples of such inhibitors include the bullwhip effect, new product lines, high inventory, and restrictive data flows. These bottlenecks can force manufacturers to source more raw materials and increase production significantly. Also, restrictive data flow in a complex global SC network generally slows down the movement of goods and services. The use of distributed ledger technologies (DLT) in SC management (SCM) demonstrates the potentials to reduce these bottlenecks through transparency, decentralization, and optimizations in data management. These technologies promise to enhance the trustworthiness of entities within the SC, ensure the accuracy of data-driven operations, and enable existing SCM processes to migrate from a linear to a fully circular economy. This article presents a comprehensive review of 111 articles published in the public domain in the use and efficacy of DLT in SC. It acts as a roadmap for current and future researchers who focus on SC security management to better understand the integration of digital technologies such as DLT. We clustered these articles using standard descriptors linked to trustworthiness, namely, immutability, transparency, traceability, and integrity.
... By allowing arbitrary data in transactions, objectionable content can find its way onto the public ledger [15]. Considering that transactions on the blockchain can be sent anonymously, the potential for criminal intentions becomes evident. ...
... Various approaches have been developed for dealing with the dangers of arbitrary content insertion on public blockchains. They can be grouped into categories as follows: avoiding the inclusion of unwanted data [15], allowing the modification (and erasure) of past blockchain state [2,9,13,21], and local pruning [12]. In this paper, we focus on developing tools for determining the severity of the original problem and whether the implementation of additional protection approaches is necessary. ...
Conference Paper
Full-text available
Blockchain-based systems have gained immense popularity as enablers of independent asset transfers and smart contract functionality. They have also, since as early as the first Bitcoin blocks, been used for storing arbitrary contents such as texts and images. On-chain data storage functionality is useful for a variety of legitimate use cases. It does, however, also pose a systematic risk. If abused, for example by posting illegal contents on a public blockchain, data storage functionality can lead to legal consequences for operators and users that need to store and distribute the blockchain, thereby threatening the operational availability of entire blockchain ecosystems. In this paper, we develop and apply a cloud-based approach for quickly discovering and classifying content on public blockchains. Our method can be adapted to different blockchain systems and offers insights into content-related usage patterns and potential cases of abuse. We apply our method on the two most prominent public blockchain systems—Bitcoin and Ethereum—and discuss our results. To the best of our knowledge, the presented study is the first to systematically analyze non-financial content stored on the Ethereum blockchain and the first to present a side-by-side comparison between different blockchains in terms of the quality and quantity of stored data.
... By allowing arbitrary data in transactions, objectionable content can find its way onto the public ledger [15]. Considering that transactions on the blockchain can be sent anonymously, the potential for criminal intentions becomes evident. ...
... Various approaches have been developed for dealing with the dangers of arbitrary content insertion on public blockchains. They can be grouped into categories as follows: avoiding the inclusion of unwanted data [15], allowing the modification (and erasure) of past blockchain state [2,9,13,21], and local pruning [12]. In this paper, we focus on developing tools for determining the severity of the original problem and whether the implementation of additional protection approaches is necessary. ...
Preprint
Full-text available
Blockchain-based systems have gained immense popularity as enablers of independent asset transfers and smart contract functionality. They have also, since as early as the first Bitcoin blocks, been used for storing arbitrary contents such as texts and images. On-chain data storage functionality is useful for a variety of legitimate use cases. It does, however, also pose a systematic risk. If abused, for example by posting illegal contents on a public blockchain, data storage functionality can lead to legal consequences for operators and users that need to store and distribute the blockchain, thereby threatening the operational availability of entire blockchain ecosystems. In this paper, we develop and apply a cloud-based approach for quickly discovering and classifying content on public blockchains. Our method can be adapted to different blockchain systems and offers insights into content-related usage patterns and potential cases of abuse. We apply our method on the two most prominent public blockchain systems - Bitcoin and Ethereum - and discuss our results. To the best of our knowledge, the presented study is the first to systematically analyze non-financial content stored on the Ethereum blockchain and the first to present a side-by-side comparison between different blockchains in terms of the quality and quantity of stored data.
... In [4], Matzutt et al. discuss numerous approaches for preventing the insertion of random, potentially unwanted data onto cryptocurrency Blockchains. Their proposed work includes content detectors, which have the ability to filter transactions based on heuristics and knowledge about frequently used data insertion methods, as well as protocol adjustment that would greatly increase the costs of including random data. ...
Article
Almost all Cloud Service Providers (CSP) takes a principled approach to the storage and deletion of Customer Data. Most of them have engineered their cloud platform to achieve a high degree of speed, availability, durability, and consistency. Their systems are designed to be optimized for these performance attributes and must be carefully balanced with the necessity to achieve accurate and timely data deletion.many researchers have turn their focus toward data storage and how it will be a challenging task for CSPs in term of storage capacity, data management and security, a considerable number of papers has been published containing new models and technique that will allow data De-duplication in a shared environment but few of them have discussed data deletion.In this paper we will be discussing a new approach that will allow a smart deletion of data stored in the file system as well as its reference in the Blockchain since, by its nature, Blockchains does not allow deletion without violating the Blockchain’s consistency, a preexisting de-duplication system will be our base platform on which we will be working to achieve an accurate and secure data deletion using Blockchain technology while preserving its consistency.
... Another approach to storage scheme optimization is block pruning, where older blockchain data can be pruned or removed by individual nodes on the network to reduce the storage burden on those nodes [72,73]. In UTXO-based blockchains, bloating of the UTXO set with non-spendable transactions and unwanted content may be ameliorated through pruning [74]. Matzutt et al. [70] proposed CoinPrune, which is a snapshot-based block pruning scheme for the Bitcoin blockchain that allows joining nodes to use only the small snapshots of the ledger for bootstrapping. ...
Article
Full-text available
Since the inception of blockchain-based cryptocurrencies, researchers have been fascinated with the idea of integrating blockchain technology into other fields, such as health and manufacturing. Despite the benefits of blockchain, which include immutability, transparency, and traceability, certain issues that limit its integration with IIoT still linger. One of these prominent problems is the storage inefficiency of the blockchain. Due to the append-only nature of the blockchain, the growth of the blockchain ledger inevitably leads to high storage requirements for blockchain peers. This poses a challenge for its integration with the IIoT, where high volumes of data are generated at a relatively faster rate than in applications such as financial systems. Therefore, there is a need for blockchain architectures that deal effectively with the rapid growth of the blockchain ledger. This paper discusses the problem of storage inefficiency in existing blockchain systems, how this affects their scalability, and the challenges that this poses to their integration with IIoT. This paper explores existing solutions for improving the storage efficiency of blockchain–IIoT systems, classifying these proposed solutions according to their approaches and providing insight into their effectiveness through a detailed comparative analysis and examination of their long-term sustainability. Potential directions for future research on the enhancement of storage efficiency in blockchain–IIoT systems are also discussed.
Thesis
Cloud computing is an information technology that enables different users to access a shared pool of configurable system resources and different services without physically acquiring them. So, it saves managing cost and time for organizations and can be rapidly provisioned with minimal management effort. Most industries nowadays such as banking, healthcare and education are migrating to cloud infrastructure due to its efficiency of services especially when it comes to data security and integrity. Cloud platforms encounter numerous challenges such as Data de-duplication, Data Transmission, Data Integrity, Virtual Machine Security, Data Availability, Bandwidth utilization… etc. In this thesis, we have adopted the blockchain technology which is the relatively new technology behind the cryptocurrency Bitcoin that proved its efficiency in securing data and assuring data integrity. In our work, Blockchains were adopted in a different way than its regular use in Bitcoin. When it comes to storage, cloud computing faces a huge amount of duplicated data and with that, various challenges rise along with data duplication such as bandwidth utilization and data security and integrity. Almost all Cloud Service Providers (CSP) take a principled approach to the storage and deletion of Customer Data. Most of them have engineered their cloud platform to achieve a high degree of speed, availability, durability, and consistency. Their systems are designed to be optimized for these performance attributes and must be carefully balanced with the necessity to achieve accurate and timely data deletion. Many researchers have turned their focus toward data storage and how it will be a challenging task for CSPs in terms of storage capacity, data management and security. By implementing our contribution, several new issues rose mostly because of the use of blockchain technology since it cannot be altered, modified or deleted, these issues were our source of inspiration for a second contribution that deals with data deletion in Cloud Computing environment that uses blockchain technology while respecting the international data protection regulations.
Article
Full-text available
In this article, the author primarily aims to clarify from a technical and terminological point of view several notions relevant for the blockchain ecosystem. In this context, the analysis is focused on notions such as blockchain, Proof of Work or Proof of Stake consensus mechanism, virtual currency, crypto-asset, digital wallet, centralized or decentralized exchange platform, public address, public and private cryptographic key, seed phrase, etc. Beyond these technical and terminological clarifications, the author aims to analyse the European and national regulatory framework, in an attempt to resolve, among other things, different matters such as the difference between virtual currencies, crypto-assets and electronic money or the regulation of exchange service providers and digital wallet providers. Last but not least, the author analyses both the risks and benefits of blockchain technology from a cyber security perspective, as well as several criminal behaviours in regard to virtual currencies or other crypto-assets. In this context, behaviours such as ransomware, cryptojacking, unauthorized transfer of virtual currencies or other cryptocurrencies, counterfeiting of virtual currencies, cloning or restricting access to digital wallets, etc. are all taken into consideration.
Article
Full-text available
Data analytic has recently enabled the uncovering of interesting properties of several complex networks. Among these, it is worth considering the bitcoin blockchain, because of its peculiar characteristic of reflecting a niche, but also a real economy whose transactions are publicly available. In this paper, we present the analyses we have performed on the users graph inferred from the bitcoin blockchain, dumped in December 2015, so after the occurrence of the exponential explosion in the number of transactions. We first present the analysis assessing classical graph properties like densification, distance analysis, degree distribution, clustering coefficient and several centrality measures. Then, we analyse properties strictly tied to the nature of bitcoin, like rich-get-richer property, which measures the concentration of richness in the network.
Conference Paper
Full-text available
We present Catena, an efficiently-verifiable Bitcoin witnessing scheme. Catena enables any number of thin clients, such as mobile phones, to efficiently agree on a log of application-specific statements managed by an adversarial server. Catena implements a log as an OP_RETURN transaction chain and prevents forks in the log by leveraging Bitcoin's security against double spends. Specifically, if a log server wants to equivocate it has to double spend a Bitcoin transaction output. Thus, Catena logs are as hard to fork as the Bitcoin blockchain: an adversary without a large fraction of the network's computational power cannot fork Bitcoin and thus cannot fork a Catena log either. However, different from previous Bitcoin-based work, Catena decreases the bandwidth requirements of log auditors from 90 GB to only tens of megabytes. More precisely, our clients only need to download all Bitcoin block headers (currently less than 35 MB) and a small, 600-byte proof for each statement in a block. We implement Catena in Java using the bitcoinj library and use it to extend CONIKS, a recent key transparency scheme, to witness its public-key directory in the Bitcoin blockchain where it can be efficiently verified by auditors. We show that Catena can secure many systems today, such as public-key directories, Tor directory servers and software transparency schemes.
Conference Paper
Full-text available
A chameleon-hash function is a hash function that involves a trapdoor the knowledge of which allows one to find arbitrary collisions in the domain of the function. In this paper, we introduce the notion of chameleon-hash functions with ephemeral trapdoors. Such hash functions feature additional, i.e., ephemeral, trapdoors which are chosen by the party computing a hash value. The holder of the main trapdoor is then unable to find a second pre-image of a hash value unless also provided with the ephemeral trapdoor used to compute the hash value. We present a formal security model for this new primitive as well as provably secure instantiations. The first instantiation is a generic black-box construction from any secure chameleon-hash function. We further provide three direct constructions based on standard assumptions. Our new primitive has some appealing use-cases, including a solution to the long-standing open problem of invisible sanitizable signatures, which we also present.
Article
Full-text available
The Bitcoin protocol allows to save arbitrary data on the blockchain through a special instruction of the scripting language, called OP_RETURN. A growing number of protocols exploit this feature to extend the range of applications of the Bitcoin blockchain beyond transfer of currency. A point of debate in the Bitcoin community is whether loading data through OP_RETURN can negatively affect the performance of the Bitcoin network with respect to its primary goal. This paper is an empirical study of the usage of OP_RETURN over the years. We identify several protocols based on OP_RETURN, that we classify by their application domain. We measure the evolution in time of the usage of each protocol, the distribution of OP_RETURN transactions by application domain, and their space consumption.
Conference Paper
Full-text available
Bitcoin has revolutionized digital currencies and its underlying blockchain has been successfully applied to other domains. To be verifiable by every participating peer, the blockchain maintains every transaction in a persistent, distributed, and tamper-proof log that every participant needs to replicate locally. While this constitutes the central innovation of blockchain technology and is thus a desired property, it can also be abused in ways that are harmful to the overall system. We show for Bitcoin that blockchains potentially provide multiple ways to store (malicious and illegal) content that, once stored, cannot be removed and is replicated by every participating user. We study the evolution of content storage in Bitcoin’s blockchain, classify the stored content, and highlight implications of allowing the storage of arbitrary data in globally replicated blockchains.
Conference Paper
Full-text available
Software-Defined Networking (SDN) introduces significant granularity, visibility and flexibility to networking, but at the same time brings forth new security challenges. One of the fundamental challenges is to build robust firewalls for protecting OpenFlow-based networks where network states and traffic are frequently changed. To address this challenge, we introduce FlowGuard, a comprehensive framework, to facilitate not only accurate detection but also effective resolution of firewall policy violations in dynamic OpenFlow networks. FlowGuard detects firewall policy violations by checking flow space against authorized flow space when network states update. Furthermore, FlowGuard conducts flexible and real-time violation resolutions with the help of several innovative resolution strategies applied to diverse network update situations. We have implemented FlowGuard in Floodlight. Our experimental results demonstrate FlowGuardeffectively addresses firewall policy violations in a real-world network topology, and produces manageable performance overhead with effective violation detection and resolution.
Chapter
Blockchains primarily enable credible accounting of digital events, e.g., money transfers in cryptocurrencies. However, beyond this original purpose, blockchains also irrevocably record arbitrary data, ranging from short messages to pictures. This does not come without risk for users as each participant has to locally replicate the complete blockchain, particularly including potentially harmful content. We provide the first systematic analysis of the benefits and threats of arbitrary blockchain content. Our analysis shows that certain content, e.g., illegal pornography, can render the mere possession of a blockchain illegal. Based on these insights, we conduct a thorough quantitative and qualitative analysis of unintended content on Bitcoin’s blockchain. Although most data originates from benign extensions to Bitcoin’s protocol, our analysis reveals more than 1600 files on the blockchain, over 99% of which are texts or images. Among these files there is clearly objectionable content such as links to child pornography, which is distributed to all Bitcoin participants. With our analysis, we thus highlight the importance for future blockchain designs to address the possibility of unintended data insertion and protect blockchain users accordingly.
Article
Bit coin has emerged as the most successful cryptographic currency in history. Within two years of its quiet launch in 2009, Bit coin grew to comprise billions of dollars of economic value despite only cursory analysis of the system's design. Since then a growing literature has identified hidden-but-important properties of the system, discovered attacks, proposed promising alternatives, and singled out difficult future challenges. Meanwhile a large and vibrant open-source community has proposed and deployed numerous modifications and extensions. We provide the first systematic exposition Bit coin and the many related crypto currencies or 'altcoins.' Drawing from a scattered body of knowledge, we identify three key components of Bit coin's design that can be decoupled. This enables a more insightful analysis of Bit coin's properties and future stability. We map the design space for numerous proposed modifications, providing comparative analyses for alternative consensus mechanisms, currency allocation mechanisms, computational puzzles, and key management tools. We survey anonymity issues in Bit coin and provide an evaluation framework for analyzing a variety of privacy-enhancing proposals. Finally we provide new insights on what we term disinter mediation protocols, which absolve the need for trusted intermediaries in an interesting set of applications. We identify three general disinter mediation strategies and provide a detailed comparison.