Content uploaded by Roman Matzutt
All content in this area was uploaded by Roman Matzutt on Oct 01, 2018
Content may be subject to copyright.
Thwarting Unwanted Blockchain Content Insertion
Roman Matzutt, Martin Henze, Jan Henrik Ziegeldorf, Jens Hiller, Klaus Wehrle
Communication and Distributed Systems
RWTH Aachen University
©2018 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media,
including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers
or lists, or reuse of any copyrighted component of this work in other works. DOI: 10.1109/IC2E.2018.00070
Abstract—Since the introduction of Bitcoin in 2008, block-
chain systems have seen an enormous increase in adoption.
By providing a persistent, distributed, and append-only ledger,
blockchains enable numerous applications such as distributed
consensus, robustness against equivocation, and smart con-
tracts. However, recent studies show that blockchain systems
such as Bitcoin can be (mis)used to store arbitrary content. This
has already been used to store arguably objectionable content
on Bitcoin’s blockchain. Already single instances of clearly
objectionable or even illegal content can put the whole system
at risk by making its node operators culpable. To overcome
this imminent risk, we survey and discuss the design space of
countermeasures against the insertion of such objectionable
content. Our analysis shows a wide spectrum of potential
countermeasures, which are often combinable for increased
efﬁciency. First, we investigate special-purpose content detectors
as an ad hoc mitigation. As they turn out to be easily evadable,
we also investigate content-agnostic countermeasures. We ﬁnd
that mandatory minimum fees as well as mitigation of trans-
action manipulability via identiﬁer commitments signiﬁcantly
raise the bar for inserting harmful content into a blockchain.
Keywords-Bitcoin, blockchain, security, objectionable con-
Blockchain-based cryptocurrencies such as Bitcoin en-
joy unbroken popularity, averaging at over 280 000 daily
conﬁrmed transactions in 2017 . This popularity is also
reﬂected by the size of the cryptocurrencies’ underlying
peer-to-peer networks and their user base: Bitcoin’s network
size has doubled since 2015 , while its number of users
is peaking in the millions . Cryptocurrencies, especially
Bitcoin, have thus become a well-accepted trading medium
due to their security, timeliness, and decentralization.
Besides offering a platform for ﬁnancial transactions,
recent work – shows that Bitcoin’s blockchain can
also be used as an anonymous and irrevocable content
store. By inserting short, non-ﬁnancial messages, Bitcoin
can be extended to realize additional services, e.g., digital
notary services , secure releases of cryptographic com-
mitments , or non-equivocation schemes .
While this initially unintended extensibility appears
promising, and in fact 1.4 % of Bitcoin transactions hold
non-ﬁnancial data , it can severely compromise block-
chain systems: A recent study  reveals that over 1600 ﬁles
have been irrevocably engraved into Bitcoin’s blockchain,
e.g., to be shared in a censorship-resistant manner. These
ﬁles range from simple text to over 155 images, source
codes, and PDF ﬁles. Any objectionable content, e.g., illegal
pornography, in such ﬁles is then inevitably distributed to all
nodes of the cryptocurrency’s underlying peer-to-peer net-
work. It is expected that court rulings in major jurisdictions
such as Germany and the USA will then ﬁnd node operators
culpable of possessing objectionable content . As a con-
sequence, the node operators must delete affected parts of
the blockchain, thereby breaking the blockchain’s integrity
and veriﬁability. The insertion of objectionable content has
thus the potential of jeopardizing cryptocurrencies, as all
users ultimately depend on this veriﬁcation. Indeed, recent
research  ﬁnds that, while most content is likely harmless,
Bitcoin’s blockchain already today contains content that is
objectionable in many jurisdictions, e.g., an image of a nude
young woman or hundreds of links to child pornography.
Reacting to the evident threats, in this work we explore
potential countermeasures to prevent insertion of objection-
able content w.r.t. blockchain-based cryptocurrencies, using
Bitcoin as a real-world working example. We ﬁrst analyze
the harmfulness of different types of blockchain content
and ﬁnd that short, token-like messages enable beneﬁcial
use cases while insertion of arbitrarily-sized content must
be prevented and we acknowledge that full prevention of
content insertion is impossible. We hence focus on coun-
termeasures that either heuristically hinder the insertion of
large chunks of arbitrary data or ﬁnancially disincentivize
content insertion to a freely tunable degree. Our results
are two-fold. One the one hand, we ﬁnd that the na¨
approach of targeted content detection constitutes a ﬁrst ad
hoc solution against objectionable content, but it is easily
evadable. On the other hand, simple adaptions to Bitcoin,
such as introducing mandatory minimum fees or replacing
manipulable blockchain identiﬁers, can effectively mitigate
objectionable blockchain content with moderate overheads.
II. BACKGROU ND
In this section, we provide the technical background on
Bitcoin required for the remainder of this paper.
Bitcoin Primer. Bitcoin  was the ﬁrst digital currency
to rely on the blockchain. The blockchain is a persistent,
distributed, and append-only ledger of events, serving in
Bitcoin as distributed, immutable record of ﬁnancial transac-
tions between pseudonymous Bitcoin addresses. A Bitcoin
address is controlled by a cryptographic key pair, which is
used to access and transfer the associated bitcoins. A trans-
action collects funds from one or more addresses, the inputs,
and reassigns them to one or more other addresses, the
outputs. To prevent double spending of bitcoins, transactions
are only considered valid if they are immutably recorded in
the blockchain. Transaction inputs and outputs are realized
using a script language that allows authenticating payments
via addresses’ public keys or hash values thereof.
Blockchain Maintenance. The blockchain is maintained
by the Bitcoin network in a distributed manner to reach
consensus about valid transactions among honest nodes :
Users propose their transactions to the network, which are
then added to the blockchain by special nodes, the miners,
via a proof-of-work scheme. To ensure correctness of the
blockchain and its transactions, the full nodes of the Bitcoin
network independently verify all transactions and blocks
they receive and reject incorrect information. Furthermore,
full nodes maintain a full copy of the whole blockchain to
serve newly joining nodes. Assuming an honest majority
(among full nodes), the longest blockchain constitutes the
Bitcoin network’s consensus.
Incentives and Fees. Miners are incentivized to perform
the proof of work via block rewards. Each miner includes a
coinbase transaction in her blocks, which rewards her with
a prescribed number of freshly minted bitcoins. As the block
reward is exponentially decreasing to limit the total supply of
bitcoins, transaction fees are a second tier of miner rewards.
Miners may collect excess bitcoins from all transactions
in their blocks as a form of tip, which is paid as a byte-
wise fee. As overpaying fees incentivizes miners to consider
a transaction faster, recommendations on transaction fees
emerged . For instance, to get a transaction included into
the blockchain during December 2017 within an hour, it is
recommended to pay on average 423 satoshi per Byte (B),
i.e., 16.14 USD for an average transaction of 250 B size .
Content Insertion. A comprehensive study on how Bit-
coin transactions can be augmented with arbitrary content
is given in . While short messages of up to 100 B can
be added via intended mechanisms (coinbase transactions
and OP_RETURN), transactions can be manipulated to hold
arbitrarily-sized content such as images or archives. Predom-
inantly, content inserters arbitrarily replace the blockchain
identiﬁers, usually 20 B-long cryptographic hash values of
public keys, of multiple outputs with their content, poten-
tially making the output unspendable . Using encodings
such as Apertus  allows spreading content over multiple
transactions while retaining efﬁcient decoding .
III. SCE NAR IO A ND PROBLEM STAT EM EN T
In this section, we deﬁne the underlying scenario for
blockchain content insertion with the goal of designing
countermeasures against such practices. We ﬁrst outline
that different classes of arbitrary blockchain content can be
harmful or beneﬁcial to cryptocurrencies (Section III-A). We
then discuss related work (Section III-B) and show that it
is impossible to prevent insertion of all unintended content
into the blockchain (Section III-C). From this analysis, we
distill the problem statement for this paper (Section III-D).
A. Harmfulness of Arbitrary Blockchain Content
Current blockchain designs allow augmenting user-
generated transactions with short chunks of arbitrary content
as described in Section II. Notably, manipulating transac-
tions allows inserting arbitrary amounts of unintended data
even into special-purpose blockchains, e.g., storing non-
ﬁnancial data on cryptocurrency blockchains. In this paper,
we refer to Bitcoin as our real-work working example.
If a miner includes content-holding transactions into her
blocks, the content is irrevocably distributed to all full
nodes. Recent research  shows that this puts full node
operators at risk: Although court rulings are yet to come, it is
expected that major jurisdictions could ﬁnd that maintaining
a blockchain containing objectionable content, e.g., illegal
pornography, constitutes possession of illegal documents .
Full node operators hence face a dilemma: If they keep
maintaining the blockchain, they may become culpable. Yet,
if they delete content-holding transactions from their local
blockchain, they break its integrity and thus its veriﬁability.
Deleting blockchain content locally to comply with le-
gal obligations severely impedes the health of the Bitcoin
network. It is critical that newly joining full nodes obtain
an intact blockchain copy in order to successfully synchro-
nize with the Bitcoin network. Furthermore, also the users
of lightweight solutions such as online wallets ultimately
depend on full nodes performing the veriﬁcation process on
their behalf. Hence, we argue that objectionable, i.e., illegal-
to-possess, blockchain content must be proactively prevented
from entering the blockchain to the largest extent possible.
However, only certain content can jeopardize blockchain
systems. In fact, the outlined culpability only holds if content
is objectionable and easily extractable from the blockchain.
For instance, full pictures of illegal pornography, as one the
most ubiquitously objected content type, can be stored on the
blockchain using tens of kilobytes . The most imminent
risk therefore stems from arbitrary-length and easy-to-read
content, especially if it is easily accessible via standard
software after extracting it from the local blockchain copy.
Contrarily, short pieces of blockchain content are less
likely to be harmful as they cannot hold objectionable
content directly. Even short links to objectionable content do
not put full node operators at risk of possessing said content:
As the content is not stored directly on the blockchain,
operators do not own a physical copy of it. Furthermore, they
could even cooperate with local authorities to take down the
target server without impeding the blockchain integrity.
As a consequence, we deem short-sized content (≤1 KiB)
to have a lower harm potential and consider very short con-
tents (≤100 B) harmless. These thresholds may, however,
require adaption in the future, e.g., due to court rulings.
On the contrary, short-sized blockchain content has proven
to fuel innovation and create new applications. A wide range
of applications now rely on engraving short tokens into the
blockchain to leverage its security model for off-blockchain
services. By adding hash values of ﬁles, it becomes publicly
veriﬁable that a given document existed by the time the
transaction was added to the blockchain . Similarly,
the blockchain can become a general-purpose event ledger,
e.g., for non-equivocation logs . Allowing short text
messages, e.g., meta information, on the blockchain further-
more enables services such as distributed management of
assets  or key-value pairs  and the execution of smart
contracts . Although arbitrary-length blockchain content
can also be beneﬁcial, e.g., for whistleblowers to unveil
misconducts in a censorship-resistant manner, a single piece
of objectionable content can jeopardize the whole system.
Hence, the risks introduced by arbitrary-length blockchain
content by far outweigh the potential beneﬁts.
It is thus crucial to face the currently disregarded risks
of arbitrary-length, easy-to-read blockchain content and to
design countermeasures. However, the beneﬁts of short
blockchain messages require such designs to trade off secu-
rity against innovation by gauging which content is harmless.
B. Related Work
Monitoring incoming data as well as recognizing and
ﬁltering unwanted content is a classical application of ﬁre-
walls, intrusion detection systems, and spam ﬁlters –
, which have drastically improved security and quality
of service within their respective domains. However, their
often high adaptability requirements are commonly tackled
via supervised or automated learning of what content should
be ﬁltered. This is challenging w.r.t. blockchain systems as
it must be guaranteed that learning is deterministic and all
overhead from computation-intense local checks multiply
over the whole network and should be avoided. However,
we deem a further investigation promising future work.
A new line of blockchain-based systems promises to avoid
the problems of objectionable blockchain content altogether
by persistently maintaining account balances instead of the
whole transaction ledger –. As transaction outputs
are separated during the balance update, it is hard to link
individual chunks of blockchain content to each other to
reveal the full content. Yet, forfeiting the event history con-
siderably limits potential applications. For instance, notary
services cannot be realized via such blockchains. The risks
of content insertion must thus be tackled for all blockchains.
Similarly, redactable blockchains  emerged, which
enable after-the-fact alternation and deletion of transactions.
These blockchains use chameleon hash functions  to
link blocks such that trusted entities or quorums can alter
them. The arising trust issues can be tackled by issuing
decentralized votes on blockchain alternations , but
again, this enhancement is incompatible to existing systems.
C. Impossibility of Rigorous Blockchain Content Filtering
We have shown that objectionable content puts blockchain
systems at risk. However, as we argue in this section,
unintended data cannot entirely be prevented from entering
public blockchains. This circumstance stems from the fact
that public blockchains are pseudonymous. Full nodes do not
verify whether an alleged receiver in a proposed transaction
does indeed exist. As of now, content can thus be easily
inserted by manipulating the receiver’s identiﬁers. E.g.,
Bitcoin allows inserting tens of kilobytes of arbitrary data
per transaction using multiple manipulated outputs .
Unfortunately, such manipulation cannot be comprehen-
sively detected by full nodes: Users can, and are even
encouraged to , refresh their Bitcoin addresses arbitrarily
often. Hence, a user willing to insert blockchain content
can continuously create new Bitcoin addresses for herself
by brute-forcing private keys such that the content can be
encoded via, e.g., the ﬁrst few bytes of each output used.
The user can subsequently craft a transaction that sends
arbitrary amounts of bitcoins to each of her chosen outputs
and publish it. This transaction is valid and indistinguishable
from non-manipulated transactions. Furthermore, full nodes
cannot link the transaction outputs to the single user during
transaction validation due to the lack of centralized user-
identity associations. Hence, it is impossible for full nodes to
detect and reject all transactions holding unintended content.
In the following, we thus explore heuristics that reject
potentially harmful content, either by analyzing transactions
to reveal content, e.g., plain text or image ﬁles, or by
disincentivizing insertion of arbitrarily-sized content.
D. Problem Statement
We argued that persistently storing arbitrarily-sized con-
tent puts blockchain systems at risk while short data pieces
have proven beneﬁcial. Furthermore, we have shown the
impossibility of preventing all unintended blockchain con-
tent. The goal of this paper is therefore to explore the
design space of countermeasure heuristics that (i) prevent
harmful content from entering the blockchain, (ii) are easily
deployable even for established systems such as Bitcoin, and
(iii) are adaptable in case that also short pieces of blockchain
content reveal unforeseen risks. To this end, content pieces
shall be (i) allowed if very short (≤100 B), (ii) tolerated
if medium-sized (≤1 KiB), and (iii) effectively prevented
if arbitrarily-sized. We note that these thresholds can be
freely adapted if deemed necessary, e.g., by court rulings.
As complete prevention of content insertion is impossible,
our countermeasures must render the insertion of arbitrarily-
sized content either computationally or ﬁnancially infeasible.
IV. COUNTERMEASURES AGAINST CON TE NT INSERTION
To effectively mitigate risks of blockchain content, we
identify countermeasures that limit the amount of insertable,
unintended content or make such insertion ﬁnancially in-
feasible in accordance with our problem statement in Sec-
tion III-D. As a representative working example, we consider
countermeasures that are easily applicable to Bitcoin. After
proposing evaluation criteria for the countermeasure quality
(Section IV-A), we propose to (i) introduce na¨
ﬁltering (Section IV-B), to (ii) adapt the fee model (Sec-
tion IV-C), and to (iii) include proofs of key authenticity
within the transactions themselves (Section IV-D).
A. Evaluation Criteria
We evaluate our countermeasures’ efﬁciency against in-
sertion of harmful blockchain content w.r.t. ﬁltering quality,
usability,network burden, and deployability. High ﬁltering
quality means that insertion of harmful content (as in Sec-
tion III-D) becomes either computationally or ﬁnancially
infeasible. A countermeasure is usable if it does not impede
normal system use. A low network burden is achieved
if neither the blockchain’s nor the Bitcoin network’s per-
formance is decreased signiﬁcantly by a countermeasure.
Finally, a countermeasure should be deployable via only
minor changes to already-established blockchain systems
(changing transaction acceptance always requires an update).
B. Filtering Content-holding Transactions
An intuitive countermeasure against unintended block-
chain content is the systematic analysis of proposed trans-
actions and subsequent rejection of content-holding trans-
actions by full nodes and miners. As discussed in Sec-
tion III-A, the most imminent threat stems from arbitrary-
sized content that is objectionable and easily accessible, e.g.,
common ﬁles such as images. We thus ﬁrst explore the na¨
approach of adding content detectors to Bitcoin’s transaction
veriﬁcation to detect and reject content-holding transactions.
To detect easily accessible content in transactions, we
propose (i) a text detector to identify transactions carrying
text or ASCII-based ﬁles and (ii) a known-ﬁle detector to
identify binary ﬁles such as images or archives.
Detecting large fractions of printable text within a trans-
action prevents custom text as well as text-based ﬁles,
e.g., HTML pages or Python scripts, from entering the
blockchain. We hence propose a text detection threshold
t∈[0,1] to check whether individual transaction outputs
consist of large fractions of printable ASCII characters.
To choose t, we consider the detector’s expected false-
positive rate (FPR). False positives can occur as block-
chain identiﬁers may contain printable ASCII characters
0.00 0.25 0.50 0.75 1.00
Text Detection Threshold
False Positive Probability
Figure 1: Expected false-
positive rate for text detec-
tion in a 20 B-long identiﬁer
Outputs per Transaction [#]
Fraction of Data
Figure 2: Cumulative distri-
bution of numbers of out-
puts per transaction
by chance. Figure 1 shows the expected FPR for random
blockchain identiﬁers (20 B length, 95 of 256 printable
characters) and varying thresholds t. We observe that only
high thresholds lead to negligible FPRs: While t= 0.75
still yields an expected FPR of 0.064 %,t= 0.9yields
an expected FPR of 1.42 ×10−4%. As this conﬁrms the
intuition from previous works , , we suggest t= 0.9.
Unfortunately, reusing Bitcoin addresses, e.g., to collect
donations, can cause the text detector to reject valid pay-
ments if the corresponding blockchain identiﬁer is a false
positive. Hence, also small expected FPRs can impede the
usability seriously as such identiﬁers are potentially used
heavily. This is illustrated, e.g., by one Bitcoin address1,
which is a false positive w.r.t. our text detector but received
bitcoins from over 370 transactions as of January 8th, 2018.
We thus propose to further restrict the text detector to
reject only transactions with more than 5distinct text-
holding outputs (100 B worth of content). This way, only
short and thus harmless texts can be inserted. Notably, most
of these texts can already be inserted using an OP_RETURN
output (at lower costs). Hence, the text detector mitigates
harmful text insertion without discriminating honest users.
Contrarily, preventing insertion of binary ﬁles is not fea-
sible via this approach. While ﬁle types can be determined
using magic numbers, i.e., byte sequences that are unique to
the respective ﬁle type, these sequences are often only few
bytes long . Hence, the expected FPR increases drasti-
cally. Although the accuracy can be improved by considering
more characteristic features, e.g., default headers, content
inserters can evade this overly speciﬁc detector more easily.
They can, e.g., introduce easily revertible modiﬁcations to
the content such as a deterministic padding.
Evaluation. Content detectors can be ﬁne-tuned specif-
ically to reject unwanted content with a low overhead for
full nodes. Furthermore, usability is guaranteed since we
detect false positives with high probability. Thus, honest
users are not affected by the detector. However, the detector’s
ﬁltering quality is insufﬁcient for binary ﬁles and we expect
a wide range of evasion schemes to emerge. This would
imply a poor deployability, as novel evasion schemes result
in frequent mandatory updates for all honest full nodes. In
1Bitcoin address: 154QWLN3Uz43nHMAM7ioYUx8tkYXdNKDtQ
(a) Piecewise constant growth
(b) Piecewise linear growth
Figure 3: Proposed minimum fees for different growths β(n)
conclusion, explicit content detection constitutes a ﬁrst line
of defense against unwanted blockchain content and works
for text-based content. Mitigating the insertion of unwanted
binary ﬁles, however, requires more general approaches.
C. Mandatory Minimal Transaction Fees
Bitcoin transaction fees are usually paid per byte, with
a current recommendation of 423 satoshi/B(16.14 USD
for the average 250 B-large transaction) as of December
2017 . Although such fees seem high, new content is
actively being added . We thus propose to adapt Bitcoin’s
underlying fee model to hinder content insertion.
As Figure 2 shows, the vast majority of all nearly 255
million transactions (as of August 2017) has at most 50
outputs (99.73 %). Of all transactions, 97.22 % even have
5outputs or fewer and 91.77 % have at most 2outputs.
Thus, we propose to enforce mandatory minimum fees to
penalize large transactions and thus disincentivize content
insertion. Given a proposed transaction twith size stand
number of outputs nt, we propose the simple fee function:
F(t) = α·(st+β(nt)·nt).
Here, αis the byte-wise base fee and β(n)is a piecewise-
deﬁned function depending on thresholds Tsand Tmthat
distinguish small, medium-sized, and large output numbers.
As an example, Figure 3 shows the resulting fees for α=
423 satoshi/B, thresholds Ts= 6, Tm= 51 and
10 n∈[Ts, Tm]
in Figure 3a and βL(n) = Pn
0βC(n)in Figure 3b (the
dashed lines denote Tsand Tm). While we leave the exact
parametrization open, our choices for β(n)to be piecewise
constant (βC(n)) or piecewise linear (βL(n)) showcase the
design space for mandatory minimum fees. Neither approach
impedes small transactions (over 97 % of all transactions),
but both penalize larger transactions in varying degrees.
For instance, using a piecewise constant fee growth per
output, a borderline-large transaction (50 outputs) would
inﬂict additional 29 USD of fees, while the linear fee growth
yields roughly 1310 USD in additional fees, i.e., legitimate
creators of medium-sized transactions are penalized more.
However, the increase in penalty fees grows gradually for
medium-sized transactions. Content inserters, contrarily, are
especially punished by linear-growth fee penalties: E.g.,
storing a small JPEG image of 20 KiB (1024 outputs), would
cost the content inserter roughly 1.19 ×106USD in fees.
Evaluation. Mandatory minimum fees are promising to
disincentivize content insertion. They are easily deployable
(a single check during transaction veriﬁcation) and have neg-
ligible overhead as full nodes only must check whether the
transaction pays at least the required fees. However, manda-
tory fees have an inherent trade-off between usability and
ﬁltering quality: If the fee model is tuned towards rejecting
even small amounts of content, honest users currently relying
on large transactions, e.g., exchange services, are potentially
penalized. Hence, the fee model’s parametrization must be
thoroughly evaluated prior to its deployment.
D. Self-Verifying Account Identiﬁers
We propose a simple adaption of standard Bitcoin trans-
actions to make content insertion computationally infeasible.
We approximate the best case of infeasible content insertion
outlined in Section III-C by only recording cryptographi-
cally non-manipulable values on the blockchain.
Currently, content inserters can easily replace mutable
identiﬁers in their transaction outputs with arbitrary values.
Full nodes are unable to validate the correctness of these
identiﬁers until they receive a future transaction attempting
to spend a particular output. Hence, full nodes are forced to
accept manipulated transaction outputs into the blockchain.
We hence propose to replace manipulable identiﬁers of trans-
action outputs with identiﬁer commitments (ICs). Namely,
the IC C(x)is obtained by interpreting an identiﬁer xas
a private key over Bitcoin’s elliptic curve secp256k1 and
signing the corresponding public value x·G(Gthe generator
of secp256k1) together with a salting nonce rvia ECDSA:
C(x) := (x·G, r, sig(x·Gkr, x))
Replacing xwith C(x)ensures that (a possibly content-
hiding) xnever appears on the blockchain. This hiding of x
is sufﬁcient for outputs, i.e., inputs are not manipulable as
they can only be created by proving possession of a private
key. To hinder content insertion, xmust not be efﬁciently
computable from C(x)and C(x)must not enable content
insertion by other means than brute-forcing it. To this end,
we rely on ICs to be one-way,fresh, and self-verifying.
The one-way property of x·Gguarantees that xcannot
be computed efﬁciently from x·G. Thus, x·Gcan safely
be added to the blockchain. Furthermore, xis unknown for
manipulated x·Gand hence the signature sig(x·Gkr, x)
cannot be computed. Thus, it is computationally infeasible
to compute a valid C(x)hiding content in x·G.
Freshness of C(x)is required to hamper the creation of
rainbow tables for identiﬁers xthat yield ICs well-suited
for content insertion, e.g., a reusable ﬁle header. We ensure
freshness via adding the salt r=CRC32(t1k...ktn), where
Input Script: [signature] [signature]
[public key] [public key]
Output Script: OP_DUP OP_DUP
[commitment x*G] [public key hash]
Figure 4: P2UC script; P2SC replaces P2SH analogously.
tiis the hash value identifying the previous, already mined
transaction referenced by the i-th input. As rdepends on pre-
decessor transactions, it changes for almost all transactions,
which makes creating rainbow tables unproﬁtable. Moreover,
it prevents content storage in ritself. The 4 B short salt
ensures that the blockchain is not bloated unnecessarily.
Finally, the self-verifying property ties C(x)to xsuch that
only possession of the private value xenables the spending
of funds. Furthermore, it ensures that x·G,r, and the
signature sig(x·Gkr, x)cannot be manipulated to contain
content without invalidating the signature.
Implementation. ICs can be easily integrated into Bitcoin
by adding only a new OP_COMMIT operation to it’s stack-
based scripting language. This way, we can replace the most
common transaction scripts, P2PKH and P2SH, with non-
manipulable alternatives: Pay-to-User-Commitment (P2UC)
and Pay-to-Script-Commitment (P2SC).
Figure 4 exemplarily shows the transition from P2PKH
to P2UC (P2SC analogously replaces P2SH). We replace
the manipulable identiﬁer xwith its IC C(x). However, full
nodes must verify the correctness of C(x)using the salt r
and the signature sig(x·Gkr, x)before the transaction is
added to the blockchain. Hence, this check is independent
from actually executing the script once the transaction output
should be spent. Instead, full nodes only need to verify the
correctness of x·Gto authenticate payments. To this end,
we introduce OP_COMMIT, which replaces the current top
stack element xwith x·G. We ignore the signature and salt
during payment veriﬁcation via the OP_DROP operations.
Performance. We evaluate our scheme w.r.t. validation
and payment veriﬁcation times and transaction sizes.
Validating a IC requires one additional signature veriﬁ-
cation, which takes 0.4 ms on average on a commodity PC
(Intel Core 2 Quad Q9400 CPU running at 2.66 GHz with
4 GiB RAM). Hence, full nodes need only 3.8 s to validate
a proposed block consisting of roughly 1900 transactions
(average over all blocks of 2017 ) with at most 5outputs
each (cf. Section IV-C). As new blocks are only published
on average every 10 min , this additional check is
clearly feasible for full nodes even for exceptionally large
transactions with up to 50 outputs each (38 s).
Introducing ICs slightly changes payment veriﬁcation:
A full node must execute OP_COMMIT and compute
OP_HASH256 instead of OP_HASH160. Computing one IC
takes 0.2 ms on average and, notably, OP_HASH256 out-
performs OP_HASH160 for small input sizes. Furthermore,
computing the salt as a CRC32 checksum is negligible.
Hence, ICs do not impede payment veriﬁcation.
Replacing blockchain identiﬁers with ICs increases the
overall transaction size. An IC is 112 B long (x·G:33 B;r:
4 B; sig(x·Gkr, x):77 B;4 B for additional operations), in
contrast to the 20 B size of the unprotected identiﬁer. Hence,
a standard (P2PKH) transaction consisting of one input
and two outputs grows from 225 B to 409 B. Blocks can
therefore hold up to 2445 transactions, which comfortably
sustains the current average of 1900 transactions per block.
Evaluation. ICs have a high ﬁltering quality as they
reduce insertable content to the theoretic minimum (cf.
Section III-C). Furthermore, usability is not impeded: Users
only need to additionally compute the IC, which is clearly
feasible for frequencies of transaction creations expected
for individual users. Even though we need to introduce
OP_COMMIT and extend Bitcoin’s transaction validation
process, we argue that the required changes are only minor
and thus ICs are well-deployable. ICs inevitably increase
transaction sizes. However, the Bitcoin network can still
sustain its transaction throughput. Thus, the overhead is
worth the IC-based protection against content insertion.
The threat of inserting arbitrary content into blockchains
was only recently recognized: Objectionable content can be
anonymously and irrevocably inserted and thus distributed
to the nodes of a blockchain-based system, whose operators
can then be culpable for possessing the content.
We proposed conceptual countermeasures to empower the
nodes of a blockchain-based system to heuristically reject
transactions holding unintended content with high proba-
bility. Namely, we propose a content detector to identify
and reject content-holding transactions, mandatory mini-
mum transaction fees to make content insertion econom-
ically infeasible, and a computationally non-manipulatable,
commitment-based replacement for easily manipulable iden-
tiﬁers in transaction outputs.
Until our countermeasures fully complement each other,
they can be deployed gradually, e.g., the content detector is
immediately deployable. Nevertheless, content inserters can
quickly adapt to its strategy and evade detection. It is thus
only an effective ad hoc defense that should soon be accom-
panied by a fee model that disincentivizes large transactions
which are suited to hold objectionable content. Furthermore,
using non-manipulable blockchain identiﬁers limits content
insertion to the theoretical minimum at moderate costs.
ACK NOW LE DG ME NT S
This work has been funded by the German Federal
Ministry of Education and Research (BMBF) under funding
reference number 16KIS0443. The responsibility for the
content of this publication lies with the authors, who would
also like to thank the German Research Foundation DFG for
the kind support within the Cluster of Excellence “Integra-
tive Production Technology for High-Wage Countries”.
 Blockchain.info. (2011) Bitcoin Charts & Graphs. Accessed
02/04/2018. [Online]. Available: https://blockchain.info/
 A. Yeow, “Bitnodes: Global Bitcoin Nodes Distribution,”
2018, accessed 02/04/2018. [Online]. Available: https:
 D. D. F. Maesa, A. Marino, and L. Ricci, “Data-driven
analysis of Bitcoin properties: exploiting the users graph,”
International Journal of Data Science and Analytics, pp. 1–
18, Sep. 2017.
 K. Shirriff. (2014) Hidden surprises in the Bitcoin blockchain
and how they are stored: Nelson Mandela, Wikileaks,
photos, and Python software. Accessed 02/04/2018.
[Online]. Available: http://www.righto.com/2014/02/ascii-
 R. Matzutt, O. Hohlfeld, M. Henze, R. Rawiel, J. H. Ziegel-
dorf, and K. Wehrle, “POSTER: I Don’t Want That Content!
On the Risks of Exploiting Bitcoin’s Blockchain as a Content
Store,” in Proceedings of the 23rd ACM Conference on
Computer and Communications Security (CCS). ACM, 2016.
 R. Matzutt, J. Hiller, M. Henze, J. H. Ziegeldorf,
ullmann, O. Hohlfeld, and K. Wehrle, “A Quantitative
Analysis of the Impact of Arbitrary Blockchain Content on
Bitcoin,” in Proceedings of the 22nd International Confer-
ence on Financial Cryptography and Data Security (FC).
 PoEx Co., Ltd. (2015) Proof of Existence. Accessed
02/04/2018. [Online]. Available: https://poex.io
 J. Clark and A. Essex, “CommitCoin: Carbon Dating Com-
mitments with Bitcoin,” in Proceedings of the 16th Inter-
national Conference on Financial Cryptography and Data
Security (FC). Springer, 2012, pp. 390–398.
 A. Tomescu and S. Devadas, “Catena: Efﬁcient non-
equivocation via bitcoin,” in IEEE Symposium on Security
and Privacy (S&P). IEEE, 2017, pp. 393–409.
 S. Nakamoto, “Bitcoin: A Peer-to-Peer Electronic Cash
System,” Tech. Rep., 2008, accessed 02/04/2018. [Online].
 J. Bonneau, A. Miller, J. Clark, A. Narayanan, J. A. Kroll, and
E. W. Felten, “SoK: Research Perspectives and Challenges
for Bitcoin and Cryptocurrencies,” in IEEE Symposium on
Security and Privacy (S&P). IEEE, 2015, pp. 104–121.
 (2016) Bitcoin Transaction Fees. Accessed 02/04/2018.
[Online]. Available: https://bitcoinfees.info
 HugPuddle. (2013) Apertus – Archive data on your favorite
blockchains. Accessed 02/04/2018. [Online]. Available:
 A. Russell, P. Norby, and S. Bakhshi. (2015) Perceptual
Image Compression at Flickr. Accessed 02/04/2018. [On-
line]. Available: https://code.ﬂickr.net/2015/09/25/perceptual-
 M. Bartoletti and L. Pompianu, “An analysis of Bitcoin
OP RETURN metadata,” in Proceedings of the 4th Workshop
on Bitcoin and Blockchain Research (BITCOIN), 2017.
 Namecoin. Accessed 02/04/2018. [Online]. Available: https:
 G. Wood, “Ethereum: A Secure Decentralised Generalised
Transaction Ledger,” 2016, accessed 02/04/2018. [Online].
 M. Roesch, “Snort - Lightweight Intrusion Detection for
Networks,” in Proceedings of the 13th USENIX Conference on
System Administration (LISA). USENIX Association, 1999,
 S. Ioannidis, A. D. Keromytis, S. M. Bellovin, and J. M.
Smith, “Implementing a Distributed Firewall,” in Proceedings
of the 7th ACM Conference on Computer and Communica-
tions Security (CCS). ACM, 2000, pp. 190–199.
 H. Hu, W. Han, G.-J. Ahn, and Z. Zhao, “FLOWGUARD:
Building Robust Firewalls for Software-deﬁned Networks,” in
Proceedings of the Third Workshop on Hot Topics in Software
Deﬁned Networking (HotSDN). ACM, 2014, pp. 97–102.
 U. Hanani, B. Shapira, and P. Shoval, “Information ﬁltering:
Overview of issues, research and systems,” User Modeling
and User-Adapted Interaction, vol. 11, no. 3, pp. 203–259,
 J. D. Bruce, “The Mini-Blockchain Scheme,” White
paper, 2014, accessed 02/04/2018. [Online]. Available:
 A. Chepurnoy, M. Larangeira, and A. Ojiganov, “Rollerchain,
a Blockchain With Safely Pruneable Full Blocks,” White
paper, 2016, accessed 02/04/2018. [Online]. Available:
 A. Molina and H. Schoenfeld, “PascalCoin Version
2,” White paper, 2017, accessed 02/04/2018. [Online].
 G. Ateniese, B. Magri, D. Venturi, and E. Andrade,
“Redactable Blockchain – or – Rewriting History in Bitcoin
and Friends,” in IEEE European Symposium on Security and
Privacy (EuroS&P). IEEE, 2017, pp. 111–126.
 J. Camenisch, D. Derler, S. Krenn, H. C. P¨
ohls, K. Samelin,
and D. Slamanig, “Chameleon-Hashes with Ephemeral Trap-
doors,” in Proceedings of Public-Key Cryptography (PKC),
S. Fehr, Ed. Springer, 2017, pp. 152–182.
 I. Puddu, A. Dmitrienko, and S. Capkun, “µchain: How
to forget without hard forks,” IACR Cryptology ePrint
Archive, vol. 2017:106, 2017, accessed 02/04/2018. [Online].