ArticlePDF Available

How Much Are We Willing To Lose in Cyberspace? On the Tail Risk of Scam in the Market for Initial Coin Offerings

Authors:

Abstract and Figures

From an entrepreneurial perspective, Initial Coin Offering (ICO) has become an alternative way for attaining funding for business projects using the new evolving digital financial market for tokens. Unfortunately, the majority of all ICOs are subject to scam which casts doubt on this new innovative tool for acquiring funding. Using a unique intensively hand-collected data set covering more than 5000 ICOs which have been launched in the August 2014-December 2019 period, we could identify 1014 ICOs exhibiting data on raised funding whereof 576 turned out to be scams projects. The cumulative losses due to scam in the ICO market correspond to $10.12 billion which is 66% of our identified overall market capitalization and highlights the enormous societal impact of this criminal activity. One novel aspect of our study is that it employs a recently proposed methodology based on 'plug-in estimation' to quantitatively computing the risk associated with scam in the market for ICOs. Our results suggest that employing naïve statistics in risk management dramatically underestimates this risk. We argue that our findings have important implications for policy makers as they call for an urgent need for ICO market regulations from governments and regulatory agencies to protect investors.
Content may be subject to copyright.
1
How much are we willing to lose in cyberspace? On the tail
risk of scam in the market for Initial Coin Offerings
Niranjan Sapkotaa †, Klaus Grobysb † *, Josephine Dufitinemac
This draft: November 18, 2020
Abstract
From an entrepreneurial perspective, Initial Coin Offering (ICO) has become an alternative way for
attaining funding for business projects using the new evolving digital financial market for tokens.
Unfortunately, the majority of all ICOs are subject to scam which casts doubt on this new innovative
tool for acquiring funding. Using a unique intensively hand-collected data set covering more than
5000 ICOs which have been launched in the August 2014–December 2019 period, we could identify
1014 ICOs exhibiting data on raised funding whereof 576 turned out to be scams projects. The
cumulative losses due to scam in the ICO market correspond to $10.12 billion which is 66% of our
identified overall market capitalization and highlights the enormous societal impact of this criminal
activity. One novel aspect of our study is that it employs a recently proposed methodology based on
‘plug-in estimation’ to quantitatively computing the risk associated with scam in the market for ICOs.
Our results suggest that employing naïve statistics in risk management dramatically underestimates
this risk. We argue that our findings have important implications for policy makers as they call for an
urgent need for ICO market regulations from governments and regulatory agencies to protect
investors.
JEL Classification: C19, C49, C59, G10, G15, G19.
Keywords: Crowd funding, Financial technology, Fraud, Initial Coin Offering, Plug-in estimation,
Scam.
*Corresponding author
a N. Sapkota
Finance Research Group, School of Accounting and Finance, University of Vaasa, Wolffintie 34, 65200 Vaasa, Finland
E-mail: niranjan.sapkota@uva.fi
b K. Grobys
Finance Research Group, School of Accounting and Finance, University of Vaasa, Wolffintie 34, 65200 Vaasa, Finland;
Innovation and Entrepreneurship (InnoLab), University of Vaasa, Wolffintie 34, 65200 Vaasa, Finland
E-mail: klaus.grobys@uva.fi
c J. Dufitinema
Mathematics and Statistics Research Group, School of Technology, University of Vaasa, Wolffintie 34, 65200 Vaasa,
Finland
E-mail: josephine.dufitinema@uwasa.fi
We gratefully acknowledge the Project Research Grant (Grant no. 190405) by the Foundation for Economic Education
(LIIKESIVISTYSRAHASTO), Finland.
2
1. Introduction
Bitcoin, cryptocurrency, digital assets, the blockchain and distributed ledger technologies are
important clusters in the new emerging digital financial ecosystem disrupting the financial sphere.
Nowadays, entrepreneurs looking to run a new business project can do it through an Initial Coin
Offering (ICO), which is a variation of an Initial Public Offering (IPO), where a blockchain-based
issuer sells cryptographically secured digital assets ̶ typically referred to as tokens–giving the holder
the right to an issuer’s product or service. Unlike IPOs, which are subject to strict legal regulations,
ICOs only require a white paper (WP) and offer some interesting features such as (i) absence of entry
constraints, (ii) scope for exponential growth, (iii) absence of geographical barriers, and (iv) simple
validation. It may be not surprising that the ICO market experiences explosive growth recently. For
example, a recent study of Howell, Niessner, and Yermack (2020) documents that in the January
2016 ̶August 2019 period, ICOs raised worldwide over $31 billion, whereas about two dozen of
individual ICOs have acquired more than $100 million. The authors point out that the ICO market
has become notorious for frauds, respectively, scams. Alarmingly, according to one estimate based
on 1500 ICO projects, 80% of all ICOs that successfully raised their target have turned out to be
scams which, unsurprisingly, damaged investor’s trust in the market for ICOs.1 In this regard, Howell
et al. (2020) highlight that the ICO market’s rapid growth and novel characteristics have attracted
interest from entrepreneurs, investors, and in regulators.
This paper explores a novel dimension of risk associated with this new digital
ecosystem, that is, the expected loss due to scam in the ICO market. Investigating a unique hand-
collected data set of 5036 ICOs covering the August 2014 to December 2019 period, we find that
1014 ICOs have available data for raised funding. Using these data, we found that 576 ICOs turned
out to be scams, corresponding to cumulative losses of $10.12 billion which highlights the enormous
societal impact of this criminal activity. The largest loss in our sample is the so-called ‘Petro-scam’,
where investors lost a total of $735 million.2 In this regard, it is important to note that the Venezuelan
Legislative Assembly also declared Petro as illegal in 2018. Naïve risk management may dramatically
underestimate the risk of those fraud incidents. We examine the distribution of fraud incidents in the
ICO market and build a statistical picture of their tail properties.3 Employing Taleb’s (2020) recently
proposed ‘plug-in estimation’, we show that the distribution of fraud incidents is extremely fat-tailed,
considerably more than what one could be led to believe from the outset. It is noteworthy that Cirillo
1 See, https://research.bloomberg.com/pub/res/d28giW28tf6G7T_Wr77aU0gDgFQ.
2 See, https://www.bloomberg.com/news/articles/2018-04-03/crypto-rating-sites-are-already-calling-venezuela-s-petro-
a-scam.
3 Note that we use ‘scam’ and ‘fraud’ as synonyms in our current study.
3
and Taleb (2020, p.606) emphasize that fat tails represent a common–yet often ignored–regularity in
many fields of science and knowledge and argue that the main problem of naïve risk management is
that it consistently uses wrong thin-tailed distributions and therefore underestimates tail risks. Taleb
(2020) formally shows that in the presence of fat-tailed distributions, the law of large numbers works
too slowly. Hence, he proposes the so-called ‘plug-in estimation’ where the underlying distribution
is estimated first and then the corresponding theoretical moment. Our study is the first that makes use
of this recently proposed ‘plug-in estimation’ to explore a new dimension of risk associated with the
new emerging digital financial ecosystem, that is, the tail risk of fraud incidents in the market for
ICOs.4
Our study has several clear and important contributions. First, taking the perspective
from a broader audience, our paper extends the literature on investigating to which degree man-made
phenomena are exposed to tail risks. As man-made phenomena are often fat-tailed distributed, and
hence, modeled via power-laws, an important study in this stream of research is that one of Clauset,
Shalizi, and Newman (2009), who analyze whether 24 real-world data sets from a range of different
disciplines–such as human endeavor, including physics, earth sciences, biology, ecology,
paleontology, computer and information sciences, engineering, or social sciences–statistically follow
power-law distributions. Clauset et al. (2009) and Taleb (2007) argue that power-law distributions
occur in many situations of scientific interest and have significant consequences for our understanding
of man-made phenomena. From the perspective of a more narrow audience of finance researchers, in
research contexts related to financial markets, power-law distributions are typically used to model
both financial returns and the volatility of financial returns. For instance, in a recent study
Warusawitharana (2019) estimates power-law coefficients of 41 stocks over the 2003 to 2014 period
and finds that the power-law coefficient of the cross-sectional distribution ranges between 2.09 and
3.46.5 Given this body of literature, our study is the first one that demonstrates that the losses due to
scams in the market for ICOs are indeed heavily fat-tailed distributed which has dramatic
consequences for the computation of risk.
Narrowing the perspective from the general finance audience to a distinct audience of
experts in the new emerging field of financial technology, our study contributes to the fast growing
literature on exploring issues related to digital ecosystems. Specifically, a recent stream of literature
investigates aspects of the Bitcoin ecosystem especially as they relate to finance (see Makarov and
4 Since our paper is related to the literature on financial risks, it is at this stage not our goal to explain the emergence of
these tail properties which could be an issue related to the literature on microstructures in financial markets or the literature
dealing with issues related to behavioral finance.
5 Lux and Alfarano (2016) provide a more detailed overview on the literature dealing with power-laws in financial market
data.
4
Schoar, 2020; Howell, Niessner, and Yermack, 2020; Grobys, 2020; Böhme, Christin, Edelman, and
Moore, 2015; Harvey, 2016; Malinova and Park, 2016; Raskin and Yermack, 2017; Aune,
Krellenstein, O’Hara, and Slama, 2017). In this stream of research, an important paper related to our
study is the one of Howell et al. (2020) investigating which issuer and characteristics related to ICO
predict successful real outcomes in a sample of more than 1500 ICOs that collectively raised a total
of $12.9 billion. Importantly, the authors highlight that scam has turned out to be a serious issue in
the ICO market. Another important study is that one of Foley, Karlsen and Putniņš (2019) proposing
a model to identify illegal activities in Bitcoin. The authors find that about one-quarter of all users
(26%) and close to one-half of Bitcoin transactions (46%) are associated with illegal activity. Grobys
and Sapkota (2020) analyze the bankruptcy risk of 143 cryptocurrencies over the 2014–2018 period.
The authors’ findings indicate that about 60% of all cryptocurrencies eventually end up in default.
All these studies show that unlike traditional asset markets, new digital financial markets carry
different types of risks such as fraud risk, risk of money laundry or credit risk. Given this more specific
stream of literature related to digital ecosystems, our study takes a novel perspective by first (i)
identifying 13 distinct types of scams that we observed in the market for ICOs, and second (ii) by
quantitatively assessing the overall risk associated with it. This is an important issue because the sums
of money involved in this new market are overwhelming. As pointed out in Howell et al. (2020),
ICOs raised worldwide over $31 billion.
Finally, Taleb and Cirillo (2020, p.607) highlight that it is not rigorous to employ
naïve but reassuring statistics as a motivation for policy making from governments or regulatory
agencies. Specifically, the fatter the tails of the distribution of a social matter carrying risk, the more
statistical information resides in the extremes and the less in the bulk of the distribution. Using Taleb’s
(2020) ‘plug-in estimation’ to compute the risk of scams in the ICO market, our study provides
important novel implications, compatible with the (non-naïve) precautionary principle, for policy
makers. As pointed out in Taleb and Cirillo (2020, p.607), the precautionary principle should be the
core driver for policy decisions under jointly systemic and extreme risks. Our paper contributes to the
audience from governments and regulatory agencies by providing risk assessments of a new emerging
tool for raising funding in a market that remained so far mostly unregulated.
2. Data
2.1 Retrieving data for ICO scams
We applied rvest and xml2 web scrapping packages in the program R to downloaded data from the
website icorating.com. This website provides data for the risk score and the hype score for more than
5000 ICOs with additional information about the amount raised in terms of USD (for some of them).
5
Similarly, the website icosbull.com provides the data on basic, financial, and social signals view for
around 3000 ICOs. We downloaded the financial information of those listed ICOs using the same
web scrapping packages. Unfortunately, the financial information for many ICOs is missing on these
two websites. Financial information, especially on the raised amount which is of major importance
in our study, is missing in many websites (including major ICO database providers such as icobench,
neironix, icoholders etc.) Nevertheless, after combining various data sources, we were able to collect
the information about the raised amount for 1014 ICOs issued in the August 2014 until December
2019 period.
2.2 Identifying scams in the ICO market
It is important to note that in our study, we explored 5036 ICOs that were launched in the August
2014 until December 2019 period. In this regard, a main challenge in data generating process was for
us to find the website or the database that enlists those ICOs that turned out to be scam. However,
there are a few websites listing coins and tokens that are not traded anymore and which are often
referred to as dead coins or dead tokens. Furthermore, we also needed to identify the reason,
respectively, type of the scam before considering them as justified observations for our sample.
Fortunately, the website deadcoins.com exhibits a list of 2000, and similarly, website coinopsy.com
reports a list of 1700 dead coins and tokens (of which most happened to be listed on deadcoins.com
too) exhibiting scam or other issues behind the default.6 The majority of the projects listed as scam
on these two websites are cryptocurrencies.
Unfortunately, these websites do not provide information on the financials. From the
available sample of 1014 ICOs with raised funding, we could categorize 97 ICOs as ‘scam listed by
the third party’ which were retrieved from the above mentioned websites listing dead coins.7 For the
rest of the 917 ICOs with financial information on raised amount of USD not listed by any third party,
we manually searched each ICO in the Bitcoin Talk forum on the website bitcointalk.org to separate
them into either ‘scam’ or ‘legit’.8 In the section ‘trading discussion’ on the website bitcointalk.org
there are more than 800000 posts discussing scam accusations, reputation and other trading topics
6 https://www.coinopsy.com/dead-coins/, https://icosbull.com/eng/icos/past/financial (as of 11.11.2020)
7 All third-party data presented herein were obtained from publicly available sources which are believed to be reliable.
However, we make no warranty, express or implied, concerning the accuracy of such information.
8 Interestingly, nearly all ICOs have been announced in bitcointalk.org and some other forums like Bitcoin.com, Altcoin
Talks, Bitcoin Garden. This is the essential part of a PR campaign. Bitcoin Talk is the one to focus on, as it is the biggest
and most popular of such platforms, especially when it comes to announcements of new coins/tokens.
6
about the coins or tokens. In the ‘trading discussion’ section, 211000 threads are created only for the
scam accusations whereas 84000 threads are for reputation (i.e. for promotion).9
Using the search term, ‘{name}{space}{scam}’ in Bitcoin forum on the website
bitcointalk.org, we manually searched all remaining ICOs (e.g., those that were not listed by any third
parties as scam). The search results showed multiple threads, where the name of the ICO and the term
scam appear together. We carefully studied those threads and marked the ICO as ‘scam’ if the forum
member has at least one valid reason for the scam accusations, otherwise the ICO is marked as ‘legit’.
In total, we were able to identify the valid reasoning behind scam accusations for 576 ICOs (including
those ICOs listed by the third-party websites). We marked the remaining ICOs as ‘legit’. From the
total of $15.38 billion raised by the ICOs included in our dataset, $10.12 billion of the raised funding
were lost due to scams, which is 65.80% of the total amount raised.
A report by SATISGROUP published in the Bloomberg research terminal identified
also scams among ICOs. In doing so, it was assessed if an ICO project either ‘had an intention’ or
‘had no intention’ of fulfilling project development duties with the funds, and/or was deemed by the
community as a scam.10 We followed a similar approach as the SATISGROUP. The difference
between our approach and their approach is that we directly (manually) looked into the
forum/community discussion on scam accusations. Since ICOs in our data sample were able to raise
significant amounts of money, we believe that they hypothetically fulfill major criteria to impersonate
themselves as legitimate ICOs–even though most of them were not. However, using rather simple
statistical approaches–based on characteristics–may not accurately differentiate between fake and real
ICOs. In this regard, Chainalysis, a blogging website decoding Ethereum scams, identified over 2000
scam addresses on Ethereum that have received funds from nearly 40000 unique users during the
2016–2018 period.11 Note that having ERC-20 as standard guidelines, many ICOs are using Ethereum
blockchain for token offering. However, in this research besides Ethereum channel many tokens use
their own channels and possibly other channels like Waves. Using our more work-intensive approach
to identify scams, we do not categorize scams based on the channels used by the tokens.
9 https://bitcointalk.org/index.php?board=8.0 (as of 11.11.2020)
10 https://research.bloomberg.com/pub/res/d28giW28tf6G7T_Wr77aU0gDgFQ.
11 The full report can be retrieved here: https://blog.chainalysis.com/2019-cryptocrime-review.
7
2.3 Clustering scams in the ICO market
Analyzing our hand-collected data, we found several different ways of how investors are fooled by
scammers. We could categorize scams incidents into thirteen different types of scams based on their
nature. The results are reported in Figure A.1 in the appendix. In doing so, we first categorized ICOs
retrieved and matched from the websites deadcoins.com and coinopsy.com as Listed’. Second, users
receiving spam emails, suspicious links and pop-ups, questions for personal and financial details,
error on withdrawals, pending withdrawals, balance disappearing form the wallet, etc. are some
common accusations which we categorized as PhishingNfraud’. We denoted the third category of
ICO scams as ‘Fake’. Specifically, we tag an ICO as fake if a Bitcoin Talk forum member identified
the ICO with fake team/project/wallet/social media/trading etc. Another scamming tool is a so-called
‘bounty program’ which entails financial rewards mostly in tokens for users’ PR-activities such as
promoting ICO on forums, telegrams, messengers, translating and localizing documents, posting on
social media, blogs etc. However, many ICOs fail to pay or do not pay bounty to those promotors.
Hence, we categorized those ICOs under ‘Bounty Scam’ if a bounty hunter has accused the ICO as
scam for not paying his/her bounty.
The fifth common type of scam among ICOs in our sample is the so-called ‘Exit Scam’,
where the developers and the promotors (the ones who collected the fund in an ICO) suddenly
disappear–leaving the investors with no information. Furthermore, we identified many ICO scam
accusations where the same group of scammers was actively scamming in different other projects.
We categorized these scams as ‘Previous Scammers’. Unfortunately, due to the lack of regulations,
the same individual/team/promotor can potentially fool the naïve investors over and over again. Next,
we define ‘Airdrop Scam’–which is the sixth category of scam in our sample–as a fraud where the
scammers steal the wallets’ private keys from the users. Specially, scammers create a booby trap and
users willing to acquire free tokens, click on the links and give away their private information,
ultimately losing their coins to scammers.
It is important to note that there are more than 32,000 crypto exchanges/markets around
the world which makes it difficult for users to identify scam exchanges.12 Developers that would like
to take advantage of this situation preferably launch the ICO at a fraudulent exchange. This type of
scam is according to our categorization defined as ‘Exchange scam’. Furthermore, we observed that
copying the whitepaper of a promising ICO and launching it using a similar or different name has
also been a new trend among the scammers. This type of scam is according to our definition referred
12 For more information, see coinmarketcap.com (as of 11.11.2020).
8
to as ‘Whitepaper Plagiarism Scam’. Fortunately, users are getting familiar with this type of scam
and typically report it in the Bitcoin Talk forum. We identified many scam accusations in the forum
about ICOs fully or partially plagiarizing whitepapers from the successful or promising ICOs.
‘Pump and Dump’ is another technique associated with scam that we observed in our
sample. One cannot directly see this type of scam at the very beginning of a project. Unfortunately,
it is only possible to identify it when it is already too late. Usually, investors and traders rush to buy
the tokens at early phase when the price is low. Similarly, some investors buy them in fear of missing
out at a high price. Once the scammers complete the sell, the price suddenly crashes. Moreover, ‘Ponzi
scam’ is another category of scam that we observed in our sample. While it is similar to the pyramid
scam, the essential difference is that a Ponzi scam requires that the victims would invest in some
product(s) or service(s) associate with the ICO–with promised returns at a later stage. As a new
method of scamming investors, we observed that scammers launch websites that resemble similar
names and the domain names of some other ICOs or projects. New (or naïve) investors that are
unaware of the original websites fall into this trap and lose their coins. Hence, we categorize this type
of scam as ‘Website scam’.
Unfortunately, we observed that the so-called ‘Porn scam’ seems to be increasingly
popular among scammers. Specifically, there are some ICOs offering premium access to their porn
sites and/or their products. This is perhaps the easiest way of scamming people because as a user of
the site, one will hesitate to report the scam if it turns out to be fraud. This is so because in many
countries/societies watching porn is strictly prohibited. Hence, we categorize this type of scam as
‘Porn scam’. The last type of scam that we could identify in our data sample is categorized as ‘Pre-
mine scam’. Pre-mining occurs when a fraction of the tokens for the project is made available to a
group of developers and promotors prior offering to the public. This is an important aspect to
distribute rewards to developers. However, if the fraction of the tokens reserved for a pre-mine is
high, there is probably some reason to worry. Specifically, we define an ICO as a ‘Pre-mine scam’ if
some tokens are shared among the developers and the promotors after the final token sale instead of
burning the unsold tokens. This is fraud to the investors because higher token circulation supply
generally implies lower token price. Furthermore, there is a chance to manipulate the market if
developers have the large fraction of the tokens from the pre-mining activity. This also applies to the
context of the cryptocurrencies (Grobys and Sapkota, 2020).
9
3. Empirical Framework
3.1. Thin-tailed versus fat-tailed distributed data
We investigate the distribution of losses due to scams in the ICO market covering the August 2014
December 2019 period. In this period, we observed 576 fraud incidents totaling an overall loss of
$10.12 billion. The lowest (highest) loss due to ICO scams corresponds to $2000 ($735.00 million).
In Table 1 we report the deciles of the loss distribution. We observe that 50% of the lowest losses
comprise about 7% of the total losses in terms of US dollar, whereas the highest 20% of the losses
comprise about 71% of the total. The last column in Table 1 shows how the losses were distributed
across deciles if they were normally distributed. For instance, if the losses were normally distributed,
50% of the lowest losses would comprise 20% of the total losses, whereas the highest 20% of the
losses would comprise 45%. of the total. These figures are indeed very different from what we observe
in the data. Next, In Figure 1 we plot the relative frequency histogram of the losses due to ICO scams
using actual data. Specifically, we divided the overall distribution into ten bins of equal length in line
with the support of the distribution. The support of the loss distribution is from $2000 until $735.00
million. For instance, in the first bin we report the number of all counted losses between $2000–
$73.50 million divided by the total number of counted losses, whereas in the last bin we report the
number of counted losses between $661.50–$735.00 million divided by the total number of counted
losses. From Figure 1 we observe that 97.57% of the losses are in the first bin ranging from $2000 to
$73.50 million. For comparison, we report also the relative frequency histogram for hypothetically
normally distributed losses. We observe that these two relative frequency histograms are very
different from each other. Unlike hypothetically normally distributed losses, the empirical relative
frequency distribution appears to be highly skewed and possibly fat-tailed. In Table 2 we report the
descriptive statistics for our data. The kurtosis of the loss distribution exceeds 120, which is a
statistical indication of fat tails. The sample average is $17.57 million, while the sample standard
deviation is $52.49 million.
More formally, we apply Bayes’ rule as outlined in detail in Taleb (2020, p.52) to
explore how likely it is that the data generating process of the hacking incidents is thin-tailed as
opposed to fat-tailed. To do so, we exclude the Petro scam from the sample and then compound the
actual sample mean and actual sample standard deviation which is $16.32 million and $43.15 million
implying that the largest loss corresponds to a 17-sigma event in the thin-tailed Gaussian world with
a corresponding probability of 8.21E-65.13 However, assuming a t-distribution with 3 degrees of
freedom as an example for a fat-tailed alternative distribution, a 17-sigma event occurs in 4 of 10000
13 For calculating the first and second sample moments, we here excluded Petro scam from the sample to explore how
likely it is to observe it as an outlier, given the remaining data of the distribution.
10
observations–or, with a probability of 4.43E-4, respectively. According to Bayes’ rule the conditional
probability that the event is Gaussian–given that the loss due to the Petro scam occurred–is then
defined as
𝑃(𝐺|𝐸)=()(|)
()(|)()(|), where
𝑃(𝐸|𝐺)=𝑃(𝐸𝑣𝑒𝑛𝑡|𝐺𝑎𝑢𝑠𝑠𝑖𝑎𝑛),
𝑃(𝐺)=𝑃(𝐺𝑎𝑢𝑠𝑠𝑖𝑎𝑛),
𝑃(𝐺|𝐸)=𝑃(𝐺𝑎𝑢𝑠𝑠𝑖𝑎𝑛|𝐸𝑣𝑒𝑛𝑡),
𝑃(𝐸|𝑁𝐺)=𝑃(𝐸𝑣𝑒𝑛𝑡|𝑁𝑜𝑛𝐺𝑎𝑢𝑠𝑠𝑖𝑎𝑛).
In Table 3 we report the corresponding probabilities for 𝑃(𝐺𝑎𝑢𝑠𝑠𝑖𝑎𝑛|𝐸𝑣𝑒𝑛𝑡) assuming various
probabilities for 𝑃(𝐺𝑎𝑢𝑠𝑠𝑖𝑎𝑛). We observe that even if assuming a probability of 0.9999 for this
distribution being Gaussian, the probability that the distribution is Gaussian given the Petro scam is
as low as 1.85E-57. Overall, we find plenty of evidence that losses due to ICO scams follow a fat-
tailed distribution. In the next section, we employ statistical models to compute the risk of losses due
to scams in the ICO market.
3.2 Computation of risk
Fat-tailed distributions describing man-made phenomena are often modeled using power-laws. In this
regard, Clauset, Shalizi, and Newman (2009) explore whether 24 real-world data sets from a range
of different disciplines follow power-law distributions. The 24 data sets that the authors study are
drawn from a broad variety of different branches of human endeavor, including physics, earth
sciences, biology, ecology, paleontology, computer and information sciences, engineering, and the
social sciences. Their findings indicate that 17 of the 24 data sets are consistent with a power-law
distribution. Both Clauset et al. (2009) and Taleb (2007) argue that power-law distributions occur in
many situations of scientific interest and have significant consequences for our understanding of man-
made phenomena. Specifically, Taleb (2020, p.34) highlights that the tail exponent of a power-law
function captures, by extrapolation, the low-probability deviation not seen in the data, but that plays
an extraordinary large share in determining the mean. In a recent paper, Cirillo and Taleb (2020)
employ power-law modeling the compute the risk of contagious disease. Motivated by this stream of
literature, we define the survival function of the power-law governing losses due to scams in the ICO
market as
11
𝑃(𝑋 > 𝑥)= 𝑝(𝑥)= 𝐶𝑥 , (1)
where 𝐶 = (𝛼 − 1)𝑥
, 𝑥 is the minimum realization of a loss due to ICO scam that bends the
power-law, and 𝛼 defines the tail exponent.14 It can be shown that the expectation 𝐸[𝑋] is then given
by
𝐸[𝑋]=𝑥𝑝(𝑥)𝑑𝑥 = ()
()
 𝑥, (2)
whereas the second moment 𝐸[𝑋] is defined as
𝐸[𝑋]=𝑥𝑝(𝑥)𝑑𝑥 = ()
()
 𝑥
, (3)
and higher moments of order 𝑘 are analogously defined as
𝐸[𝑋]=()
() 𝑥
. (4)
From Equation (2) we recognize immediately that the mean only exists for 𝛼 > 2, whereas the
variance only exists for 𝛼 > 3. Since White, Enquist and Green (2008) and Clauset et al. (2009), who
explore the performance of various estimation techniques for estimating power-law exponents, find
that Maximum Likelihood Estimation (MLE) performs best, we estimate the tail exponent as,
𝛼 = 1 + 𝑁 ln 

 , (5)
where 𝛼 denotes the MLE estimator and 𝑁 = 576 is the number of observations in our sample. As
we see from Equations (2) ̶ (4), the minimum value 𝑥 is essential for the calculation of distribution
moments. Figure 2 plots the evolution of 𝛼 depending on the chosen 𝑥. We observe from Figure
2 that the point estimates for alpha are first sharply increasing but then appear to be relatively stable
with corresponding estimates for alpha that are close to 2.
14 In our notation here we follow Clauset et al. (2009).
12
Cirillo and Taleb (2016a, p.1487-1488) argue that in risk management, when dealing with Value at
Risk (VaR) or Expected Shortfall (ES), it is common practice to just focus on the upper tail of the
distribution of losses which has the advantage of avoiding excessive parametric assumptions with
respect to the whole distribution of losses. The authors also note that it is reasonable to ignore the
other parts because the fatter the tails, the less the contribution of the body of the distribution for risk
analysis. Next, we employ Cirillo and Taleb’s (2016a, 2016b, 2020) approach to model the tail of our
loss distribution. Assessing the threshold 𝑥 that determines the power-law behavior in the tail is
ultimately an empirical question. For instance, Cirillo and Taleb (2016a) document that VaR and ES
are usually computed for very high confidence levels, from 95 to 99.9%. In this regard, Clauset et al
(2009) also argue that it is quite common in the quantitative finance literature to limit the analysis to
the largest observed samples only. For instance, in a recent paper, Cirillo and Taleb (2020) allocate
34.7% of the data in the tail event category. In our current research context, we find that losses
exceeding $10 million are large, and hence, we specify these observations as tail events.15 This is also
in line with Clauset et al (2009), who document that a common way of choosing the threshold is to
identify a point beyond which the value the power-law exponent appears relatively stable.
Using 𝑥 = $10.00 million gives us a corresponding estimate for the power-law
exponent 𝛼 = 2.0671. To investigate how well our chosen distribution model specification fits the
empirical data, we plot in Figure 3 the empirical cumulative density function (y-axis) against the
theoretical counterpart (x-axis). If the data fit were perfect–which is of course impossible given our
real world data–all observations would line up on the 45 degrees line, which is marked in Figure 3
with grey color. Visual inspection of Figure 3 suggests that our chosen model describes the data fairly
well. Next, unlike Cirillo and Taleb (2016a, 2016b, 2020), who deal with power-law exponents below
2, we estimate the risk 𝑅𝑖𝑠𝑘  as
𝑅𝑖𝑠𝑘  =
𝐸[𝑥|𝑥< 𝑥 ]+()
𝐸[𝑥|𝑥 𝑥 ]= (6)
𝑅𝑖𝑠𝑘  =
𝐸[𝑥|𝑥< 𝑥 ]+()
𝑥𝑝(𝑥)𝑑𝑥
 , (7)
and from Equation (2) it follows that Equation (7) is equivalent to Equations (8) and (9), that is,
15 Especially given that most investors the ICO market are not institutional investors but small retail investors, we find
that $10.00 million is a very plausible threshold. For instance, in contrast to stocks, ICOs allow for a fraction of a token
to be traded which benefits especially retail investors that would like to invest smaller amounts of their wealth depending
on their risk appetite. Indeed, one could even argue that ICOs are designed for small retail investors for enabling them to
participate in the financing of small businesses and start-ups (see http://www.oecd.org/finance/ICOs-for-SME-
Financing.pdf).
13
𝑅𝑖𝑠𝑘  =
𝐸[𝑥|𝑥< 𝑥 ]+()
()
()𝑥 (8)
𝑅𝑖𝑠𝑘  =
𝑥∈| +()
()
()𝑥, (9)
where 𝑁= 356 denotes the number sample observations below 𝑥. It its noteworthy that we
allocate 32.5% to the tail event category which is very close to Cirillo and Taleb (2020), who allocate
34.7% of their data in the tail event category.16 Using our plug-in estimator, we get a shadow mean
of $159.03 million for the losses exceeding 𝑥 = $10.00 million as opposed to the naïve tail
sample mean of $40.00 million. Thus, we retrieve an overall risk estimate of $54.10 million, which
exceeds the naïve sample average of $17.58 million by a factor of 3.
3.3 Discussion
Our results are very similar to those documented in Cirillo and Taleb (2020), who use this approach
to compute the shadow mean of the numbers of victims in pandemics. In their study, the shadow
mean using actual data is 1.5 times larger than the corresponding sample tail mean. The main
difference between our study and the studies of Cirillo and Taleb (2016a, 2016b, 2020) is that we do
not need to employ ETV to compute the shadow mean as our model suggests that the power-law
exponent slightly exceeds the critical value of 2. Since we employ a large data set in association with
maximum likelihood estimation, our point estimator 𝛼 = 2.0671 has a corresponding standard
deviation of 𝜎 =

+ 𝑂 
 = 0.0761 and t-statistic 27.15 indicating statistical significance on any
level. We note that the average 𝛼 across all different possible values for 𝑥 is 1.9394. For instance,
choosing 𝑥 = $5.00 million results in a maximum-likelihood estimate of 𝛼 = 1.8112 which, in
turn, would require an EVT-based approach as detailed in Cirillo and Taleb (2016a, 2016b, 2020).17
(This means, however, that one would allocate the majority of observations–that is 56.3%–in the tail
event category. We find that this does not appear to be plausible in our research context.) We follow
the common practice and determine the threshold for observations that are allocated to the tail event
category depending on our current research context. As pointed out in Clauset et al (2009, p.674) “an
alternative approach, quite common in the quantitative finance literature, is simply to limit the
analysis to the largest observed samples only”. Overall, our results strongly support the argument
16 The difference between our study’s data and Cirillo and Taleb’s (2020) data is that our estimate for 𝛼 > 2 and hence,
their proposed approach based on Extrem-Value-Theory (EVT) is not applicable in our research context.
17 Future research is encouraged to address this issue.
14
raised in Cirillo and Taleb (2020) that a naïve use of the sample mean would be misleading and would
result in a substantial underestimation of risk.
3.4 Robustness check: Accounting for the OneCoin scam
OneCoin is considered as the largest scam in the history of cryptocurrency (so far). Even though the
developers promoted it as a cryptocurrency promising financial revolution, it turned out to be a ponzi
scam. Shockingly, according to estimates, using this scam ICO, criminals have risen around $3.77
billion worldwide18. Zhang, Raveenthiran, Mukai, Naeem, Dhuna, Parveen and Kim (2019) consider
OneCoin as a controversial example for an ICO. They argue that it was neither blockchain-based nor
had any real tokens. Furthermore, OneCoin was selling plagiarized educational packages as their
products. Therefore, it appears to be a complete fraud from the very beginning on. To check the
robustness of our estimated risk of losses due to scam in the ICO market, we decide to add OneCoin
to our data set due to the high societal impact of this fraud. Adding Onecoin to our data, our updated
data set has 377 observations. Using again 𝑥 = $10.00 million gives us a corresponding estimate
for the power-law exponent 𝛼= 2.0421 which is 0.33 standard deviation below our previously
estimated 𝛼. Since we cannot reject the null-hypothesis 𝐻: 𝛼 = 𝛼, we consider our results as robust.
4. Conclusion
Exploring a large data set on ICOs covering the August 2014– December 2019 period, we observed
that 1014 ICOs raised a total of $15.46 billion. Unfortunately, we identified 56% of those ICOs as
scams corresponding to $10.12 billion–which is 65% of the total raised funding. Categorizing ICO
scam into 13 different categories, our findings indicate that the so-called ‘PhishingNfraud’ appears
to be the most common type of scam where users receive spam emails, suspicious links and pop-ups,
questions for personal and financial details, error on withdrawals, pending withdrawals, balance
disappearing form the wallet, etc. Our further analysis indicates that losses due to ICO scams are
heavily fat-tailed distributed. Using appropriate statistical methodologies accounting for fat-tailed
distributions, we estimate that the fraud risk associated with ICO scam is $54.10 million. We find
that employing naïve statistics such as the sample mean underestimates this risk. Future research
could explore the positive side of ICOs, that is, the opportunity of raising funding. We argue that our
findings have significant implications, including the extent to which traditional risk management tools
18 See more at: https://www.justice.gov/usao-sdny/pr/manhattan-us-attorney-announces-charges-against-leaders-
onecoin-multibillion-dollar.
15
can be relied upon for making decisions. Finally, our results underline the urgent need for ICO market
regulations from governments and regulatory agencies to protect investors from severe losses.
16
References
Aune, R., Krellenstein, A., O’Hara, M., and O. Slama, 2017. Footprints on a blockchain: trading
and information leakage in distributed ledgers. Journal of Trading 12 (2), 5–13.
Biais, B., Bisiere, C., Bouvard, M., Casamatta, C., and A. Menkveld, 2019. Equilibrium Bitcoin
Pricing, 2019 Meeting Papers 360, Society for Economic Dynamics.
Böhme, R., Christin, N., Edelman, B., and T. Moore, 2015. Bitcoin: economics, technology, and
governance. Journal of Economic Perspectives 29, 213–238.
Cirillo, P., 2013. Are your data really Pareto distributed? Physica A 392, 5947–5962.
Cirillo, P., and N. N. Taleb, 2016a. On the statistical properties and tail risk of violent conflicts.
Physica A 452, 29–45.
Cirillo, P., and N. N. Taleb, 2016b. Expected shortfall estimation for apparently in finite-mean
models of operational risk. Quantitative Finance 16, 1485–1494.
Cirillo, P., and N. N. Taleb, 2020. Tail risk of contagious diseases. Nature Physics 16, 606-613.
Clauset, A., Shalizi, C. R., M. E. J. Newman, 2009. Power-Law Distributions in Empirical Data.
SIAM Review 51, 661-703.
Gillespie, C.S. 2015. Fitting Heavy Tailed Distributions: The poweRlaw Package. Journal of
Statistical Software 64(2), 1-16.
Foley, S., Karlsen, J.R., and T. J. Putniņš, 2019. Sex, Drugs, and Bitcoin: How Much Illegal
Activity Is Financed Through Cryptocurrencies? Review of Financial Studies 32(5),
1798-1853.
Grobys, K., 2020. When the Blockchain does not block: On Hackings and Uncertainty in the
Cryptocurrency Market. Quantitative Finance (forthcoming).
Grobys, K., and N. Sapkota, 2020. Predicting Cryptocurrency Defaults. Applied Economics 52(46),
5060-5076.
Harvey, C.R., 2016. Cryptofinance. Social Science Research Network
http://ssrn.com/abstract=2438299.
Hileman, G., and M. Rausch, 2017. Global Cryptocurrency Benchmarking Study. Cambridge
University, Cambridge Center for Alternative Finance, Cambridge, UK.
Howell, S.T., Niessner, M., and D. Yermack, 2020. Initial Coin Offerings: Financing Growth with
Cryptocurrency Token Sales, Review of Financial Studies 33(9), 3925–3974.
Lux, T., and S. Alfarano, 2016. Financial power laws: Empirical evidence, models, and
mechanisms. Chaos, Solitons and Fractals 88, 3-18.
Makarov, I., and A. Schoar, 2020. Trading and arbitrage in cryptocurrency markets, Journal of
17
Financial Economics 135(2), 293-319.
Malinova, K., and Park, A., 2016. Market Design with Blockchain Technology. Social Science
Research Network. https://papers.ssrn.com/sol3/ Delivery.cfm?abstractid=2785626.
Plerou V, Gopikrishnan P, Amaral L, Meyer M, Stanley HE. 1999. Scaling of the distribution of
price fluctuations of individual companies. Physical Review E 60, 6519-29.
Raskin, M., and Yermack, D., 2017. Corporate governance and blockchains. Review of Finance 21,
7–31.
Taleb, N.N. 2007. The Black Swan: The Black Swan: The Impact of the Highly Improbable.
Random House Inc., New York.
Taleb, N.N. 2020. Statistical Consequences of Fat Tails: Real World Preasymptotics, Epistemology,
and Applications. STEM Academic Press.
Warusawitharana, M., 2018.Time-varying volatility and the power law distribution of stock returns,
Journal of Empirical Finance 49, 123-141.
White, E., Enquist, B., Green, J.L., 2008. On estimating the exponent of power-law frequency
distributions. Ecology 89, 905-912.
Zhang, A.R., Raveenthiran, A., Mukai, J., Naeem, R., Dhuna, A., Parveen, Z. and Kim, H., 2019.
The regulation paradox of initial coin offerings: a case study approach. Frontiers in
Blockchain 2, p.2.
18
Figures
Figure 1. Histogram of loss distribution due to ICO scams
This figure reports the histogram of the loss distribution due to ICO scams. We divided the overall distribution into ten
bins of equal length in line with the support of the distribution. The support of the loss distribution is from $2000 until
$735.00 million. For instance, in the first bin we report the number of all counted losses between $2000–$73.50 million
divided by the total number of counted losses, whereas in the last bin we report the number of counted losses between
$661.50–$735.00 million divided by the total number of counted losses.
Figure 2. Power-law exponent depending on the chosen minimum
This figure plots the alpha depending on the chosen minimum value.
0,0000
0,2000
0,4000
0,6000
0,8000
1,0000
1,2000
12345678910
Normal distribution Losses due to ICO scam
0
0,5
1
1,5
2
2,5
3
0 50 000 000 100 000 000 150 000 000 200 000 000 250 000 000
19
Figure 3. Empirical versus theoretical density function
This figure plots the empirical cumulative density function on the y-axis against the theoretical cumulative density
function on the x-axis for all observations exceeding our chosen 𝑥 = $10.00 million.
20
Tables
Table 1. Distribution of losses due to ICO scams
Decile
Loss in $
% of total losses
% of total losses
assuming normal
distribution
10 % 650000 0.16 0.76
20 % 1500000 0.80 3.07
30 % 2700000 1.98 7.11
40 % 4310000 3.91 12.88
50 % 6900000 7.02 20.38
60 % 10000000 11.88 29.77
70 % 14400000 18.59 41.29
80 % 22500000 29.08 55.49
90 % 37378000 45.32 73.78
100 % 735000000 100.00 100.00
Table 2. Descriptive statistics
Metric Losses
Mean 17572118.00
Median 6834500.00
Maximum 735000000.00
Minimum 2000.00
Std.Dev. 52493273.00
Skewness 10.10
Kurtosis 123.83
21
Table 3. Conditional probabilities
This table reports the conditional probability of a 17-sigma event being Gaussian as opposed to fat-tailed. As a fat-tailed
distribution we use a t-distribution with 3 degrees of freedom. According to Bayes’ rule the conditional probability that
the event is Gaussian–given that the Petro scam occurred–is defined as
𝑃(𝐺|𝐸)=()(|)
()(|)()(|),
where 𝑃(𝐸|𝐺) denotes 𝑃(𝐸𝑣𝑒𝑛𝑡|𝐺𝑎𝑢𝑠𝑠𝑖𝑎𝑛), 𝑃(𝐺) is 𝑃(𝐺𝑎𝑢𝑠𝑠𝑖𝑎𝑛), 𝑃(𝐺|𝐸) is 𝑃(𝐺𝑎𝑢𝑠𝑠𝑖𝑎𝑛|𝐸𝑣𝑒𝑛𝑡), whereas
𝑃(𝐸|𝑁𝐺) denotes 𝑃(𝐸𝑣𝑒𝑛𝑡|𝑁𝑜𝑛𝐺𝑎𝑢𝑠𝑠𝑖𝑎𝑛).
(
𝐺𝑎𝑢𝑠𝑠𝑖𝑎𝑛
)
𝑃
(
𝐺𝑎𝑢𝑠𝑠𝑖𝑎𝑛
|
𝐸𝑣𝑒𝑛𝑡
)
0.50 1.85E-61
0.90 1.67E-60
0.9999 1.85E-57
0.99999999 7.60E-53
0.9999999999999999 1.67E-45
1 1
22
Appendix
Figure A.1. Categorizing scams in the ICO market
This figure categorizes our sample of 576 ICO scams into 13 different categories.
... For example, the Initial Coin Offer (ICO) scams claim to have innovative technology or promising business logic that can provide high returns on investment; however, the smart contracts they actually deployed do not match the content described in their white papers, homepages, or update announcements. As of December 2019, the cumulative losses due to ICO scams have reached 10.12 billion US dollars [16]. For those who have experience in Solidity programming, reading the project's source code can help them find such inconsistency risks. ...
Preprint
Full-text available
Context: Decentralized applications on blockchain platforms are realized through smart contracts. However, participants who lack programming knowledge often have difficulties reading the smart contract source codes, which leads to potential security risks and barriers to participation. Objective: Our objective is to translate the smart contract source codes into natural language descriptions to help people better understand, operate, and learn smart contracts. Method: This paper proposes an automated translation tool for Solidity smart contracts, termed SolcTrans, based on an abstract syntax tree and formal grammar. We have investigated 3,000 smart contracts and determined the part of speeches of corresponding blockchain terms. Among them, we further filtered out contract snippets without detailed comments and left 811 snippets to evaluate the translation quality of SolcTrans. Results: Experimental results show that even with a small corpus, SolcTrans can achieve similar performance to the state-of-the-art code comments generation models for other programming languages. In addition, SolcTrans has consistent performance when dealing with code snippets with different lengths and gas consumption. Conclusion: SolcTrans can correctly interpret Solidity codes and automatically convert them into comprehensible English text. We will release our tool and dataset for supporting reproduction and further studies in related fields.
Article
Full-text available
Compared to Initial Public Offerings (IPOs) and conventional loans, Initial Coin Offerings (ICOs) are sales of promises of cryptocurrency appreciation. However, regulatory uncertainties continue to prohibit successful widespread adoption. This paper examines ICOs with varying levels of success, including Mastercoin (now Omni) and Kin, as well as fraudulent ICOs, like REcoin and OneCoin. The discussion of the ICO market focuses on the “regulation paradox,” examining some of the inherent contradictions between regulatory actions and values that differentiate ICO schemes from other investment instruments, and therefore questioning the capability of regulations to enhance investor protection mechanisms without undermining the fundamental value of cryptocurrencies and ICOs as a viable funding structure.
Article
Full-text available
While many studies find that the tail distribution of high frequency stock returns follows a power law, there are only a few explanations for this finding. This study presents evidence that time-varying volatility can account for the power law property of high frequency stock returns. In particular, one finds that a conditional normal model with nonparametric volatility provides a strong fit. Specifically, a cross-sectional regression of the power law coefficients obtained from stock returns on the coefficients implied by the nonparametric volatility model yields a slope close to one. Further, for most of the stocks in the sample taken individually, the model-implied coefficient falls within the 95 percent confidence interval for the coefficient estimated from returns data.
Article
The COVID-19 pandemic has been a sobering reminder of the extensive damage brought about by epidemics, phenomena that play a vivid role in our collective memory, and that have long been identified as significant sources of risk for humanity. The use of increasingly sophisticated mathematical and computational models for the spreading and the implications of epidemics should, in principle, provide policy- and decision-makers with a greater situational awareness regarding their potential risk. Yet most of those models ignore the tail risk of contagious diseases, use point forecasts, and the reliability of their parameters is rarely questioned and incorporated in the projections. We argue that a natural and empirically correct framework for assessing (and managing) the real risk of pandemics is provided by extreme value theory (EVT), an approach that has historically been developed to treat phenomena in which extremes (maxima or minima) and not averages play the role of the protagonist, being the fundamental source of risk. By analysing data for pandemic outbreaks spanning over the past 2500 years, we show that the related distribution of fatalities is strongly fat-tailed, suggesting a tail risk that is unfortunately largely ignored in common epidemiological models. We use a dual distribution method, combined with EVT, to extract information from the data that is not immediately available to inspection. To check the robustness of our conclusions, we stress our data to account for the imprecision in historical reporting. We argue that our findings have significant implications, including on the extent to which compartmental epidemiological models and similar approaches can be relied upon for making policy decisions.
Article
Cryptocurrency markets exhibit periods of large, recurrent arbitrage opportunities across exchanges. These price deviations are much larger across than within countries, and smaller between cryptocurrencies, highlighting the importance of capital controls for the movement of arbitrage capital. Price deviations across countries co-move and open up in times of large bitcoin appreciation. Countries with higher bitcoin premia over the US bitcoin price see widening arbitrage deviations when bitcoin appreciates. Finally, we decompose signed volume on each exchange into a common and an idiosyncratic component. The common component explains 80% of bitcoin returns. The idiosyncratic components help explain arbitrage spreads between exchanges.
Article
Cryptocurrencies are among the largest unregulated markets in the world. We find that approximately one-quarter of bitcoin users are involved in illegal activity. We estimate that around $76 billion of illegal activity per year involve bitcoin (46% of bitcoin transactions), which is close to the scale of the U.S. and European markets for illegal drugs. The illegal share of bitcoin activity declines with mainstream interest in bitcoin and with the emergence of more opaque cryptocurrencies. The techniques developed in this paper have applications in cryptocurrency surveillance. Our findings suggest that cryptocurrencies are transforming the black markets by enabling “black e-commerce.” Received June 1, 2017; editorial decision December 8, 2018 by Editor Andrew Karolyi. Authors have furnished an Internet Appendix, which is available on the Oxford University Press Web site next to the link to the final published paper online.
Article
Blockchains represent a novel application of cryptography and information technology to age-old problems of financial record-keeping, and they may lead to farreaching changes in corporate governance. Many major players in the financial industry have began to invest in this new technology, and stock exchanges have proposed using blockchains as a new method for trading corporate equities and tracking their ownership. This essay evaluates the potential implications of these changes for managers, institutional investors, small shareholders, auditors, and other parties involved in corporate governance. The lower cost, greater liquidity, more accurate record-keeping, and transparency of ownership offered by blockchains may significantly upend the balance of power among these cohorts. © The Authors 2017. Published by Oxford University Press on behalf of the European Finance Association. All rights reserved.