Content uploaded by Taisei Kaizoji
Author content
All content in this area was uploaded by Taisei Kaizoji on Jun 14, 2019
Content may be subject to copyright.
http://www.aimspress.com/journal/QFE
Research article
Volatility A
nalysis of Bitcoin
Lukáš Pichl*
and Taisei Kaizoji
International Christian University
,
* Correspondence: lukas
@icu.ac.jp
Abstract:
Bitcoin has the largest share in the total capitalization of cryptocurrency markets
reaching above 70 billion USD. In this work we focus on the price of Bitcoin in terms of standard
currencies and their volatility over the last five years. The average day
period is 0.328%, amounting in exponential
present. Multi
scale analysis is performed from the level of the tick data, through the 5 min, 1 hour
and 1 day scales. Distribution of trading volumes (1 sec, 1 min, 1 hour and 1 day) aggregated from
the Kr
aken BTCEUR tick data is provided that shows the artifacts of algorithmic trading (selling
transactions with volume peaks distributed at integer multiples of BTC unit).
are studied using the EUR, USD and CNY currencies. Whereas the
currency pair is found narrow at the order of a percent, at the
spread for USD
CNY (and similarly EUR
above 5 percent on rare occasi
ons. The volatility of BTC exchange rates is modeled using the
dayto
day distribution of logarithmic return, and the Realized Volatility, sum of the squared
logarithmic returns on 5
minute basis. In this work we demonstrate that the
Autoregressive model for Realiz
ed Volatility Andersen et al. (
BTCUSD dataset. Finally, a feed

window sampling daily return predictors is applied to estimate the ne
results show that such an artificial neural network prediction is capable of approximate capture of the
actual log return distribution; more sophisticated methods, such as recurrent neural networks and
LSTM (Long Short Term M
emory) techniques from deep learning may be necessary for higher
prediction accuracy.
Keywords: bitcoin price;
foreign exchange rate
artificial neural network;
logarithmic return
QFE, 1 (4
):
DOI:
10.3934/QFE.
Received date
Accepted date
Published date
http://www.aimspress.com/journal/QFE
nalysis of Bitcoin
Price Time Series
and Taisei Kaizoji
International Christian University
, Osawa 3102, Mitaka, Tokyo 181
8585 Japan
@icu.ac.jp
; Tel: +81422333286; Fax: +81422
33
Bitcoin has the largest share in the total capitalization of cryptocurrency markets
reaching above 70 billion USD. In this work we focus on the price of Bitcoin in terms of standard
currencies and their volatility over the last five years. The average day
to
day return throughout this
period is 0.328%, amounting in exponential
growth from 6 USD to over 4,000 USD per 1 BTC at
scale analysis is performed from the level of the tick data, through the 5 min, 1 hour
and 1 day scales. Distribution of trading volumes (1 sec, 1 min, 1 hour and 1 day) aggregated from
aken BTCEUR tick data is provided that shows the artifacts of algorithmic trading (selling
transactions with volume peaks distributed at integer multiples of BTC unit).
Arbitrage opportunities
are studied using the EUR, USD and CNY currencies. Whereas the
arbitrage spread for EUR
currency pair is found narrow at the order of a percent, at the
1 hour sampling period
CNY (and similarly EUR
CNY) is found to be more
substantial, reaching as high as
ons. The volatility of BTC exchange rates is modeled using the
day distribution of logarithmic return, and the Realized Volatility, sum of the squared
minute basis. In this work we demonstrate that the
ed Volatility Andersen et al. (
2007)
applies reasonably well to the

forward neural network with 2 hidden layers using 10
window sampling daily return predictors is applied to estimate the ne
xt
day logarithmic return. The
results show that such an artificial neural network prediction is capable of approximate capture of the
actual log return distribution; more sophisticated methods, such as recurrent neural networks and
emory) techniques from deep learning may be necessary for higher
foreign exchange rate
; volatility modeling;
transaction volume distribution
logarithmic return
):
474–485
10.3934/QFE.
2017.4.474
Received date
: 10 September 2017
Accepted date
: 19 November 2017
Published date
: 13 December 2017
8585 Japan
33
3286.
Bitcoin has the largest share in the total capitalization of cryptocurrency markets
currently
reaching above 70 billion USD. In this work we focus on the price of Bitcoin in terms of standard
day return throughout this
growth from 6 USD to over 4,000 USD per 1 BTC at
scale analysis is performed from the level of the tick data, through the 5 min, 1 hour
and 1 day scales. Distribution of trading volumes (1 sec, 1 min, 1 hour and 1 day) aggregated from
aken BTCEUR tick data is provided that shows the artifacts of algorithmic trading (selling
Arbitrage opportunities
arbitrage spread for EUR
USD
1 hour sampling period
the arbitrage
substantial, reaching as high as
ons. The volatility of BTC exchange rates is modeled using the
day distribution of logarithmic return, and the Realized Volatility, sum of the squared
minute basis. In this work we demonstrate that the
Heterogeneous
applies reasonably well to the
forward neural network with 2 hidden layers using 10
day moving
day logarithmic return. The
results show that such an artificial neural network prediction is capable of approximate capture of the
actual log return distribution; more sophisticated methods, such as recurrent neural networks and
emory) techniques from deep learning may be necessary for higher
transaction volume distribution
;
475
Quantitative Finance and Economics Volume 1, Issue 4, 474–485
Figure 1. (a) BTCUSD time series for the past 5 years (data source: Investing.com) in
logarithmic scale, and (b) The distribution of the corresponding daily logarithmic returns.
1. Introduction
The price of Bitcoin to US Dollar over the past five years is an example of the (super)
exponential growth hardly seen in finance in any field except for the cryptocurrency markets. The
data in Figure 1 (a) are shown in logarithmic scale, with the red line indicating the average daily
476
Quantitative Finance and Economics Volume 1, Issue 4, 474–485
return of 0.3283% per day. The time period shown comprises 2013 daily closing values with the
logarithmic returns confined in the interval from 0.371564 (20120819) to 0.308301 (20130417),
attesting to the large density and magnitude of extreme events as depicted in Figure 1 (b). The main
source of Bitcoin demand and majority of trading volume comes from exchange markets in China.
Worldwide, Bitcoin as the leading cryptocurrency aspires to become a rudimentary means of
payment, being gradually accepted by online stores for payment of goods, cafes and restaurants, even
some academic institutions for payments of tuition. Still, the share of Bitcoin payments per GDP
does not reach even a single per cent in any country; the growth of its value is therefore propelled by
risky speculation on its broader acceptance as a common means of payment. For instance, Estonia at
present plans to introduce its own national cryptocurrency in line with the concept of electronic
citizenship. Whether the cryptocurrencies prevail or not, and which one would eventually become a
major standard is still an open question. Thus all the cryptocurrencies, including Bitcoin, are
sensitive to major event news such as exchange market bankruptcy, fraud and occasional market
crashes on the negative side, or the unheard of lucrative speculation opportunities on the positive side
that result in herding behavior. In brief, extreme events are abundant in the cryptocurrency markets,
Bitcoin being no exception here.
We have collected representative data sets from various Bitcoin exchanges as well as data from
Bloomberg that aggregate major exchanges into relevant Bitcoin price indices, from the scale of tick
events, through 1min, 5min, 1hour and 1day sampling resolution. In what follows, the distributions
of logarithmic returns, trading volumes at various time scales, arbitrage opportunity windows,
prediction of Realized Volatility and Bitcoin daily logarithmic return prediction by means of neural
networks will be discussed, thus providing different angles of view on the extreme events in the
Bitcoin market and its volatility. The paper is organized as follows. Section 2 provides a brief
literature review of the still rather scarce but rapidly increasing research work on quantitative Bitcoin
analysis. Data analysis methods are explained in Section 3, followed by Results and Discussions in
Section 4. The paper is closed with Conclusion in Section 5. Exchange Rates are denoted as FX1
FX2 (1 unit of FX1 in terms of an amount in currency FX2). Same notation is applied to Bitcoin
prices in standard currencies. We use the code of BTC for Bitcoin throughout although the notation
of XBT is also common. Log returns are based on the natural logarithm.
2. Literature Review
The origins of Bitcoin date back to the end of October, 2008, when a developer using the
pseudonym Satoshi Nakamoto published a paper entitled “Bitcoin: A peertoPeer Electronic Cash
System” (https://bitcoin.org/bitcoin.pdf). The actual cryptocurrency software was released in the open
source domain in January 2009. Since then Bitcoin established itself as the major cryptocurrency. Over
the last five years Bitcoin price has increased more than 700 times, there is at least 35 Bitcoin exchange
markets where Bitcoin prices are quoted in standard currencies, each with the daily transaction volume
above 1 million USD. Bitcoin is increasingly accepted in real economy as a means of payment. The
aspiration to become a major world’s means of payment is still far from accomplished, and investment
in Bitcoin is a risky strategy. The number of research papers in major journals related to Bitcoin has
been limited, and started to surge just recently as summarized in the following.
Balcilar et al. (2017) discuss the predictability of Bitcoin returns and volatility based on
transaction volume, finding out that in the quantile range of 0.25 to 0.75, i.e., extreme events
excluded, volume is an important predictor variable. Bariviera et al. (2017) study the stylized facts in
477
Quantitative Finance and Economics Volume 1, Issue 4, 474–485
Bitcoin markets, showing that the Hurst exponents have undergone significant changes in the early
years after Bitcoin introduction but stabilizes recently. Their multiscale analysis shows a selfsimilar
process characteristics. The prospects of Bitcoin and the entire cryptocurrency markets are nicely
summarized at the accessible level in the review work by Extance (2015). Econometric methods, in
particular the GARCH model, are applied to volatility estimation of Bitcoin by Katsiampa (2017).
Sentiment analysis using computational intelligence methods for Bitcoin fluctuation prediction based
on user comments are applied in Kim et al. (2016). Kristoufek (2013) compares the Bitcoin
phenomenon to other Internet phenomena of the present day; in a different paper Kristoufek (2015)
also analyzes the main drivers of Bitcoin price, such as the demand in China, using wavelet
coherence analysis.
There are also increasing more recent papers on the fundamental importance of Bitcoin and its
security aspects. The role of Bitcoin in present day finance is questioned by Bouri et al. (2017b) and
Dyhrberg (2016). Bitcoin market efficiency is studied by Urquhart (2016) with the conclusion that it
is still transitioning to the regime of market efficiency. Bitcoin price clustering at round numbers is
observed by Urquhart (2017). Price dynamics and speculative trading in Bitcoin is studied by Blau
(2017) with the conclusion that speculative behavior cannot be directly linked to the unusual
volatility of the Bitcoin market. Cheah and Fry (2015) also explore the role of speculation in the
Bitcoin market from the viewpoint of Bitcoin’s fundamental value. Dwyer (2015) examines the
Bitcoin economy with the conclusion that Bitcoin is likely to limit government’s revenue from
inflation. Branvold et al. (2015) studies the role of various Bitcoin exchanges in the price discovery
process, indicating that the information share is dynamic and significantly evolving over time.
Security problems, inherent in the cryptocurrency world, are discussed by Bradbury (2013) for the
case of Bitcoin. Analysis is also available for the period of crash in 2013 in the work of Bouri et al.
(2017). There is a number of intriguing extreme events in the history of Bitcoin market which can be
discussed from multidisciplinary perspective, such as using the methods of Franzke (2012).
The existing research gap we are aiming at is the substantial lack of understanding of the
Bitcoin price process dynamics and the mostly unexplored applicability of standard econometric,
technical trading, and machine learning approaches.
The main contributions of the present paper, in view of the previous work, are as follows: (1)
we provide theoretical and empirical bounds on Bitcoin arbitrage opportunities using different
standard currency pairs, discovering major opportunity window at the Chinese market; (2) we show
that the econometric HARRVJ model with adjusted parameters is well capable of capturing the
dynamics of realized volatility time series, and (3) it is demonstrated that a feedforward neural
network architecture is capable of learning the statistical distribution of the logarithmic return but it
exhibits rather limited prediction ability for the market trend and return magnitude on the daily
sampling scale. In addition, we also provide a case study insight into the Kraken market liquidity and
transaction volumes.
3. Data Analysis Methods
The Bitcoin prices in terms of a standard currency CRS, i.e. the BTCCRS time series, are
denoted as Bi, assuming an equidistant time sampling represented by integer sequence i, i=1…n. In
what follows we use the sampling frequencies of 1 second, 1 minute, 5 minutes, 1 hour, and 1 day.
478
Quantitative Finance and Economics Volume 1, Issue 4, 474–485
3.1. Data transformation
The logarithmic return is defined as
𝑅
=
log
. (1)
In Eq. (1), Bi stands for the price of Bitcoin at time step i. The advantage of the logarithmic
return over the prices is a symmetric representation of price increase and decrease by the same
multiple, which differ only by the sign of the respective log return; constant price levels are
represented by the zero return, and, importantly, unlike from the nonstationary price process, the
time series of logarithmic return can often be approximated as stationary. The distribution of the
daily log returns for BTCUSD time series is shown in Figure 1 (b), illustrating the approximate
symmetry of the Rdistribution. We notice, for instance, that whereas Figure 1 (a) shows a clear
upward trend of the price process, Figure 1 (b) is approximately symmetric in regard to the change of
the sign of the logarithmic return; the upward trend is a consequence of the small positive value of
the mean of the distribution (exponential growth process shown as a red line in Figure 1 (a).
3.2. Data aggregation
In order to produce distribution of Bitcoin trading volumes, which is usually unavailable from the
standard high frequency data sources, we use the application interface (API) of Kraken exchange market,
collecting the last 5,000 transactions every minute. The data show the transaction time (resolved to 0.1
ms), price, volume (in BTC), and trade direction. Using the last preceding transaction available, we
define the OHLCV data (Open, High, Low, Close prices and Volume of transactions) on 1 second grid,
which are then aggregated for transformation to longer sampling periods.
3.3. Measuring arbitrage spread
Let us assume Bitcoin prices in two currencies, i.e. BTCFC1 and BTCFC2. A hypothetical arbitrage
transaction can be defined by buying 1 BTC in currency FC1 (expense –BTCFC1), selling it in currency
FC2 (revenue BTCFC2), then transforming the received cash back to currency FC1 (foreign exchange
rate FC2FC1=1/FC1FC2). The profit rate (relative to the Bitcoin price BTCFC1) is then
𝛿
,
=
=
/
𝐹𝐶
1
𝐹𝐶
2
−
1
. (2)
In other words, the profit rate is taken as the ratio of the Bitcoinimplied foreign exchange
(BTCFC2/BTCFC1≡FC1FC2(Bitcoin)) and the actual exchange rate, FC1FC2. Let us notice here
that no transaction cost is assumed here; its incorporation is straightforward by distinguishing
between the supply and demand part for each price or exchange rate in Eq. (2). The direction of the
transaction, FC1FC2 is important; it holds
𝛿
,
+
𝛿
,
≤
0
.
(3)
3.4. Modeling realized volatility
Realized volatility, RV, is defined by
479
Quantitative Finance and Economics Volume 1, Issue 4, 474–485
𝑅𝑉
=
∑
𝑅
∈
{
}
, (4)
where the log returns R with index j are taken relative to a highfrequency sampling grid (5 minutes
in our case) within the duration of the longer sampling period with index i (1 day for daily returns).
The regression equation for the Heterogeneous Autoregressive model for Realized Volatility RV
including jumps Andersen (2007) applies the square root transform to the RV values, in particular
√
𝑅𝑉
=
𝛽
+
𝛽
√
𝑅𝑉
+
𝛽
√
𝑅𝑉
+
𝛽
√
𝑅𝑉
+
𝛽
𝐽
+
𝛽
𝐽
+
𝛽
𝐽
, (5)
where the values Ji are the jumps defined by Andersen (2007). The Rpackage highfrequency by
Boudt et al. (2017) is used for implementation. The above model was developed in the series of
papers by Andersen et al. (2000, 2001, 2007) and BarndorffNielsen and Shpephard (2004).
Andersen et al. (2000, 2001, 2007) sample the past series of Realized Volatility back to one week (5
days) and one month (22 days); they also use a quadratic model of Realized Volatility rather than the
square root, but suggest that the square root version or even a log (RV) version may be appropriate
based on the process. In E q. (5), the sampling horizons are selected as daily, 5 days back, and 10
days back; the change to 10 days back sampling rather than monthly sampling has been motivated by
the observation that the coefficient 𝛽 becomes statistically significant for the 10 day sampling
rather than 22 day sampling. Importantly, since the Bitcoin is traded 7 days a week, the 5 days
period does not correspond to a full week trading; similarly, one month period would use 30 days
rather than 22 days of sampling delay. Aiming at a reliable statistical estimate of the process in Eq.
(5), and given the rather short time scale of the HARRVJ process specific to the Bitcoin market, we
have therefore settled at the (1,5,10)day parameter selection for the model.
3.5. Predicting daily log returns
Machine learning has been increasingly applied in the field of quantitative finance for
prediction of prices or logarithmic returns. Here we briefly outline our method of choice. We adopt
the feedforward neural network in 2hidden layer configuration, using the past 10day moving
window for daily log return sampling as predictors. The log returns are scaled to zeromean and the
standard deviation of 0.08 using the gradient vanish threshold of 0.005. The neural network is
initialized at random, trained on the first two thirds of the BTCUSD data set shown in Figure 4, and
tested for accuracy on the remaining part of the time series. The Rpackage neuralnet by Fritsch and
Guenther (2016) is used for implementation.
4. Results and Discussions
4.1. Price and logarithmic return distribution
The BTCUSD price history over the past five years is depicted in Figure 1(a) with the red line
showing the average logarithmic return (exponential explosion of prices with the exponent of 0.328
percent per day). The log return distribution in Figure 1 (b) shows the fat tail covering the extreme
event region of bubbles and crashes. The fat tail distribution of the average logarithmic return is most
elementary and universal financial time series characteristics. The price of Bitcoin is highly volatile
and not supported by “fundamentals” that is, any real economy in behind of cryptocurrency, and may
have random walk (martingale) property which is one of the stylized facts in financial time series.
480
Quantitative Finance and Economics Volume 1, Issue 4, 474–485
Nevertheless the graph clearly demonstrates that the simple buy and hold strategy applied over
several years is very lucrative. This may reflect the rising worldwide bets on the fulfillment of the
major aspiration of Bitcoin, and other cryptocurrencies in general, to become the new prevailing
ways of payment for the entire global market.
4.2. Distribution of trading volumes (Kraken, BTCEUR)
We have collected all transactions available from the online API of the Kraken Bitcoin market,
which is a double auction market, and transformed them onto regular high frequency grids of 1 sec, 1
min, 1 hour and 1 day. The period covered consists of 88 days from 20170517 to 20170812. The
resulting distribution of transaction volume is shown in Figure 2. The results in Figure 2 (d) are only
tentative, since the number of data points is relatively small. There are small peaks in Figure 2 (a) at
the volumes of 1, 2, 3 and 5 Bitcoins, which correspond to volume distribution for larger selling
orders streamed into the market distributed into relatively small Bitcoin amounts.
Figure 2. Aggregate trading volumes for all BTCEUR transactions at the Kraken market over
the past 5 years on the (a) second, (b) minute, (c) hour, and (d) day time scales.
The volume distribution shown in the four panels of Figure 2 linearly scales with time to larger
magnitudes, while the small peaks corresponding to distributed transaction packing on 1 sec scale disappear at
larger, nontechnical trading time scales of 1 min and 1 hour. Since there is yet very little quantitative research
on the liquidity of Bitcoin markets, these data provide a rigorous insight into the typical aggregate volume of
transactions that can be realized over multiscale time periods within a certain market such as Kraken.
According to the data.bitcoinity.org server (https://data.bitcoinity.org/markets/volume/30d?c=e&t=b) the
daily trading volumes in the second half of 2017 rarely fall below 100 thousand BTC; the market share of
Kraken in the trade volume is estimated to be about 8%.
481
Quantitative Finance and Economics Volume 1, Issue 4, 474–485
We remark here that the Bitcoin price data from the Kraken market are unique in the sense that
the time stamp available for all recorded transactions is resolved to 0.1 millisecond. Most of the
Bitcoin transaction repositories collecting Bitcoin data, such as http://api.bitcoincharts.com/v1/csv/,
contain transactions where the date and time is represented in the format of Unix time (integer
indicating number of seconds elapsed from January 1st, 1970, 0:00 AM). As such, the trades are all
resolved up to the unit of one second; consequently, at markets with frequent trading and high
liquidity, it is not uncommon that several trades share the same time; thus the study of interarrival
time distribution becomes difficult or impossible. To our knowledge, Kraken is the only market
providing the data with 0.1 millisecond resolution using the online API. Although this had no impact
on the 1 sec scale of the trade volume distribution, the data in principle allow for interarrival
distribution fits such as the selfexciting process of Hawkes (1974).
4.3. Arbitrage opportunities at Bitcoin markets
The data sets used for the evaluation of arbitrage opportunities are XBTEUR, XBTUSD,
XBTCNY and the foreign exchange rates EURUSD and USDCNY, all on 1 hour scale, from
2013/2/8 to 2017/4/7. Using the methods of Section 3.3, the arbitrage spread (transaction profit rate)
is shown in Figure 3 (a) for BTCEURBTCUSD currency pair, and (b) for BTCUSDBTCCNY
currency pair. Notice the fat tail on the righthandside of the distribution in Figure 3 (b), showing
substantial arbitrage windows. While both distributions in (a) and (b) are roughly similar in shape,
the width is much larger when the BTCCNY market enters into the arbitrage transaction. Care should
be taken, however, in regard to the interpretation of these distributions. First, it takes from 10
minutes to hours in extreme cases to record the new transaction in the blockchain. Thus, transactions
across markets are excluded from the arbitrage opportunity windows. Second, given the exponential
burst in Bitcoin prices, for most Bitcoin holders, who are not financial institutions specialized in
technical trading, the most profitable strategy may simply be to hold the Bitcoin over time rather
than to engage in trading which requires instant access to foreign exchange markets. Third, there is
no transaction fees considered in Eq. (2). Nevertheless, Figure 3 (b) shows that arbitrage possibilities
may, at least theoretically, exist even in case of substantial real transaction fees. It is a debatable issue
whether the Bitcoin market is efficient. It calls for further consideration.
Figure 3. Bitcoin arbitrage spread (transaction costs excluded) as based on the 1hour trading
data (data source: Bloomberg) for (a) USDEUR currency pair and (b) USDCNY currency pair.
482
Quantitative Finance and Economics Volume 1, Issue 4, 474–485
4.4. Model of realized volatility
The estimation of Realized Volatility using the HARRVJ model of Eq. (5) in Sec. 3.4 produces
reasonably good results as shown in Table 1 and Figure 4. In Figure 4 (a), the 5min time series of
BTCUSD are shown that were used for the computation of realized volatility. Also, the lower panel
of Figure 4 (a) depicts the corresponding logarithmic returns.
Table 1. HARRVJ regression coefficients for BTCUSD (5 min, 1 day).
Coef. Estimate Std. error tvalue pvalue Signif.
beta0 0.010307
0.001459
7.065
3.22E12
***
beta1 0.344821
0.05796
5.949
3.86E09
***
beta2 0.517929
0.115008
4.503
7.57E06
***
beta3 0.22684
0.111337
2.037
0.0419
*
beta4 0.12326
0.077452
1.591
0.1119
beta5 0.86087
0.147957
5.818
8.27E09
***
beta6 0.856319
0.160656
5.33
1.24E07
***
The actual and predicted realized volatility are shown in Figure 4 (b). Table 1 gives the values
and statistical significance of the regression coefficients. As mentioned in Sec. 3.4, the parameters of
the process (1 day, 5 day, and 10 day sampling) were selected in order to obtain as many significant
coefficients as possible. The choice of 10day delay instead of 1 month delay has proven necessary to
obtain the significant value of the coefficient beta 3. We could find no parameterization that would
result in the significant value of coefficient beta 4 (for the jump at time t). It can be seen in Figure 4
that not all the peaks of the Realized Volatility distribution correspond to extreme values of daily log
returns; some are due to large intraday volatility process instead. In brief, we consider the agreement
of the observed and forecasted realized volatility in Figure 4 very good. It hereby appears that the
HARRVJ process is well applicable to the Bitcoin market. This by far cannot be expected a priori
and constitutes one of the empirical findings of this paper.
Figure 4. 5min sampled BTCUSD time series in upper panel (a), the derived logarithmic
returns on daily basis in lower panel (a), and the actual and HARRVJ model predicted realized
volatility are shown in (b). Date format: (M) M DD YYYY.
483
Quantitative Finance and Economics Volume 1, Issue 4, 474–485
4.5. Neural network prediction of log returns
Using the artificial neural network outlined in Sec. 3.5, we show that the method is roughly
capable of capturing the clustering of extreme events in the market, where the absolute value of the
log return is high, albeit there are some differences in Figure 5 (a). The reproduction of the density of
the daily log return density in Figure 5 (b) is quite satisfactory. It is an open question whether more
advanced machine learning methods can provide better results, or whether the market is relatively
efficient, hence difficult to predict by any markethistory based computational means.
Figure 5. (a) Scaled daily logarithmic return as used for the neural network prediction on the
data from Figure 4. The last one third of the time series is shown (testing dataset). (b)
Comparison of the actual and predicted log return distribution.
In particular, a feedforward neural network with two fully interconnected hidden layers, input
layer of size 10, and output layer of size 1, when used as a statistical regression technique on scaled
data of daily logarithmic volume, is capable of capturing the shape of the logarithmic return density
distribution. The learning algorithm is the Backpropagation method explained in detail in Hsieh
484
Quantitative Finance and Economics Volume 1, Issue 4, 474–485
(2009). A closer look at the comparison of the actual and predicted log returns shows a discrepancy
in the peak location, sign of the logarithmic return, or the magnitude value. We find it interesting,
nevertheless, that the overall statistical distribution can be learned by the neural network in this case.
A detailed study of machine learning algorithms for Bitcoin price prediction will be deferred to a
subsequent paper, since it requires more advanced network topologies than the one applied at the
present work.
5. Conclusion
The present work is an empirical investigation into the properties of Bitcoin markets. Volatility
of Bitcoin prices was studied from various viewpoints, ranging from the stylized features of
logarithmic return distribution, transaction volume distribution at multiple time scales, arbitrage
opportunities on 1hour trading scale for the currency pairs of EURUSD and USDCNY,
econometric analysis of the time series of Realized Volatitlity, and classical machine learning
prediction of logarithmic returns for BTCUSD on daily time scale by means of the artificial neural
network. The time series of Bitcoin prices are substantially more volatile than those of EURUSD
exchange rates, for the sake of comparison, with market bubbles and crashes relatively abundant.
Substantial arbitrage opportunities are available for currency USD or EUR currency pairs involving
CNY. The HARRVJ model captures well the dynamics of daily Realized Volatility as aggregated on
the 5minute grid. Standard neural network prediction of daily logarithmic returns of BTCUSD time
series is capable of reproducing the extreme event clustering feature and the shape of the distribution
of logarithmic returns; more sophisticated methods will be applied in a future study to discover the
ultimate prediction accuracy limits with deep learning algorithms.
Acknowledgment
This research was supported by JSPS GrantsinAid Nos. 2538404, 2628089. We would like to
thank Mr. Zheng Nan for collection of Bloomberg data and preliminary calculations.
Conflict of Interest
All authors declare no conflict of interest.
References
Andersen TG, Bollerslev T, Diebold FX, et al. (2000) Exchange Rate Returns Standardized by
Realized Volatility are (Nearly) Gaussian. Multinatl Finance J 4: 159–179.
Andersen TG, Bollerslev T, Diebold FX, et al. (2001) The distribution of realized stock return
volatility. J Financ Econ 61: 43–76.
Andersen TG, Bollerslev T, Diebold FX (2007) Roughing it up: including jump components in
measuring, modeling and forecasting asset return volatility. Rev Econ Stat 89: 701–720.
Balcilar M, Bouri E, Gupta R, et al. (2017) Can volume predict Bitcoin returns and volatility? A
quantilesbased approach. Econ Model 64: 74–81.
Bariviera AF, Basgall MJ, Hasperué W, et al. (2017) Some stylized facts of the Bitcoin market,
Physica A: Stat Mechanics Appl 484: 82–90.
Quantitative Finance and Economics
BarndorffNielsen OE,
Shephard N (2004) Power and Bipower Variation with Stochastic Volatility
and Jumps. J Financ Econom
Blau BM (2017) Price dynamics and speculative trading in bitcoin.
Boudt K, Cornelissen J, Payseur S
,
R package version 0.5.
Available from:
Bour
i E, Azzi G, Dyhrberg AH (2017
around the price crash of 2013.
Bouri E, Molnár P, Azzi G,
et al.
really more than a diversifier?
Bradbury D (2013) The problem with Bitcoin.
Brandvold M, Molnár P, Vagstad K
Mark, Inst Money 36: 18–35.
Cheah ET, Fry J (2015) Speculative bubbles in Bitcoin markets? An empirical investigation into the
fundamental value of Bitcoin.
Dwyer GP (2015) The economics of Bitcoin and similar private digital currencies.
81–91.
Dyhrberg AH (2016) Hedging capabilities of
139–144.
Extance
A (2015) Bitcoin and beyond.
Franzke C (2012) Predictability of extreme events in a nonli
Physical Rev E 85.
Fritsch S,
Guenther F (2016) neuralnet: Training of Neural Networks. R package version 1.33.
Available online: https://CRAN.R
Hawkes AG,
Oakes D (1974) A Cluster Process Representation of a Self
Prob 11: 493–503.
Hsieh WW (2009) Machine Learning Methods in the Environmental Sciences, Cambridge University
Press, Cambridge, UK.
Katsiampa P (2017) Volatility estimati
158: 3–6.
Kim YB, Kim JG,
Kim W, et al. (2016) Predicting Fluctuations in Cryptocurrency Transactions
Based on User Comments and Replies.
Kristoufek L (2013) BitCoin meets Google Trends and Wikipedia: Quantifying the relationship
between phenomena of the Internet era.
Kristoufek L (2015) What Are the Main Drivers of the Bitcoin Price? Evidence from Wavelet
Coherence Analysis.
PLoS One
Urquhart
A (2016) The inefficiency of Bitcoin.
Urquhart A
(2017) Price clustering in Bitcoin.
Shephard N (2004) Power and Bipower Variation with Stochastic Volatility
2: 1–37.
Blau BM (2017) Price dynamics and speculative trading in bitcoin.
Res Int
Bus Financ
,
et al. (2017) high frequency: Tools for High
frequency Data Analysis.
Available from:
https://CRAN.R
project.org/package=highfrequency
i E, Azzi G, Dyhrberg AH (2017
a) On the return
volatility relationship in the Bitcoin market
around the price crash of 2013.
Econ 11: 1–16.
et al.
(2017b
) On the hedge and safe haven properties of Bitcoin: Is it
really more than a diversifier?
Financ Res Letters 20: 192–198.
Bradbury D (2013) The problem with Bitcoin.
Comput Fraud Security
2013: 5
Brandvold M, Molnár P, Vagstad K
, et al. (2015)
Price discovery on Bitcoin exchanges.
Cheah ET, Fry J (2015) Speculative bubbles in Bitcoin markets? An empirical investigation into the
fundamental value of Bitcoin.
Econ Lett 130: 32–36.
Dwyer GP (2015) The economics of Bitcoin and similar private digital currencies.
Dyhrberg AH (2016) Hedging capabilities of
B
itcoin. Is it the virtual gold?
A (2015) Bitcoin and beyond.
Nature 526: 21–23.
Franzke C (2012) Predictability of extreme events in a nonli
near stochastic
Guenther F (2016) neuralnet: Training of Neural Networks. R package version 1.33.
Available online: https://CRAN.R
project.org/package=neuralnet.
Oakes D (1974) A Cluster Process Representation of a Self

Exciting Process.
Hsieh WW (2009) Machine Learning Methods in the Environmental Sciences, Cambridge University
Katsiampa P (2017) Volatility estimati
on for Bitcoin: A comparison of GARCH models.
Kim W, et al. (2016) Predicting Fluctuations in Cryptocurrency Transactions
Based on User Comments and Replies.
PLoS ONE 11.
Kristoufek L (2013) BitCoin meets Google Trends and Wikipedia: Quantifying the relationship
between phenomena of the Internet era.
Scientific rep 3.
Kristoufek L (2015) What Are the Main Drivers of the Bitcoin Price? Evidence from Wavelet
PLoS One
10.
A (2016) The inefficiency of Bitcoin.
Econ Lett 148: 80–82.
(2017) Price clustering in Bitcoin.
Econ Lett 159: 145–148.
© 2017 the Author(s), licensee AIMS Press. T
his
distributed under the terms of the Creative Commons Attribution License
(http://creativecommons.org/licenses/by/4
.0)
485
Volume 1, Issue 4, 474–485
Shephard N (2004) Power and Bipower Variation with Stochastic Volatility
Bus Financ
41: 493–499.
frequency Data Analysis.
project.org/package=highfrequency
.
volatility relationship in the Bitcoin market
) On the hedge and safe haven properties of Bitcoin: Is it
2013: 5
–8.
Price discovery on Bitcoin exchanges.
J Int Financ
Cheah ET, Fry J (2015) Speculative bubbles in Bitcoin markets? An empirical investigation into the
Dwyer GP (2015) The economics of Bitcoin and similar private digital currencies.
J Financ Stab 17:
itcoin. Is it the virtual gold?
Financ Res Lett 16:
near stochastic
dynamical model.
Guenther F (2016) neuralnet: Training of Neural Networks. R package version 1.33.
Exciting Process.
J Appl
Hsieh WW (2009) Machine Learning Methods in the Environmental Sciences, Cambridge University
on for Bitcoin: A comparison of GARCH models.
Econ Lett
Kim W, et al. (2016) Predicting Fluctuations in Cryptocurrency Transactions
Kristoufek L (2013) BitCoin meets Google Trends and Wikipedia: Quantifying the relationship
Kristoufek L (2015) What Are the Main Drivers of the Bitcoin Price? Evidence from Wavelet
his
is an open access article
distributed under the terms of the Creative Commons Attribution License
.0)