Conference PaperPDF Available

An Empirical Analysis of Pool Hopping Behavior in the Bitcoin Blockchain

Authors:

Abstract and Figures

We provide an empirical analysis of pool hoppingbehavior among 15 mining pools throughout Bitcoin’s history.Mining pools have emerged as major players to ensure that theBitcoin system stays secure, valid, and stable. Individual minersjoin mining pools to benefit from a more predictable income.Many questions remain open regarding how mining pools haveevolved throughout Bitcoin’s history and when and why minersjoin or leave mining pools. We propose a heuristic algorithm toextract the payout flow from mining pools and detect the pools’migration of miners. Our results showed that payout schemesand pool fees influence miners’ decisions to join, change, orexit from a mining pool, thus affecting the dynamics of miningpool market shares. Our analysis provides evidence that miningactivity becomes an industry as miners’ decisions follow classicaleconomic rationale.
Content may be subject to copyright.
A preview of the PDF is not available
... This article is an extension of our previous conference paper [10] in which we proposed a new approach to detect individual miners from the reward payout flow and evaluated miners' migration (pool hopping) among the top 15 mining pools. In this extension, we contribute additional updated content regarding the evolution of 23 mining pools and miners' mobility behaviors up to Aug. 2021. ...
... The authors analyzed the mining profit regarding hardware cost and electricity price and concluded that the profit became negative when the hash rate increased faster than the Bitcoin price. Belotti et al. [13] investigated pool hopping between KanoPool and SlushPool between Apr. [6][7][8][9][10][11][12][13][14][15][16][17][18][19][20]2016, and reported that few miners tried to exploit the time difference of reward payout between two pools with diverse strategies to earn more profit from mining. Romiti et al. [14] analyzed the distribution of mining pools from Dec. 2013 to Dec. 2018 and found that 3-4 mining pools controlled >50% of the hash rate. ...
... In this case, the algorithm will calculate that the transaction purity is 1, assign it as , and follow all outputs from . . We provided an evaluation to justify that our approach can be used to detect miners' addresses in our previous article [10]. ...
Article
Full-text available
We analyzed 23 mining pools and explore the mobility of miners throughout Bitcoin's history. Mining pools have emerged as major players to ensure that the Bitcoin system stays secure, valid, and stable. Many questions remain open regarding how mining pools have evolved throughout Bitcoin's history and when and why miners join or leave the pools. We investigated the reward payout flow of mining pools and characterized them based on payout irregularity and structural complexity. Based on our proposed algorithm, we identified miners and studied their mobility in the pools over time. Our analysis shows that Bitcoin mining is an industry that is sensitive to external events (e.g., market price and government policy). Over time, competition between pools involving reward schemes and pool fees motivated miners to migrate between pools (i.e., pool hopping and cross pooling). These factors converged toward optimal scheme and values, which made mining activities more stable.
... We found evidence that miners make economic decisions to select a pool and that mining pools compete to offer better reward incentives to attract miners. This chapter is an updated and extended version of my original article published at IEEE International Conference on Blockchain and Cryptocurrency (ICBC 2021) [154]. The work was led by myself in collaboration with Nicolas Soulié, Nicolas Heulot, and Petra Isenberg. ...
... Previous work had used glyph charts [172] to show numbers over time but does not highlight the association between pools. Due to the fluctuations of miner counts in the data [154], I chose to show aggregated data to reduce outliers. A heatmap matrix is an alternative to display the percentage of cross-pooling miners like in Figure 32. ...
... Throughout the design iterations of the MiningVis tool, we used visual analytics approaches to help my economist collaborator understand the evolution of mining pools and the Bitcoin mining economy as a whole. My economist collaborator and I were able to make discoveries reported in two analysis articles [153,154] based on the visualization prototypes. The economists used the tool to identify time periods to investigate pool hopping behavior, compared the characteristics between pools, and discovered contextual information to explain the behavior of actors in the activity, including individual miners, mining pools, and the Bitcoin blockchain. ...
Thesis
Full-text available
Bitcoin is a pioneer cryptocurrency that records transactions in a public distributed ledger called the blockchain. It has been used as a medium for payments, investments, and digital wallets that are not controlled by any government or financial institution. Over the past ten years, transaction activities in Bitcoin have increased rapidly. The volume and evolving nature of its data pose analysis challenges to explore diverse groups of users and different activities on the network. The field of Visual Analytics (VA) has been working on the development of analytical systems that allow humans to interact and gain insights from complex data. In this thesis, I make several contributions to the analysis of Bitcoin mining activity. First, I provide a characterization of the past work and research challenges related to VA for blockchains. From this assessment, I proposed a VA tool to understand mining activities that ensure data integrity and security on the Bitcoin blockchain. I propose a method to extract miners from the transaction data and trace pool hopping behavior. The empirical analysis of this data revealed that emerging mining pools provided a better incentive to attract miners. Simultaneously, miners strategically chose mining pools to maximize their profit. To explore the evolution and dynamics of this activity over the long term, I developed a VA tool called MiningVis that integrates mining behavior data with contextual information from Bitcoin statistics and news. The user study demonstrates that Bitcoin miner participants use the tool to analyze higher-level mining activity rather than mining pool details. The evaluation of the tool proves that it helped participants to relate multiple information and discover historical trends of Bitcoin mining.
... They mainly utilized economic activity pattern-based ways to explore the relationships between miner-owned addresses and pools. In separate work, Tovanich et al. [10] studied pool hopping behaviors in 15 pools of Bitcoin transactions. Based on the empirical study and their proposed heuristic algorithm designed to describe the payout flows, they determined those pool fees and payout schemes are the two most important factors to influence the behaviors of miner-owned addresses. ...
... Finally, we use scikit-learn and xgboost to implement common machine learning models to test BABD-13 10 . We perform an 8:2 split of the training set and testing set on the BABD-13 using function train test split() with parameter test size=0.2. ...
Preprint
Full-text available
Cryptocurrencies are no longer just the preferred option for cybercriminal activities on darknets, due to the increasing adoption in mainstream applications. This is partly due to the transparency associated with the underpinning ledgers, where any individual can access the record of a transaction record on the public ledger. In this paper, we build a dataset comprising Bitcoin transactions between 12 July 2019 and 26 May 2021. This dataset (hereafter referred to as BABD-13) contains 13 types of Bitcoin addresses, 5 categories of indicators with 148 features, and 544,462 labeled data, which is the largest labeled Bitcoin address behavior dataset publicly available to our knowledge. We then use our proposed dataset on common machine learning models, namely: k-nearest neighbors algorithm, decision tree, random forest, multilayer perceptron, and XGBoost. The results show that the accuracy rates of these machine learning models for the multi-classification task on our proposed dataset are between 93.24% and 97.13%. We also analyze the proposed features and their relationships from the experiments, and propose a k-hop subgraph generation algorithm to extract a k-hop subgraph from the entire Bitcoin transaction graph constructed by the directed heterogeneous multigraph starting from a specific Bitcoin address node (e.g., a known transaction associated with a criminal investigation). Besides, we initially analyze the behavior patterns of different types of Bitcoin addresses according to the extracted features.
... However, Token Flow does not support arbitrary cross-chain use cases. On the academic side, we have several tools that allow on-chain analysis of smart contracts for security purposes [38], [62], [63], performance [64], [64], [65], compliance and anti-fraud [66], and others [67], [68] . However, such projects provide a sort of meta-view over user activity, do not provide specific information about interaction with protocols, and are not generalizable, contrarily to this work. ...
Preprint
p>Ecosystems of multiple blockchains are now a reality. Multi-chain applications and protocols are perceived as necessary to enable scalability, privacy, and composability. Despite being a promising emerging research area, we recently have witnessed many attacks that have caused billions of dollars in losses. Attacks against bridges that connect chains are at the top of such attacks in terms of monetary cost, and no apparent solution seems to emerge from the ongoing chaos. In this paper, we present our contribution to minimizing bridge attacks. In particular, we explore the concepts of cross-chain transaction, cross-chain logic, and the cross-chain state as the enablers of the cross-chain model. We propose Hephaestus , the first cross-chain model generator that captures the operational complexity of cross-chain applications. Hephaestus can generate cross-chain models from local transactions on different ledgers realizing arbitrary use cases and allowing operators to monitor their cross-chain applications. Monitoring helps identify outliers and malicious behavior, which can help programmatically to stop bridge hacks and other attacks. We conduct a detailed evaluation of our system, where we implement a cross-chain bridge use case. Our experimental results show that Hephaestus can process 600 cross-chain transactions in less than 5.5 seconds in an environment with two blockchains and requires sublinear storage.</p
... On the academic side, we have several tools that allow onchain analysis of smart contracts for security purposes [34], [49], [50], performance [51], [51], [52], compliance and antifraud [53], and others [54], [55] . However, such projects provide a sort of meta-view over user activity, do not provide specific information about interaction with protocols, and are not generalizable. ...
Preprint
Full-text available
p>Ecosystems of multiple blockchains are now a reality. Multi-chain applications and protocols are perceived as necessary to enable scalability, privacy, and composability. Despite being a promising emerging research area, we recently have witnessed many attacks that have caused billions of dollars in losses. Attacks against bridges that connect chains are at the top of such attacks in terms of monetary cost, and no apparent solution seems to emerge from the ongoing chaos. In this paper, we present our contribution to minimizing bridge attacks. In particular, we explore the concepts of cross-chain transaction, cross-chain logic, and the cross-chain state as the enablers of the cross-chain model. We propose Hephaestus , the first cross-chain model generator that captures the operational complexity of cross-chain applications. Hephaestus can generate cross-chain models from local transactions on different ledgers realizing arbitrary use cases and allowing operators to monitor their cross-chain applications. Monitoring helps identify outliers and malicious behavior, which can help programmatically to stop bridge hacks and other attacks. We conduct a detailed evaluation of our system, where we implement a cross-chain bridge use case. Our experimental results show that Hephaestus can process 600 cross-chain transactions in less than 5.5 seconds in an environment with two blockchains and requires sublinear storage.</p
... On the academic side, we have several tools that allow onchain analysis of smart contracts for security purposes [34], [49], [50], performance [51], [51], [52], compliance and antifraud [53], and others [54], [55] . However, such projects provide a sort of meta-view over user activity, do not provide specific information about interaction with protocols, and are not generalizable. ...
Preprint
Full-text available
p>Ecosystems of multiple blockchains are now a reality. Multi-chain applications and protocols are perceived as necessary to enable scalability, privacy, and composability. Despite being a promising emerging research area, we recently have witnessed many attacks that have caused billions of dollars in losses. Attacks against bridges that connect chains are at the top of such attacks in terms of monetary cost, and no apparent solution seems to emerge from the ongoing chaos. In this paper, we present our contribution to minimizing bridge attacks. In particular, we explore the concepts of cross-chain transaction, cross-chain logic, and the cross-chain state as the enablers of the cross-chain model. We propose Hephaestus , the first cross-chain model generator that captures the operational complexity of cross-chain applications. Hephaestus can generate cross-chain models from local transactions on different ledgers realizing arbitrary use cases and allowing operators to monitor their cross-chain applications. Monitoring helps identify outliers and malicious behavior, which can help programmatically to stop bridge hacks and other attacks. We conduct a detailed evaluation of our system, where we implement a cross-chain bridge use case. Our experimental results show that Hephaestus can process 600 cross-chain transactions in less than 5.5 seconds in an environment with two blockchains and requires sublinear storage.</p
Article
Bitcoin is the most widely used crypto-currency, and one of the most studied. Thanks to the open nature of the Blockchain, transaction records are freely accessible and can be analyzed by anyone. The first step in most analytics work is to group anonymous addresses into a set of addresses, called aggregates, that are meant to correspond to unique actors. In this paper, we propose new methods to discover more accurate address aggregates using supervised learning. We introduce a way to create a labeled training set based on reliable heuristics and external information, and propose two methods. The first method automatically finds address aggregates from a set of transactions. The second one improves an address aggregate of a target actor by specializing the training for a single actor. We empirically validate our results on large-scale datasets. A striking result of our analysis is that training a model to recognize the change addresses of a particular actor is more efficient than using a larger dataset that does not target that particular actor. In doing so, we clearly show the feasibility and interest of supervised machine learning to identify Bitcoin actors.
Article
This paper proposes a conceptual framework for the analysis of reward sharing schemes in mining pools, such as those associated with Bitcoin. The framework is centered around the reported shares in a pool instead of agents and introduces two new fairness criteria: absolute and relative redistribution. These criteria impose that the addition of a share to a round affects all previous shares of the round in the same way, either in absolute amount or in relative ratio. We characterize two large classes of reward sharing schemes corresponding to each of these fairness criteria in turn. We further show that the intersection of these classes brings about a generalization of the well-known proportional scheme, which in turn leads to a new characterization of the proportional scheme itself.
Article
Full-text available
Cryptocurrencies gain trust in users by publicly disclosing the full creation and transaction history. In return, the transaction history faithfully records the whole spectrum of cryptocurrency user behaviors. This article analyzes and summarizes the existing research on knowledge discovery in the cryptocurrency transactions using data mining techniques. Specifically, we classify the existing research into three aspects, i.e., transaction tracings and blockchain address linking, the analyses of collective user behaviors, and the study of individual user behaviors. For each aspect, we present the problems, summarize the methodologies, and discuss major findings in the literature. Furthermore, an enumeration of transaction data parsing and visualization tools and services is also provided. Finally, we outline several gaps and trends for future investigation in this research area.
Conference Paper
Full-text available
We present our work on visual analytics tools to support the analysis of Bitcoin mining pool evolution. Mining blocks are a critical component of the Bitcoin ecosystem, helping to keep the system secure, valid, and stable. At the same time, mining is a resource-intensive activity that continues to get more and more difficult. Mining pools have emerged to address this issue and to ensure a more stable and predictable income by sharing computing power. Yet, increased centralization of the mining power is also not without dangers (e. g., the 51% attack), and, thus, it is important to better understand and analyze mining pool activities in Bitcoin. Here, we report three contributions: our extensive data collection on Bitcoin mining pools, our development of two custom visualizations, and our first exploratory data analysis leading to hypotheses and documented activities about pools' main features such as market share, reward rules, or location.
Article
Full-text available
The Bitcoin network not only is vulnerable to cyber-attacks but currently represents the most frequently used cryptocurrency for concealing illicit activities. Typically, Bitcoin activity is monitored by decreasing anonymity of its entities using machine learning-based techniques, which consider the whole blockchain. This entails two issues: first, it increases the complexity of the analysis requiring higher efforts and, second, it may hide network micro-dynamics important for detecting short-term changes in entity behavioral patterns. The aim of this paper is to address both issues by performing a “temporal dissection” of the Bitcoin blockchain, i.e., dividing it into smaller temporal batches to achieve entity classification. The idea is that a machine learning model trained on a certain time-interval (batch) should achieve good classification performance when tested on another batch if entity behavioral patterns are similar. We apply cascading machine learning principles—a type of ensemble learning applying stacking techniques—introducing a “k-fold cross-testing” concept across batches of varying size. Results show that blockchain batch size used for entity classification could be reduced for certain classes (Exchange, Gambling, and eWallet) as classification rates did not vary significantly with batch size; suggesting that behavioral patterns did not change significantly over time. Mixer and Market class detection, however, can be negatively affected. A deeper analysis of Mining Pool behavior showed that models trained on recent data perform better than models trained on older data, suggesting that “typical” Mining Pool behavior may be represented better by recent data. This work provides a first step towards uncovering entity behavioral changes via temporal dissection of blockchain data.
Chapter
Full-text available
The first six months of 2018 have seen cryptocurrency thefts of $761 million, and the technology is also the latest and greatest tool for money laundering. This increase in crime has caused both researchers and law enforcement to look for ways to trace criminal proceeds. Although tracing algorithms have improved recently, they still yield an enormous amount of data of which very few datapoints are relevant or interesting to investigators, let alone ordinary bitcoin owners interested in provenance. In this work we describe efforts to visualize relevant data on a blockchain. To accomplish this we come up with a graphical model to represent the stolen coins and then implement this using a variety of visualization techniques.
Article
Cryptocurrencies represented by Bitcoin have fully demonstrated their advantages and great potential in payment and monetary systems during the last decade. The mining pool, which is considered the source of Bitcoin, is the cornerstone of market stability. The surveillance of the mining pool can help regulators effectively assess the overall health of Bitcoin and issues. However, the anonymity of mining-pool miners and the difficulty of analyzing large numbers of transactions limit in-depth analysis. It is also a challenge to achieve intuitive and comprehensive monitoring of multi-source heterogeneous data. In this study, we present SuPoolVisor, an interactive visual analytics system that supports surveillance of the mining pool and de-anonymization by visual reasoning. SuPoolVisor is divided into pool level and address level. At the pool level, we use a sorted stream graph to illustrate the evolution of computing power of pools over time, and glyphs are designed in two other views to demonstrate the influence scope of the mining pool and the migration of pool members. At the address level, we use a force-directed graph and a massive sequence view to present the dynamic address network in the mining pool. Particularly, these two views, together with the Radviz view, support an iterative visual reasoning process for de-anonymization of pool members and provide interactions for cross-view analysis and identity marking. Effectiveness and usability of SuPoolVisor are demonstrated using three cases, in which we cooperate closely with experts in this field.
Article
Since its deployment in 2009, Bitcoin has achieved remarkable success and spawned hundreds of other cryptocurrencies. The author traces the evolution of the hardware underlying the system, from early GPU-based homebrew machines to today’s datacenters powered by application-specific integrated circuits. These ASIC clouds provide a glimpse into planet-scale computing’s future.
Article
Analysis of blockchain data is useful for both scientific research and commercial applications. We present BlockSci, an open-source software platform for blockchain analysis. BlockSci is versatile in its support for different blockchains and analysis tasks. It incorporates an in-memory, analytical (rather than transactional) database, making it several hundred times faster than existing tools. We describe BlockSci's design and present four analyses that illustrate its capabilities. This is a working paper that accompanies the first public release of BlockSci, available at https://github.com/citp/BlockSci. We seek input from the community to further develop the software and explore other potential applications.