Conference PaperPDF Available

Echidna: effective, usable, and fast fuzzing for smart contracts

  • Trail of Bits


Content may be subject to copyright.
Echidna: Eective, Usable, and Fast Fuzzing for Smart Contracts
Gustavo Grieco
Trail of Bits, USA
Will Song
Trail of Bits, USA
Artur Cygan
Trail of Bits, USA
Josselin Feist
Trail of Bits, USA
Alex Groce
Northern Arizona University, USA
Ethereum smart contracts—autonomous programs that run on a
blockchain—often control transactions of nancial and intellectual
property. Because of the critical role they play, smart contracts need
complete, comprehensive, and eective test generation. This paper
introduces an open-source smart contract fuzzer called Echidna that
makes it easy to automatically generate tests to detect violations in
assertions and custom properties. Echidna is easy to install and does
not require a complex conguration or deployment of contracts
to a local blockchain. It oers responsive feedback, captures many
property violations, and its default settings are calibrated based on
experimental data. To date, Echidna has been used in more than
10 large paid security audits, and feedback from those audits has
driven the features and user experience of Echidna, both in terms of
practical usability (e.g., smart contract frameworks like True and
Embark) and test generation strategies. Echidna aims to be good at
nding real bugs in smart contracts, with minimal user eort and
maximal speed.
Software and its engineering Dynamic analysis
ware testing and debugging.
smart contracts, fuzzing, test generation
ACM Reference Format:
Gustavo Grieco, Will Song, Artur Cygan, Josselin Feist, and Alex Groce.
2020. Echidna: Eective, Usable, and Fast Fuzzing for Smart Contracts. In
Proceedings of the 29th ACM SIGSOFT International Symposium on Software
Testing and Analysis (ISSTA ’20), July 18–22, 2020, Virtual Event, USA. ACM,
New York, NY, USA, 4pages.
Smart contracts for the Ethereum blockchain [
], usually written
in the Solidity language [
], facilitate and verify high-value nan-
cial transactions, as well as track physical goods and intellectual
property. Thus, it is essential that these programs be correct and
secure, which is not always the case [
]. Recent work surveying
and categorizing aws in critical contracts [
] established that
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for prot or commercial advantage and that copies bear this notice and the full citation
on the rst page. Copyrights for components of this work owned by others than the
author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or
republish, to post on servers or to redistribute to lists, requires prior specic permission
and/or a fee. Request permissions from
ISSTA ’20, July 18–22, 2020, Virtual Event, USA
©2020 Copyright held by the owner/author(s). Publication rights licensed to ACM.
ACM ISBN 978-1-4503-8008-9/20/07. . . $15.00
fuzzing using custom user-dened properties might detect up to
63% of the most severe and exploitable aws in contracts. This sug-
gests an important need for high-quality, easy-to-use fuzzing for
smart contract developers and security auditors. Echidna [
] is an
open-source Ethereum smart contract fuzzer. Rather than relying
on a xed set of pre-dened bug oracles to detect vulnerabilities
during fuzzing campaigns, Echidna supports three types of proper-
ties: (1) user-dened properties (for property-based testing [
]), (2)
assertion checking, and (3) gas use estimation. Currently, Echidna
can test both Solidity and Vyper smart contracts, and supports most
contract development frameworks, including True and Embark.
Echidna has been used by Trail of Bits for numerous code audits
]. The use of Echidna in internal audits is a key driver in
three primary design goals for Echidna. Echidna must (1) be easy
to use and congure; (2) produce good coverage of the contract or
blockchain state; and (3) be fast and produce results quickly.
The third design goal is essential for supporting the rst two
design goals. Tools that are easy to run and produce quick results
are more likely to be integrated by engineers during the develop-
ment process. This is why most property-based testing tools have
a small default run-time or number of tests. Speed also makes use
of a tool in continuous integration (CI) (e.g., more
practical. Finally, a fast fuzzer is more amenable to experimental
techniques like mutation testing [
] or using a large set of bench-
mark contracts. The size of the statistical basis for decision-making
and parameter choices explored is directly limited by the speed of
the tool. Of course, Echidna supports lengthy testing campaigns as
well: there is no upper bound on how long Echidna can run, and
with coverage-based feedback there is a long-term improvement
in test generation quality. Nonetheless, the goal of Echidna is to
reveal issues to the user in less than 5 minutes.
Fuzzing smart contracts introduces some challenges that are un-
usual for fuzzer development. First, a large amount of engineering
eort is required to represent the semantics of blockchain execution;
this is a dierent challenge than executing instrumented native
binaries. Second, since Ethereum smart contracts compute using
transactions rather than arbitrary byte buers, the core problem is
one of transaction sequence generation, more akin to the problem
of unit test generation [
] than traditional fuzzing. This makes it
important to choose parameters such as the length of sequences
] that are not normally as important in fuzzing or in single-value-
generation as in Haskell’s QuickCheck [
]. Finally, nding smart
contract inputs that cause pathological execution times is not an
exotic, unusual concern, as in traditional fuzzing [
]. Execution on
the Ethereum blockchain requires use of gas, which has a price in
cryptocurrency. Ineciency can be costly, and malicious inputs can
lock a contract by making all transactions require more gas than
a transaction is allowed to use. Therefore producing quantitative
ISSTA ’20, July 18–22, 2020, Virtual Event, USA Gustavo Grieco, Will Song, Artur Cygan, Josselin Feist, and Alex Groce
Figure 1: The Echidna architecture.
output of maximum gas usage is an important fuzzer feature, along-
side more traditional correctness checks. Echidna incorporates a
worst-case gas estimator into a general-purpose fuzzer, rather than
forcing users to add a special-purpose gas-estimation tool [
to their workow.
2.1 Echidna Architecture
Figure 1shows the Echidna architecture divided into two steps:
pre-processing and the fuzzing campaign. Our tool starts with a
set of provided contracts, plus properties integrated into one of the
contracts. As a rst step, Echidna leverages Slither [
], our smart
contract static analysis framework, to compile the contracts and
analyze them to identify useful constants and functions that handle
Ether (ETH) directly. In the second step, the fuzzing campaign starts.
This iterative process generates random transactions using the ap-
plication binary interface (ABI) provided by the contract, important
constants dened in the contract, and any previously collected sets
of transactions from the corpus. When a property violation is de-
tected, a counterexample is automatically minimized to report the
smallest and simplest sequence of transactions that triggers the
failure. Optionally, Echidna can output a set of transactions that
maximize coverage over all the contracts.
2.2 Continuous Improvement
A key element of Echidna’s design is to make continuous improve-
ment of functionality sustainable. Echidna has an extensive test
suite (that checks detection of seeded faults, not just proper execu-
tion) to ensure that existing features are not degraded by improve-
ments, and uses Haskell language features to maximize abstraction
and applicability of code to new testing approaches.
Tuning Parameters. Echidna provides a large number of cong-
urable parameters that control various aspects of testing. There are
currently more than 30 settings, controlled by providing Echidna
with a YAML conguration le. However, to avoid overwhelming
users with complexity and to make the out-of-the-box experience
as smooth as possible, these settings are assigned carefully chosen
default values. Default settings with a signicant impact on test
generation are occasionally re-checked via mutation testing [
] or
benchmark examples to maintain acceptable performance. This, like
other maintenance, is required because other functionality changes
may impact defaults. For example, changes in which functions are
called (e.g., removing view/pure functions with no assertions) may
necessitate using a dierent default sequence length. Parameter
tuning can produce some surprises with major impact on users:
e.g., the dictionary of mined constants was initially only used infre-
quently in transaction generation, but we found that mean coverage
on benchmarks could be improved signicantly by using constants
a full 40% of the time.
Before starting Echidna, the smart contract to test should have
either explicit
properties (public methods that return a
Boolean have no arguments) or use Solidity’s
to express
properties. For instance, Figure 2a shows a contract with a property
that tests a simple invariant. After dening the properties to test,
running Echidna is often as simple as installing it or using the
provided Docker image and then typing:
$ ec hi dn a t e s t C o n t r a c t . s o l −− c o n t r a c t TEST
An optional YAML conguration le overriding default settings
can be provided using the
option. Additionally, if a path
to a directory is used instead of a le, Echidna will auto-detect the
framework used (e.g. True) and start the fuzzing campaign.
By default, Echidna uses a dashboard output similar to AFL’s as
shown in Figure 2b. However, a command line option or a cong
le can change this to output plaintext or JSON. The cong le
also controls various properties of test generation, such as the
maximum length of generated transaction sequences, the frequency
with which mined constants are used, whether coverage driven
feedback is applied, whether maximum gas usage is computed, and
any functions to blacklist from the fuzzing campaign.
4.1 Setup
We compared Echidna’s performance to the MythX platform [
accessed via the
] interface, on a set of reachability tar-
gets. Our experiments are produced by insertion of
statements, on a set of benchmark contracts [
] produced for the
VeriSmart safety-checker [
]. To our knowledge, MythX is the only
comparable fuzzer that supports arbitrary reachability targets (via
supporting assertion-checking). Comparing with a fuzzer that only
supports a custom set of built-in detectors, such as ContractFuzzer
], which does not support arbitrary assertions in Solidity code,
is dicult to do objectively, as any dierences are likely to be due
to specication semantics, not exploration capability. MythX is a
commercial SaaS platform for analyzing smart contracts. It oers a
free tier of access (limited to 5 runs/month, however) and can easily
run assertion checking on contracts via
, which provides
an interface similiar to Echidna’s. MythX analyzes the submitted
contracts using the Mythril symbolic execution tool [
] and the Har-
vey fuzzer [
]. Harvey is a state-of-the-art closed-source tool, with
a research paper describing its design and implementation in ICSE
2020 [
]. We also attempted to compare to the ChainFuzz [
] tool;
unfortunately, it is not maintained, and failed to analyze contracts,
producing an error reported in a GitHub issue submitted in April
of 2019 (
4.2 Datasets
VeriSmart. To compare MythX and Echidna, we rst analyzed
the contracts in the VeriSmart benchmark [
] and identied all con-
tracts such that 1) both tools ran on the contract and 2) neither tool
Echidna: Eective, Usable, and Fast Fuzzing for Smart Contracts ISSTA ’20, July 18–22, 2020, Virtual Event, USA
co n t ra c t T E ST {
bo ol fl a g0 ; boo l fla g 1 ;
fu n ct i o n set 0 ( int va l ) public ret u rn s (b oo l ) {
if ( v a l % 10 0 = = 23 ) { fl a g 0 = true;}}
fu n ct i o n s et 1 ( int v al ) public r e tu r ns (b o ol ) {
if ( v a l % 10 == 5 & & f la g 0 ) { f la g 1 = true ;}}
fu n ct io n e ch id n a_ fl a g () public r et u rn s ( b o ol ) {
return( ! fl ag 1 ); }
(a) A contract with an echidna property. (b) A screenshot of the UI with the result of a fuzzing campaign
Figure 2: Using Echidna to test a smart contract
reported any issues with the contract. This left us with 12 clean con-
tracts to compare the tools’ ability to explore behavior. We inserted
statements into each of these contracts, after ev-
ery statement, resulting in 459 contracts representing reachability
targets. We discarded 44 of these, as the assert was unconditionally
executed in the contract’s constructor, so no behavior exploration
was required to reach it.
Tether. For a larger, more realistic example, we modied the
actual blockchain code for the TetherToken contract
, and again
targets to investigate reachability of the
code. Tether is one of the most famous “stablecoins”, a cryptocur-
rency pegged to a real-world currency, in this case the US dollar, and
has a market cap of approximately 6 billion dollars. The contract
has been involved in more than 23 million blockchain transactions.
4.3 Results
We then ran
’s default quick check and Echidna with a
2-minute timeout on 40 randomly selected targets. Echidna was
able to produce a transaction sequence reaching the assert for 19 of
the 40 targets, and solfuzz/MythX generated a reaching sequence
for 15 of the 40, all of which were also reached by Echidna. While
the time to reach the assertion was usually close to 2 minutes with
solfuzz, Echidna needed a maximum of only 52 seconds to hit the
hardest target; the mean time required was 13.9 seconds, and the
median time was only 6.9 seconds. We manually examined the
targets not detected by either tool, and believe them all to repre-
sent unreachable targets, usually due to being inserted after an
unavoidable return statement, or being inserted in the SafeMath
contract, which redenes assert. Of the reachable targets, Echidna
was able to produce sequences for 100%, and solfuzz/MythX for
78.9%. For Echidna, we repeated each experiment 10 more times,
and Echidna always reached each target. Due to the limit on MythX
runs, even under a developer license (500 executions/month), we
were unable to statistically determine the stability of its results to
the same degree, but can conrm that for two of the targets, a sec-
ond run succeeded, and for two of the targets three additional runs
still failed to reach the assert. Running
with the
argument (not available to free accounts) did detect all
four, but it required at least 15 minutes of analysis time in each
case. Figure 4shows the key part of the code for one of the four
targets Echidna, but not solfuzz/MythX (even with additional runs),
was able to reach. The assert can only be executed when a call
has been made to the
function, allowing the sender of
call to send an amount greater than or equal to
, and when the contract from which transfer is to be made
has a token balance greater than
. Generating a sequence
with the proper set of functions called and the proper relationships
between variables is a dicult problem, but Echidna’s heuristic
use of small numeric values in arguments and heuristic repetition
of addresses in arguments and as message senders is able to navi-
gate the set of constraints easily. A similar set of constraints over
allowances and/or senders and function arguments is involved in
two of the other four targets where Echidna performs better.
When using the Tether contract, we again randomly selected 40
targets, and ran two minutes of testing on each with
Echidna. Echidna was able to reach 28 of the 40 targets, with mean
and median runtimes of 24 and 15 seconds, respectively. The longest
run required 103 seconds. On the other hand, solfuzz/MythX was
unable to reach any of the targets using the default search. MythX/-
solfuzz was able to all detect the targets Echidna detected using
the standard search, and detected one additional target. The mean
time required for detection, however, was almost 16 minutes. The
additional target reached by
involves adding an address
to a blacklist, then destroying the funds of that address. Because
an address can also be removed from a blacklist, there is no simple
coverage-feedback to encourage avoiding this, and there are many
functions to test, Echidna has trouble generating such a sequence.
However, using a prototype of a swarm testing [
] mode not yet
included in the public release of Echidna, but briey discussed in
the conclusion below, we were able to produce such a sequence
in less than ve minutes. Even without swarm testing, we were
able to detect the problem in between 10 and 12 minutes, using a
branch (to be merged in the near future) that incorporates more
information from Slither, and uses some novel mutation strategies.
Of the 11 targets hit by neither tool, we manually determined that
all but two are clearly actually unreachable.
As a separate set of experiments, we measured the average cov-
erage obtained on the VeriSmart and Tether token contracts, with
various settings for the length of transaction sequences, ranging
from very short (length 10) to very long (length 500) for runs of 2,
ISSTA ’20, July 18–22, 2020, Virtual Event, USA Gustavo Grieco, Will Song, Artur Cygan, Josselin Feist, and Alex Groce
(a) TetherToken (b) VeriSmart (avg)
Figure 3: Coverage obtained given short runs (2, 5 and 10
minutes) with dierent transaction sequence lengths.
if ( b a la n ce s [ _ fr om ] >= _ am o un t
&& a l l ow e d [ _f r om ][ ms g . s en d er ] >= _ am o un t
&& _ a mo u n t > 0
&& b a l an c es [_t o ] + _ am o u nt > ba l an c es [ _ t o ]) {
ba l an c es [ _ f ro m ] - = _a m ou n t ;
al l ow e d [_ f ro m ][ msg . s en de r ] - = _a m ou n t ;
as s er t ( f al se );
Figure 4: Code for a dicult reachabilility target.
5, and 10 minutes each. Figures 3a and 3b show that the current
default value used by Echidna (100) is a reasonable compromise
to maximize coverage in short fuzzing campaigns. Each run was
repeated 10 times to reduce the variability of such short campaigns.
Echidna inherits concepts from property-based fuzzing, rst pop-
ularized by the QuickCheck tool [
] and from coverage-driven
fuzzing, perhaps best known via the American Fuzzy Lop tool [
Other fuzzers for Ethereum smart contracts include Harvey [
ContractFuzzer [
], and ChainFuzz [
]. We were unable to get Con-
tractFuzzer to produce useful output within a four hour timeout,
and ChainFuzz no longer appears to work. Harvey is closed-source,
but is usable via the MythX [
] CI platform and the
tool. Echidna uses information from the Slither static analysis tool
[10] to improve the creation of Ethereum transactions.
Echidna is an eective, easy-to-use, and fast fuzzer for Ethereum
blockchain smart contracts. Echidna provides a potent out-of-the-
box fuzzing experience with little setup or preparation, but allows
for considerable customization. Echidna supports assertion check-
ing, custom property-checking, and estimation of maximum gas
usage—a core feature set based on experience with security audits
of contracts. The default test generation parameters of Echidna
have been calibrated using real-world experience in commercial
audits and via benchmark experiments and mutation analysis. In
our experiments, Echidna outperformed a comparable fuzzer using
sophisticated techniques: Echidna detected, in less than 2 minutes,
many reachability targets that required 15 or more minutes with
, on both benchmark contracts and the real-world Tether
token. Echidna is under heavy active development. Recently added
or in-progress features include gas estimation, test corpus collec-
tion, integration of Slither static analysis information, and improved
mutation for feedback-driven fuzzing. One future work will add
a driver mode, similar to the swarm tool [
] for the SPIN model
checker [
], to make better use of conguration diversity, includ-
ing swarm testing [
], in order to fully exploit multicore machines.
In particular, this mode will enable Echidna to produce even more
accurate maximum gas usage estimates.
[1] VeriSmart benchmark.
Elvira Albert, Jesús Correas, Pablo Gordillo, Guillermo Román-Díez, and Albert
Rubio. Gasol: Gas analysis and optimization for ethereum smart contracts, 2019.
James H. Andrews, Alex Groce, Melissa Weston, and Ru-Gang Xu. Random test
run length and eectiveness. In Automated Software Engineering, pages 19–28,
Nicola Atzei, Massimo Bartoletti, and Tiziana Cimoli. A survey of attacks on
Ethereum smart contracts SoK. In International Conference on Principles of Security
and Trust, pages 164–186, 2017.
Vitalik Buterin. Ethereum: A next-generation smart contract and decentralized
application platform., 2013.
[6] Chain Security.
Koen Claessen and John Hughes. QuickCheck: a lightweight tool for random test-
ing of Haskell programs. In International Conference on Functional Programming
(ICFP), pages 268–279, 2000.
ConsenSys. Mythril: a security analysis tool for ethereum smart contracts., 2017.
[9] Consensys Diligence.
Josselin Feist, Gustavo Grieco, and Alex Groce. Slither: A static analysis frame-
work for smart contracts. In International Workshop on Emerging Trends in
Software Engineering for Blockchain, 2019.
Alex Groce, Josselin Feist, Gustavo Grieco, and Michael Colburn. What are
the actual aws in important smart contracts (and how can we nd them)? In
International Conference on Financial Cryptography and Data Security, 2020.
Alex Groce, Josie Holmes, Darko Marinov, August Shi, and Lingming Zhang.
An extensible, regular-expression-based tool for multi-language mutant genera-
tion. In Proceedings of the 40th International Conference on Software Engineering:
Companion Proceeedings, ICSE ’18, pages 25–28, New York, NY, USA, 2018. ACM.
Alex Groce, Chaoqiang Zhang, Eric Eide, Yang Chen, and John Regehr. Swarm
testing. In International Symposium on Software Testing and Analysis, pages 78–88,
Gerard Holzmann, Rajeev Joshi, and Alex Groce. Swarm verication techniques.
IEEE Transactions on Software Engineering, 37(6):845–857, 2011.
Gerard J. Holzmann. The SPIN Model Checker: Primer and Reference Manual.
Addison-Wesley Professional, 2003.
Bo Jiang, Ye Liu, and WK Chan. Contractfuzzer: Fuzzing smart contracts for vul-
nerability detection. In Proceedings of the 33rd ACM/IEEE International Conference
on Automated Software Engineering, pages 259–269, 2018.
Caroline Lemieux, Rohan Padhye, Koushik Sen, and Dawn Song. Peruzz: Auto-
matically generating pathological inputs. In Proceedings of the 27th ACM SIGSOFT
International Symposium on Software Testing and Analysis, pages 254–265, 2018.
Fuchen Ma, Ying Fu, Meng Ren, Wanting Sun, Zhe Liu, Yu Jiang, Jun Sun, and
Jiaguang Sun. Gasfuzz: Generating high gas consumption inputs to avoid out-of-
gas vulnerability, 2019.
[19] Bernhard Mueller. mueller/solfuzz.
Carlos Pacheco, Shuvendu K. Lahiri, Michael D. Ernst, and Thomas Ball. Feedback-
directed random test generation. In International Conference on Software Engi-
neering, pages 75–84, 2007.
Mike Papadakis, Marinos Kintis, Jie Zhang, Yue Jia, Yves Le Traon, and Mark
Harman. Mutation testing advances: an analysis and survey. In Advances in
Computers, volume 112, pages 275–378. Elsevier, 2019.
Sunbeom So, Myungho Lee, Jisu Park, Heejo Lee, and Hakjoo Oh. VeriSmart: A
highly precise safety verier for ethereum smart contracts. In IEEE Symposium
on Security & Privacy, 2020.
Trail of Bits. Echidna: Ethereum fuzz testing framework.
crytic/echidna, 2018.
Trail of Bits. Trail of bits security reviews. bits/
publications#security-reviews, 2019.
Gavin Wood. Ethereum: a secure decentralised generalised transaction ledger., 2014.
Valentin Wüstholz and Maria Christakis. Targeted greybox fuzzing with static
lookahead analysis. In International Conference on Software Engineering, 2020.
Michal Zalewski. american fuzzy lop (2.35b)./.
Accessed December 20, 2016.
... Many prior works explored vulnerability detection algorithms on Ethereum smart contracts, and most of them can be categorized into two types: (1) contract-level vulnerability detection and (2) line-level/node-level vulnerability detection. Contract-level vulnerability detection takes the detection as a classification problem and uses symbolic execution [21], [23], [24], fuzzing [12], [16], [35], and machine learning algorithms [19], [27], [37], [38], [43] to find vulnerable contracts. These methods achieve over 90% F1 scores in classification, but developers are still required to examine the contract lineby-line to localize buggy statements. ...
... Fuzzing: Fuzzing [12], [16], [28], [29], [31], [35] uses different algorithms to automatically generate inputs potentially trigger errors or unexpected behaviors to detect smart contract vulnerabilities. ContractFuzzer [16] analyzes the ABI interfaces of smart contracts to generate inputs that conform to the invocation grammars of the smart contracts under test. ...
... It defines new test oracles for different types of vulnerabilities and instrument EVM to monitor smart contract executions to detect security vulnerabilities. Echidna [12] incorporates a worst-case gas estimator into a general-purpose fuzzer. ...
Full-text available
Due to the immutable and decentralized nature of Ethereum (ETH) platform, smart contracts are prone to security risks that can result in financial loss. While existing machine learning-based vulnerability detection algorithms achieve high accuracy at the contract level, they require developers to manually inspect source code to locate bugs. To this end, we present G-Scan, the first end-to-end fine-grained line-level vulnerability detection system evaluated on the first-of-its-kind real world dataset. G-Scan first converts smart contracts to code graphs in a dependency and hierarchy preserving manner. Next, we train a graph neural network to identify vulnerable nodes and assess security risks. Finally, the code graphs with node vulnerability predictions are mapped back to the smart contracts for line-level localization. We train and evaluate G-Scan on a collected real world smart contracts dataset with line-level annotations on reentrancy vulnerability, one of the most common and severe types of smart contract vulnerabilities. With the well-designed graph representation and high-quality dataset, G-Scan achieves 93.02% F1-score in contract-level vulnerability detection and 93.69% F1-score in line-level vulnerability localization. Additionally, the lightweight graph neural network enables G-Scan to localize vulnerabilities in 6.1k lines of code smart contract within 1.2 seconds.
... Researchers have devised various detecting tools [12] to address security concerns for SCs. They employed multiple techniques, including symbolic analysis [13][14][15], dynamic analysis [16,17], formal verification [18][19][20], fuzzing [21][22][23], etc. However, these techniques excessively emphasize stringent rules defined by the experts. ...
... It requires inputting large amounts of random data. Tools such as ContractFuzzer [21], Echidna [22], and ILF [23] are fuzzy testing tools. ...
Full-text available
Smart contracts are utilized widely in developing safe, secure, and efficient decentralized applications. Smart contracts hold a significant amount of cryptocurrencies, and upgrading or changing them after deployment on the blockchain is difficult. Therefore, it is essential to analyze the integrity of contracts to design secure contracts before deploying them. As a result, the effective detection of various class vulnerabilities in smart contracts is a significant concern. While human specialists are still necessary for vulnerability detection methods that utilize machine learning and deep learning, these approaches often miss numerous vulnerabilities, leading to a significant false-negative rate. This research proposes a two-step hierarchical model using deep learning techniques that significantly improve the feature extraction mechanism for Ethereum smart contracts to circumvent these limitations. The first step is to determine the relationship between opcodes using a transformer for extracting the internal features of contracts to strengthen the contextual information. Then, a Bi-GRU is employed to aggregate forward and backward sequential information for long-term reliance, including vulnerable code. In the second step, the Text-CNN and spatial attention extract the local features to emphasize the significant semantics. Experiments conducted on 49,552 real-world smart contracts have demonstrated that the proposed method is more effective than state-of-the-art methods. Extensive ablation experiments are carried out to additional illustrate the framework design option's efficacy.
... Smart contract fuzzing is a valuable technique that has been extensively researched with promising results. [24,42,47]. ...
... Afterward, we craft transactions and observe if the transactions create an erroneous state in the blockchain. Note that this is a tedious validation process, but a common issue when developing smart contract fuzzers [24,42,47]. ...
Solana has quickly emerged as a popular platform for building decentralized applications (DApps), such as marketplaces for non-fungible tokens (NFTs). A key reason for its success are Solana's low transaction fees and high performance, which is achieved in part due to its stateless programming model. Although the literature features extensive tooling support for smart contract security, current solutions are largely tailored for the Ethereum Virtual Machine. Unfortunately, the very stateless nature of Solana's execution environment introduces novel attack patterns specific to Solana requiring a rethinking for building vulnerability analysis methods. In this paper, we address this gap and propose FuzzDelSol, the first binary-only coverage-guided fuzzing architecture for Solana smart contracts. FuzzDelSol faithfully models runtime specifics such as smart contract interactions. Moreover, since source code is not available for the large majority of Solana contracts, FuzzDelSol operates on the contract's binary code. Hence, due to the lack of semantic information, we carefully extracted low-level program and state information to develop a diverse set of bug oracles covering all major bug classes in Solana. Our extensive evaluation on 6049 smart contracts shows that FuzzDelSol's bug oracles find bugs with a high precision and recall. To the best of our knowledge, this is the largest evaluation of the security landscape on the Solana mainnet.
... A number of works [41]- [43] have been developed to automatically detect flaws in smart contracts. They mainly focus on static analysis [44]- [46] and fuzzing techniques [47]- [49] to discover vulnerabilities. For example, Slither [50] converts smart contracts into an intermediate representation and performs taint analysis to detect vulnerabilities. ...
Decentralized Finance (DeFi) is emerging as a peer-to-peer financial ecosystem, enabling participants to trade products on a permissionless blockchain. Built on blockchain and smart contracts, the DeFi ecosystem has experienced explosive growth in recent years. Unfortunately, smart contracts hold a massive amount of value, making them an attractive target for attacks. So far, attacks against smart contracts and DeFi protocols have resulted in billions of dollars in financial losses, severely threatening the security of the entire DeFi ecosystem. Researchers have proposed various security tools for smart contracts and DeFi protocols as countermeasures. However, a comprehensive investigation of these efforts is still lacking, leaving a crucial gap in our understanding of how to enhance the security posture of the smart contract and DeFi landscape. To fill the gap, this paper reviews the progress made in the field of smart contract and DeFi security from the perspective of both vulnerability detection and automated repair. First, we analyze the DeFi smart contract security issues and challenges. Specifically, we lucubrate various DeFi attack incidents and summarize the attacks into six categories. Then, we present an empirical study of 42 state-of-the-art techniques that can detect smart contract and DeFi vulnerabilities. In particular, we evaluate the effectiveness of traditional smart contract bug detection tools in analyzing complex DeFi protocols. Additionally, we investigate 8 existing automated repair tools for smart contracts and DeFi protocols, providing insight into their advantages and disadvantages. To make this work useful for as wide of an audience as possible, we also identify several open issues and challenges in the DeFi ecosystem that should be addressed in the future.
... These analysis tools have been applied to detect vulnerabilities in smart contracts, such as reentrancy [43,53], arithmetic overflow [51], state inconsistency problems [23], and access control problems [32,35,42]. Dynamic analysis tools, such as fuzz testing [36,37,58,61], automatically generate test cases or inputs for smart contracts to find abnormal behaviors during runtime. Formal verification techniques like Verx [48] and VeriSmart [50] can be used to check user-provided specifications. ...
Full-text available
Smart contracts are prone to various vulnerabilities, leading to substantial financial losses over time. Current analysis tools mainly target vulnerabilities with fixed control or dataflow patterns, such as re-entrancy and integer overflow. However, a recent study on Web3 security bugs revealed that about 80% of these bugs cannot be audited by existing tools due to the lack of domain-specific property description and checking. Given recent advances in Generative Pretraining Transformer (GPT), it is worth exploring how GPT could aid in detecting logic vulnerabilities in smart contracts. In this paper, we propose GPTScan, the first tool combining GPT with static analysis for smart contract logic vulnerability detection. Instead of relying solely on GPT to identify vulnerabilities, which can lead to high false positives and is limited by GPT's pre-trained knowledge, we utilize GPT as a versatile code understanding tool. By breaking down each logic vulnerability type into scenarios and properties, GPTScan matches candidate vulnerabilities with GPT. To enhance accuracy, GPTScan further instructs GPT to intelligently recognize key variables and statements, which are then validated by static confirmation. Evaluation on diverse datasets with around 400 contract projects and 3K Solidity files shows that GPTScan achieves high precision (over 90%) for token contracts and acceptable precision (57.14%) for large projects like Web3Bugs. It effectively detects groundtruth logic vulnerabilities with a recall of over 80%, including 9 new vulnerabilities missed by human auditors. GPTScan is fast and cost-effective, taking an average of 14.39 seconds and 0.01 USD to scan per thousand lines of Solidity code. Moreover, static confirmation helps GPTScan reduce two-thirds of false positives.
... As a final gap, we observed that a few tools, including [56,9,5,106,18,47,36], support different smart contract languages beyond of Solidity. These tools are actually the exception. ...
Blockchain technology is increasingly being adopted in various domains where the immutability of recorded information can foster trust among stakeholders. However, upgradeability mechanisms such as the proxy pattern permit modifying the terms encoded by a Smart Contract even after its deployment. Ensuring that such changes do not impact previous users is of paramount importance. This paper introduces CATANA, a replay testing approach for proxy-based Ethereum applications. Experiments conducted on real-world projects demonstrate the viability of using the public history of transactions to evaluate new versions of a deployed contract and perform more reliable upgrades.
Full-text available
An important problem in smart contract security is understanding the likelihood and criticality of discovered, or potential, weaknesses in contracts. In this paper we provide a summary of Ethereum smart contract audits performed for 23 professional stakeholders, avoiding the common problem of reporting issues mostly prevalent in low-quality contracts. These audits were performed at a leading company in blockchain security, using both open-source and proprietary tools, as well as human code analysis performed by professional security engineers. We categorize 246 individual defects, making it possible to compare the severity and frequency of different vulnerability types, compare smart contract and non-smart contract flaws, and to estimate the efficacy of automated vulnerability detection approaches.
Full-text available
We present the main concepts, components, and usage of Gasol, a Gas AnalysiS and Optimization tooL for Ethereum smart contracts. Gasol offers a wide variety of cost models that allow inferring the gas consumption associated to selected types of EVM instructions and/or inferring the number of times that such types of bytecode instructions are executed. Among others, we have cost models to measure only storage opcodes, to measure a selected family of gas-consumption opcodes following the Ethereum’s classification, to estimate the cost of a selected program line, etc. After choosing the desired cost model and the function of interest, Gasol returns to the user an upper bound of the cost for this function. As the gas consumption is often dominated by the instructions that access the storage, Gasol uses the gas analysis to detect under-optimized storage patterns, and includes an (optional) automatic optimization of the selected function. Our tool can be used within an Eclipse plugin for Solidity which displays the gas and instructions bounds and, when applicable, the gas-optimized Solidity function.
Conference Paper
Full-text available
This paper describes Slither, a static analysis framework designed to provide rich information about Ethereum smart contracts. It works by converting Solidity smart contracts into an intermediate representation called SlithIR. SlithIR uses Static Single Assignment (SSA) form and a reduced instruction set to ease implementation of analyses while preserving semantic information that would be lost in transforming Solidity to bytecode. Slither allows for the application of commonly used program analysis techniques like dataflow and taint tracking. Our framework has four main use cases: (1) automated detection of vulnerabilities, (2) automated detection of code optimization opportunities, (3) improvement of the user's understanding of the contracts, and (4) assistance with code review. In this paper, we present an overview of Slither, detail the design of its intermediate representation, and evaluate its capabilities on real-world contracts. We show that Slither's bug detection is fast, accurate, and outperforms other static analysis tools at finding issues in Ethereum smart contracts in terms of speed, robustness, and balance of detection and false positives. We compared tools using a large dataset of smart contracts and manually reviewed results for 1000 of the most used contracts.
Conference Paper
Decentralized cryptocurrencies feature the use of blockchain to transfer values among peers on networks without central agency. Smart contracts are programs running on top of the blockchain consensus protocol to enable people make agreements while minimizing trusts. Millions of smart contracts have been deployed in various decentralized applications. The security vulnerabilities within those smart contracts pose significant threats to their applications. Indeed, many critical security vulnerabilities within smart contracts on Ethereum platform have caused huge financial losses to their users. In this work, we present ContractFuzzer, a novel fuzzer to test Ethereum smart contracts for security vulnerabilities. ContractFuzzer generates fuzzing inputs based on the ABI specifications of smart contracts, defines test oracles to detect security vulnerabilities, instruments the EVM to log smart contracts runtime behaviors, and analyzes these logs to report security vulnerabilities. Our fuzzing of 6991 smart contracts has flagged more than 459 vulnerabilities with high precision. In particular, our fuzzing tool successfully detects the vulnerability of the DAO contract that leads to USD 60 million loss and the vulnerabilities of Parity Wallet that have led to the loss of USD 30 million and the freezing of USD 150 million worth of Ether.
Conference Paper
Performance problems in software can arise unexpectedly when programs are provided with inputs that exhibit worst-case behavior. A large body of work has focused on diagnosing such problems via statistical profiling techniques. But how does one find these inputs in the first place? We present PerfFuzz, a method to automatically generate inputs that exercise pathological behavior across program locations, without any domain knowledge. PerfFuzz generates inputs via feedback-directed mutational fuzzing. Unlike previous approaches that attempt to maximize only a scalar characteristic such as the total execution path length, PerfFuzz uses multi-dimensional feedback and independently maximizes execution counts for all program locations. This enables PerfFuzz to (1) find a variety of inputs that exercise distinct hot spots in a program and (2) generate inputs with higher total execution path length than previous approaches by escaping local maxima. PerfFuzz is also effective at generating inputs that demonstrate algorithmic complexity vulnerabilities. We implement PerfFuzz on top of AFL, a popular coverage-guided fuzzing tool, and evaluate PerfFuzz on four real-world C programs typically used in the fuzzing literature. We find that PerfFuzz outperforms prior work by generating inputs that exercise the most-hit program branch 5x to 69x times more, and result in 1.9x to 24.7x longer total execution paths.
Conference Paper
Mutation testing is widely used in research (even if not in practice). Mutation testing tools usually target only one programming language and rely on parsing a program to generate mutants, or operate not at the source level but on compiled bytecode. Unfortunately, developing a robust mutation testing tool for a new language in this paradigm is a difficult and time-consuming undertaking. Moreover, bytecode/intermediate language mutants are difficult for programmers to read and understand. This paper presents a simple tool, called universalmutator, based on regular-expression-defined transformations of source code. The primary drawback of such an approach is that our tool can generate invalid mutants that do not compile, and sometimes fails to generate mutants that a parser-based tool would have produced. Additionally, it is incompatible with some approaches to improving the efficiency of mutation testing. However, the regexp-based approach provides multiple compensating advantages. First, our tool is easy to adapt to new languages; e.g., we present here the first mutation tool for Apple's Swift programming language. Second, the method makes handling multi-language programs and systems simple, because the same tool can support every language. Finally, our approach makes it easy for users to add custom, project-specific mutations.