PreprintPDF Available

Blockchain Address Poisoning

Authors:
Preprints and early-stage research may not have been peer reviewed yet.

Abstract and Figures

In many blockchains, e.g., Ethereum, Binance Smart Chain (BSC), the primary representation used for wallet addresses is a hardly memorable 40-digit hexadecimal string. As a result, users often select addresses from their recent transaction history, which enables blockchain address poisoning. The adversary first generates lookalike addresses similar to one with which the victim has previously interacted, and then engages with the victim to ``poison'' their transaction history. The goal is to have the victim mistakenly send tokens to the lookalike address, as opposed to the intended recipient. Compared to contemporary studies, this paper provides four notable contributions. First, we develop a detection system and perform measurements over two years on Ethereum and BSC. We identify 13 times the number of attack attempts reported previously -- totaling 270M on-chain attacks targeting 17M victims. 6,633 incidents have caused at least 83.8M USD in losses, which makes blockchain address poisoning one of the largest cryptocurrency phishing schemes observed in the wild. Second, we analyze a few large attack entities using improved clustering techniques, and model attacker profitability and competition. Third, we reveal attack strategies -- targeted populations, success conditions (address similarity, timing), and cross-chain attacks. Fourth, we mathematically define and simulate the lookalike address-generation process across various software- and hardware-based implementations, and identify a large-scale attacker group that appears to use GPUs. We also discuss defensive countermeasures.
Content may be subject to copyright.
Blockchain Address Poisoning
Taro Tsuchiya1,Jin-Dong Dong1,Kyle Soska2,Nicolas Christin1
1Carnegie Mellon University, 2Independent
Abstract
In many blockchains, e.g., Ethereum, Binance Smart Chain
(BSC), the primary representation used for wallet addresses
is a hardly memorable 40-digit hexadecimal string. As a re-
sult, users often select addresses from their recent transaction
history, which enables blockchain address poisoning. The
adversary first generates lookalike addresses similar to one
with which the victim has previously interacted, and then en-
gages with the victim to “poison” their transaction history.
The goal is to have the victim mistakenly send tokens to
the lookalike address, as opposed to the intended recipient.
Compared to contemporary studies, this paper provides four
notable contributions. First, we develop a detection system
and perform measurements over two years on Ethereum and
BSC. We identify 13 times the number of attack attempts re-
ported previously—totaling 270M on-chain attacks targeting
17M victims. 6,633 incidents have caused at least 83.8M USD
in losses, which makes blockchain address poisoning one of
the largest cryptocurrency phishing schemes observed in the
wild. Second, we analyze a few large attack entities using im-
proved clustering techniques, and model attacker profitability
and competition. Third, we reveal attack strategies—targeted
populations, success conditions (address similarity, timing),
and cross-chain attacks. Fourth, we mathematically define
and simulate the lookalike address-generation process across
various software- and hardware-based implementations, and
identify a large-scale attacker group that appears to use GPUs.
We also discuss defensive countermeasures.
1 Introduction
Most modern blockchains rely on “wallets” for monetary
transfers. Wallet addresses are derived from cryptographic
public keys and are often represented by long, hard to mem-
orize, strings. However, most currency or token transfers re-
quire users to manually input or select the recipient’s wallet
address. A common practice is to copy and paste addresses
or select an address with which one has previously interacted.
This introduces a new attack, blockchain address poisoning.
An adversary generates one or more “lookalike” address(es)
whose first and last characters match those of an address the
victim often interacts with. The attacker then floods (“poi-
sons”) the target’s transaction history with the lookalike ad-
dress(es). As a result, the victim may erroneously send digital
assets to a lookalike address instead of the intended address.
In contrast to traditional bank transfers, blockchain transac-
tions are irreversible, making fund recovery challenging, and
thereby exacerbating the severity of any mistakes. In this
paper, we attempt to characterize the economics of address
poisoning and to uncover attackers’ strategies and capabilities
through large-scale measurements and simulations.
We design a detection system and run it between July 1,
2022, and June 30, 2024, on Ethereum and Binance Smart
Chain (BSC). Compared to contemporary studies [10,46], our
detection 1) covers all types of poisoning attacks, 2) demon-
strates greater robustness against attacker manipulation, and
3) generalizes better across time intervals and chains.
Combining the results from both chains, we identify over
270M attack attempts (i.e., 13 times the previous efforts:
21M [46] and 14M [10]) that target over 17M victims. Out of
those, attackers successfully received 6,633 transfers, adding
up to over 83.8M USD in ill-gotten gains. This attack appears
to be one of the largest cryptocurrency phishing schemes ob-
served in the wild. We validate our results by 1) manually
verifying the phished cases, 2) directly comparing the results
with the related works, 3) checking the smart contracts used
for those attacks, and 4) comparing the data collection results
with different parameters (e.g., a window size).
Of independent interest, we leverage our detection system
to capture accidental transfers (e.g., typing mistakes in the
recipient’s address). We confirm 681 accidental transfers with
5.5M USD losses, based solely on human mistakes which
underscores the need for more forgiving user interfaces.
Attackers typically optimize their operations by bundling
several transfers into a single transaction and reusing both
addresses and contracts. This behavior allows us to link attack
addresses through the “guilt-by-association” heuristic [43].
After removing transactions from unrelated bots that occa-
1
arXiv:2501.16681v1 [cs.CR] 28 Jan 2025
V: 0x3b75…2712a R: 0x168e8373d
L: 0x16827373d
1) 123.4 USDT
V: 0x3b752712a R: 0x168e8373d
L: 0x16827373d
1) 123.4 USDT
V: 0x3b752712a R: 0x168e8373d
L: 0x16827373d
1) 123.4 USDT
(a) Tiny transfer
V: 0x3b752712a R: 0x168e8373d
L: 0x16827373d
1) 123.4 USDT
V: 0x3b75…2712a R: 0x168e8373d
L: 0x16827373d
1) 123.4 USDT
V: 0x3b752712a R: 0x168e8373d
L: 0x16827373d
1) 123.4 USDT
(b) Zero-value transfer
V: 0x3b752712a R: 0x168e8373d
L: 0x16827373d
1) 123.4 USDT
V: 0x3b752712a R: 0x168e8373d
L: 0x16827373d
1) 123.4 USDT
V: 0x3b75…2712a R: 0x168e8373d
L: 0x16827373d
1) 123.4 USDT
(c) Counterfeit token transfer
Figure 1: Poisoning transfers
sionally copy poisoning transactions from attackers, we can
establish the existence of a few large distinct attack groups
which we use to model the economics of address poisoning.
For each group, we calculate revenues (i.e., a sum of phished
amounts) and costs (i.e., total attack transaction costs). De-
spite high variability, most large groups manage to stay prof-
itable by succeeding often, while smaller groups sometimes
suffer from loss (by losing to the competition). There is no
apparent early mover advantage.
We also analyze attack strategies. First, we evaluate the
characteristics of the targeted victims to learn what attack-
ers are looking for. We find that targets are more likely to
have had higher balances at the time of the attack, conducted
more transactions, and transferred a larger amount compared
to random stablecoin users. Second, we investigate what influ-
ences attack success. We find that victims are more likely to
choose a lookalike address that is more similar to or appears
earlier than others. Third, given that addresses are compat-
ible across Ethereum and BSC, some attack groups re-use
lookalike addresses and target victims across chains.
To infer attackers’ (hardware) lookalike address genera-
tion capabilities, we simulate address generation rates. We
implement our scripts both naively and with optimizations
and compare the performance over different CPUs and GPUs.
Based on our benchmarks, one sophisticated attack group
appears to be using GPUs while others seem to rely on CPUs.
Finally, we show that attackers are not as efficient as they
could be, suggesting that the attack would be even more pow-
erful if they optimized their strategies. This motivates us to
propose a few mitigations for address poisoning attacks at the
protocol, contract, wallet, and user levels.
2 Background
We next discuss 1) basic blockchain primitives (smart con-
tracts, ERC-20 tokens), 2) wallet types and addresses, and 3)
differences between address poisoning and phishing attacks.
2.1 Blockchain primitives
Most popular modern public blockchains are either UTXO
(Unspent Transaction Output)-based or account-based. On
UTXO-based chains such as Bitcoin, address reuse is mini-
mal: most addresses are used only for two transactions (one
to receive some money, one to spend it). On the other hand,
account-based chains such as Ethereum see significant ad-
dress reuse. Account-based chains are therefore acutely sus-
ceptible to address poisoning as users frequently reference
recipient addresses from past transactions. In this paper, we
only focus on account-based chains, so we use “account” and
“wallet addresses” interchangeably.
Ethereum (and derived blockchains) features two types of
addresses: wallet addresses (also called External Owned Ac-
counts—EOA) and contract addresses. Wallet addresses to
send Ether (ETH, the native currency) or interact with smart
contracts (whose addresses are contract addresses). Smart
contracts are executed by all nodes using the Ethereum Vir-
tual Machine (EVM). Smart contracts are developed in a
higher level programming language (e.g., Solidity), and then
compiled to “bytecode” intelligible by the EVM before be-
ing deployed. Smart contracts allow users to perform actions,
such as token implementation, token transfer. Of particular
interest, ERC-20 tokens (e.g., USDT, USDC) are fungible
assets, defined by the ERC-20 framework [40], for which
creators specify a token name, symbol, and value unit. Essen-
tially ERC-20 tokens allow developers to implement other
cryptocurrencies or payment systems on top of Ethereum.
For every action (e.g., sending ETH or ERC-20 tokens,
or more generally, interacting with any contract), users pay
“gas, that is, transaction fees dependent on the computational
complexity of the action requested. Other chains, such as
Binance Smart Chain (BSC), that inherit Ethereum’s structure
are called EVM-compatible chains. Notably, wallet addresses
are compatible across EVM chains, allowing users to hold
tokens across chains in one wallet.
2.2 Wallet and blockchain address
Most users will interface with the blockchain via wallet soft-
ware such as MetaMask or Phantom. These powerful tools
enable users to have full control over their funds and perform
complex tasks such as engaging in DeFi (decentralized fi-
nance) activities. When sending direct payments, users must
often manually specify the recipient address, exacerbating
usability challenges—since addresses are represented by long
2
(40-character), hardly memorable, hexadecimal strings. This
UX challenge facilitates a significant threat from phishing or
scams and motivates several concerns about wallet interfaces
(e.g., the absence of strong enough warnings) [47].
Generating a wallet address in an EVM-compatible chain
is a three-step process:
(1) Choose a 32-byte private key (
k
), which must be truly
random to avoid a wallet breach [1].
(2) Apply elliptic curve (ECDSA) multiplication to produce
a public key
K=k×G
where
G
is the fixed generator point
on the elliptic curve.
(3) Hash (Keccak 256) the public key
K
, take the last 20
bytes, and add prefix 0xto obtain the address.
Elliptic curve (ECDSA) multiplication is computationally
hard to invert, meaning that it is impractical to derive a private
key from a public key. So, to generate a public key with a
certain format (e.g., a vanity address), the only viable strategy
is to perform a brute-force computation over potentially many
private keys. The complexity of the brute force computation
depends on the number of characters that have to be fixed
in the desired public key address. For instance, it is roughly
163
times harder to generate an (hexadecimal) address with
a 3-character desired prefix compared to a random public
key address. Attackers adopt this brute-force technique to
generate lookalike addresses.
2.3 Attack characteristics
Phishing attacks on domain names have several variants.
Examples include typosquatting [21], which involves users
mistyping domain names; combosquatting [16], wherein at-
tackers append related keywords to domain names; and ho-
mograph attacks [7], where attackers substitute characters in
a domain name with visually similar ones (e.g., Cyrillic char-
acters instead of Latin [13]). All these attacks exploit human
features—our inability to distinguish between graphically
near-identical representations, as well as our ability to quickly
make inferences from incorrect text [28]. Domain-name-
based phishing attacks may resort to indirect monetization,
e.g., ad impressions, or affiliate program enrollment [2,13,21].
Blockchain address poisoning is markedly different from
domain-name based phishing attacks in several respects. First,
it focuses on the direct extraction of funds, and it may be
more challenging for a victim to recover funds due to the
irreversibility of cryptocurrency payments.
Second, blockchain address poisoning attempts are perma-
nently recorded in a public ledger, allowing for more accurate
estimates of the losses [19]. Estimating the impact of tradi-
tional phishing attacks is challenging due to data confidential-
ity [24] and phishing website ephemerality [3,19,37,39].
As a third difference, address poisoning targets hexadeci-
mal strings, while domain-name based attacks typically focus
on character strings. Previous typosquatting studies intro-
duced various metrics for quantifying domain name similar-
ity, such as Damerau-Levenshtein (DL) distance [6], fat fin-
ger (FF) distance [21], and visual distance [32]. However,
blockchain address poisoning involves copy-and-paste or
multiple-choice selection rather than manual typing of char-
acters. Additionally, long hexadecimal addresses are often ab-
breviated to their prefix and suffix, especially on constrained
interfaces such as mobile. Thus, DL and FF distances are
less relevant here. We instead use the number of matching
hexadecimal digits at the beginning and at the end.
Fourth, domain-name based attacks often proceed in multi-
ple stages to steal funds: the victim first clicks on a malicious
URL, is directed to a phishing website and is prompted to
input sensitive information that can eventually be monetized.
On the other hand, blockchain address poisoning is far more
direct. Victims only need to make the single mistake of sub-
mitting transactions by selecting the attacker’s address.
Finally, blockchain address poisoning has differing costs to
the attacker. They only need to deploy attack smart contracts
(which can be even recycled from others for free), and one
poisoning transfer costs only about a dollar in Ethereum and
a cent in BSC. Generating lookalike addresses however has a
variable cost that can be adjusted by the attacker, ranging any-
where from a laptop CPU to data centers of GPUs. Domain-
based phishing attacks, on the other hand, require register-
ing/hijacking domains, crafting phishing websites, and/or buy-
ing tool kits [3], all of which is expensive and time-intensive.
3 The attack
This section describes address poisoning in detail. We first
introduce our threat model and attacker capabilities, discuss
various ways to carry out address poisoning, and examine
how attackers execute them in practice.
3.1 Threat model
We assume the existence of an attacker that passively moni-
tors all public blockchains simultaneously in real-time. When
available the attacker may also observe a state that has yet
to be finalized in a chain such as transactions in the mem-
pool. The attacker can also actively engage with any chain by
launching contracts and sending transactions using many ac-
counts, bounded only by the chain and economic constraints.
Finally, the attacker can generate lookalike addresses at a cost
consistent with (publicly known) state-of-the-art hardware.
We do not assume that the attacker has any out-of-band
information about or agency towards a victim other than what
can be obtained through public blockchains. Therefore attack-
ers cannot influence the wallet software a victim is using, or
make predictions about a victim’s transfers beyond what can
be inferred through public signals. We also assume that the
security models for the individual chains hold, so attackers
cannot, for example, perform denial of service on a chain or
drop/modify a victim’s transactions.
3
The attacker’s ultimate goal is to have the victim send as-
sets to (one of) the attacker’s lookalike address(es) instead
of the victim’s intended address. To do so, the attacker first
monitors blockchain transactions and identifies asset (ETH
or ERC-20 tokens) transfer events. When such transfers are
observed, the attacker identifies an intended recipient wallet,
R
, and generates a lookalike address
L
. We will later numeri-
cally evaluate the computational requirements to get lookalike
addresses convincing enough for the attack to be viable.
The attacker then submits to the chain phishing transac-
tions that interact with the victim
V
and contain the lookalike
address
L
, typically within a short period of time. When the
victim later attempts to send assets to
R
, they could get con-
fused and mistakenly send funds to
L
instead. This confusion
typically arises through some combination of UI/UX limita-
tions, and the victim failing to carefully check the recipient. A
common vector is for a victim to transfer funds to an address
that is pre-populated by their wallet software in a list of ad-
dresses with which they had recent interactions. We focus on
ERC-20 (or BEP-2 in BSC) tokens for the underlying asset
as attacks with these tokens are most prevalent.
3.2 Transfer types
The attacker poisons the victim through three types of token
transfers: tiny transfers,zero-value transfers, and counterfeit
token transfers. Figure 1depicts these variations.
In a tiny transfer (Fig. 1(a)), the attacker always sends
from
L
to
V
a small amount of the same token that
V
had
previously sent to
R
. The idea is that
L
now shows up in
V
’s
recent transaction history. Later,
V
could mistakenly assign
L
as the recipient due to its similarity to
R
. This attack is similar
to blockchain dusting [42], in which the attacker sends many
small transactions to a victim for deanonymization.
In a zero-value transfer (Fig. 1(b)), the attacker creates
a transfer from
V
to
L
for the same token that
V
sent to
R
, but with the value of zero. Typically, token transfers can
only be sender-initiated. However, the current token standard
features a public function called transferFrom, and popular
token implementations allow the function callers to specify an
arbitrary sender and an arbitrary receiver if the transferred
value is zero. Therefore, the attacker can record a transfer
event from
V
to
L
of the same ERC-20 token with zero value.
Zero-value transfers are potentially even more dangerous than
tiny transfers, as the victim appears to have previously sent
money to the lookalike address.
In a counterfeit token transfer (Fig. 1(c)), the attacker sends
from
V
to
L
the same amount
V
sent to
R
, but with a counter-
feit token created by the attacker (“USDTT here). Since the
attacker controls the implementation of the counterfeit token,
they can have the ability to transfer it on behalf of anyone.
There is no restriction on names, symbols, or functionality for
ERC-20 tokens [8,43], which makes it easy to create similar-
looking tokens for malicious purposes. Because most popular
wallet software maintains a curated list of popular (chain, con-
tract address, symbol) tuples, attackers often avoid names (or
symbols) identical to mainstream tokens to avoid triggering a
warning. In the case of new or recently launched tokens, they
could even proactively register the token name first [46].
Finally, these types of transactions are not mutually exclu-
sive. Indeed, the attacker can combine these strategies, flood
the victim with transactions, and wait for a mistake.
3.3 Attack implementation
L1
L2
V1
V2
A1
TX1
AC1
L4
V4
TX2
L5
L6
V5
TX3
AAttack address
Call
Attack contract
CT Counterfeit token
VVictim’s address
AT Authentic token
V6
TX Poison ing t ransa ction
TR1
TR2
TR3
TR4
TR5
TR6
:
:
:
:
:
:
TR Poison ing t ransfer
Execute
A2 AC2
Call Execute
A1 AC1
Call Execute
AC
CT1
AT
Value: 0
Value: 100
AT
Value: 0
Value: 0.1
CT1
AT
Value: 0.1
Value: 100
(fake)
(zero)
(zero)
(tiny)
(fake)
(tiny)
AT
L3 V3
Lookalike address
L
Figure 2: Attack mechanism
Figure 2shows one method of implementing blockchain
address poisoning in practice. Wallets are represented by rect-
angles and contracts (or ERC-20 tokens created by contracts)
are represented by ovals. We color attackers’s instances in
pink. First, an attack account (
A
) deploys a tailor-made attack
contract (
AC
) and calls a function from it to execute multiple
poisoning transfers at once. The attacker can also directly
interact with the token contract to initiate a single poisoning
transfer (without
AC
), which we observe in some cases despite
its higher gas cost. The transaction
T X
bundles together sev-
eral potentially unrelated phishing transfers, in this case, the
transfer of a lookalike token from
V1
to
L1
and a zero-value
transfer from
V2
to
L2
. Notice also that attacker
A1
recycles
their contract
AC1
and
CT 1
to perform additional phishing
transfers in
T X 3
, and the infrastructure that
A1
has embedded
on a chain is typically distinct from that of other attackers
such as
A2
’s use of
AC2
. It is also common for multiple at-
tackers to target the same victim (e.g.,
V1=V4
). As we will
later describe, we have observed cases where attackers launch
more than 100 poisoning transfers in a single transaction.
4 Detecting attacks
We next explain how we implement our detection system and
how we cluster individual attack instances into attack groups.
4.1 Detection algorithm
Ye et al. [46] identify zero-value transfers by enumerating all
token transfers with zero value. While computationally effi-
4
Block nBlock n+1 Block n+m+1
….
1. Find V = {V1, V2, …} & R = {R1, R2, …} (in stablecoins)
2. Find potential poisoning transfers & C = {C1, C2, …}
3. Calculate similarity between R = {R1, R2, …} & C = {C1, C2, …},
extract L = {L1, L2, …}
1
2
3
4
5
35
38
40
R
0x
a
3
6
1
b
4
3
c
C
0x
a
3
6
3
b
2
3
c
Ignore the middle
Match (a=3) Match (b=4)
For every block n
in two years
4. Categorize transfers
Label
From
To
Token
Value (v )
Intended V R AT v > 0
Tiny L V AT 0 < v < thresh
Zero V L AT v = 0
Countefeit V L CT v > 0
Payoff V L AT v > 0
5. Confirm payoff transfers
Intended PayoffAttack
Time (block)
Intended Payoff
Retrieve the entire history of V and re-check
Figure 3: Detection algorithm sketch
cient, this method does not consider tiny and counterfeit token
transfers and possibly mislabel benign zero-value transfer at-
tacks, e.g., those used for testing a destination address. Guan
and Li [10] capture all three poisoning transfers but follow an
approach slightly different from ours. They flag suspicious
transfers based on attack characteristics first, whereas we filter
transfers when a victim interacts with two improbably similar,
but different addresses. Our approach appears more robust to
attacker manipulation (refer to Appendix A). We deploy our
own Ethereum and BSC full nodes to scan the data efficiently
instead of relying on third-party RPC endpoints. Figure 3
sketches our detection system, which consists of five steps.
Step 1: Identifying potential victims. For block
n
, we col-
lect the list of addresses (senders)
V={V1,V2,. ..}
who have
sent any amount of one (or more) of the major stablecoins
1
and the addresses
R={R1,R2,. . .}
that have received these
tokens.
V
is the set of (potential) victim addresses, and
R
is
the set of intended addresses.
Step 2: Identifying possible poisoning transfers. We next
collect potential poisoning transfers, i.e.,the attempt(s) to tam-
per with the victim’s address lists. These poisoning transfers
often happen soon after the attacker has identified a potential
victim through the original token transfer (which we collected
in Step 1). So, we focus on the 20 minutes following the orig-
inal transfer, which corresponds to blocks
n+1
to
n+m+1
where
m=100
and
400
for Ethereum and BSC, respectively.
We validate this choice of window size in §5.2.
We look at all transfers in stablecoins i.e., we search all
transfers sent to (or from) addresses in
V
for potential tiny
(or zero value) transfers. We also retrieve all ERC-20 token
transfers for which an address in
V
is a source for potential
counterfeit token transfers. We denote the set of recently used
addresses that interact with Vby C={C1,C2,. ..}.
We iterate this two-step process for every block
n
over a
two-year interval.
2
Given that attackers often launch multiple
poisoning transfers within a single transaction, we further
retrieve other transfer events (if any) in the same poisoning
1
Tether (USDT), USD Coin (USDC), and DAI for Ethereum and Binance-
Peg USD Coin (USDC), Binance-Peg BSC-USD (BSC-USD), Binance-Peg
BUSD (BUSD), and any USDC for BSC.
2
We collect transfer events through the
eth_getLogs
JSON-RPC
API with signature
0xddf252ad1be2c89b69c2b068fc378daa952ba7f1
63c4a11628f55a4df523b3ef.
transactions. For instance, in Figure 2, if we detect
T R1
, we
can also capture T R2.
Step 3: Extracting lookalike addresses. We then extract the
lookalike addresses
L
that the attackers create to impersonate
the intended addresses
R
. While researchers typically apply
Hamming distance to hexadecimal strings [16], the middle
part of a string matters significantly less in address poison-
ing. As a result, we develop our similarity score algorithm
focusing on the beginning and ending hexadecimal characters
of the address. For simplicity, we do not consider checksum
addresses and regard upper and lower cases as equivalent. For
every pair
(Ri,Cj)
where
RiR
is an intended address and
CjC
a recently used address, we check the length of the
prefix and suffix match which we record as a similarity score
(a,b)
. To reduce false positives (i.e., addresses that are similar
to each other by coincidence), we only consider the addresses
in
C
for which
a3
and
b4
.
3
We choose this threshold
based on the common visual representations of blockchain
addresses, which use abbreviations [46]. We denote the set
of these lookalike addresses by
L={L1,L2,. . .} C
. By the
birthday paradox, if a victim interacts with many different
accounts, there is a high chance of interacting with a similar-
looking address by coincidence, leading to potential false
positives. Appendix Bdiscusses how we set the threshold to
exclude victims who are highly susceptible to this issue while
capturing nearly all attacks.
Step 4: Pruning possible attacks. For each victim
V
, the
corresponding intended address
R
, and the associated looka-
like address
L
, we extract all related token transfer events
collected in step 2. In other words, we only look at the cases
where the victim
V
interacts with two very similar addresses
R
and
L
, making false positives statistically unlikely (except
for those caused by typing mistakes, which we separately
discuss in Appendix C).
To identify counterfeit tokens, we acquire a list of ERC-
20 and BEP-20 tokens from two sources: 1) Etherscan and
BSCscan
4
and 2) Coinranking.
5
The union of these lists forms
the set of authentic tokens
AT
; all others form the set of
counterfeit tokens C T .
We then categorize the transfers we have collected into
three major categories (intended, poisoning, payoff), distin-
guishing poisoning transfers between tiny, zero-value, and
counterfeit token transfers per their description in Figure 1.
Let vbe the monetary value of the transfer. Formally:
A transfer is an intended transfer if a potential victim
V
sends an authentic token
AT AT
to an intended
recipient R, and v>0.
A transfer is a poisoning transfer:
3
Previous work on typosquatting or combosquatting [16,21] avoids short
domain names for the same reason.
4
Tokens with at least a “Neutral” or “OK” reputation from Etherscan:
https://etherscan.io/tokens
and BSCscan:
https://bscscan.com/
tokens as of August 13, 2024.
5https://coinranking.com/coins/erc-20 as of August 31, 2024.
5
1.
If a lookalike address
L
sends an authentic token
AT
to a victim
V
with
0<v<v0
(tiny transfer at-
tack). We set the threshold
v0
to 10 USD to exclude
unreasonably high transfers from the attacker.6
2.
Or, if a victim
V
sends an authentic token
AT
to a
lookalike address
L
with
v=0
(zero-value transfer
attack).
3.
Or, if a victim
V
sends a counterfeit token
CT
C T
to a receiver
L
(counterfeit token transfer at-
tack). Recall that because the token was created by
the attacker, they could implement any semantics
they desire, notably having anybody send tokens
to anybody else, without any authentication.
A transfer is a payoff transfer if the victim
V
sends an
authentic token
AT
(
v>0
) transfer to a lookalike address
L
. This is the transfer that ultimately monetizes the attack
for the adversary.
The top-right table in Figure 3summarizes our labeling
scheme. Red cells are attackers. Note that we convert the
transferred value into USD on the day of the transfer. The
price data is from CoinGecko API based on a contract address.
Step 5: Verifying payoffs. Among all possible payoff trans-
fers found in Step 4, we consider a transfer to be confirmed if
V
has received at least one poisoning transfer from
L
between
the intended transfer and the (candidate) payoff transfer.
Others are unconfirmed payoff transfers—the victim sends
AT
to
L
without having received a poisoning transfer from
L
.
This can happen for three reasons.
First, the victim could have legitimately interacted with two
similar addresses by sheer coincidence. This is statistically
unlikely, specifically because we already filter out addresses
particularly vulnerable to the “birthday paradox” (refer to
Appendix Bfor our detailed analysis).
Second, the attack may have taken place over a time interval
exceeding our 20-minute monitoring window. To address this,
for each unconfirmed payoff transfer, we fetch the victim’s
entire token transfer history (from Etherscan and BSCscan)
between the intended transfer and the unconfirmed payoff
transfer. If we find any poisoning transfers we had failed to
capture, we confirm the payoff transfer.
Third, the victim could have accidentally sent tokens to an
incorrect destination address, e.g., due to a typing mistake.
If 1) the destination address never sent any tokens, i.e., the
attacker does not redeem the payoff transfer, and 2)
d
is more
than 20, i.e., computationally infeasible to generate such a
L
,
we assume that the destination address was actually a typo.
We indeed find 681 accidental transfers from 501 users, total-
ing 5.5 million USD in losses (almost all in Ethereum, with
only 57,000 USD in BSC): Appendix Cfor more detailed
method, results, and analysis.
6
More than 98.79% of tiny transfers below 10 USD are less than 3 USD.
We manually found some rare cases where the attackers sent nearly 10 USD
as a tiny transfer.
4.2 Clustering attackers
Following traditional cybercrime literature [33], we want to
link attack instances to extrapolate attack groups and analyze
strategies, profitability, and infrastructure. Each address poi-
soning attack consists of 1) a poisoning transfer (
T R
), 2) a
poisoning transaction (
T X
) that initiates
T R
, 3) a lookalike
wallet address (
L
), 4) a counterfeit token (
CT
), 5) an attack
contract (
AC
) that initiates
T X
, and 6) an attack wallet address
(A) that interacts with AC (see Figure 2).
We associate each poisoning transfer
T R
with the corre-
sponding attack instance
{T X ,L,CT,AC,A}
. Similar to the
“guilt-by-association” technique [43], we merge two attack
sets if they share at least one of
T X
,
L
, or
A
. In other words,
we assume that 1) two poisoning transfers in the same transac-
tion belong to the same attacker, 2) an attacker is economically
rational and specifies addresses
L
they control (to successfully
receive payoffs), and 3) an attacker does not share the private
key of the attacker’s wallet
A
with other attackers. We do not
use information on
AC
or
CT
to avoid spurious cluster merges
because anyone can call an attack contract or use a counterfeit
token, regardless of source code availability. For instance, in
Figure 2, we 1) merge (
T R1
,
T R2
) because both transfers are
in the same transaction, 2) merge (
T R1
,
T R3
) if
L1=L3
, and
3) merge (T R3,T R4,T R5,T R6) if A1=A2.
Having applied this process to our entire dataset, we no-
tice that two seemingly distinct large attack groups—active
at different times or using different attack strategies—can be
merged due to a few overlapping attack addresses
A
. Man-
ual investigation revealed that those accounts appear to be
bots copying poisoning transactions from two different attack
groups with little or no modification, leading to a seemingly
erroneous merge. Indeed, Guan and Li [10] performed similar
clustering techniques, but did not consider the existence of
bots and appeared to form one large cluster.
To prevent such issues, we explored several methods to
exclude transactions from bots. While most of the approaches
either still produce erroneous merges or remove too many
transactions and underestimate the group size, analyzing ac-
count history effectively addresses both issues. For each attack
address
A
, we additionally fetch their account history from
Etherscan and calculate an attack ratio, defined as the propor-
tion of transactions used for poisoning transfers. We then set
a threshold to exclude addresses that have a low attack ratio.
We document the details of those bots’ behavior and the bot
exclusion process in Appendix D.
5 Results and evaluation
We apply our detection algorithm over the period Jul. 1st,
2022 to Jun. 30th, 2024 for both Ethereum (blocks 15,053,226
to 20,207,948) and BSC (blocks 19,170,674 to 40,077,592).
The chain scanning process (written in Go) takes over two
weeks on Ethereum and over a month on BSC (in Ubuntu
6
20.04.5 LTS server; AMD EPYC 9124 16-Core Processor;
128GB RAM), highlighting the need for our optimizations.
5.1 Detection results
We present the summary statistics for poisoning transfers in
Table 1and confirmed payoff transfers in Table 2
On Ethereum, we find 17.3M poisoning transfers:
300,000 tiny transfers, 7.2M zero-value transfers, and 9.9M
counterfeit-token transfers over 1.7M transactions. Attackers
target 1.3M victim addresses from 6.5M lookalike addresses.
On Feb. 18th, 2023, we capture 362,934 poisoning transfers
(
50 transfers per block). Those attack transactions consume
6.6% of total gas used that day, producing wasteful state on
the chain and increasing network operating cost7.
Attackers on Ethereum receive 1,738 transfers from 1,502
victims, totaling nearly 80M USD (avg. 45,853 USD, median
2,169 USD, with high variance), with a success rate is 0.01%
8
.
Some victims sent assets to the attacker’s address(es) multiple
times. For instance, one victim sent 1.999M and 2M USDC
within 10 blocks without noticing they were being phished.
On BSC, the number of attacks is significantly larger. We
identify over 252M poisoning transfers: 3.6M tiny transfers,
141M zero-value transfers, and 108M counterfeit-token trans-
fers in 17M transactions. Attackers target 16M victim ad-
dresses from 44M lookalike addresses. On Jun. 5th, 2024,
we find over 3M poisoning transfers (i.e.,
105 transfers
per block). The result suggests a higher attack prevalence in
chains with lower transaction fees, leading to a significant
clutter in UIs and degradation in user experience. Attackers
on BSC receive 4,895 transfers from 4,004 victims, totaling
4.5M USD (avg. 1,164 USD, median 85 USD).
As discussed in §4, we only focus on a few stablecoins for
data collection efficiency, so our estimate is a conservative
lower bound. For instance, in a well-known address poisoning
attack, a victim mistakenly sent 1,155 WBTC (
68M USD)
to a lookalike address in Ethereum [34,38]. We capture the
attack attempt (which belongs to the 13th largest group in our
clustering results in Section 5.3) but we do not capture the
final transfer (i.e., WBTC which is not a stablecoin), so our
loss estimates do not include this attack.
5.2 Detection evaluation
Manual evaluation on successful cases. Statistically, the
probability of getting a false positive is exceedingly low be-
cause 1) we only look at transfers where the victim has inter-
acted with two similar addresses, and 2) we observe poison-
ing transfers between an intended and a payoff transfer take
place. Nevertheless, two of the authors also manually verified
transfers labeled as successful payoff transfers. We sorted
7
We assume the average block gas limit of 15 million, and aggregate gas
usage of all poisoning transactions to calculate the percentage.
80.13% spam click-through rate in Twitter [9]
payoff transfers by decreasing the amount and checked the 30
largest transfers for both chains. We visit each victim’s page
on external blockchain scanning websites (e.g., Etherscan and
BSCscan) and manually confirmed 1) the intended (original)
transfer, 2) the poisoning transfer, and 3) the payoff transfer.
We verified that all 30 cases are indeed payoff transfers for
both Ethereum and BSC (i.e., no false positives). Those 30
cases account for 53.5M (67%) of Ethereum losses and 2M
(45%) of BSC losses. For accountability purposes, we list the
top 30 successful cases for Ethereum and BSC in Appendix G.
Comparison with previous work. We directly compare our
detection results by focusing on 12 and 16 months of data to
align with Ye et al.’s work [46] and Guan and Li’s [10] work,
respectively. We detect more payoff transfers than Ye et al. We
find 1,059 successful cases with 36.6M USD in losses while
Ye et al. found 560 successful cases with at most 28M USD in
losses, including two separate (non-address poisoning) attacks
they investigate. In terms of poisoning transfers, we identify
4.9M zero-value token transfers, compared to 21M for Ye et
al. It is unclear whether the difference comes from 1) false
positives on Ye et al.s side, (i.e., benign zero-value token
transfers), or 2) false negatives on our side.
We find a similar total loss on Ethereum (during the same
period) as Guan and Li [10], but we detect more than 2.2M
counterfeit token transfers, mostly from tokens with modified
symbols (refer to a more detailed comparison in Appendix A).
Checking the contract verification. We may mislabel coun-
terfeit token transfer attacks if a transferred token is authentic
but not in our ERC-20 token list (
AT
). In general, public
ERC-20 token creators upload the smart contract source code
to Etherscan, get verified by Etherscan, and allow users to
interact with (e.g., call a function of) that contract. Attackers,
on the other hand, have few incentives to upload their source
code. By leveraging this fact, we check the existence of the
smart contract source code (i.e., verification) on Etherscan
for each counterfeit token
CT
. We also examine each attack
contract
AC
for the same reason. 7 (out of 6,280) counter-
feit token contracts and 8 (out of 3,480) attack contracts are
verified on Etherscan. These involve only 14 out of 17.3M
poisoning transfers. Such potential false positives could hap-
pen when the victim specifies a similar address (by a typing
mistake or by coincidence) and uses a very minor (but not
counterfeit) token (i.e.,
/AT
) at the same time. Some of those
verified contracts are false positives (e.g., authentic tokens
that are not in our dataset), but others are indeed counterfeit
contracts that attackers uploaded to Etherscan. In other words,
some attackers had uploaded the source code of the attack
contracts—incidentally, this helped us illuminate how the at-
tack is conducted (which we described in §3). We exclude
verified attack contracts and counterfeit token contracts from
our clustering to avoid potential erroneous merge.
Window range. To make the chain scanning process efficient,
we assume that the attack happens in the next 20 minutes (100
blocks in Ethereum) after the original transfers. We evaluate
7
Table 1: Attack summary statistics
Blocks Transactions All poisoning Tiny Zero-value Counterfeit-token Victim Lookalike Attack Counterfeit
transfers transfers transfers transfers addresses addresses contracts tokens
Ethereum 5,154,722 1,691,529 17,365,954 308,881 7,185,298 9,871,775 1,330,948 6,492,215 3,480 6,280
BSC 20,906,918 16,505,215 252,703,515 3,651,015 140,556,905 108,495,595 16,107,774 43,644,433 406 710
Total 26,061,640 18,196,744 270,069,469 3,959,896 147,742,203 118,367,370 17,438,722 50,136,648 3,892 6,990
Table 2: Successful phishing attacks
Total Loss Min Median Avg. Max Std. dev Count Victims
(USD) (USD) (USD) (USD) (USD) (USD)
Ethereum 79,344,412 0.001 2,160 45,653 20,000,000 521,663 1,738 1,502
BSC 4,490,804 0.000 85 1,164 279,489 8,366 4,895 4,004
this assumption by collecting data over 200,000 blocks start-
ing from 16,950,603 (Apr. 1st, 2023), using both a 100-block
range and a 200-block range, and comparing results. Both
ranges identify the same number of successful cases (90).
The 200-block range only identifies two additional tiny trans-
fers besides the 250 tiny transfers identified with a 100-block
range. With a 100-block range, we find 102,154 zero-value
transfers; extending to 200 blocks only identifies an addi-
tional 27 zero transfers. Finally, a 100-block window captures
704,998 counterfeit token transfers, whereas a 200-block win-
dow brings this up to 706,332. In short, doubling the window
size (20 minutes to 40 minutes) does not dramatically increase
the number of attacks we discover. We believe that collecting
other transfers within the same poisoning transaction 4.1:
Step 2) has significantly helped the coverage.
5.3 Clustering results
Before clustering, we attempt to remove addresses (bots) that
copy transactions by setting an attack ratio threshold, as dis-
cussed in §4.2. We choose a threshold of 0.5, which 1) avoids
erroneous merge of large clusters and 2) only removes 1.49%
of all transfers. We provide a detailed explanation of this
threshold selection and demonstrate the robustness of our clus-
tering results (through temporal clustering) in Appendix D.
Table 3describes the number of attack instances and as-
signs Group IDs sorted by the number of lookalike addresses
in each group on Ethereum. The notation follows §3. We
only list the top eight biggest groups because the size of the
eighth group is significantly smaller than the top seven (i.e.,
less than 100,000 lookalike addresses with a few successful
cases). Different groups exhibit different attack strategies; not
all groups use three poisoning transfers. Only Group 1 uses
tiny transfers. Groups 3, 4, and 7 do not have any counterfeit
token contracts,
CT
, and thus do not use any counterfeit-token
transfers, and instead focus on zero-value transfers.
We additionally collect contract bytecode from our
Ethereum node and list the number of distinct contracts in
a parenthesis. In Group 2, for example, all
CT
and
AC
are
identical, indicating the reuse of the same source code (but
Table 3: Attack group general statistics. Notations follow
Fig. 2). The distinct number of contracts is in parentheses.
Group L CT A AC R TR TX
1 1,390,902 3,309 (517) 4,818 2,147 (67) 1,091,130 6,485,270 640,600
2 1,221,956 1,491 (1) 382 259 (1) 970,832 3,329,931 573,500
3 1,199,268 0 (0) 241 242 (2) 833,140 2,317,361 127,215
4 473,140 0 (0) 27 27 (4) 308,079 595,720 38,675
5 465,389 392 (149) 241 227 (74) 375,444 1,003,463 15,222
6 423,107 81 (46) 104 13 (13) 280,045 1,303,112 63,856
7 272,557 0 (0) 27 39 (39) 272,559 600,876 16,368
8 99,020 43 (11) 399 3 (2) 75,742 210,831 13,589
Dec 2022
Mar 2023
Jun 2023
Sep 2023
Dec 2023
Mar 2024
Jun 2024
0
10000
20000
Date
Num. of attacks
Group 1
Group 2
Group 3
Group 4
Group 5
Group 6
Group 7
Figure 4: Number of poisoning transfers from each attack
group over time (weekly basis)
with different addresses). Given that the contract bytecode
changes even with a slight alteration in the source code, the
high number of identical contracts corroborates the validity
of our clustering techniques (especially since our clustering
does not rely on those contracts).
We also investigate the number of poisoning transfers over
time. Rather than focusing on the aggregate number, we
present the number of attacks from each group. Figure 4
illustrates the number of poisoning transfers (not transactions)
per week from the top seven groups. We started to observe
attacks from Dec. 2022, starting with Group 7. Early in 2023,
Groups 3 and 4 enter the market and Group 1 comes around
Feb. 2023. As we will see in Table 4, we could not confirm an
early mover advantage (i.e., later comers are still profitable).
The variation in the attack strategies and the active periods
across attack groups further supports our clustering results.
6 Analysis
We next analyze the detection outcome to identify the popu-
lation(s) targeted, illuminate attacker strategies, investigate
8
1 1e3 1e6
0.00
0.25
0.50
0.75
1.00
Total balance in USD
Percent
Baseline
Group 1
Group 2
Group 3
Group 4
Group 5
Group 6
Group 7
Figure 5: CDF of targeted victims’ stablecoin balance for the
top 7 groups (and the baseline). x-axis is in logscale.
success conditions, and estimate attack profitability.
6.1 Targeted population
We first look at victim profiles, so that we can better under-
stand who the attackers target, and eventually propose de-
fenses. To make the attack more profitable, we conjecture that
attackers are more likely to target rich addresses. As different
attack groups may employ different strategies, we look at vic-
tims on a group-by-group basis on Ethereum. For this purpose,
we maintain an Ethereum archive node that retains all histor-
ical account states. Specifically, we randomly sample 1,000
victims from each of the top seven attack groups. For each
victim, we retrieve the sum of their three stablecoins (USDT,
USDC, DAI) balance just before when the victim receives the
first poisoning transfer from each attack group. We also create
a baseline group by randomly sampling 1,000 USDT users
and looking up their balance before the most recent transfer.
Figure 5illustrates the cumulative distribution function (CDF)
of the total stablecoin balance for the victims in each of the
seven attack groups and the baseline USDT users. All attack
groups tend to target addresses that have significantly more
stablecoins than regular USDT users. Groups 6 particularly
target rich accounts given that most of the victims own more
than 10,000 USD at the time of the attack. Targeting rich
accounts seems successful since we detect 83 payoff transfers
with more than 100,000 USD.
Attackers also appear to target addresses that are active or
have transferred a large amount. To verify this, we sample
three sets of 100,000 accounts: 1) the most active USDT users
in terms of number of transfers, 2) the users who have made
the largest USDT transfers and 3) a baseline of randomly
selected USDT users. We calculate the number of attacks
each user receives. Figure 6is the CDF of the number of
attacks for three groups. Active users and large transfer users
appear to significantly attract more attacks than other users.
Within the targeted victim population, we confirm that the
larger the number (or the amount) of transfers they are in-
volved in, the more they are targeted. Specifically, for every
victim, Spearman’s rank correlation (robust to outliers) be-
tween the number of stablecoin transfers a victim has made
and the number of attacks received is
ρ=0.7
. Spearman’s
rank correlation between the maximum amount of a transfer
1 1e1 1e2 1e3 1e4
0.00
0.25
0.50
0.75
1.00
Number of Attacks
Percent
Active
Baseline
Large transfer
Figure 6: CDF: the number of attacks received for three
groups: active, large transfers, baseline. logscale
x
-axis. The
more the curve is to the right, the more the group is targeted.
and the number of attacks received is
ρ=0.45
. All of those
results support that attackers appear to target users with 1)
high activity, 2) large transfers, and 3) high balance.
6.2 Attack strategies
We next infer attack strategies. In particular, we investigate
how attackers select which (and how similar) lookalike ad-
dresses to generate and target multi-chain addresses.
First, we discover that the attacker tends to imitate ad-
dresses (
R
) from centralized entities. By manually looking at
the list of addresses in
R
on Ethereum that the attacker imi-
tates to generate
L
the most, we find that those addresses often
belong to centralized exchanges (e.g., Coinbase, Binance, By-
bit) probably because they are one of the most active accounts.
However, most addresses are hot wallets, with which normal
users typically do not interact. Indeed, we do not find any
successful attack involving those hot wallets. This appears to
be a waste of time and money, as excluding such addresses
would reduce address generation and transaction costs. We
have more detailed information in Appendix E.
Second, we observe distinctive patterns in generating looka-
like addresses. Figure 7illustrates the distributions of looka-
like addresses based on the number of digits matched at
the beginning of the string (
x
-axis) and the end (
y
-axis) for
Group 1 (left) and all other groups (right). The bubble size
represents the number of lookalike addresses. The difficulty
of matching addresses exponentially increases by a power
of 16 for each additional hexadecimal digit (see details in
§7). For most of the groups (the right figure), the bubble size
decreases when
x
and
y
increase, forming a triangle shape
starting from
(x,y)=(3,4)
as expected. For Group 1, we
observe an increase in the number of lookalike addresses
starting from
(x,y)=(7,6)
(i.e., the second triangle), indicat-
ing that Group 1 seems to use two different address generation
methods. One possible conjecture is that they target different
services. Most wallet software (e.g., MetaMasks, Phantom
Wallet) used to show only the first and last 3–5 characters,
while blockchain scan services (e.g., Etherscan) used to dis-
play the first and last 6–7 characters.
Figure 8depicts the distributions of lookalike address sim-
ilarity for the top four groups. While most groups generate
9
5 10
5
10
First Digits
Last Digits
1000 10000 100000
(a) Group 1
5 10
5
10
First Digits
Last Digits
500000 1000000 1500000
(b) Other groups
Figure 7: Number of lookalike addresses for which the first
x- and the last y-digit match the intended recipient address.
10 15 20
1e1
1e3
1e5
Num. matched digits
Num. of lookalike addr
Group 1 Group 2 Group 3 Group 4
Figure 8: Distributions of the lookalike address similarity (for
the top 4 attack groups): y-axis is on a log scale
addresses with a maximum of 14 matching digits, Group 1
manages to produce addresses with 20 matching digits, which
requires higher computational power by several orders of
magnitude. The difference in distributions also suggests that
different attack groups may have employed different infras-
tructures, which we will investigate in §7.2.
Finally, we investigate whether any attack group re-uses
lookalike addresses across different chains. For any EVM-
compatible chains, users can use the same wallet addresses
across different chains. Attackers can reduce computation es-
pecially as the most active accounts frequently hold (ERC-20
and BEP-20) tokens across chains. We observe that attack-
ers re-use 16,903 lookalike addresses and target the same
107,542 victims across two chains. In particular, Group 4
re-use 21, Group 6 re-use 317, and Group 7 re-use 14,778
lookalike addresses between Ethereum and BSC, while the
other groups do not. These results imply that the attack can
easily scale up to any EVM chain. More generally, blockchain
address poisoning can affect any blockchain whose addresses
are represented by strings or objects complex enough that a
user cannot easily spot minor differences. For example, there
have been attacks on blockchains, such as Tron,
9
which do
not use a hexadecimal addressing system.
9https://support.token.im/hc/en-us/articles/12967949725
593-Security- Alert-0- USDT-transfer- scam
5 10
5
10
First Digits
Last Digits
0.0005
0.0010
0.0015
0.0020
Figure 9: Success probability and address similarity. The first
x- and the last y-digit match the intended recipient address.
6.3 When is the attack successful?
We next look into which conditions are satisfied for an attack
to be successful. We test our hypothesis that the more similar
the lookalike address is to the intended recipient address,
the more likely the victim will make a mistake. In Figure 9,
the bubble size represents the success ratio (the number of
successes to the number of lookalike addresses) depending on
similarity. We do not consider the number of attack attempts
for each victim. While
(x,y)=(7,10)
and
(x,y)=(9,8)
have
high success rate, there is no strong linear correlation between
similarity and success ratio in general. This result suggests
that the attack may have been more profitable to generate a
larger number of slightly less convincing lookalike addresses
and conduct more attacks.
We next focus on successful cases and examine how ad-
dress similarity (or attack timing) relates to success. Figure 10
(left) is the number of lookalike addresses targeting each vic-
tim. In 79% of the successful attacks on Ethereum, multiple
lookalike addresses target one victim, indicating competition
among attackers. This competition allows us to investigate
how a victim selects one particular lookalike address (i.e., a
winner
Lwin
) over a pool of other lookalike addresses (i.e.,
victims’ perception). We focus on two aspects: how similar
the winner
Lwin
is to
R
; and how early the winning address
appears, compared to other lookalike addresses. For each suc-
cessful attack, we extract the list of corresponding lookalike
addresses
L
, then check how
Lwin
—the recipient of the payoff
transfer—ranks in these two metrics. A smaller rank implies
a higher similarity; and an earlier attack, respectively. Fig-
ure 10 shows that, while the number of competitors (left)
decreases somewhat linearly, a disproportionately large pro-
portion of the most similar addresses (center), and/or early-
comers (right) tend to win the competition. In other words, in
times of stronger competition, being the first and/or having a
lookalike address most similar to the intended recipient seems
to yield measurable benefits.
While those results provide insights from chain activities,
the success also relies on what wallets or chain scanning
services victims use. Appendix 8compares how poisoning
attacks appear in those services’ UI and qualitatively discusses
10
1234567
0
100
200
300
Num. of Competitors
Frequency
1234567
0
250
500
750
1000
Similarity Order
Frequency
1234567
0
200
400
600
800
Attack Order
Frequency
Figure 10: Histogram for the number of lookalike addresses
per victim (left), the ranking of the
Lwin
in terms of similarity
(middle), and timing (right).
Table 4: Attack group profitability
Group Nr. Success. Revenue Total Cost Profit
Attacks (USD) (USD) (USD)
1 440 12,494,248 8,368,313 4,125,935
2 363 29,043,402 2,704,455 26,338,947
3 277 9,415,636 2,546,143 6,869,493
4 109 1,187,927 504,628 683,299
5 48 4,837,606 202,868 4,634,738
6 125 9,023,888 682,016 8,341,871
7 156 4,572,499 226,418 4,346,081
8 7 103,417 68,123 35,294
when and where victims are most likely to fall for the attack.
6.4 Benefit-cost analysis for attack groups
We next evaluate the attack profitability at the group level.
Table 4illustrates 1) the number of successful attacks, 2) the
total revenue, 3) the total cost, and 4) the total profit for the
top eight attack groups. The revenue is the total amount of
payoff transfers (i.e., phished amount) in USD. The cost is 1)
the transferred value from the tiny transfer attacks (sent from
L
to
V
) and 2) the transaction fees for all poisoning transfers.
We defer the discussion of the address generation costs to
§7.2, and focus here on transaction-related costs.
To accurately reflect the price fluctuation, We use the ETH-
USD conversion rates from the CoinGecko API for the day
of the transaction. The profit is the difference between the
revenue and the cost. To highlight attack profitability for all
attack groups, Figure 11 illustrates the relationship between
costs (
x
-axis) and revenues (
y
-axis) where the number is the
group ID; the graph is in log-log scale. We only include attack
groups with at least one payoff transfer. Revenues tend to rise
with cost, which is primarily dependent on the number of
poisoning transfers. The red line is the breakeven line—when
costs and revenues are the same. Most groups are above the
line, and thus profitable, which indicates strong incentives to
carry out blockchain address poisoning attacks.
In general, the top seven groups are largely profitable. De-
spite the high variance in payoff amounts as shown in Table 2,
1e4 1e5 1e6 1e7
1e3
1e5
1e7
1
2
3
4
56
7
8
9
10
1112
13
14
15
16
17
18
20
Total Cost
Revenue
Figure 11: Attack profit vs
cost; red line: y=x
12345678910
1
2
3
4
5
6
7
8
9
10
Attack group
Attack group
0.00
0.25
0.50
0.75
1.00
Figure 12: Win-loss ratio be-
tween attack groups
those groups manage to amortize the revenue across many
payoff transfers while some small groups do not. Given the
attack scale (i.e., Group 1 has invested over 8 million USD),
some groups appear to be well-funded cyber-crime entities.
However, none of the attack instances comes from any sanc-
tioned entities in the SDN list [41].
We also investigate competition between groups. For each
successful attack, we collect the set of lookalike addresses
from each group and record which group wins over the others.
Figure 12 is the win-loss matrix; the color indicates the win
ratio for the group on the
y
-axis against the group on the
x
-
axis. The win-loss ratio of the top seven groups hovers around
50% within themselves. On the other hand, these top groups
mostly win against the smaller attack groups (Groups 8–10),
probably because larger groups tend to generate more (or have
more similar) lookalike addresses and attack attempts.
In summary, while some attack strategies (e.g., targeting
wealthy accounts, cross-chain attacks) are effective, others
(e.g., imitating hot wallets, generating lookalike addresses
with large
d
) are not always optimal, suggesting that the attack
damage could have been even more severe.
7 Lookalike address generation
Figures 7and 8illustrate distinct distributions in lookalike ad-
dress similarity between Group 1 and others, suggesting vast
differences in computational capabilities. We next mathemati-
cally formulate the address generation problem, and simulate
and measure the number of address per second (APS) of vari-
ous hardware and software implementations.
7.1 Mathematical formulation
We assume that the attacker wants to match the first
a
and the
last
b
digits of the intended recipient and maximize the total
number of digits
a+b=d
, with no preference for
a
over
b
.
For instance, the attacker is indifferent between
(a,b)=(3,6)
or
(4,5)
as long as
(a3,b4)
holds, per §4.1. The attacker
has the set of target addresses (R) they want to imitate.
We assume that the probability of generating a given ad-
dress follows a uniform random distribution over the entire
address space, and that the number of target addresses
|R|
is
11
much smaller than the total address space (
1640
). If
|R|=r
,
the probability of generating a target address is
p=116d1
16dr
,(1)
which reduces to
p=1/16d=24d
when
r=1
(i.e., a single
address is targeted).
The probability of finding a collision after
k
trials follows
the geometric distribution
P(X=k) = p·(1p)k1
, and the
expected number of trials for a collision is
E[X] = 1/p
. Using
a first-order Taylor approximation in Eqn. (1) yields
p
r/16d
, and the resulting expected number of trials is
E[X]
24d/r
. In short, the collision probability linearly increases
with the size of the target address space
r
, and, accordingly,
the expected number of trials linearly decreases with r.
7.2 Simulating address generation
We implement and run, on various combinations of hardware
and software, a brute-forcing process to simulate lookalike
address generation. This allows us to measure the number of
addresses per second (APS) that can be generated, which in
turn gives us a lower bound on the computational resources
attackers must possess to produce the lookalike addresses we
observed in the wild.
We implement 1) a naive version using Python with the
web3py
10
library; and 2) an optimized version leveraging
CPU multiprocessing using Python with the multiprocessing,
coincurve,
11
and pycryptodome
12
libraries. In addition, we
use two popular vanity address-generating software: Vanity-
ETH,
13
a Javascript-based web application, and Profanity2,
14
a C++-based GPU implementation. We elect, for simplicity,
not to check whether the generated address is in the set of
target addresses
R
, which will produce an optimistic bound on
performance, and thus a conservative bound on the attackers’
implied hardware capabilities.
We test our implementations on two hardware configura-
tions, a Macbook Air (OS Version 13.1) with an 8-core Apple
M1 base chip; and a server workstation (Ubuntu 22.04 LTS)
with an Intel(R) Xeon(R) Silver 4214 CPU and an Nvidia
RTX 8000 GPU. We run our simulations on the CPU on
both machines and on the GPU on the second machine. We
summarize our simulation results in Table 5.
The naive Python implementation which models what
might be an expected “first-attempt” by an attacker produces
in the order of
103
104
addresses per second. The more op-
timized implementation produces an order of magnitude im-
provement (
104
105
). The Vanity-ETH implementation is
10https://github.com/ethereum/web3.py
11https://github.com/ofek/coincurve
, Python bindings for a
heavily optimized C library for elliptic curve operations.
12https://github.com/Legrandin/pycryptodome/
, the default
hashing package used by web3py.
13https://vanity- eth.tk/
14https://github.com/1inch/profanity2
Table 5: Generated addresses per second (APS)
Mac M1 Server (CPU-only) Server (GPU)
Python (naive) 12,152 5,769
Python (optimized) 81,660 460,665
Vanity-ETH 4,800
Profanity2 516,437,000
much slower than the standalone clients. The CUDA imple-
mentation from Profanity2 is three orders of magnitude faster
than our best CPU implementation with roughly half a billion
APS. The Profanity2 developers self-report about a billion
APS on similarly priced consumer hardware. However, the
overhead imposed by checking set inclusion in
R
should be
much higher in the GPU implementation.
Based on the distribution of the matching digits
d
for each
attack group, we next attempt to infer the attacker’s hardware
capability. We define two units: CPU-day and GPU-day, the
number of addresses the attacker can generate in one day
per machine with a CPU and a GPU, respectively. From the
above benchmarks, 1 CPU-day is
3.98 ×1010
addresses and
and 1 GPU-day is
4.46 ×1013
, respectively. In the process
of producing a
d
-digit match, the attacker can generate many
addresses matching
d<d
digits in
R
. Therefore, estimating
the amount of computation needed to generate the maximum
match dfor each group suffices.
As shown in Figure 8, Group 1 produces a maximum 20-
digit match while other groups (Groups 2, 3, and 4) achieve
14-digit matches. We calculate the number of expected trials
for
d=20
and, based on Table 3,
r=106
. We estimate that
Group 1 have used
3.0×107
CPU-days or 27,093 GPU-days.
Group 1 would have needed to run over 41,000 CPUs
around the clock over our whole two years of measurements
to generate their lookalike addresses. We thus suspect they
perform a substantial amount of their computation via GPUs.
Group 1 could also have developed specialized hardware for
address generation (e.g., ASICs), which would further reduce
generation time. Other groups (
d=14
) only used 1.81 CPU-
days, or 1.6×103GPU-days.
To get a rough idea of the economic cost of computation, we
use the AWS pricing model for similar GPUs and CPUs
15
. For
GPU, an hourly rate (yearly plan) for g4dn.16xlarge (NVIDIA
T4 tensor) is $2.612 per hour; $62.69 for one GPU-day. For
CPU, an hourly rate (on-demand) for r6i.4xlarge is $1.008;
$24.19 for one CPU-day. Using these numbers, we estimate
that Group 1 would have paid (at most) 1.7M USD to generate
a 20-digit match, which is well below its 4M USD profit
described in Table 4. Furthermore, these numbers are likely
higher than the costs incurred by self-hosting hardware. For
other groups working with
d=14
, CPU costs (
1.81×24.19
43 USD) are dominated by transaction costs.
15
GPUs:
https://aws.amazon.com/ec2/instance-types/g4/
and
CPUs: https://aws.amazon.com/ec2/pricing/on-demand/
12
8 Mitigations
We next suggest a few countermeasures to mitigate the success
of blockchain address poisoning.
Protocol-level mitigations: One mitigation is to map human-
readable strings to each address, like DNS for IP addresses.
Some entities already have adopted domain names on wallet
addresses through the Ethereum Domain Name System (ENS).
Human-readable addresses would reduce address poisoning
as well as accidental transfers (Appendix C) and enhance
usability. ENS, however, fosters other security risks such as
typosquatting, name hijacking (i.e., re-using the expired do-
main names) [44], or name collisions [14].
Another protocol-level solution is to raise the cost of gen-
erating an address and make the attack economically less
effective [29]. Adding 1 ms of latency per address results in
(roughly) decreasing lookalike generation by 1,000 addresses
per second. Verifiable delay functions [4] might be a good
application for this purpose. Alternatively, the chain could use
a larger alphabet representation to increase the cost of finding
lookalikes—e.g., Bitcoin uses Base58, so the cost of matching
d
characters is
O(58d)
as opposed to
O(16d)
in EVM-based
chains. These additional costs could help render the attack
model economically impractical.
Contract-level mitigations: One solution is to modify the
major stablecoin contracts and ask permission from the token
sender even when the amount is zero. Function modification
requires each ERC-20 token issuer to upgrade its smart con-
tract despite limited incentives for developers. This solution
only mediates zero-value transfers but not tiny and counter-
feit token transfers. In addition, token owners can quickly
blacklist the attacker’s address. For the largest payoff transfer
we observe, the Tether team (USDT) blocked the attacker’s
address within an hour, which prevented the attacker from
moving the retrieved funds to another address.16
Wallet-level mitigation: Blockchain wallet or chain scanning
services can improve usability. For example, one can show
more of the address to make lookalike addresses easier to spot.
One can also hide suspected poisoning transfers (e.g., using
our detection algorithms) from the account history to prevent
users from mistakenly selecting a lookalike; or at least require
confirmation when sending money to a lookalike address.
User-level mitigations: Finally, users are the last line of de-
fense. They could, for instance, be trained to engage in safer
practices, e.g., building an allow-list of trusted addresses, or in-
stalling third-party extensions to flag phishing addresses [47].
This, however, adds another layer of trust, relies on the ex-
tension’s effectiveness, and remains potentially vulnerable to
anti-detection strategies [23].
16
Transaction blocking
L
:0x2675aaf5db84f9f966c431ff82f2173d3251-
020b11b4939b4961ebedf92c78dd.
9 Related work
The surge in blockchain popularity exposes users to a range
of cryptocurrency scams and online misconduct. Examples
include phishing websites [12], giveaway scams [19], coun-
terfeit tokens and rug pulls [8,43,46], market manipula-
tion [11,18,45], exchange/marketplace scams [20,35], and
Ponzi schemes [39]. The above attacks exploit 1) the absence
of an intermediary [22], 2) pseudonymous payments for scam-
mers or criminals [5,30], 3) large price movements attracting
inexperienced investors [15,31] and inviting mischief [17,36].
We highlight work particularly related to our discussion,
with a focus on attack scale. Gao et al. [8] identified 2,117
counterfeit tokens by applying keyword matching to the 100
most popular tokens. The authors investigated two techniques
involving counterfeit tokens (airdrop and arbitrage), leading to
total financial losses exceeding 17M USD over 7,000 victims.
Xia et al. [43] identified around 10,000 counterfeit tokens on
Uniswap (50% of all Uniswap tokens) through ground truth
labels provided by the platform, address clustering, and ma-
chine learning-based techniques. They estimated these coun-
terfeit tokens collectively led to financial losses of 16M USD
from approximately 40,000 victims. He et al. [12] introduced
TxPhishScope, a tool designed to detect blockchain phishing
websites. The authors identified 26,333 phishing websites
with 3,486 phishing blockchain accounts. Other studies on
free giveaway scams reported 24M–69.9M USD losses [19],
including 872K USD losses from Twitter [17]. Based on these
other works, blockchain address poisoning appears to be one
of the largest (in terms of aggregate losses) and most wildly
targeted criminal activity taking place on-chain.
Closest to our efforts, Ye et al. [46] examined three types
of blockchain scams, one of which is address poisoning on
Ethereum with zero-value transfers. In parallel, and contem-
porary to our work, Guan and Li [10] also studied address
poisoning on Ethereum and discuss all three poisoning trans-
fers. We provide a detailed comparison with our method and
results in §4.1, §5.2 and Appendix A. The Chainalysis’s blog
post [34] conducted the case study (WBTC 68M loss) and
analyzed its related addresses (victims, lookalike addresses).
10 Conclusion
We studied blockchain address poisoning, a form of phish-
ing that exploits visual similarities in hexadecimal wallet
addresses used in EVM-compatible chains (e.g., Ethereum,
BSC). We implemented a detection, identified large attack
groups and their strategies, and inferred their capability
through simulation. We showed blockchain address poisoning
presents a considerable threat, in total, 270M attack attempts,
50M lookalike (malicious) addresses, and, more chillingly,
nearly 84M USD in monetary losses (from more than 6,600
cases) on Ethereum and BSC. Worse, damages could be even
higher if attackers optimized their strategies.
13
11 Ethical considerations
The concept of blockchain address poisoning is already the
subject of some online discussions, and, as this paper shows,
this issue is currently extensively exploited in the field. As
such, wallet operators and (some) users are most likely already
aware of at least the theoretical possibility of the attack. What
our work does, though, is to demonstrate its practical severity
in economic terms and we hope it will be a call to arms to
act sooner rather than later.
While attackers could potentially misuse some of our find-
ings to improve their strategies, besides raising community
awareness, the main contribution of our paper is to extensively
describe the attack mechanisms. We also introduce possible
protection strategies. Our proposed mitigations can hopefully
inspire wallet operators. Likewise, we hope that our work can
help users recognize when they are in targeted populations,
and accordingly, take precautionary measures.
Finally, Appendix G, specifically Table 9, includes full
transaction hashes, which could potentially be used to re-
identify the wallet addresses of the potential victims (whom
we do not directly list in the table). We decided to present the
table as is, as obfuscating part of the transaction hash would
impede reproducibility or external validation of our findings,
for little to no privacy benefit: users of Ethereum (or BSC)
are relying on a public blockchain, in which all records are
publicly available. Furthermore, wallet addresses are not, on
their own, considered personally identificable information, as
1) there is no one-to-one mapping to specific individuals, as
a given person may hold several wallets, and a single wallet
may be shared by multiple people (e.g., belonging to the
same organization), and 2) a wallet address does not, in itself,
contain information directly mapping to a real-world identity.
Acknowledgments
This research was partially supported by a Sui Founda-
tion Academic Grant, the Carnegie Mellon CyLab’s Secure
Blockchain Initiative, and the Nakajima Foundation.
References
[1]
1inch Network. A vulnerability disclosed in profanity,
an ethereum vanity address tool.
https://blog.1in
ch.io/a-vulnerability-disclosed- in-profa
nity-an- ethereum-vanity-address-tool/
, 2022.
Accessed Aug. 12th, 2024.
[2]
Pieter Agten, Wouter Joosen, Frank Piessens, and Nick
Nikiforakis. Seven months’ worth of mistakes: A lon-
gitudinal study of typosquatting abuse. In Proceedings
of the 22nd Network and Distributed System Security
Symposium (NDSS 2015). Internet Society, 2015.
[3]
Hugo Bijmans, Tim Booij, Anneke Schwedersky, Aria
Nedgabat, and Rolf van Wegberg. Catching phishers
by their bait: Investigating the dutch phishing landscape
through phishing kit detection. In 30th USENIX security
symposium (USENIX security 21), pages 3757–3774,
2021.
[4]
Dan Boneh, Joseph Bonneau, Benedikt Bünz, and Ben
Fisch. Verifiable delay functions. In Annual interna-
tional cryptology conference, pages 757–788. Springer,
2018.
[5]
Nicolas Christin. Traveling the silk road: A measure-
ment analysis of a large anonymous online marketplace.
In Proceedings of the 22nd international conference on
World Wide Web, pages 213–224, 2013.
[6]
Fred J Damerau. A technique for computer detection
and correction of spelling errors. Communications of
the ACM, 7(3):171–176, 1964.
[7]
Evgeniy Gabrilovich and Alex Gontmakher. The homo-
graph attack. Communications of the ACM, 45(2):128,
2002.
[8]
Bingyu Gao, Haoyu Wang, Pengcheng Xia, Siwei Wu,
Yajin Zhou, Xiapu Luo, and Gareth Tyson. Tracking
counterfeit cryptocurrency end-to-end. Proceedings of
the ACM on Measurement and Analysis of Computing
Systems, 4(3):1–28, 2020.
[9]
Chris Grier, Kurt Thomas, Vern Paxson, and Michael
Zhang. @ spam: the underground on 140 characters or
less. In Proceedings of the 17th ACM conference on
Computer and communications security, pages 27–37,
2010.
[10]
Shixuan Guan and Kai Li. Characterizing ethereum
address poisoning attack. In Proceedings of the 2024 on
ACM SIGSAC Conference on Computer and Communi-
cations Security, pages 986–1000, 2024.
[11]
JT Hamrick, Farhang Rouhi, Arghya Mukherjee, Amir
Feder, Neil Gandal, Tyler Moore, and Marie Vasek. The
economics of cryptocurrency pump and dump schemes.
Available at SSRN 3310307, 2018.
[12]
Bowen He, Yuan Chen, Zhuo Chen, Xiaohui Hu, Yufeng
Hu, Lei Wu, Rui Chang, Haoyu Wang, and Yajin Zhou.
Txphishscope: Towards detecting and understanding
transaction-based phishing on ethereum. In Proceedings
of the 2023 ACM SIGSAC Conference on Computer and
Communications Security, pages 120–134, 2023.
[13]
Tobias Holgers, David E Watson, and Steven D Gribble.
Cutting through the confusion: A measurement study
of homograph attacks. In USENIX Annual Technical
Conference, General Track, pages 261–266, 2006.
14
[14]
Daiki Ito, Yuta Takata, Hiroshi Kumagai, and Masaki
Kamizono. Investigations of top-level domain name col-
lisions in blockchain naming services. In Proceedings
of the ACM on Web Conference 2024, pages 2926–2935,
2024.
[15]
Daisuke Kawai, Bryan Routledge, Kyle Soska, Ariel
Zetlin-Jones, and Nicolas Christin. User participation
in cryptocurrency derivative markets. In 5th Con-
ference on Advances in Financial Technologies (AFT
2023). Schloss-Dagstuhl-Leibniz Zentrum für Infor-
matik, 2023.
[16]
Panagiotis Kintis, Najmeh Miramirkhani, Charles
Lever, Yizheng Chen, Rosa Romero-Gómez, Nikolaos
Pitropakis, Nick Nikiforakis, and Manos Antonakakis.
Hiding in plain sight: A longitudinal study of com-
bosquatting abuse. In Proceedings of the 2017 ACM
SIGSAC Conference on Computer and Communications
Security, pages 569–586, 2017.
[17]
Kai Li, Darren Lee, and Shixuan Guan. Understanding
the cryptocurrency free giveaway scam disseminated on
twitter lists. In 2023 IEEE International Conference on
Blockchain (Blockchain), pages 9–16. IEEE, 2023.
[18]
Tao Li, Donghwa Shin, and Baolian Wang. Cryptocur-
rency pump-and-dump schemes. Available at SSRN
3267041, 2021.
[19]
Xigao Li, Anurag Yepuri, and Nick Nikiforakis. Double
and nothing: Understanding and detecting cryptocur-
rency giveaway scams. In Proceedings of the Network
and Distributed System Security Symposium (NDSS),
2023.
[20]
Tyler Moore and Nicolas Christin. Beware the middle-
man: Empirical analysis of bitcoin-exchange risk. In
International conference on financial cryptography and
data security, pages 25–33. Springer, 2013.
[21]
Tyler Moore and Benjamin Edelman. Measuring the
perpetrators and funders of typosquatting. In Interna-
tional Conference on Financial Cryptography and Data
Security, pages 175–191. Springer, 2010.
[22]
Satoshi Nakamoto. Bitcoin: A peer-to-peer electronic
cash system. Technical report, 2008.
[23]
Adam Oest, Yeganeh Safaei, Penghui Zhang, Brad
Wardman, Kevin Tyers, Yan Shoshitaishvili, and Adam
Doupé.
{
PhishTime
}
: Continuous longitudinal measure-
ment of the effectiveness of anti-phishing blacklists. In
29th USENIX Security Symposium (USENIX Security
20), pages 379–396, 2020.
[24]
Adam Oest, Penghui Zhang, Brad Wardman, Eric Nunes,
Jakub Burgis, Ali Zand, Kurt Thomas, Adam Doupé,
and Gail-Joon Ahn. Sunrise to sunset: Analyzing the
end-to-end life cycle and effectiveness of phishing at-
tacks at scale. In 29th
{
USENIX
}
Security Symposium
({USENIX}Security 20), 2020.
[25]
Lim Yu Qian. Most popular crypto hot wallets for self-
custody (2023), August 2023.
https://www.coinge
cko.com/research/publications/most-popular
-crypto- hot-wallets.
[26]
Kaihua Qin, Stefanos Chaliasos, Liyi Zhou, Benjamin
Livshits, Dawn Song, and Arthur Gervais. The
blockchain imitation game. In 32nd USENIX Security
Symposium (USENIX Security 23), pages 3961–3978,
2023.
[27]
Kaihua Qin, Liyi Zhou, and Arthur Gervais. Quantifying
blockchain extractable value: How dark is the forest? In
2022 IEEE Symposium on Security and Privacy (SP),
pages 198–214. IEEE, 2022.
[28]
Keith Rayner, Sarah J. White, Rebecca L. Johnson, and
Simon P. Liversedge. Raeding wrods with jubmled let-
tres: There is a cost. Psychological Science, 17(3):192—
-193, 2006. doi: 10.1111/j.1467-9280.2006.01684.x.
[29]
Kyle Soska. Security Defender Advantages via Eco-
nomically Rational Adversary Modeling. PhD thesis,
Carnegie Mellon University, 2021.
[30]
Kyle Soska and Nicolas Christin. Measuring the lon-
gitudinal evolution of the online anonymous market-
place ecosystem. In 24th USENIX Security Symposium
(USENIX Security 15), pages 33–48, 2015.
[31]
Kyle Soska, Jin-Dong Dong, Alex Khodaverdian, Ariel
Zetlin-Jones, Bryan Routledge, and Nicolas Christin.
Towards understanding cryptocurrency derivatives: A
case study of bitmex. In Proceedings of the 30th Web
Conference (WWW’21). Ljubljana, Slovenia (online),
2021.
[32]
Janos Szurdi and Nicolas Christin. Email typosquat-
ting. In Proceedings of the 2017 internet measurement
conference, pages 419–431, 2017.
[33]
Janos Szurdi, Balazs Kocso, Gabor Cseh, Jonathan
Spring, Mark Felegyhazi, and Chris Kanich. The long
{
“Taile”
}
of typosquatting domain names. In 23rd
USENIX Security Symposium (USENIX Security 14),
pages 191–206, 2014.
[34]
Chainalysis Team. Anatomy of an address poisoning
scam, October 2024.
https://www.chainalysis.co
m/blog/address-poisoning- scam/.
15
[35]
Taro Tsuchiya, Alejandro Cuevas, and Nicolas Christin.
Identifying risky vendors in cryptocurrency p2p market-
places. In Proceedings of the ACM on Web Conference
2024, pages 99–110, 2024.
[36]
Taro Tsuchiya, Alejandro Cuevas, Thomas Magelinski,
and Nicolas Christin. Misbehavior and account suspen-
sion in an online financial communication platform. In
WWW’23: Proceedings of the ACM Web Conference
2023, Austin, TX, USA, 2023.
[37]
Amber Van Der Heijden and Luca Allodi. Cognitive
triaging of phishing attacks. In 28th USENIX Security
Symposium (USENIX Security 19), pages 1309–1326,
2019.
[38]
Zoltan Vardai. WBTC thief returns $71 million worth
of stolen funds, May 2024.
https://cointelegraph.
com/news/wbtc-thief- returns-71- million.
[39]
Marie Vasek and Tyler Moore. Analyzing the bitcoin
ponzi scheme ecosystem. In International Conference
on Financial Cryptography and Data Security, pages
101–112. Springer, 2018.
[40]
Fabian Vogelsteller and Vitalik Buterin. Erc-20: Token
standard, November 2015.
https://eips.ethereum.
org/EIPS/eip-20.
[41]
Anton Wahrstätter, Jens Ernstberger, Aviv Yaish, Liyi
Zhou, Kaihua Qin, Taro Tsuchiya, Sebastian Steinhorst,
Davor Svetinovic, Nicolas Christin, Mikolaj Barczen-
tewicz, et al. Blockchain censorship. WWW’24: Pro-
ceedings of the ACM Web Conference 2024, Singapore,
2024.
[42]
Yunpeng Wang, Jin Yang, Tao Li, Fangdong Zhu, and Xi-
aojun Zhou. Anti-dust: a method for identifying and pre-
venting blockchain’s dust attacks. In 2018 international
conference on information systems and computer aided
education (ICISCAE), pages 274–280. IEEE, 2018.
[43]
Pengcheng Xia, Haoyu Wang, Bingyu Gao, Weihang Su,
Zhou Yu, Xiapu Luo, Chao Zhang, Xusheng Xiao, and
Guoai Xu. Trade or trick? detecting and characterizing
scam tokens on uniswap decentralized exchange. Pro-
ceedings of the ACM on Measurement and Analysis of
Computing Systems, 5(3):1–26, 2021.
[44]
Pengcheng Xia, Haoyu Wang, Zhou Yu, Xinyu Liu, Xi-
apu Luo, Guoai Xu, and Gareth Tyson. Challenges in
decentralized name management: the case of ens. In
Proceedings of the 22nd ACM Internet Measurement
Conference, pages 65–82, 2022.
[45]
Jiahua Xu and Benjamin Livshits. The anatomy of
a cryptocurrency pump-and-dump scheme. In 28th
USENIX Security Symposium), pages 1609–1625, 2019.
[46]
Guoyi Ye, Geng Hong, Yuan Zhang, and Min Yang. In-
terface illusions: Uncovering the rise of visual scams in
cryptocurrency wallets. 2024.
[47]
Yaman Yu, Tanusree Sharma, Sauvik Das, and Yang
Wang. " don’t put all your eggs in one basket": How
cryptocurrency users choose and secure their wallets. In
Proceedings of the CHI Conference on Human Factors
in Computing Systems, pages 1–17, 2024.
A Comparison with Guan and Li’s
This section compares our detection performance to the most
recent work from Guan and Li [10]. To align with their obser-
vation, we only include our measurement results from Nov.
2022 to Feb. 2024 on Ethereum. Table 6summarizes the
results.
First, the total amount of payoff transfers is similar, cor-
roborating the validity of both approaches. Second, both ap-
proaches estimate similar attack costs in ETH, but not in USD.
Guan and Li use a fixed ETH to USD conversion rate (around
3200 USD) whereas we query the price at the time of attack.
The exchange rate, however, fluctuated between 1000 USD
and 2000 USD for the majority of the observation period, lead-
ing to the attack costs being overestimated by over 10 million
USD, which results in a difference of a 300% on the return-on-
investment (RoI) for the attacker. Third, we detect more than
2.2 million counterfeit token transfers. Guan and Li do not
report tokens with symbols that do not exactly match USDT
and USDC. For example, out of 5,838 counterfeit tokens, only
1,922 use the exact same token symbols whereas at least 740
tokens use the different encodings of USDT or USDC (e.g.,
using a Cyrillic “S” and/or “T” in “USDT”). Likewise, attack-
ers can also avoid their detection if they change the attack
characteristics, e.g., launching zero-value transfers without at-
tack contracts. Our results suggest filtering the transfers based
on address similarity first would be more robust to attacker
manipulation (less false negatives), than filtering by attack
characteristics. On the other hand, compared to Guan and
Li [10], our technique misses attacks where less than 7 digits
match the address L.
In addition to the methodological differences, our work
has several distinctions. First, we have better coverage in
measurement (two years, two chains). We find 270M attack
attempts (compared to 14M), highlighting significant chain
waste and user experience degradation.
Table 6: Comparison of detection performance
Payoff Cost ETH ROI Tiny Zero-value Counterfeit CT
transfer (USD) token
Guan and Li [10] 76.79M 7918.2 (25.5M) 2.2 277.3K 7.85M 6.64M 2,333
Ours 76.81M 8202 (14.9M) 5.1 283.9K 7.05M 8.87M 5,838
Second, although Guan and Li also perform attacker clus-
tering, they appear to have overestimated group sizes, due
16
to erroneous merges from copying bots, as we show in §5.3.
We carefully remove bots and extrapolate a few large attack
entities.
Last but not least, most of our analyses are novel. We ana-
lyze victim characteristics, success conditions, group competi-
tions, cross-chain attacks, and modeling of address generation
for the first time. We provide deeper insights on how attackers
perform attacks and what capabilities they have.
B Birthday Paradox
This section investigates whether our detection system mis-
labels poisoning transfers when the victim interacts with
two similar-looking addresses by coincidence (i.e., due to
the “birthday paradox”). We define the number of accounts
the victim interacts with as
r=|R|
and the likelihood of
encountering, among these accounts, by pure chance, two
addresses with the first three and the last four characters
matching as
p
, a “collision probability. By the birthday para-
dox, we can mathematically relate those two variables with:
p1er2/2×16(3+4)
. For instance, when the victim interacts
with r=19,290 accounts, the probability pbecomes 0.5.
Here, we examine whether our system should exclude a set
of victims that present a collision probability greater than a
given threshold
a
, i.e.,
pa
. We denote this set of victims
by
Vpa
. Figure 13(a) shows the number of victims (
|Vp>a|
)
that present a collision probability greater than
a
, on the
x
axis. Even at a small threshold
a=0.01
, we only find 664
accounts (0.04% of total victims). At first, excluding
Vpa
does not seem to impact our detection performance. However,
we find that excluding these accounts results in numerous
false negatives because attackers purposefully target very
active accounts, as shown in §6.1. Figure 13(b) illustrates the
number of attacks we would report if we excluded attacks
on
Vp>a
. For example, we would discover 16.2M million
poisoning transfers at
a=0.01
, which is 1 million fewer than
at
a=0.99
. Indeed, we find that most of those detected attacks
are indeed poisoning transfers. 99.7% of attacks targeting
Vp>0.99
are launched by the attack accounts
A
that have also
targeted victims
Vp<0.01
(i.e., the population that is hardly
susceptible to a collision).
False positives on payoff transfers are even more unlikely
as they require all of the following: 1) the victim interacts
with two similar addresses, 2) one of similar addresses “unin-
tentionally” performs the poisoning transfer (e.g., testing with
a zero transfer), and 3) the intended transfer, the poisoning
transfer, and the payoff transfer happen in that correct order.
As shown in §5.2, we do not find any false positives for top
30 cases for both chains.
The result shows that our system is robust against spurious
collisions and that setting a low threshold reduces the perfor-
mance of our detection system. Hence, we decide to set the
threshold of
a=0.999
to exclude victims that are interact-
0.00 0.25 0.50 0.75 1.00
200
400
600
Collision Prob. (threshold)
Nr. of victims
(a) Nr. of accounts (
y
-axis)
with a probability of collision
greater than a given threshold
(x-axis)
0.00 0.25 0.50 0.75 1.00
16,500,000
17,000,000
Collision Prob. (threshold)
Nr. of deteced attacks
(b) Nr. of poisoning transfers
detected at a given collision
probability threshold
Figure 13: The number of victims that are susceptible to a
collision and the number of attacks that are detected when we
exclude those victims. The x-axis does not start with 0.
ing with a large enough number of addresses to be extremely
susceptible to collisions, while capturing nearly all attacks.
Setting such a high threshold only excludes, for instance, 64
Ethereum accounts.
C Accidental transfers
Victims could accidentally send tokens to an incorrect, but
not malicious, destination address, due to a typing mistake.
We focus on those cases when two addresses (legitimate and
lookalike) match more than 20 (out of 40) characters, regard-
less of position. Given current hardware capabilities, gener-
ating a lookalike address with more than 20 matching digits
is computationally unlikely (see §7for details). Furthermore,
if users mistakenly specify an incorrect destination address,
tokens are probably not recoverable, since it is very unlikely
that anybody possesses the private key of the mistyped ad-
dress. We leverage this property, and check whether one of
the recipient addresses has ever sent any authentic tokens;
17
if they have not, we hypothesize this transfer is an accidental
transfer that we categorize separately from payoff transfers.
Measurement results. We identify 363 accidental transfers
(i.e., not the result of a deliberate attack) from 295 victims
and 336 mistyped destination addresses, totaling a 5.5M USD
loss in Ethereum. This high number suggests that some vic-
tims have sent assets to the same mistyped address multiple
times, or made typing mistakes multiple times, over different
intended addresses. We observe a similar number of acciden-
tal transfers in BSC: 318 cases from 206 victims and 241
mistyped addresses for a total of 57,562 USD lost. By manu-
ally looking at accidental transfers, we confirmed that those
transfers appear to be typing mistakes, and noticed common
human errors. For instance, users 1) mistype a character with
a neighboring key (i.e., low fat finger-distance [21]): “4” vs.
17
To be more precise, we exclude zero-value transfers or counterfeit token
transfers which can be launched without private keys.
17
“3,” “3” vs. “e, 2) swap the order of two characters, or 3)
mistype a character for something visually similar [32]: “b”
vs “d,” “e” vs “c, or “6” and “b. To quantitatively assess
typing mistakes, we calculated the Levenshtein distance
(i.e., the number of operations necessary to replace a string
with one another, including transposition) between two ad-
dresses. As expected, the distance is small; 84.95% of typos
have
=1
, and 96.69% have
2
in Ethereum. We have
similar results in BSC (86.48% and 94.34% for
=1
and
2).
D Detecting copying bots
Here we explain why we use the attack ratio to exclude bots
and show the robustness of our clustering results through
temporal clustering.
As discussed in §4.2, we group two poisoning trans-
fers if they 1) are in the same
T X
, 2) are launched by
the same attacker
A
, or 3) specify the same lookalike ad-
dress
L
. Our cluster considers two groups to be the same
if they share even one link. We notice that two seemingly
distinct large groups (based on their strategies and behav-
iors) are connected by a few attack addresses
A
that be-
have differently from other attack addresses. We find that
they appear to be the copying bots that duplicate trans-
actions from multiple attack groups. For instance, the ac-
count
0x7D575a7C732D1c502c07f18BB822D29CF7DBf9E8
copies not only address poisoning transactions
18
but also other
transactions such as from MEV (Maximum Extractable Value)
bots
19
. This type of copying behavior is relatively common
in Ethereum or BSC, as discussed in previous work [26,27].
Those bots seem to copy the original transactions without
properly understanding address poisoning. Without changing
the lookalike address
L
, the payoff transfer goes to the original
attacker regardless of who initiates the attack. In other words,
these bots are helping other attackers.
We next explain why we choose to use the attack ratio to
detect copying bots over other methods (using the notion of
time or removing all duplicated transactions).
Using the notion of time. Copy transactions generally come
after the original poisoning transactions (in terms of the block
number), so removing the second or later transactions could
be effective. However, we notice that some copy transac-
tions end up in the same block as the original transaction.
This can be achieved by monitoring unconfirmed transac-
tions in the peer-to-peer (P2P) network. For instance, the ac-
count
0xfffdcF2B3419C243A5eba5f051A64ad629362c9a
18The original transaction: 0x725eaedf8857e243587020de97e-
d503fb8bd8899bcbe8c685cf57fbce6810cc5,
the copy transaction: 0x5cc772d10b9e6fd61c7402967b5ae2ff6f-
cae7a2ca9bef0e9c1c6202c2ea257c
19The original transaction: 0x0720fec6980c196b3545d273da2-
b5b41987bf29524c6236cbf462d683df06057,
the copy transaction: 0x7aec84ca51fd5bdfcecf8717762-
87b6732e8b8f27097e8d690c090cf6e6865bf
manages to include copy transactions in the same block as
the original transaction most of the time. Based on the public
mempool data from Flashbots
20
, we realize that this account’s
transactions do not appear in the public P2P network, while
the original transactions do. This suggests that some bots
monitor the original transactions in the P2P layer, and directly
submit their transactions to the block builders (to potentially
front-run the original transaction). As a result, we cannot
systematically determine whether the first transaction is the
original or the copy.
Removing all duplicated transactions or transfers. While
differentiating the original from the copy transactions remains
impractical, we can simply remove all transactions involved
in copying behavior (including the original). If the bot simply
copies the entire transaction, the data field in the transaction
stays the same. While this method captures most of the copy
transactions, we observe that some bots appear to slightly
modify one of the parameters to potentially avoid detection,
yielding a different data field.
We can also exclude duplicated transfers that have the
same (
L
,
V
,
CT
(or
AC
), and
v
(value)). This method appears
to capture all the transactions from the bots. However, we
realize that the attackers (not the copying bots) also re-send
their poisoning transfers several times against the same victim.
We would remove more than 11 million (62%) “duplicated”
transfers with many false positives, which significantly under-
mines the final clustering results (i.e., the size of the clusters
becomes exceedingly small).
Removing potential bots by attack ratio. Only using in-
formation from transactions either leads to 1) the removal of
too many non-bot transactions or 2) the inclusion of copying
bot transactions. We decide to retrieve additional information
of attack accounts
A
, and exclude
A
with a low attack ratio—
the ratio of the transactions used for address poisoning, as
introduced in §4.2.
We aim to set the threshold as low as possible while remov-
ing all the copying bots. Figure 14 (left) illustrates the top
10 clusters when we vary the attack ratio threshold from 0
(removing no transfer) to 1 (removing all transfers). The line
indicates the same cluster. We manually identify that all the
erroneous merges we have spotted disappear with a threshold
of 0.5; the largest cluster at threshold 0 breaks down into
five different groups, which, as we show in §5.3, exhibit dis-
tinct behaviors. Figure 14 (right) shows the ratio of removed
transfers over the total number of transfers (
y
-axis) for each
threshold value (
x
-axis). Even setting a high threshold does
not exclude many attacks, because attackers generally use
A
only for address poisoning. The attack addresses
A
with a
threshold below 0.5 account for only 1.45% of all transfers
(represented by the vertical line on Figure 14 (right)), which
helps avoid removing too many attack instances and preserve
the structure of the clusters. Groups remain stable up to a
20https://mempool- dumpster.flashbots.net/index.html
18
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
Threshold
1
2
3
4
5
6
7
8
9
10
Group Rank
0.00 0.25 0.50 0.75 1.00
0.00
0.25
0.50
0.75
1.00
Threshold
Removed Transfers (%)
Figure 14: Left: Top 10 clusters (
y
-axis) based on the thresh-
old (
x
-axis). Right: the ratio of removed transfers over the
total number of transfers based on the threshold
threshold of 0.7, but the group structure starts to collapse with
a threshold above 0.8 (i.e., removing 6.64% of all transfers).
Therefore, we choose a threshold of 0.5.
Finally, to additionally validate the robustness of our clus-
tering, we perform a temporal clustering. We apply our cluster-
ing algorithm for the data until time
x
and gradually increase
x
over time to observe the change in the clustering results.
Figure 15 shows how the top 10 clusters (based on the num-
ber of lookalike addresses) evolve over time where the link
indicates the same attack group. The left figure shows the
result with a threshold of 0.5; the clusters appear to be stable
over time. The right figure is the result without any threshold
(no bot exclusion); some large clusters remain separate for
a while, but eventually connected around July and August
2023, forming a single large cluster due to the appearance of
copying bots.
E Intended addresses
We focus here on the intended addresses
R
—the victim’s
recipient which attackers try to impersonate. Table 7lists
the top ten intended addresses the attackers impersonate the
most (based on the number of lookalike addresses
L
target-
ing each
R
). We label each address based on the third-party
Arkham Intel website.
21
We also aggregate the number of
attacks targeting each
R
. The table shows that many of
R
are centralized exchange hot addresses. Since users do not
directly interact with those hot wallets, we do not observe any
successful cases for those addresses. Yet, those exchanges’
deposit addresses frequently interact with those hot wallets,
which may explain why the attackers’ (algorithm) capture
those intended addresses (as discussed in §6.2).
F UI Design of wallets and chain scanners
This section examines how address poisoning attacks appear
in victims’ wallets or chain scanners. In particular, we visit
21https://intel.arkm.com/
Table 7: Top 10 intended addresses
Intended address RLabel Num of LNum of T R
0xA9D1e0...1d3E43 Coinbase Hot Wallet 143,764 18,229
0x28C6c0...f21d60 Binance Hot Wallet 26,373 16,147
0x974CaA...ECC400 Stake.com Hot Wallet 3,254 2,355
0xF7C8dA...38d921 Devout Deposit 5,242 2,014
0xf89d7b...5EaA40 Bybit Hot Wallet 4,556 1,702
0xBFCd86...23d6aD HitBTC Hot Wallet 3,216 1,569
0x731309...AB4D2a BitGet Hot Wallet 7,547 1,477
0x99870D...9AF1e2 HitBTC Deposit 9,700 1,449
0x3CC936...aeCF18 MEXC Hot Wallet 6,603 1,397
0x75e89d...1dcB88 MEXC Hot Wallet 1,629 1,257
those services in Jan. 2025 and investigate how those services
have reacted to the threat of address poisoning.
To discover how victims see the attacks on their wallets and
chain scanners, we first implement address poisoning transfers
and attack our own accounts (i.e., attacking our accounts)
on the Ethereum mainnet. While our attack uses minimal
blockchain resources (three poisoning transfers), conducting
our attack on the mainnet is critical because of the significant
UI difference from the testnet. Specifically, we conduct our
experiment on Dec. 20th, 2024, and used 159,364 gas for
three poisoning transfers which consumes about 1% of one
block (0.00014% of gas usage for a day).
We generate a lookalike address
L
that matches the
first six and the last 5 characters to
R
. We follow to
AC
and
CT
implementations from the attackers who
have uploaded their contracts’ source codes on Etheres-
can and perform three poisoning transfers on our vic-
tim account
V
. Our performed attack can be found in
V
0xB6e84DF1cE401117C450221ccc6EF502cb0e2284.
First, we check the victim’s UI on blockchain wallets (as
of Jan. 2nd, 2025). Based on the wallet market share [25], we
study MetaMask, Trust Wallet, and Phantom. Regarding the
address abbreviations (first, last), Metamask is (4,4), Trust
Wallet is (2, 4), and Phantom is (4, 4). Trust Wallet and Phan-
tom only display tiny transfers but not zero-value or counter-
feit token transfers. Metamask does not appear to show any
ERC-20 token transfer history in the first place. When initi-
ating the transfers, users no longer have options to select the
recipient from “recently used addresses” but manually specify
it on their own. The fact suggests that victims may rely more
on blockchain scanning services to find the recipients.
Second, we check the major blockchain scanning services:
Etherscan, Ethplorer, OkLink, and Arkm (as of Jan. 2nd,
2024). Table 8compares the UI of those blockchain scan-
ners (tiny: T, zero-value: Z, and counterfeit token: C). The
level of address abbreviations significantly differs by each
service. Etherscan equally shortens the beginning and the end
while others significantly abbreviate one side (or no abbre-
viation at all). Since the attackers mostly focus on the first
and last characters now, displaying one side might be effec-
tive. Next, we check the poisoning transfers. By default, most
19
Dec'22
Jan'23
Feb'23
Mar'23
Apr'23
May'23
Jun'23
Jul'23
Aug'23
Sep'23
Oct'23
Nov'23
Dec'23
Jan'24
Feb'24
Mar'24
Apr'24
May'24
Jun'24
Jul'24
Time
1
2
3
4
5
6
7
8
9
10
Group Rank
(a) With a threshold of 0.5
Dec'22
Jan'23
Feb'23
Mar'23
Apr'23
May'23
Jun'23
Jul'23
Aug'23
Sep'23
Oct'23
Nov'23
Dec'23
Jan'24
Feb'24
Mar'24
Apr'24
May'24
Jun'24
Jul'24
Time
1
2
3
4
5
6
7
8
9
10
Group Rank
(b) Without a threshold
Figure 15: Temporal clustering results. The x-axis is the time, and the y-axis is the number of clusters.
Table 8: UI comparison of blockchain scanners
Addr. abbr. Poisoning transfers Token icon
Default Setting change
Etherscan 8, 9 T C Z Yes
Ethplorer No abbr. T, Z, C No
OKLink 4, 11 T, C Z Yes
Arkm 36, 0 T C Yes
services (but Ethplorer) successfully hide zero-value trans-
fers while they fail to detect tiny transfers. By changing the
setting, Etherscan and OkLink additionally show zero-value,
and Arkm displays counterfeit token transfers. It is generally
easier for them to hide tiny than zero-value (i.e., just detecting
the value of 0) while they must implement some heuristics to
detect tiny or counterfeit token transfers
22
. Note that hiding
zero-value transfers hinders the usability if users use zero-
value transfers for (benign) testing purposes. Most services
show token icons or give a warning to
CT
(based on their list
of
AC
), which would be helpful for users to distinguish them.
G Payoff transfers
Table 9presents the 30 largest payoff transfers on the
Ethereum blockchain and BSC as examples for readers to
investigate. These transactions are the final step of the attack,
in which victims mistakenly send funds to the attackers. We
verify the authenticity of these attacks discovered by our algo-
rithm on Etherscan (BSCscan) through the following steps:
1.
In the transaction page, get the sender (i.e., from) of the
Transfer event
2.
Go to the sender (victim)’s page and examine the Token
Transfer (ERC-20) tab.
22
Etherscan successfully detects all poisoning transfers as of our visit on
Jan. 13th, 2025.
3.
Follow the prompt and change the site settings to disable
“Ignore Tokens with Poor Reputation” and “Zero-Value
Token Transfers” options.
4.
Find the payoff transfer transaction and verify that there
are attack attempts in earlier blocks. For example, in the
first case of Ethereum, a transfer of 10,000,000 counter-
feit tokens from the victim to the (attacker’s) lookalike
address took place in block 17,778,047.
5.
Find the authentic token transfer events from the
victim to the intended address that share similar
characters to the lookalike address. For example, in
the first case of Ethereum, a transfer of 10,000,000
USDT (Tether) is sent from the victim to the address
0xa7B4BAC8f0f9692e56750aEFB5f-
6cB5516E90570.
20
Table 9: Largest payoff transfers on Ethereum and Binance Smart Chain. USD Phished is the monetary loss in this single
transaction, Recipient is the attacker’s lookalike address, Intended recipient is the address that the victim interacted with, Matched
is the number of characters matched between the lookalike and the intended recipient,
a+b
indicates that the lookalike address
matched the first acharacters and the last bcharacters of the intended address.
(a) Ethereum blockchain
USD Phished Transaction Hash Recipient (attacker) Intended recipient Matched
20,000,000 0x08255ca0e42a872559437141fa46980e66d907f7668922467d67515b1ebb4b7f 0xa7Bf48...E90570 0xa7B4BA...E90570 3 + 6
3,800,000 0x2dea87ff34b0d4712d59e582e810d1a8396e469ce7830a61b9521759b28221f1 0x61b697...074c5C 0x61b9d3...174c5c 3 + 5
3,554,610 0x11ac26acce0620c50731a94b528419771211779903d6917b05f2bd90e4a6476c 0x1cbB23...9B758a 0x1CbB23...9B758a 7 + 6
2,030,000 0x0911bd6713493e9ab75ef82cc909114218996f0e717b81b71d7bf29fa06e1622 0x74C9bd...60E1cA 0x74C32c...50E1cA 3 + 5
2,000,000 0x48e591d562a5098527c0de850ba44ce21014726434184f693707f62877a16994 0xbb2EDb...619455 0xBB217c...219455 3 + 5
2,000,000 0x5ae3147442cb9cba98b8b3dc1548a7c27dab29847ca2c126059ca806ad4479c9 0xC7B14b...B33A8f 0xc7b57d...533a8F 3 + 5
1,999,000 0x7cae29b7d215ef67fa183d4455760fc6e8698bfb2c5f1744dec4749d0c4b88c7 0xC7B14b...B33A8f 0xc7b57d...533a8F 3 + 5
1,660,000 0x86d39a9fd0e4078223179e6f59d2651963edb07899efc9558c43a0a338fb40a5 0xEA0d13...CBeb14 0xEa0ddb...FcEb14 4 + 4
1,499,990 0xf9948a832bfc217451f056fb3dc2df258c6ad525d7c3c184e07eeeee3df7ed09 0x9339e7...1F4126 0x9339CF...3A4126 4 + 4
1,200,000 0x371f26eb4b08b37d69b280fa5ea06208750c4bc9cf3258950d63c842f9f01154 0xCbA796...34c994 0xCbA796...34c994 8 + 6
1,045,150 0xa7be1b408b17853c3f9de8cef05371616d154b646e0f7c556feb72dc87ca9676 0x73435A...ca79F7 0x734659...Ca79F7 3 + 6
1,000,000 0xb97df9feb55c3849da4551f43995ec0f0fd78ff0e3d28f337934dbaba154bc11 0x80D707...4dbeA2 0x80D713...f4beA2 4 + 4
960,000 0x5b03c7356fe51575bd4fd190cb83841ec7398d8531dd34d21d89fdc0f776afd5 0x946C8e...f98Cd6 0x9462B5...f98CD6 3 + 6
865,884 0x71eb719e26a31dbb9b05d38b8a36d73537af35e3a5e0d2ccf4bbb6accbd8cd36 0x55B714...Ba9d79 0x55be3c...Ca9d79 3 + 5
849,996 0xb09f4c9fe09e6bbeb3abfdbed21a0403a2d8d2b6313ed1cbcdbd620a1f748fb2 0xEB4034...84683d 0xEb4034...84683D 8 + 7
800,158 0xaeb6f1b730e65ddb05c84d71af8c370dead57c4d1aecafc7e2369fde50875ae6 0x0Cee6B...e70EbF 0x0CEE33...3f0EbF 4 + 4
749,900 0xecd48ffd8d6a09f1ca18e7acbaad4ff4f70116d70fa65dab842a7cd1651df8a0 0xd7F373...7B7403 0xd7F38D...7B7403 4 + 6
719,000 0x1576beff54a53f0ef0d881a6242edf4539d0964b5b2431f9e1f08fe781b42de3 0x9887E5...D40ea3 0x9887F3...740ea3 4 + 5
710,000 0x66096743f07f2f0d49818f5e4de28b98bbf3ac6bfcf665fa48225f97260be6b1 0x949D0D...ef861B 0x949954...58861B 3 + 4
699,990 0xb2d6e74cba922beeae6dfd9f8a832372ae1d4d5314c0d1505dda13911e840316 0xC7BA77...26b438 0xc7BD59...F5B438 3 + 4
618,000 0x273166dfae3d85e8297a0c1353d8d9bc72ab7b021a61797cd30b8d1871f59fea 0x55Eb2A...A83159 0x55e5Cf...083159 3 + 5
600,000 0xf997e3b1fcba63c52815dbc63d4364691a79ef65bb578957f182f54753ef59ca 0xE07222...6adFeA 0xE07C8D...6AdFea 3 + 6
561,035 0x65f005042adff7a66fd5e317459987367dcfbae3246a50dbde88d1f364c4d9a5 0x368595...c3aFd3 0x3683a1...C3aFD3 3 + 7
541,900 0x82be73572215148ba7c03e984b1361188b098b7ceedf1aaac2a2574ec3f52fa0 0x35D6CD...4B0305 0x35D3c7...4b0305 3 + 6
541,000 0x7b5b69fad190d0e2b233d1b5bc0fb4a92178b72a737b3f2041a438a8bcd8ecad 0x60855B...30D466 0x6085fd...c6D466 4 + 4
500,962 0xea71be058fcd162620d95d04b2d5acbec953cedbb5ccb5339e6d43198b4b026e 0x539592...0DDf0b 0x539683...a6Df0B 3 + 4
500,000 0x8e452c91d99185a0ba325b2ea3dbb276d09f4b2a6dc6772d615acd48237a3148 0x324AFa...80C28F 0x32494C...C0c28F 3 + 5
500,000 0xd4bc53f71576fc1b78dd56145e42a127d4109567e7d04c3033e59d9ce9da502c 0xB08cfc...A74a2F 0xb08D97...574A2f 3 + 5
500,000 0xe68134277191eae950ca698b6651ac7c1942c6c4578eb53a98379c7deceb3f33 0x140588...b0Cb01 0x140b09...25cB01 3 + 4
500,000 0x78cfaa6c26a06444561397b7b02a426b1e3250b1eaf7e051e91b9137a42fdf86 0xC6BCCb...Bf0115 0xc6B7A8...BF0115 3 + 6
(b) Binance Smart Chain
USD Phished Transaction Hash Recipient (attacker) Intended recipient Matched
279,489 0xa0880a04b9f4dd114eb6e5b457a4b6f260c1e1444eef350b62cdc5f1cba8b0ca 0x7e4d7a...a7f223 0x7e4D54...dcf223 4 + 4
150,000 0xba144f4c4542434bc1ebc2d70030a89f46e81c993842bf79b43ad453cd446d3e 0x112C71...C4B08a 0x112784...24B08a 3 + 5
144,876 0xc11c5930465a2c33a479b7df63eaa15610dd9ca78c70f788d0f69cfe5e96960d 0x25f566...b02c50 0x25f5e8...b02c50 4 + 6
133,000 0x266647bc5190cd516f791399ac61b10e876ad75dcc0c6fabb77ccc80f7ead038 0xe72a80...49dc98 0xe72dee...f7dc98 3 + 4
100,000 0x62798e183339f5ed70239a84b4463b7e48f497e1ef0e23ee82b78087195462d4 0xD643f1...eAF6fB 0xd64329...24F6fB 4 + 4
97,250 0x9a4081519e99e2e6fe6add185da7333d44eb28cd1c70df36a32ff8648940cbf5 0x8AC739...5cEB7B 0x8ac7CE...76eb7b 4 + 4
91,680 0x12505feedc126def4e118a2b8e970766795d45201a79dbe45f4b2b9a41391375 0x5ebefa...8De436 0x5EBe2a...EDe436 4 + 5
86,180 0xa070b46843b57ff4e529cbe7aabb67c99cb36da58510106bb464d1f039e7b3db 0x2556be...757cF1 0x255EC6...A57cF1 3 + 5
72,671 0x4e7dd743e32d51f6f8d0705d20542a15a9a8b40a80e707ad1e1f18147033b81b 0x6c9351...D01114 0x6c93Ce...EC1114 4 + 4
70,006 0x13aecc1f514cc95db38a31ada1c9827124aa87360dc11cb1b3dfd7fe0d2499e0 0x2f9233...2BC04D 0x2f9276...cbc04d 4 + 5
60,000 0x521bb785beaac9b08cc35d57096a40d28dc7e3e697fe22572a0431681770cc04 0x1616bB...4d1800 0x161c2C...6d1800 3 + 5
50,111 0xe5576fff71817e990959b6947aa54866888953e2912d3047620b27d0ac766350 0xf06e0a...b8b610 0xf06e59...29b610 4 + 4
50,094 0x09c21db09c5e92e68bf01eb46374796a9a3c16a0b42df9ef2217020f2f8e2352 0x5c3415...82E06f 0x5c3464...a2E06F 4 + 5
50,000 0x6108401848776037926010c09b848f00ec5e04e3c4075b72437575544e4c0cd9 0x809799...4A2d4c 0x8097dc...ba2D4C 4 + 5
50,000 0xb4a277c05232537bbb1dd2aa1f04c8dced24fbd5be125c65fe0498db3f4b837d 0x1487c9...f08B32 0x148799...b08b32 4 + 5
50,000 0xedfe46269453c28ff4d70f736ca73664ca032bbc65a67462b3d35ea178e47171 0xC4d43e...71d48D 0xc4D89C...81D48D 3 + 5
46,000 0xe4335fecccba5eec21dbaa18d71cdb4d1c56056ee38c8a30b2a57aaccf4c4faa 0xcc3af0...97298f 0xcc3afe...fe298f 5 + 4
40,008 0x230ef03a64905d8e0c9a074fde33c0fb993ad536b3852129c36efe485409ad69 0x458728...dF9961 0x4587C1...629961 4 + 4
39,995 0x5ac785bf1bc5273ca3183471bafce3e1a301afc969f3e5eb26e83f249cca70b4 0x73A31C...E5A703 0x73af35...35a703 3 + 5
37,000 0x41c732fa968e3816c69ffaebc849233b84a863695344f24328891654b1230edf 0x52C990...0c1961 0x52c908...fC1961 4 + 5
35,000 0xd96aa7bfbd32720c88ecfddaec21af640a65cb858ddb735d23b5d6b1fae2b1bc 0xe6287d...08d0EA 0xe62d77...f8D0ea 3 + 5
32,000 0x7cee39ef7f56a08b88346938540be29d58249a33b1b53a1166203d7c467e21c4 0xF632D3...294ee0 0xf63078...194EE0 3 + 5
30,060 0x53eb5af3962cf511e51b6369071494118460cc11e24234a3551478f3415e0a47 0x4245B2...782880 0x424276...782880 3 + 6
30,000 0x6bc9148d278302517d49f7602f5c8fd0ef516ea8563996910db846ab029e6994 0x8b8ea3...7A8F9F 0x8B808A...5a8f9f 3 + 5
30,000 0xa9c015c3b41dd449266cb67fb8c514aadc501fac88a1c49e06d6b51b0c0a9953 0x532D78...C1540f 0x532Dc7...71540f 4 + 5
30,000 0x6462aaf80cc01700133399d6086cc0381f72ae48b52cdd85529fd7ec45d19fc9 0x2FD1D3...5cc8B8 0x2fd1Da...cCC8B8 5 + 5
30,000 0x6ef40292302c45c1a0a1756b9b431e4b08f02118ca78c12892377c52e5a84796 0xBB3e51...c3aEAF 0xBb3fd3...93aEAF 3 + 5
29,377 0x74123f8aef47435e94a4c2291df65bdcf1d6735de97e05fb84974c48a207983d 0xE15D59...5bA949 0xE15f47...6ba949 3 + 5
28,650 0x1e56b900524521ef21e82392ff8992675a58f0af74a8ddf476c96876533cd5c2 0xaF64F8...Cf01d0 0xaF6859...3f01d0 3 + 5
25,666 0x60b3ca5895303d08fe83288419e805de7cd078b1778ba7ee30ceafdbac170601 0x4A2274...7b57A0 0x4a22aE...bB57a0 4 + 5
21
ResearchGate has not been able to resolve any citations for this publication.
Conference Paper
Full-text available
This paper presents an end-to-end analysis of the cryptocurrency free giveaway scam disseminated in a new distribution channel, Twitter lists. To detect scams disseminated in this channel, unlike existing detection systems that rely on manual effort, this paper develops a fully automated detection system, GiveawayScamHunter, which can continuously collect lists from Twitter and utilize a Nature-Language-Processing (NLP) model to automatically detect the free giveaway scam and then extract the associated scam cryptocurrency address. By running GiveawayScamHunter from Jun. 2022 to Jun. 2023, we have collected 95,111 free giveaway scam Twitter lists created by thousands of Twitter accounts. Through analyzing the profile of scam list creators, our work reveals that different strategies have been adopted to spread the scam scheme, including compromising popular accounts and registering spam accounts on Twitter. Our analysis result shows that 43.9% of spam accounts remained active before our report. Furthermore, we collected 327 free giveaway domains and 121 new scam cryptocurrency addresses. By tracing the transaction history of all scam addresses, our work uncovers that over 365 victims have been attacked by the scam, resulting in an estimated financial loss of 872K USD. Overall, our work sheds light on the tactics, scale, and impact of free giveaway scams disseminated on Twitter lists, emphasizing the urgent need for effective detection and prevention mechanisms to protect social network users from such fraudulent activity.