Content uploaded by Emmanouil Vasilomanolakis
Author content
All content in this area was uploaded by Emmanouil Vasilomanolakis on Nov 06, 2020
Content may be subject to copyright.
Towards systematic honeytoken fingerprinting
Shreyas Srinivasa
Aalborg University
Copenhagen, Denmark
shsr@es.aau.dk
Jens Myrup Pedersen
Aalborg University
Copenhagen, Denmark
jens@es.aau.dk
Emmanouil Vasilomanolakis
Aalborg University
Copenhagen, Denmark
emv@es.aau.dk
ABSTRACT
With the continuous rise in the numbers and sophistication of
cyber-attacks, defenders are moving towards more proactive lines
of defense. Deception methods such as honeypots and moving tar-
get defense paradigms, are nowadays utilized in a multitude of
ways. A honeytoken is an umbrella term that describes honeypot-
like entities/resources that can be inserted into a network or system.
The moment an adversary interacts with a honeytoken, an alert is
raised. Similar to honeypots, the value of honeytokens lies in their
indistinguishability; if an attacker can detect them, e.g. via a nger-
printing tool, they can easily evade them. In this paper, we propose
and discuss honeytoken ngerprinting methods. To the best of our
knowledge, this is the rst paper to examine honeytoken-specic
ngerprinting. Furthermore, we showcase a proof of concept that
is able to successfully detect a number of honeytoken types.
CCS CONCEPTS
•Security and privacy →Network security.
KEYWORDS
honeytokens, ngerprinting, honeypots, deception
ACM Reference Format:
Shreyas Srinivasa, Jens Myrup Pedersen, and Emmanouil Vasilomanolakis.
2020. Towards systematic honeytoken ngerprinting. In 13th International
Conference on Security of Information and Networks (SIN 2020), November
4–7, 2020, Merkez, Turkey. ACM, New York, NY, USA, 5 pages. https://doi.
org/10.1145/3433174.3433599
1 INTRODUCTION
Proactive defense mechanisms such as honeypots and moving target
defense schemes have become a common additional line of defense.
A honeypot is an information system resource whose value lies in
unauthorized or illicit use of that resource [
24
]. Over the years, a
number of honeypot approaches have been proposed (e.g. [
20
,
22
,
27
,
29
]) for defending a multitude of protocols and systems (ranging
from industrial control systems [23, 30] to IoT devices [21]).
Honeytoken is an umbrella term for a subset of honeypots in
which there is no protocol or system emulation. Instead, a honeyto-
ken usually emulates some resource (e.g. a le or a username/password)
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for prot or commercial advantage and that copies bear this notice and the full citation
on the rst page. Copyrights for components of this work owned by others than ACM
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specic permission and/or a
fee. Request permissions from permissions@acm.org.
SIN 2020, November 4–7, 2020, Merkez, Turkey
©2020 Association for Computing Machinery.
ACM ISBN 978-1-4503-8751-4/20/09. . . $15.00
https://doi.org/10.1145/3433174.3433599
that is part of a real system and triggers an alert whenever it is
accessed or used [
25
]. For example, a honeytoken can be a .docx
le that contains an obfuscated script that is triggered when the
le is opened.
An advantage of honeytokens over traditional honeypots is that
they operate with lower system resources and are simpler to man-
age. In addition, they are easy to generate and deploy. Honeytokens
can indirectly detect the presence of diverse attack vectors (e.g. mal-
ware) and identify direct attacks like unauthorised access attempts.
Due to their simple design and exibility, honeytokens are popular
and are used by system administrators.
Over the years, there has been increase in honeytoken research
including patents by commercial organizations [
3
,
19
,
28
]. Honey-
tokens can be modelled as les, directories, URLs, DNS entries, fake
user accounts and fake data tuples in a database. While there is no
limitation in the design, the core of honeytokens is to detect and no-
tify users about unauthorized access. Some of the open-source and
other research implementations include Canarytokens [
26
],
ℎ𝑜𝑛𝑒𝑦𝜆
[
16
], honeybits [
15
], HoneyGen [
4
], honeywords [
14
] and lastly the
honeyle [17].
Recently, a number of researchers have discussed methods for
ngerprinting honeypots [
13
,
18
,
31
]. The purpose of these works is
to generate some type of signature probe that is able to distinguish
between a real system and a honeypot. While this research suggests
that many traditional honeypots can be easily identied, it does not
take honeytokens into account. The key feature of honeytokens is
that their alert logic is embedded within a real digital entity with
fake contents. This makes honeytokens hard to identify as the only
way of determining if an entity is a honeytoken is by utilizing it.
In this paper, we attempt a preliminary study on the possibil-
ity of ngerprinting honeytokens. We rst classify the dierent
honeytoken technologies in a systematic way and proceed by de-
termining ways for their identication. Furthermore, we provide
proof of concept experiments that demonstrate the feasibility of the
proposed ngerprinting mechanisms. To the best of our knowledge
this is the rst paper to examine honeytoken ngerprinting.
The rest of the paper is structured as follows. Section 2 provides
a background of honeytokens and honeytoken ngerprinting. In
Section 3 we propose honeytoken ngerprinting techniques. We
present a proof of concept by scanning of honeytokens using the
proposed techniques in Section 4. We conclude our paper in Section
5 along with our future work goals.
2 BACKGROUND
Since honeypots and honeytokens are exible in their design and
emulation approach, various concepts regarding their applicability
and type have been proposed. Nevertheless, the factor that dis-
tinguishes honeytokens from honeypots is their ability to detect
threats by emulating low-level digital resources/entities like les,
SIN 2020, November 4–7, 2020, Merkez, Turkey Shreyas Srinivasa, Jens Myrup Pedersen, and Emmanouil Vasilomanolakis
directories, user-accounts, and URLs. Honeypots operate at a higher
level by emulating services and protocols that resemble a system
or a service.
Fraunholz et al. survey deceptive technologies and provide a
comprehensive overview of honeytokens as well [
9
]. The survey
suggests that most proposals cover dierent types of entities and
focus on the generation of deceptive digital twins. Furthermore, the
authors present a classication that distinguishes between server,
database, authentication, and le honeytokens. For example, the
authors classify Honeyport [
10
] as a server-based honeytoken as
it emulates an open network port within a server. Similarly, the
honeytokens classied under database, authentication, and le,
contain deceptive elements to emulate a data record, password, and
a document respectively.
Han et al. also survey deception techniques in computer security
[
11
]. The authors introduce a multi-dimension classication for hon-
eypots, based on four orthogonal dimensions: goal, unit, layer, and
deployment of deception. Internal to the deployment dimension, a
sub-class based on the deployment layer is relevant to honeytokens.
The layer is further divided the into network, system, application,
and data layers.
Based on the various honeytokens proposed in related research,
we break down honeytokens’ architecture into two primary mech-
anisms: deception and alerting. The deception mechanism is respon-
sible for the emulation of the digital entity and the deception logic.
The alerting mechanism focuses on the alert trigger mechanism
responsible for notifying the user about the access attempt. The
alerting mechanism is triggered when the adversary tries accessing
the honeytoken or using the data generated as a honeytoken for
an access attempt. Both deception and alerting mechanisms may
vary based on the digital entity replicated.
Deceptive Alerting
Honeytoken Entity/Resource Mechanism
Honeyentries [4],[12] Table data set DB Monitor
Honeyword [14] Password DB Monitor
Honeyaccount [8] User-account Event Logger
Honeyle [17] File-Google Sheets Session Log
Honeyle [10] File Event Logger
Honeypatch [1], [2] Vulnerability Session Log
HoneyURL [17] URL DNS Trigger
CanaryTrap [7] Email Email
Honeyport [10] Network port Session Log
CanaryToken [26] File-pdf, docx DNS Trigger
CanaryToken [26] Directory DNS Trigger
CanaryToken [26] URL DNS Trigger
Honeybits[15] Email DNS Trigger
Table 1: Honeytoken-Mechanisms overview
Table 1 provides an overview of the deception and alerting mech-
anisms employed in research-based and open-source honeytokens.
The deceptive entity denotes the digital entity or resource that is
emulated by the honeytoken. These vary from passwords, user-
accounts, les, directories, email, software patches, URLs, network
ports, etc. The alerting mechanism lists the triggering and notica-
tion technique employed by the honeytokens.
The DB Monitor monitors a dataset for changes and maintains
an activity log. All changes and access information are logged
respectively. The Event Logger operates at the system-level and
maintains a log of all system events. User dened events can be
logged on the Event Logger; this is supported by most modern
operating systems. The Session Log operates at the application-level
and logs all the events at user-dened log levels. These may vary
from informational, debug, error and warning. DNS Triggers operate
at the network-level by performing a name resolution query to a
DNS server. The query includes a URL that triggers the alerting
mechanism.
3 HONEYTOKEN FINGERPRINTING
We present generic techniques to detect honeytokens that operate
at dierent levels. The proposed ngerprinting techniques leverage
the gaps in both the deceptive entity and the honeytokens’ alerting
mechanism to determine if the entity is indeed a honeytoken. To
understand the operating levels of the honeytokens, we classify
the honeytokens (see Table 1) into Network Level,System Level,
Application/File Level and Data Level. Table 2 provides an overview
of the classication based on their operating level. The table also
lists the ngerprinting techniques associated with each operational
level corresponding to the alerting mechanism. In the following
subsections we describe ngerprinting techniques on the basis of
the various operating levels.
Alerting Operating Fingerprinting
Mechanism Level Technique
DB Monitor Data Modied Date
Event Logger System Last Used, grep search
Session Log Application Grep search
Application, Reverse Engineering
DNS Trigger Network Network Sning
Table 2: Honeytokens Fingerprinting Overview
3.1 Network level
Honeytoken overview. Honeytokens operating at the network
level are either replicating a networking entity or using the network
for communicating the alerts to the administrator. For example, the
Honeyport [
10
] emulates an open network port on a web server and
uses the web server’s session logs as the alerting mechanism. How-
ever, a honeytoken may operate at a dierent level (e.g. the le level)
and use the network to communicate the alerts (e.g. Canarytoken
[26]).
Network level ngerprinting. Considering the alerting mecha-
nisms classied to operate at the network level in Table 2, we
observe the utilization of DNS. The honeytokens trigger a DNS
resolution call made to a domain hardcoded within the embedded
alerting mechanism upon detecting an access attempt. For example,
a le-level Canarytoken contains an alerting mechanism that per-
forms a DNS call upon opening the respective le. Fingerprinting
Towards systematic honeytoken fingerprinting SIN 2020, November 4–7, 2020, Merkez, Turkey
these calls can be done by sning the DNS trac on the com-
promised system. The DNS trac will reveal the calls made to
open-source honeytoken alert domains. However, inspecting the
DNS trac for calls made to honeytoken domains is a passive ap-
proach. Using this ngerprinting technique will notify the user of
the access attempt before identifying the honeytoken. In the fol-
lowing, we introduce active ngerprinting techniques that detect
honeytokens without triggering an alert.
3.2 Application/File Level
Honeytoken overview. The application/le level ngerprinting
techniques focus on detecting honeytokens at the application or le
level. These honeytokens operate by emulating a le of a specic
format (e.g. pdf or docx) and obfuscating an alert mechanism within
the le. The alert is triggered when the le is opened through
specic applications like the Adobe Reader.
Application/File level ngerprinting. File-level honeytokens using
a network for alerting mechanisms can be ngerprinted by decom-
posing the les using reverse engineering techniques. For example,
Canarytokens [
26
] that oer honeytokens as a pdf le format can
be decomposed by le parsing techniques. On parsing the pdf le
with a parsing tool from DidierStevensSuite we observe that the Ca-
narytoken contains obfuscated DNS triggers to "canarytokens.net"
in the /URI of the object stream [
6
]. We nd similar obfuscated
DNS triggers in other le formats like docx, which is oered from
the open-source Canarytokens service. Adversaries can use le
parsing techniques to explore the honeytokens for obfuscated code
fragments that trigger the alert mechanisms. Similarly, a honeydi-
rectory, a directory-emulating honeytoken from Canarytokens, can
be identied by examining its meta-data.
3.3 System Level
Honeytoken overview. Honeytokens that operate at the system
level use the underlying operating system’s features to facilitate the
alert mechanism. Examples of the system features include event-logs
and inotify alerts. Honeytokens like Honeyle [
17
] and Honeyac-
count [8] employ system-level triggers to alert the users.
System level ngerprinting. Fingerprinting system-level alert
mechanisms are complex because of their abstract calls and ob-
fuscated deployments. Access monitors like inotify run as a back-
ground service that monitors a le or a directory for modications.
The inotify system calls are embedded within a C program and are
initialized with a le descriptor, le path, and the mask modes as
parameters. The mask modes oer options for the triggers like le
accessed, modied, deleted, or created. The rst step towards n-
gerprinting would be to check for inotify processes running in the
background; i.e. the adversary has to list the background processes
in the compromised system. Upon nding a process relevant to a C
program execution, it is evident that there is an alerting mechanism
setup. The adversary can open the C program’s path, which calls
inotify and check the le or directory path for changes.
3.4 Data level
Honeytoken overview. Data-level honeytokens work on the gen-
eration of fake data that resemble actual data. The generation of
data-based honeytokens is complicated due to the requirement of
high resemblance to real data. An example of this is the generation
of employee data and access information. The honeytoken data that
resembles the employee information and his access information
must resemble a real employee record. Simultaneously, this data
must be fake and attractive enough for an adversary. There have
been many research proposals over algorithms and techniques for
the generation of such data honeytokens [4, 8, 32].
Data level ngerprinting. The ngerprinting technique for de-
tecting data-level honeytokens depends on the type of data emu-
lated and the alerting mechanism used. Some research concentrate
only on the generation of data honeytokens and not the alerting
mechanism (e.g. HoneyGen[
4
]). Honeytokens like Honeyword [
14
]
use a Honeychecker module that is responsible for comparing the
password-hash used by the adversary with the list of passwords
and triggering an alarm in case of unauthenticated access. While it
is complicated to ngerprint honeywords, we propose using meta-
data to determine if it is a real entity. For example, Honeyaccount
[
8
] creates fake user-accounts for a system. On a compromised
system that is running Windows and is attached to a domain, user
accounts can be listed and checked for the last known activity of a
user. In addition, the adversary can make use of specic scripts in
the Windows PowerShell to retrieve meta-data about user accounts
in the Active Directory. By observing the meta-data retrieved from
the Active Directory, the adversary can identify if the user account
is real or not.
4 PROOF OF CONCEPT
This section demonstrates the applicability of some of the afore-
mentioned honeytoken ngerprinting techniques. In particular,
we demonstrate ngerprinting in one of the most used honeyto-
ken implementation, the Canarytoken [
26
]. With respect to our
classication we emphasize on network and application/le level n-
gerprinting. The source code and other screenshots of the proposed
ngerprinting techniques can be found on our GitHub account1.
Firstly, we generate a pdf honeytoken by utilizing the Canaryto-
ken service [
26
]. To support our claims (see Section 3.1) we manu-
ally monitor the network and open the generated pdf honeytoken.
Figure 1 shows the packets captured from a system when a Ca-
narytoken is accessed in Wireshark. To avoid manual sning of
all the network we implemented a honeytoken DNS snier (see
GitHub for the implementation code) that checks the DNS trac of
the system for calls made to known honeytoken services. We note
here that if the adversary uses this method they risk triggering the
honeytoken and therefore notifying the administrator.
For a stealthier option, the attacker may use le level nger-
printing techniques (see Section 3.2). We adopt the code of a pdf
parser in [
6
], to identify honeytoken traces in a given pdf le. By
using such an application-level ngerprinting technique, the pdf
Canarytoken was parsed and the honeytoken was detected without
triggering an alert to the administrator. A URI reference obfuscated
in the pdf object
16
was detected. The URI referenced to [
5
] clearly
indicates the call to a domain hosted by the Canarytokens service.
1https://github.com/aau-network-security/tokengrabber
SIN 2020, November 4–7, 2020, Merkez, Turkey Shreyas Srinivasa, Jens Myrup Pedersen, and Emmanouil Vasilomanolakis
Figure 1: Network-level (DNS) Fingerprinting
5 CONCLUSION
In this paper, we propose ngerprinting techniques against the
majority of existing honeytoken proposals and implementations.
Furthermore, as a proof of concept, we successfully ngerprint open-
source honeytokens. This work provides a foundation to extend
our research on honeytoken ngerprinting. In particular, for future
work we plan to work on countermeasures against ngerprinting
for the various honeytokens. Moreover, we will further examine the
possible ngerprinting attacks against them, beyond the presented
proof of concept.
ACKNOWLEDGMENTS
This research was supported as part of COM
3
, an Interreg project
supported by the North Sea Programme of the European Regional
Development Fund of the European Union.
REFERENCES
[1]
Frederico Araujo, Kevin W Hamlen, Sebastian Biedermann, and Stefan Katzen-
beisser. 2014. From patches to honey-patches: Lightweight attacker misdirection,
deception, and disinformation. In Proceedings of the 2014 ACM SIGSAC conference
on computer and communications security. 942–953.
[2]
Jerey Avery and Eugene H Spaord. 2017. Ghost patches: Fake patches for
fake vulnerabilities. In IFIP International Conference on ICT Systems Security and
Privacy Protection. Springer, 399–412.
[3]
Tal Arieh Be’ery and Itai Grady. 2020. Systems and methods for the detection of
advanced attackers using client side honeytokens. US Patent 10,609,048.
[4]
Maya Bercovitch, Meir Renford, Lior Hasson, Asaf Shabtai, Lior Rokach, and Yuval
Elovici. 2011. HoneyGen: An automated honeytokens generator. In Proceedings of
2011 IEEE International Conference on Intelligence and Security Informatics. IEEE,
131–136.
[5]
Canarytokens. [n.d.]. Canarytokens Domain. https://ev942nscoy6b9atf1lscy5gw6.
canarytokens.net/QBEINOXGQDLDQUOXILNWLUCAPCMWEAGOGJ.
[6]
Stevens Didier. 2014. PDF Parser. https://github.com/DidierStevens/
DidierStevensSuite/blob/master/pdf-parser.py
[7]
Shehroze Farooqi, Maaz Musa, Zubair Shaq, and Fareed Zaar. 2020. Canary-
Trap: Detecting Data Misuse by Third-Party Apps on Online Social Networks.
Proceedings on Privacy Enhancing Technologies 2020, 4 (2020), 336–354.
[8]
C. D. Faveri and A. Moreira. 2018. Visual Modeling of Cyber Deception. In 2018
IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC).
205–209.
[9]
Daniel Fraunholz, Simon Duque Anton, Christoph Lipps, Daniel Reti, Daniel
Krohmer, Frederic Pohl, Matthias Tammen, and Hans Dieter Schotten. 2018.
Demystifying deception technology: A survey. arXiv preprint arXiv:1804.06196
(2018).
[10]
Daniel Fraunholz, Daniel Krohmer, Frederic Pohl, and Hans Dieter Schotten. 2018.
On the detection and handling of security incidents and perimeter breaches-a
modular and exible honeytoken based framework. In 2018 9th IFIP International
Conference on New Technologies, Mobility and Security (NTMS). IEEE, 1–4.
[11]
Xiao Han, Nizar Kheir, and Davide Balzarotti. 2018. Deception techniques in
computer security: A research perspective. ACM Computing Surveys (CSUR) 51,
4 (2018), 1–36.
[12]
Michael Gregory Hoglund and Shawn Michael Bracken. 2017. Inoculator and
antibody for computer security. US Patent 9,792,444.
[13]
Cheng Huang, Jiaxuan Han, Xing Zhang, and Jiayong Liu. 2019. Automatic
Identication of Honeypot Server Using Machine Learning Techniques. Security
and Communication Networks 2019 (2019).
[14]
Ari Juels and Ronald L Rivest. 2013. Honeywords: Making password-cracking
detectable. In Proceedings of the 2013 ACM SIGSAC conference on Computer &
communications security. 145–160.
[15] Adel Karimi. [n.d.]. Honeybits. https://github.com/0x4D31/honeybits.
[16] Adel Karimi. [n.d.]. HoneyLambda. https://github.com/0x4D31/honeyLambda.
[17]
Martin Lazarov, Jeremiah Onaolapo, and Gianluca Stringhini. 2016. Honey sheets:
What happens to leaked google spreadsheets?. In 9th Workshop on Cyber Security
Experimentation and Test ({CSET }16).
[18]
Shun Morishita, Takuya Hoizumi, Wataru Ueno, Rui Tanabe, Carlos Gañán,
Michel JG van Eeten, Katsunari Yoshioka, and Tsutomu Matsumoto. 2019. Detect
me if you. . . oh wait. An internet-wide view of self-revealing honeypots. In 2019
IFIP/IEEE Symposium on Integrated Network and Service Management (IM). IEEE,
134–143.
[19]
Hani Hana Neuvirth, Tomer Weinberger, Yaniv Zohar, Craig A Nelson, and
Andrew E Johnson. 2020. Automated generation and deployment of honey
tokens in provisioned resources on a remote computer resource platform. US
Patent App. 16/291,963.
[20]
Michel Oosterhof. 2016. Cowrie SSH/telnet honeypot. :https://github.com/
micheloosterhof/cowrie
[21]
Yin Minn Pa Pa, Shogo Suzuki, Katsunari Yoshioka, Tsutomu Matsumoto,
Takahiro Kasama, and Christian Rossow. 2016. IoTPOT: A novel honeypot
for revealing current IoT threats. Journal of Information Processing 24, 3 (2016),
522–533.
[22] L Rist. 2009. Glastopf project. The Honeynet Project (2009).
[23]
Lukas Rist, Johnny Vestergaard, Daniel Haslinger, A Pasquale, and J Smith. 2013.
Conpot ics/scada honeypot. Honeynet Project (conpot. org) (2013).
[24]
L. Spitzner. 2003. Honeypots: catching the insider threat. In 19th Annual Computer
Security Applications Conference, 2003. Proceedings. 170–179.
[25]
Lance Spitzner. 2006. Honeytokens: The other honeypot. 2003. Internet:
http://www. securityfocus. com/infocus/1713 (2006).
Towards systematic honeytoken fingerprinting SIN 2020, November 4–7, 2020, Merkez, Turkey
[26] Thinkst. [n.d.]. Canarytokens. https://github.com/thinkst/canarytokens.
[27] Dino Tools. 2010. Web Honeypot. https://github.com/DinoTools/dionaea/
[28]
Shlomo Touboul, Hanan Levin, Stephane Roubach, Assaf Mischari, Itai Ben David,
Itay Avraham, Adi Ozer, Chen Kazaz, Ofer Israeli, Olga Vingurt, et al
.
2020. Multi-
factor deception management and detection for malicious actions in a computer
network. US Patent 10,623,442.
[29]
Emmanouil Vasilomanolakis, Shankar Karuppayah, Max Mühlhäuser, and Math-
ias Fischer. 2014. Hostage: a mobile honeypot for collaborative defense. In
Proceedings of the 7th International Conference on Security of Information and
Networks. 330–333.
[30]
Emmanouil Vasilomanolakis, Shreyas Srinivasa, and Max Mühlhäuser. 2015. Did
you really hack a nuclear power plant? An industrial control mobile honeypot.
In 2015 IEEE Conference on Communications and Network Security (CNS). IEEE,
729–730.
[31]
Alexander Vetterl and Richard Clayton. 2018. Bitter harvest: Systematically
ngerprinting low-and medium-interaction honeypots at internet scale. In 12th
{USENIX}Workshop on Oensive Technologies ({WOOT}18).
[32]
Jonathan White. 2010. Creating personally identiable honeytokens. In Innova-
tions and Advances in Computer Sciences and Engineering. Springer, 227–232.