Conference PaperPDF Available

A Revised Taxonomy of Steganography Embedding Patterns

Authors:

Abstract and Figures

Steganography embraces several hiding techniques which spawn across multiple domains. However, the related terminology is not unified among the different domains, such as digital media steganography, text steganography, cyber-physical systems steganography, network steganography (network covert channels), local covert channels, and out-of-band covert channels. To cope with this, a prime attempt has been done in 2015, with the introduction of the so-called hiding patterns, which allow to describe hiding techniques in a more abstract manner. Despite significant enhancements, the main limitation of such a taxonomy is that it only considers the case of network steganography. Therefore, this paper reviews both the terminology and the taxonomy of hiding patterns as to make them more general. Specifically, hiding patterns are split into those that describe the embedding and the representation of hidden data within the cover object. As a first research action, we focus on embedding hiding patterns and we show how they can be applied to multiple domains of steganography instead of being limited to the network scenario. Additionally, we exemplify representation patterns using network steganography. Our pattern collection is available under https://patterns.ztt.hs-worms.de.
Content may be subject to copyright.
A Revised Taxonomy of Steganography Embedding Paerns
Steen Wendzel
wendzel@hs-worms.de
Worms Univ. Appl. Sciences
Worms, Germany
FernUniversität in Hagen
Hagen, Germany
Luca Caviglione
luca.caviglione@ge.imati.cnr.it
National Research Council of Italy
Genova, Italy
Wojciech Mazurczyk
wojciech.mazurczyk@pw.edu.pl
Warsaw University of Technology
Warsaw, Poland
FernUniversität in Hagen
Hagen, Germany
Aleksandra Mileva
aleksandra.mileva@ugd.edu.mk
University Goce Delcev
Stip, North Macedonia
Jana Dittmann
jana.dittmann@iti.cs.uni-
magdeburg.de
University of Magdeburg
Magdeburg, Germany
Christian Krätzer
kraetzer@iti.cs.uni-magdeburg.de
University of Magdeburg
Magdeburg, Germany
Kevin Lamshöft
kevin.lamshoeft@ovgu.de
University of Magdeburg
Magdeburg, Germany
Claus Vielhauer
claus.vielhauer@th-brandenburg.de
Brandenburg Univ. Appl. Sciences
Brandenburg, Germany
University of Magdeburg
Magdeburg, Germany
Laura Hartmann
hartmann@hs-worms.de
Worms University of Applied Sciences
Worms, RLP, Germany
FernUniversität in Hagen
Hagen, Germany
Jörg Keller
joerg.keller@fernuni-hagen.de
FernUniversität in Hagen
Hagen, Germany
Tom Neubert
tom.neubert@th-brandenburg.de
Brandenburg Univ. Appl. Sciences
Brandenburg, Germany
University of Magdeburg
Magdeburg, Germany
ABSTRACT
Steganography embraces several hiding techniques which spawn
across multiple domains. However, the related terminology is not
unied among the dierent domains, such as digital media steganog-
raphy, text steganography, cyber-physical systems steganography,
network steganography (network covert channels), local covert
channels, and out-of-band covert channels. To cope with this, a
prime attempt has been done in 2015, with the introduction of the
so-called hiding patterns, which allow to describe hiding techniques
in a more abstract manner. Despite signicant enhancements, the
main limitation of such a taxonomy is that it only considers the
case of network steganography.
Therefore, this paper reviews both the terminology and the tax-
onomy of hiding patterns as to make them more general. Speci-
cally, hiding patterns are split into those that describe the embedding
and the representation of hidden data within the cover object.
This work is licensed under a Creative Commons Attribution International
4.0 License.
ARES 2021, August 17–20, 2021, Vienna, Austria
©2021 Copyright held by the owner/author(s).
ACM ISBN 978-1-4503-9051-4/21/08.
https://doi.org/10.1145/3465481.3470069
As a rst research action, we focus on embedding hiding patterns
and we show how they can be applied to multiple domains of
steganography instead of being limited to the network scenario.
Additionally, we exemplify representation patterns using network
steganography. Our pattern collection is available under https://
patterns.ztt.hs-worms.de.
CCS CONCEPTS
Security and privacy Network security
;Distributed systems
security;Information ow control; Pseudonymity, anonymity and
untraceability.
KEYWORDS
Network Steganography, Covert Channels, Terminology, Taxon-
omy, Information Hiding, Science of Security, Information Security,
Patterns, PLML, Cyber Security.
ACM Reference Format:
Steen Wendzel, Luca Caviglione, Wojciech Mazurczyk, Aleksandra Mileva,
Jana Dittmann, Christian Krätzer, Kevin Lamshöft, Claus Vielhauer, Laura
Hartmann, Jörg Keller, and Tom Neubert. 2021. A Revised Taxonomy of
Steganography Embedding Patterns. In The 16th International Conference
on Availability, Reliability and Security (ARES 2021), August 17–20, 2021,
Vienna, Austria. ACM, New York, NY, USA, 12 pages. https://doi.org/10.
1145/3465481.3470069
1
ARES 2021, August 17–20, 2021, Vienna, Austria Wendzel, Caviglione, Mazurczyk, Mileva, Dimann, Krätzer, Lamshö, Vielhauer et al.
1 INTRODUCTION
Steganography is the art and science of hiding information in so-
called cover objects, e.g., a secret message is embedded inside a
digital le, network packet, or written text. Its counterpart, ste-
ganalysis, aims at detecting, preventing, and limiting steganogra-
phy. Several attempts have been made to dene the fundamental
terminology and its domains, such as text steganography, digital
media steganography, or network steganography [
9
,
24
,
27
,
28
].
One of these attempts to unify and rene the terminology led to the
systematization of steganographic techniques in precise, general,
and abstract templates, dened as hiding patterns [
37
]. Each hiding
pattern is described via the Pattern Language Markup Language
(PLML) allowing to outline all the various templates in a unied
manner. By using PLML, patterns can be derived from each other
forming a taxonomy, and they can also be linked or composed.
Despite being progressively adopted by the scientic community
(140+ citations as of June 2021), hiding patterns have some limita-
tions. First, hiding patterns are only dened for the sub-discipline
of network steganography. Second, network-specic hiding pat-
terns cannot be directly applied to other domains of steganography.
Thus, the absence of unied terminology and taxonomy as well as
the impossibility of exploiting overlaps to generalize core concepts,
are key issues preventing the adoption of the pattern-based para-
digm by a wide audience. At the same time, a precise terminology
in the ever-growing research domain of steganography is a real
need, as to limit scientic re-inventions and terminological incon-
sistencies [
36
]. For instance, distinguishing between the sender
and receiver side of various patterns as proposed in [
22
] is not an
optimal solution and it could lead to ambiguities.
Therefore, in this paper we aim at addressing the aforementioned
issues. We rst summarize the characteristics of three steganogra-
phy domains, i.e., network steganography, digital media steganog-
raphy, and text steganography. We then present a methodology to
unify the description of hiding patterns in a domain-overlapping
manner. Compared to previous works (see [
22
,
37
]) emphasis will be
put to provide a less ambiguous distinction between the embedding
process and the representation of hidden information within the
cover object. We especially focus on the embedding patterns for
which a novel taxonomy is provided while existing patterns are
integrated into a list of representation patterns.
The rest of this paper is structured as follows. Sect. 2 explains the
characteristics of three key domains of steganography, namely net-
work steganography, digital media steganography and text steganog-
raphy. It further points out the limits of the existing network
steganography-based taxonomy. Sect. 3 explains our methodol-
ogy while, Sect. 4 presents our unied terminology and taxonomy
of embedding patterns and exemplies representation patterns
using network steganography. Sect. 5 highlights the anticipated
future developments of steganography that might inuence our
pattern-based taxonomy and, nally, Sect. 6 concludes the paper
and provides an outlook on future work.
2 ANALYSIS OF EXISTING STEGANOGRAPHY
DOMAINS
In general, steganographic techniques can be utilized either to en-
able a covert transfer or covert storage, as well as in a combined
Application of
steganography
Covert storage Covert transfer
Hybrid
Digital media
steganography
Network
steganography
Filesystem
steganography
Figure 1: Various applications of steganography.
manner as depicted in Fig. 1. In both cases, the secret information
is embedded into a cover object, which should be selected to not
represent an anomaly and have a suitable embedding capacity. Typ-
ically, a steganographic application or technique is closely related
to the features characterizing the chosen hidden data carrier. In
more detail, for the case of covert transfer, a covert sender (CS)
transmits secret information to a covert receiver (CR). Even if many
mechanisms for covert transfer exist, the most popular group of
information hiding solutions exploit network trac and protocols
[
38
]. Instead, for the case of covert storage, the steganographer is in-
terested only in storing sensitive data on a local information carrier
(e.g., on a hard drive), in such a way that the data cannot be spotted
by a third party observer unaware of the information concealment.
An example of such a technique is lesystem steganography where
some additional overlay lesystem for data hiding purposes is cre-
ated by using features like the unused space in partially-allocated
blocks [14].
Finally, for some cover objects, it is possible to perform covert
transfer or covert storage depending on the required application.
This is the case, for instance, of digital media steganography where
one can perform a hidden data exchange by embedding secret data
into the content transferred by services like video or audio stream-
ing, or even if one sends an email with an image containing secret
information. Alternatively, the steganographer can utilize digital
images as a vault to locally store his/her secrets [
18
]. Recently,
another new set of techniques emerged that combines the covert
transfer (over network covert channels) and the covert storage
(within the caches of network protocols), which is called a Dead
Drop [29].
The remainder of this section highlights how major steganogra-
phy domains dier in terms of their cover objects and embedding
strategies.
Network Steganography. As hinted, the principal characteristics
of network steganography are already covered by the existing ter-
minology (see e.g., [
21
] and the references therein). In essence,
the main cover objects used in network steganography are pro-
vided by manipulating or injecting information in some digital
artifacts belonging to the network trac, e.g., the header or the
payload of a Protocol Data Unit (PDU) as well as in the behaviors of
ows/conversations consisting of a coherent sequence of packets. In
general, two main avors of network steganography exist: i) direct
embedding of data within the PDU, or ii) by modulating the timing
or the sequence of adjacent/succeeding packets. Compared to other
steganography domains, the goal of network steganography is not
2
A Revised Taxonomy of Steganography Embedding Paerns ARES 2021, August 17–20, 2021, Vienna, Austria
to store but to transfer the data [
21
]. The capacity of a stegano-
graphic method targeting network is limited by the trac type and
the length of a transmission. Typically, this leads to a slower em-
bedding process compared to digital media steganography [
18
,
38
].
The data is hidden in an ephemeral manner and the application of
network steganography can increase delays and packet loss. This
can impact on the stealthiness of the resulting covert transmission
due to the reduction of some functionality provided by the protocol
or a degradation of the transmission quality [21].
Digital Media Steganography. The term digital media steganog-
raphy (or short: media steganography) addresses the wide eld
of digital steganography research and development focusing on
digital media (i.e., media encoded in machine-readable formats)
as cover data for a plausible, secured and hidden communication.
Digital media were initially designed to address the human audio-
visual system (by delivering information to a screen and/or loud-
speaker) and include many heterogeneous forms as images, audio
data, videos, 3D models, etc. As with the media themselves, digital
media steganography comes in a wide variety of dierent types
that can be classied by various categories. In particular, digital
media steganography can focus on the media type(s) (e.g., audio
steganography), the transmission method (e.g., data as spatial image
or as audio stream vs. audio les) and the basic strategy concerning
the existence and plausibility of a cover data (such as a data stream
to embed into). Three paradigms for the message embedding can
be applied: steganography by modication, by synthesis and by
selection. Further in the case of steganography by modication, the
basic coding strategies of message insertion (i.e., where to embed in
the cover data), the structure of how to embed the message in the
cover data (usually represented as a signal or coded signal data), as
well as the usage of the steganographic key are common categories.
Since becoming an active research eld in the 1990s, a great
number and wide variety of scientic works have been published
on media steganography and steganalysis. The vast majority of
these publications (as well as most of the tools available) have
been focusing on image steganography as the most prominent
sub-domain in this eld [9].
It can be stated that any continuous digital media (in the sense
of temporally-changing media content) can be designed both for
covert storage and covert transfer. This obviously applies mainly to
audio and video, which can be streamed or stored as les. Recently,
streaming services received an increasing degree of interest, as
they appear to become the new main form of media delivery and
consumption in entertainment.
The capacity of digital media steganography is limited by the
type and size of the digital media. For digital media steganography,
capacity always depends on two other characteristics to be achieved:
robustness and imperceptibility for the detection of the hidden
message (also related to undetectability). Some methods of media
steganography can survive conversion to another format, but a
plausible cover object is always required. The application of digital
media methods might decrease the quality of the cover object (e.g.,
image quality).
Text Steganography. This distinct branch of steganography relies
on hiding information in textual messages and textual documents
as cover data, including those in magazines, newspapers, word
processing documents, personal notes, and music notes – just to
mention a few. In contrast to digital media steganography, it uses
manipulation of some lexical, syntactic or semantic features of the
text content, modication of dierent features of the text’s elements
(e.g., characters, paragraphs, sentences, words, lines) or generation
of a new text that simulate some features of the normal text. Sev-
eral examples of such techniques are presented in [
27
] and more
recently in [
11
]. The latter has identied the following concepts as
embedding principles in the literature: i) word spelling, ii) seman-
tic method, iii) line shifting, iv) abbreviation, v) word shifting, vi)
syntactic method, and vii) new synonym text. Since at least three
of these (i.e., ii,iv, and vii) can be considered of purely semantic
nature, and since in comparison to digital media steganography,
text steganography also involves printed (non-digital) text, the dis-
tinction between them and the eld of digital media steganography
seems reasonable.
Similar to digital media steganography, text steganography al-
lows the permanent hiding of information as the texts are not of
ephemeral nature like network trac. To this end, the vast ma-
jority of proposed concepts can be categorized as covert storage
techniques. However, concepts of embedding hidden information
in text streams (e.g., keystrokes or scrolling text) appear feasible.
The capacity of text-based steganographic methods is mainly
limited by the size and structure (including grammar, sections and
use of white-spacing) of a text. However, a suitable cover text is
required to make it plausible as auto-generated texts might appear
synthetic to an observer. Similar to digital media steganography,
text steganography may decrease the quality of the cover object,
even if imperceptible.
Other Steganography Domains. Additional domains of steganog-
raphy bring dierent characteristics with them. For instance, in
lesystem steganography, the cover object might be a le, unused
space in a partially allocated block, cluster distribution of an ex-
isting le [
14
], or an inode [
7
]. In cyber-physical systems (CPS)
steganography, a value might be embedded into a sensor value
[
32
], an actuator state or unused registers [
34
], or into the control
logic of a PLC [
15
]. Hidden data might even be embedded into the
number of cyber-physical events of some machine. Hildebrandt
et al. published the only available pattern-based classication for
CPS steganography [
13
], built on top of the existing one for net-
work steganography. However, their taxonomy adds additional
categories, namely for rmware accessible and program accessible
patterns.
Summary. When we look at the aforementioned steganography
domains, it becomes clear that cover objects appear to be highly
dierent, involving events, values or states, not just les or packets.
For this reason, the novel taxonomy must allow for the inclusion
of highly heterogeneous events, based on a taxonomy that incorpo-
rates events, values, and states to unify the patterns of steganogra-
phy.
In general, a unied theory/taxonomy can be more suitable for
a research area than multiple domain-specic theories in a similar
manner that universal programming languages can be advanta-
geous over domain-specic programming languages (see sect. 2.1
for the limitations of the domain-specic approach).
3
ARES 2021, August 17–20, 2021, Vienna, Austria Wendzel, Caviglione, Mazurczyk, Mileva, Dimann, Krätzer, Lamshö, Vielhauer et al.
2.1 Limitations of the Current Approach
While there are several advantages of unifying the terminology and
taxonomy of hiding methods (e.g., they help structuring steganal-
ysis processes), there are also certain limitations with the current
pattern-based taxonomy which shall be addressed by our work:
(1)
Currently, the available terminology and taxonomy of hiding
patterns are limited to network communications, neglecting
other domains of steganography.
(2)
The level of abstraction of the current taxonomy does not al-
low for the inclusion of non-network patterns. For instance,
user-data from the perspective of network steganography
might be a digital media payload. However, from a digital me-
dia perspective, the network steganography context would
not matter. Thus, current pattern names, e.g.,
Payload Field
Size Modulation1
, and taxonomy, e.g., user-data awareness,
are not fully suitable. A novel taxonomy should therefore
discard domain-specic abstractions. For instance, a least
signicant bit(s) (LSB) method applied to an image le and
an LSB method applied to a network packet share the same
concept and it is the concept that matters.
(3)
The current set of available hiding patterns does not discrim-
inate between the embedding process and the representation
of hidden information in a carrier, rendering the interpreta-
tion of existing hiding patterns ambiguous.
(4)
Some of the original patterns are actually hybrid patterns
that should be broken down into their atomic pieces to de-
scribe them clearly (see, e.g., the
Sequence Modulation
pattern in Sect. 4.5.1 (1)).
(5)
The current systematic categorization of patterns partially
follows the “open science” paradigm by providing informa-
tion about new patterns through a freely accessible website.
However, the inclusion of additional scientists and research
groups was not actively sought, which we aim to change by
encouraging scientists to participate in our consortium.
3 METHODOLOGY
We set up a consortium consisting of eleven experts from seven
institutions located in four countries. During regular consortium
meetings, the following methodology emerged. Given the success
and the functionality of hiding patterns, we decided to keep the
concept of patterns for the new taxonomy. It was further agreed that
the consortium will stick to the PLML-based pattern specication
that was already applied by [
37
]. PLML provides a comparable and
unied systematic for the description and management of patterns
[
8
] that is also applied in other areas, such as software engineering.
A PLML-based description contains certain attributes, such as a
name for the pattern, aliases, an illustration, code snippets, evidence
in form of references, example cases, and links to related patterns
[
8
] — just to mention a few. A PLML-based specication also allows
to exploit existing methodology, such as the unied description
method for hiding techniques [
35
] and the existing framework
for determining whether some hiding technique represents a new
pattern, or not [
36
]. Furthermore, PLML enables easy indexing,
extensibility and linkage of patterns to keep the provided taxonomy
up-to-date on the long run. By allowing the inclusion of aliases in
1In this paper, pattern names are written in bold font.
PLML-based specications, dierent terminology can be unied in
a common term as well, limiting the chance for so-called scientic
re-inventions [36].
4 A NOVEL TAXONOMY OF HIDING
PATTERNS
This section presents our taxonomy for hiding patterns in a way
that incorporates the characteristics of the discussed steganography
domains. The central aspect of our taxonomy is to split all patterns
into two categories:
(1)
Embedding Patterns describe how secret information is em-
bedded into a cover object, such as an image le or a network
packet.
(2)
Representation Patterns describe how the secret information
is represented in a cover object.
It must be noted that when secret data is embedded via the
pattern A, it is not necessarily represented by the same pattern, but
it can be. Two examples illustrate this statement:
(1)
Embedding Pattern
=
Representation Pattern: CS sends an IP
packet to CR in which it manipulates the least signicant bit
of the Time to Live (TTL) eld. CR reads the very same value.
Thus, the embedding uses the
State/Value Modulation
pat-
tern while the hidden information is also represented by this
pattern.
(2)
Embedding Pattern
,
Representation Pattern: Let us assume an
indirect covert channel, where the CS exploits functionality
of a central element that is observed by the CR. Let us further
assume that a third-party client is getting disconnected from
the central network node if some specic value is sent to
it. The CS would then use the so-called
Value Modulation
pattern to cause a disconnect of a certain client from the
central element. However, the CR might only be able to
poll the list of (re-)connected clients at the central element,
i.e., the hidden information would be represented by the
Articial Reconnections pattern introduced in [26].
4.1 Justication of Taxonomy Design Decisions
In previous works [
22
,
24
,
37
], several taxonomy layers specic to
network steganography have been proposed, which we modied
or even discarded for the new taxonomy due to reasons given in
the following subsections.
4.1.1 Previous Terminology Was Based on Packets and Messages. It
arose early during discussions that the current network steganog-
raphy hiding patterns terminology does not fully reect other
steganography domains. For instance, the pattern
Inter-packet
Times
relates only to network packets and a more generic pattern
should thus be named
Event/Element Interval Modulation
. A
similar case is the
Message Timing
pattern, which has been re-
named to
Event Occurrence
. Similarly,
Value Modulation
,
Mes-
sage Timing and other patterns need to reect non-network spe-
cic aspects, such as states of cyber-physical systems, texts and
lesystems, which resulted in novel terms, such as
State/Value
Modulation.
4.1.2 Previous Terminology Focused on Payload. Another issue
when transferring the network steganography terminology to the
4
A Revised Taxonomy of Steganography Embedding Paerns ARES 2021, August 17–20, 2021, Vienna, Austria
broader steganography context was the term payload as there was
a set of payload-specic patterns. From a network perspective, an
image nested in a packet would be the payload, but the image
would be the major focus in digital media steganography, where
the network packet headers would be irrelevant. Thus, we decided
to discard the term payload as well as the taxonomy abstraction
between payload and non-payload. We further removed the terms
user-data (as it referred to payload) and the linked terms user-data
aware and user-data agnostic.
4.1.3 Syntax vs. Semantics. We decided not to discriminate be-
tween patterns that modify (corrupt) the syntax and those that
modify the semantics of a cover element. This is rooted in the fact
that several patterns can modify both. For example, let us assume
that we apply our new pattern
Elements/Features Positioning
,
which modulates the position of an element (we simply use a word
as an element) in the sentence
Joe has the right not to sign
the document after 10.00 o’clock
. So, Joe would be allowed
to reject signing the document after 10.00. When we shift the posi-
tion of the word “not” we can either break the original meaning of
the sentence (
Joe has not the right to sign the document
after 10.00 o’clock
, i.e., now Joe is not allowed anymore to
sign the document after 10.00, even if he would like to do so) or
the grammar (syntax) (
Joe has the right to sign the not
document after 10.00 o’clock
). Similarly, a structured network
packet header could be used to exemplify this aspect, cf. [33].
Structure-preserving: In this context, we consequently decided to
discard the distinction between structure-preserving and structure-
modifying non-temporal methods.
4.1.4 Temporal vs. Non-temporal Paerns. While temporal hiding
patterns are considered those that modulate timing behavior (e.g.,
timing between succeeding network packets), non-temporal hiding
patterns are those that do not modify temporal aspects, at all. How-
ever, non-temporal patterns can be applied in a sequence, though.
For instance, if the
Elements/Features Positioning
pattern is ap-
plied to one IPv4 packet header and places some IP option at a
specic position in the list of options, this is a non-temporal pattern:
the sequence of bits is not considered temporal and the packet is
sent in one piece. However, if the
Elements/Features Position-
ing
pattern is applied to several succeeding IP packets in a row,
the pattern is still considered as non-temporal. Its succeeding appli-
cation might result in transmissions errors if one packet overruns
another, due to temporal behavior, but the embedding process was
not directly focusing or considering this temporal behavior, nor would
the data be represented by the temporal behavior (but instead by
the order).
Discarding Protocol-awareness of Temporal Patterns: To ease the
accessibility of our taxonomy, we discarded the previous dier-
entiation between protocol-aware and protocol-agnostic temporal
patterns. Communication protocols are not the core subject of the
new taxonomy anymore. Moreover, methods can be protocol-aware
at one layer and protocol-agnostic at another. For instance, the orig-
inal
Inter-packet Times
([
37
]) pattern requires at least awareness
of low-level frames but it does not need awareness of higher-layer
protocols encapsulated into the frames. Additionally, if the
Inter-
packet Times
pattern would operate on a higher level, it would
require the understanding of frame structures, packet structures etc.,
e.g., when timings of UDP datagrams are modulated, the IPv4/IPv6
structure must be known.
4.1.5 Discarding ICS-specific Taxonomy Categories. The catego-
rization between rmware accessible and program accessible patterns
as proposed by Hildebrandt et al. [
13
] was dropped for the same
reasons as network-specic categorizations: they do not t into
all domains. ICS-specic patterns will be addressed in follow-up
works.
4.1.6 Extendability of the Taxonomy. A key criteria for the design
of our taxonomy is its extendability. As mentioned in Sect. 3, PLML
will be used as a tool to achieve extendability. With PLML, patterns
can be updated (also on the website) to reect changes; they can
also be added if new patterns are discovered and aliases as well as
relations between patterns can be updated.
4.2 Naming Conventions
Hiding patterns are identied by a number (Sect. 4.2.1) and a name
(Sect 4.2.2).
4.2.1 Enumeration of Paerns. As embedding patterns are of a
generic nature, they are not required to reect any steganography
domain in their enumeration. Their enumeration follows the con-
vention
E[TN]n
, where
[TN]
means that either
T
or
N
are used.
Temporal embedding patterns follow the enumeration convention
ETn
(embedding; temporal, number
n
) while non-temporal patterns
follow the enumeration convention
ENn
(embedding; non-temporal,
number
n
). Sub-patterns add an additional number followed by a
dot, e.g.,
ETn.x
(the
x
-th sub-pattern of the temporal embedding
pattern ET
n
). Additional hierarchy layers can be represented ac-
cordingly, such as ETn.x.y or even ETn.x.y.z, if necessary.
Representation patterns are always domain-specic and follow
the enumeration convention
R[TN]nD
, where
R
tells us that it is
a representation pattern and
T
and
N
dierentiate between tempo-
ral and non-temporal hiding patterns (same as above).
n
is again
the number of the hiding pattern. The only novelty is the param-
eter
D=[ndtcf]
, which represents the steganography domain, of
which the following are dened so far:
n
(network steganogra-
phy),
d
(digital media steganography), and
t
(text steganography).
We additionally dene (but not use in this paper) the steganogra-
phy domains
c
(cyber-physical steganography) and
f
(lesystem
steganography). This convention might be extended in the future to
reect additional steganography domains. For instance, the repre-
sentation pattern
RT1t
tells us that it is a temporal representation
pattern with the number 1 and it belongs to text steganography.
4.2.2 Naming of Paerns. The naming of patterns follows a clear
structure. A pattern name contains three components. First, its
number, second, the modiable object (e.g., Event or Feature) and,
third, the action of a pattern (e.g., Modulation or Occurrence).
2
The
full pattern name separates all three components by a space, e.g.,
ET2. Event Occurrence
. Sect. 4.3 provides a list of objects and
actions. However, additional objects and actions might be dened
in future work.
2
Please note that the previously introduced term cover object is not meant when we
refer to a modiable object.
5
ARES 2021, August 17–20, 2021, Vienna, Austria Wendzel, Caviglione, Mazurczyk, Mileva, Dimann, Krätzer, Lamshö, Vielhauer et al.
4.3 Glossary
As a preliminary, we introduce some basic terminology, which will
be used in the remainder of the paper. Even if the creation of a non-
ambiguous vocabulary for steganographic applications is outside
the scope of this work, reducing possible confusions or overloading
of terms is fundamental to not void the eciency and expressiveness
of the taxonomy. Specically, the term modiable object we dene
as the general object type that will be used to contain the secret
information. The process of hiding data within the cover depends
on the used mechanism or pattern. In the following, we refer to such
a process as embedding,injecting or hiding. The term modulating
will be used in case of ambiguities, especially to highlight that the
secret information is not directly stored but encoded by means
of variations of the cover object. The amount of data that can be
hidden will be denoted as the capacity.
In general, patterns can be used both to describe the process of
hiding information for storage purposes as well as to secretly move
data among two endpoints. To avoid burdening the text, when the
“transmissional” nature of the embedding process is not obvious,
we will explicitly identify the covert sender and receiving side as
to emphasize the origin and the destination of the steganographic
communication.
For the specic case of dening the taxonomy as well as to
describe patterns, the following formal denitions have been intro-
duced:
(1) Modiable Objects (see, Tab. 1):
An Event describes a (timed or forced) appearance, which
can be composed of several elements, e.g., 1) the appear-
ance of a predened character sequence; 2) a predened
specic sound in a video; 3) network connection establish-
ment, reset or disconnection.
An Element represents a single unit of a whole sequence,
e.g., 1) a word/character of a text; 2) a pixel of an image;
3) a network packet of the whole ow.
AFeature characterizes a property of an element to be
modulated, e.g., 1) the color of a character; 2) the attribute
of a tag in vector graphics; 3) the eld / the size of a
network packet.
An Interval species the temporal gap between two events,
e.g., 1) the duration of an audio le; 2) the time between
sending a message and receiving the related acknowledge-
ment.
AState/Value denotes a non-temporal numerical or posi-
tional quantity of an element, feature, or event, e.g., 1) the
values of TCP header elds (feature value); 2) the x-y-z
coordinates of a player in a 3D game.
(2) Actions:
An Occurrence is the temporal location of a given element,
feature, or event observed in the cover.
AModulation of an element’s (or event’s) value (or state) is
the selection of one particular value/state (out of multiple
possible values/states).
ACorruption refers to the blind overwriting of an element,
feature or state/value.
Enumeration means that the overall number of appear-
ances of something is altered.
Repeating refers to duplicating elements, events or features
(multiple times). It can be considered a sub-form of the
enumeration action.
Positioning selects the non-temporal position of an ele-
ment in a sequence of elements.
4.4 Embedding Patterns
Our novel taxonomy of hiding patterns contains two major branches
(see Fig. 2): patterns that describe how information is embedded in
a cover object and patterns that describe how embedded secret data
is represented in it.
4.4.1 Modulation of Temporal Behavior. The covert message is
embedded by modulating how a behavior evolves in time.
ET1. Event/Element Interval Modulation. The covert message is
embedded by modulating the gaps between succeeding events/elements,
for instance by: 1) modulating the inter-packet gap between suc-
ceeding network packets (elements) or between connection estab-
lishments (events); 2) modulating the time-gap between succeeding
cyber-physical actions, such as acoustic beeps.
(1)
Rate/Throughput: The covert message is embedded by alter-
nating the rate of events/elements (by introducing delays
or by decreasing delays). Here, several inter-event/element
intervals have to be modied in a row to embed a secret mes-
sage, i.e., the message is not embedded into particular inter-
event/element timings but in the overall rate/throughput.
Examples: 1) modulating the packet rate while sending traf-
c to some destination (by decreasing/increasing delays be-
tween
send()
actions); 2) modulating the number of pro-
duced items per hour in a production facility.
ET2. Event Occurrence. The covert message is encoded in the
temporal location of events (in comparison to ET1.1, the rate of
events is not directly modulated but events are triggered at specic
moments in time, moreover, ET2 can be a single event while ET1.1
needs a sequence of elements), e.g., 1) sending a specic network
packet at 6pm; 2) inuencing the time at which a drone starts its
journey to some destination (or its arrival time); 3) performing a
disconnect at a certain time.
Note: We did not include elements into this pattern in favor of EN2
and EN3. See also Sections 4.1.4.
4.4.2 Modulation of Non-temporal Behavior.
EN1. Articial Element-Loss Modulation. The covert message is
embedded by modulating the articial loss of elements. Examples:
1) dropping TCP segments with an even sequence number; 2) re-
moving commas in sentences [2].
EN2. Elements/Features Positioning. The covert message is em-
bedded by modulating the position of a predened (set of) ele-
ment(s)/feature(s) in a sequence of elements/features. Examples: 1)
position of an IPv4 option in the list of options; 2) placing a drink
on a table to signal a Go player to play more defensive; 3) placing a
specic character in a paragraph.
EN3. Elements/Features Enumeration. The covert message is em-
bedded by altering the overall number of appearances of elements
or features in a sequence. Examples: 1) fragmenting a network
6
A Revised Taxonomy of Steganography Embedding Paerns ARES 2021, August 17–20, 2021, Vienna, Austria
Table 1: Dierentiation between the types of objects used in this paper.
Domain Interval Event Element Feature State/Value
network steganography time between packets presence of ow; disconnect network packet size of packet; eld of packet value of header eld;
number of packets
text steganography time between text notes sent occurrence of character se-
quence
character color of character number of characters
digital media steganography duration of audio le occurrence of pre-dened sound
in MP3 le
pixel of image color of pixel value of pixel; number of
pixels in image
Embedding Hiding Patterns
Modulation of Temporal Behavior Modulation of Non-temporal Behavior
ET1. Event/Element Interval Modulation
ET1.1. Rate/Throughput Modulation
ET2. Event Occurrence*
EN1. Artificial Element-Loss Modulation
EN2. Elements/Features Positioning
EN3. Elements/Features Enumeration
EN4. State/Value Modulation
EN5.1. Size Feature Modulation
EN4.1. Reserved/Unused State/Value Modulation
EN5.2. Character Feature Modulation
Representation Hiding Patterns (Domain Specific)
Hiding Patterns
Example: Network Steganography
EN4.3. Blind State/Value Modulation
EN4.2. Random State/Value Modulation
EN5. Feature Structure Modulation
RT1n. Event/Element Interval Modulation (derived from ET1)
RT2n. Event Occurrence (derived from ET2)
RT1.1n. Rate/Throughput Modulation (derived f. ET1.1)
RT2.1n. Frame Corruption (derived f. RT2n)
RN1n. Artificial Element-Loss Modulation (derived from EN1)
RN2n. Elements/Features Positioning (derived from EN2)
RN3n. Elements/Features Enumeration (derived from EN3)
RN4n. State/Value Modulation (derived from EN4)
RN1.1n. Artificial (Forced) Reconnections Modulation (der. fr. RN1n)
RN3.1n. Artificial Retransmissions Mod. (derived from RN3n)
RN4.1n. Reserved/Unused State/Value Modulation (der. fr. EN4.1)
RN4.2n. Random State/Value Modulation (derived from EN4.2)
RN5n. Feature Structure Modulation (derived from EN5)
RN5.1n. Size Feature Modul. (derived from EN5.1)
RN5.2n. Character Feature Mod. (derived from EN5.2)
RN4.3n. Blind State/Value Modulation (derived from EN4.3)
Recognition of Temporal Behavior Recognition of Non-temporal Behavior
* ET2 excludes Elements in its title due to EN2 and EN3, see pattern descriptions.
Figure 2: The novel, general-purpose taxonomy of embedding hiding patterns for steganography (and exemplary representa-
tion hiding patterns for the network steganography domain).
packet into either
n
or
m
(
n,m
) fragments; 2) modulating the
number of people wearing a t-shirt in a specic color in an image
le; 3) repeating an element/feature by duplicating a white space
character (or not) in a text [2].
7
ARES 2021, August 17–20, 2021, Vienna, Austria Wendzel, Caviglione, Mazurczyk, Mileva, Dimann, Krätzer, Lamshö, Vielhauer et al.
EN4. State/Value Modulation. The covert message is embedded
by modulating the states or values of features, e.g., 1) performing
intense computation to inuence some temperature/clock-skew
[
24
]; 2) modulating other physical states, such as proximity, visi-
bility, force, height, acceleration, speed, etc. of certain devices; 3)
changing values of the network packet header elds (e.g., target IP
address of ARP [
16
], Hop Count value in IPv6 [
17
] or the LSB in
the IPv4 TTL); 4) modulate the x-y-z coordinates of a player in a
3D multiplayer online game [39].
(1)
Reserved/Unused State/Value Modulation: The covert message
is embedded by modulating reserved/unused states/values,
e.g., 1) overwriting the IPv4 reserved eld [
12
]; 2) modulation
of unused registers in embedded CPS equipment [34].
(2)
Random Modulation: A (pseudo-)random value or state is re-
placed with a secret message (that is also following a pseudo-
random appearance), e.g., 1) replacing the pseudo-random
content of a network header eld with encrypted covert
content; 2) encoding a secret message in the randomized
selection of a starting player in an online chess game.
(3)
Blind State/Value Modulation: Blind corruption of data, e.g., 1)
blindly overwriting a checksum of a PDU to corrupt a packet
(or not) to embed hidden information; 2) blindly overwriting
content of a le in a lesystem, neglecting its le header; 3)
blindly overwriting a TCP payload.
EN5. Feature Structure Modulation. This hiding pattern comprises
all hiding techniques that modulate the structural properties of a
feature (but not states/values (EN4), positions (EN2) or number of
appearances (EN3)). Examples include: 1) increasing/decreasing the
size of succeeding network packets; 2) changing the color/style of
characters in texts.
(1)
Size Modulation: The covert message is embedded by mod-
ulating the size of an element, e.g., 1) create additional (un-
used) space in network packets for embedding hidden data,
such as adding an “unused” IPv6 destination option [10]; 2)
alternate the size of PNG les.
(2)
Character Feature Modulation: Modulation of dierent fea-
tures in characters, such as color, size (scale), font, position
or size of dierent parts in some letters, e.g., 1) using up-
per/lower case letters in HTTP or SMTP requests [
6
]; 2)
modulating the color of characters in text steganography.
Relations: Utilizes partially the same methods as EN4. State/Value
Modulation (e.g., a HTTP header eld’s character is also a
value). Thus, both are linked in see Fig. 2.
4.4.3 Hybrid Embedding Paerns. It must be noted that the hybrid
application of embedding methods is feasible, too. For example,
the LACK method for IP telephony uses ET1 (by applying articial
delays) and EN4 (by changing the value in the payload eld) [
19
].
As we exemplify in Sect. 4.5.1, hybrid representation patterns exists
as well.
4.4.4 Example 1: Network Steganography. As discussed, network
steganography is a steganography domain for which hiding pat-
terns were already dened. Thus, our embedding patterns were
designed on the basis of the hiding patterns designed for network
steganography as introduced by [
37
] and extended/updated by
[
22
,
24
,
26
]. For this reason, the embedding patterns match the
known embedding strategies for network steganography.
Instead of separating patterns into timing and storage patterns,
we favored the dierentiation between temporal and non-temporal
behavior, which is only loosely related to the original distinction.
We veried that all previously known network steganography hid-
ing patterns’ embedding functionality can be represented by the
proposed embedding patterns.
To underpin the functioning of our dierentiation between em-
bedding and representation patterns for network steganography,
section 4.5.1 provides details for the integration of the known net-
work hiding patterns into their corresponding representation pat-
terns.
4.4.5 Example 2: Digital Media Steganography. Commonly, three
dierent approaches for generating steganographic digital media
data exist, depending on the role of the underlying cover, which
can be related to the Embedding Hiding Patterns proposed in
Fig. 2 as follows: Cover Modication modies a pre-existing, non-
steganographic carrier medium, for example by modulating the
DCT coecients in JPEG compressed images [
9
], which can be
categorized as EN5. In Cover Selection, the embedder generates
subsets from a previously existing set of digital media, using specic
attributes of the individual digital media to encode information.
One possible method for this can be the choice of photographic
images from a library. Bits of value 0 or 1 are encoded by explicitly
selecting portrait or landscape image orientations, respectively, and
sequentially broadcasting them. This falls into the category of EN2.
Cover Synthesis describes the process of articially generating digi-
tal media to embed the hidden data. An example for pattern EN4 are
computer-generated images, which can be composed in such way,
that clip-arts are combined into an image and the actual selection
from the clipart library builds the coding of the secret message (e.g.,
cars for a “1” , animals for a “0” message), rendering it a value that
is modulated.
Continuous Digital Media (i.e., temporally changing media con-
tent like audio or video) further allow the modulation of temporal
behavior as embedding pattern. For example, the inter-sample time
intervals between samples of an audio stream can be articially
delayed or shortened in order to encode a hidden message, as an
example for category ET1.1.
4.4.6 Example 3: Text Steganography. Taking into account the tax-
onomy of the text hiding techniques from [
1
], we can see how their
taxonomy can naturally t into our embedding hiding patterns. So
we have:
Structural methods – Open Space methods involving the use
of white or dierent Unicode spaces can be expressed with
EN3 or EN4 patterns. Line/Word shifting which involves the
position of a word in a line or of a line in a text, can be
explained with EN2. Zero-Width methods (by using ZWC
Unicode characters that do not have text trace to represent
dierent groups of
n
secret bits) and Emoticons use EN4.
Feature/Format methods can be explained with EN5.2 for
characters and EN5 for other text elements (like paragraphs,
sentences, etc).
8
A Revised Taxonomy of Steganography Embedding Paerns ARES 2021, August 17–20, 2021, Vienna, Austria
Linguistic methods - Semantic methods modifying the se-
mantic attributes, such as spelling of words, abbreviations,
synonyms, acronyms, paraphrasing, transliterations, and so
on, can be expressed with EN4. Syntactic methods which
use changing of the diction and structure of text without
signicantly altering meaning or tone, such as ambiguous
punctuation, shifting the location of the noun and verb, ty-
pographical errors, can be modeled via EN2 and EN1.
Random & Statistics methods - Compression methods (which
hide the secret message in the compression codewords) and
Random Cover methods (which automatically generate cover
message from some type, such as jokes, lists, notes, missing
letter puzzles, Ci-poetry, etc., by using a secret bitstream,
such as hiding in the rst letter of the keyword) can be seen
as hybrid methods. Both use a generated carrier from the
secret bitstream, so this can be seen as a combination of
Carrier Size Feature Modulation from EN5.1 and EN4.
4.5 Representation Patterns
Here, essentially the same patterns can be applied as in the case of
the embedding process. However, instead of describing how data is
embedded, they describe how data is represented. Moreover, new
patterns can be derived from representation patterns, which might
not be directly reected by embedding patterns.
As described in Sect. 2, representation patterns must not neces-
sarily match embedding patterns during their application. Represen-
tation patterns can cover a larger variety of ideas than embedding
patterns due to their domain-specic focus and because embedding
patterns can cause indirect actions, such as the termination of con-
nections without actually performing the termination. For instance,
in network steganography, the patterns
Articial (Forced) Recon-
nections Modulation
and
Articial Retransmissions Modula-
tion
have no direct counterpart at the side of embedding patterns.
Such patterns are derived from their representation parent pattern
(highlighted in bold font in Fig. 2).
In the remainder, we will cover network steganography repre-
sentation patterns in detail to exemplify this concept. Providing a
comprehensive taxonomy of representation patterns is part of our
ongoing research.
4.5.1 Network Steganography. Fig. 2 (right side) shows how net-
work steganography representation patterns can be derived from
embedding patterns.
Unfortunately, the current taxonomy of network steganography
hiding patterns cannot be directly applied in the context of our
novel taxonomy, as the distinction between timing and non-timing
channels diers. The current taxonomy classies more patterns
as temporal than our taxonomy (cf. Sect. 4.1.4). Given that our
denition of a temporal hiding pattern is a bit stricter than with
the existing taxonomy, we would consider several protocol-aware
hiding patterns as non-temporal, in particular the original patterns:
Articial Loss
,
Articial Reconnections
/
Retransmission
and
Message Ordering
(determining which packet/connection is lost,
retransmitted, reconnected, or in which order packets appear, is
based on non-temporal attributes, such as TCP sequence numbers)
as well as
Temperature
(temperature is a state or value that can
be modulated). However, the original pattern
Frame Collisions
remains temporal and so do all protocol-agnostic timing patterns
(
Inter-packet Times
,
Message Timing
, and
Rate/Throughput
).
However, their naming have been adjusted to the new terminology.
As also shown in Fig. 2, some representation patterns have no
direct peer at the embedding patterns branch (bold derivations in
the gure):
(1) Frame Corruptions
: Frame collisions can be caused by tim-
ing a message (
Event Occurrence
) in a way that two mes-
sages collide. The hidden information is then represented
by the collision. Also,
Frame Corruptions
is not a hybrid
pattern as the content of the frame does not represent hidden
information (and might be lost due to the collision), but the
timing is the crucial information here for this pattern, i.e.,
when a collision happens.
(2) Articial Retransmissions Modulation
: Several embed-
ding actions can cause a retransmission, e.g., dropping se-
lected TCP segments using the
Articial Element-Loss
Modulation
pattern or overloading a TCP buer using the
Elements/Features Enumeration
pattern. However, the
CR would observe the caused retransmission of PDUs.
(3) Articial (Forced) Reconnections Modulation
: same as
in case of Articial Retransmissions Modulation.
Moreover, the following previous network steganography hiding
patterns are now dened as
hybrid patterns
, which are not shown
in Fig. 2 to not burden the taxonomy:
(1) Sequence Modulation
: Because of its sub-patterns, this pat-
tern modies the position of each element and their overall
number, which renders this pattern a hybrid form of
El-
ements/Features Enumeration
and
Elements/Features
Positioning.
(2) Message Ordering
(former
PDU Ordering
pattern): This
pattern orders PDUs instead of a message’s elements. It is
a sub-pattern of the
Sequence Modulation
hybrid pattern
and now considered a non-temporal pattern (see Sect. 4.1.4)
as the position of the PDUs and their number are interpreted.
(3) Add Redundancy
and
Modify Redundancy
: These pat-
terns are known to do one of the following: 1) create addi-
tional space in a PDU to place hidden data (combination of
Size Feature Modulation
and
State/Value Modulation
);
2) compress data and then use the saved space to insert
secret information, e.g., changing transmission codec for au-
dio streams (transcoding steganography [
20
]), which would
mean that the
State/Value Modulation
pattern is applied
twice in a row (rst for compression and then the sub-pattern
Reserved/ Unused State/Value Modulation).
Taking advantage of our novel taxonomy, we were able to
dis-
card
the following patterns from the network steganography do-
main as they mix embedding and representation patterns.
(1) Value Inuencing
sub-pattern: This pattern was previ-
ously considered a sub-pattern of the original Value Modula-
tion pattern. Some value is indirectly inuenced by altering
some surrounding condition that results in a modied value.
However, this pattern actually groups two dierent patterns.
The embedding pattern inserts hidden information by alter-
ing the surrounding value, but the representation pattern
refers directly to the inuenced value.
9
ARES 2021, August 17–20, 2021, Vienna, Austria Wendzel, Caviglione, Mazurczyk, Mileva, Dimann, Krätzer, Lamshö, Vielhauer et al.
(2) Payload Field Size Modulation
,
User-data Value Mod-
ulation & Reserved Unused
: Their concepts are already
found in the respective patterns
Size Feature Modulation
,
State/Value Modulation
and
Reserved/Unused State/ Value
Modulation
, from which they were derived. As discussed
in Sect. 4.1.2, we discarded the distinction between payload
and non-payload.
(3) User-data Corruption
pattern: This pattern refers to hy-
brid methods such as HICCUPS or RSTEG, which, e.g., re-
transmit a message and then replace the original content. The
overwriting however must be considered as
Blind State/Value
Modulation
whereas the retransmission refers to the new
Articial Retransmissions
pattern. While this would ren-
der the pattern a hybrid one, it was discarded due to the the
same reason as Payload Field Size Modulation was.
(4) Temperature
pattern: Similarly to the
Value Inuencing
pattern, embedding and representation must be split. The
secret data is represented by some temperature value but is
embedded by, e.g., high CPU load. Moreover, this pattern is
domain overlapping: network load can inuence the CPU
temperature (network-specic pattern) while the tempera-
ture value is a physical value, belonging to the domain of
CPS steganography.
4.5.2 Other Steganography Domains. As discussed, this paper fo-
cuses on embedding patterns. Thus, representation patterns were
only illustrated for the network steganography domain. Future
work will extend the taxonomy to cover representation patterns
for additional domains, especially digital media, text and cyber-
physical systems steganography.
5 ANTICIPATED STEGANOGRAPHY
DEVELOPMENTS IN THE CONTEXT OF
PATTERNS
Our proposed pattern-based taxonomy needs to proof its function-
ality under the umbrella of future trends, such as:
(1) Novel Application Domains for Steganography.
We ex-
pect several new domains of steganography to emerge during
the next decade. As pointed out by Bezahaf et al., the Inter-
net will be required to adapt to certain requirements of new
services, including holographic applications, autonomous
vehicles, remote surgery, and automated reality [
3
]. Such
services will provide several new options for the embed-
ding of digital media steganography and CPS steganography.
These new services will exploit 5G+ and low-earth-orbit
satellite clusters while being linked to higher performance
characteristics in terms of Quality of Service [
3
], which will
provide novel communication protocols that will allow en-
hanced forms of network steganography. It cannot be stated
whether novel hiding patterns will emerge during these de-
velopments but their application scenarios will widen.
(2) Steganography for Machine Learning (ML) Systems.
At-
tackers could exploit ML systems and its related processes
to embed secret information. The recent body of research
has shown that ML can be inuenced by adversary attacks,
overtting (eases manipulations) and data poisoning, among
other aspects [
25
]. This development is reected in the on-
going work to establish a taxonomy for such attacks in form
of the so-called Adversarial ML Threat Matrix by the MITRE
Corporation and others [
30
]. For instance, adversary manip-
ulations of road signs for smart vehicles can lead to false
categorizations of such signs. However, the process from
data-collection to generation of ML-based outputs can be
potentially inuenced by a steganographer. For instance, one
could try to inuence certain aspects of raw data in a way
that the ML system might provide excellent outputs in prac-
tice. However, when minimal changes to specic parts of the
input data are conducted, results might dier. The output of
an ML system could then represent a secret message and the
modication of input data would be the steganographic key.
Alternatively, secret data could be nested directly inside the
ML models. A rst paper that exploits federated learning for
steganography is [
5
]. Again, ML steganography might lead
to novel hiding patterns.
(3) Adaptive Countermeasures.
Current steganography coun-
termeasures are usually tailored for testbed environments,
where they provide sucient results, see [
4
] for a comprehen-
sive overview. However, not only do real-world applications
demand very low false-positive rates as false-positives ac-
cumulate to large numbers for large-scale scenarios [
31
],
they also have to deal with continuously changing data. For
instance, since the invention of the ARPANET, the Inter-
net’s trac characteristics continuously kept changing [
3
].
This is a signicant problem since many detection meth-
ods are tailored for Internet or network trac provided at a
given time and under specic environmental attributes. For
this reason, countermeasures need to be adaptive. First ap-
proaches were already tailored, such as the dynamic warden
[
23
]. Further research paths for countermeasures might ex-
ploit multi-agent systems (MAS) that simulate (large-scale)
network environments attacked by steganography, thus al-
lowing predictable behaviour of stego-malware and adjust-
ment of countermeasures. At the moment, it is still unclear,
how dynamic countermeasures are linked to the characteris-
tics of specic patterns.
(4) Hybrid Transfer and Storage as well as Chaining of
Patterns.
Our new taxonomy allows describing hiding tech-
niques in a much more precise manner than before. Con-
sider, for instance, the recently introduced DeadDrops [
29
]
which exploit network protocol caches for storage while
they secretly transfer the information over a network covert
channel to embed the secret data inside the caches. Such
methods apply one hiding pattern for embedding of secret
information into the transfer from the CS to the DeadDrop,
further embed the information using a second pattern (e.g.,
to alter an NTP or ARP cache), while the CR might indirectly
retrieve the information using some third representation
pattern. However, such a chaining of embedding patterns in
a way that multiple steganography domains are utilized is
not well-understood yet.
10
A Revised Taxonomy of Steganography Embedding Paerns ARES 2021, August 17–20, 2021, Vienna, Austria
6 CONCLUSION AND FUTURE WORK
We revised the entire taxonomy of hiding patterns. Our new tax-
onomy provides a tool for all domains of steganography – not
solely network steganography – and thus allows the utilization
of hiding patterns also in digital media, text, CPS, lesystem, and
other steganography areas. We also provide a clearer distinction
between the embedding process and the representation by hidden
patterns than available through the current taxonomy. Wherever
suitable, we kept previous terms in order to maximize backward-
compatibility with the old taxonomy and to ease the transition for
users of the previous taxonomy.
The next steps of our consortium will be to address the following
topics in order to develop our proposed taxonomy further: We plan
to extend the size of our consortium so that more stakeholders
from additional domains, such as lesystem steganography and
CPS steganography, can contribute to it. We will further extend the
representation patterns taxonomies to fully reect each domain.
This will aid the further distribution and acceptance of the model
while also improving its functionalities and widening its applica-
tion domain. Moreover, we currently evaluate the integration of
linked domains, such as digital watermarking, into the taxonomy,
which would require the inclusion of additional experts into our
consortium.
ACKNOWLEDGMENTS
Parts of the work from Brandenburg and Magdeburg authors in this
paper (i.e., on denitions and general discussions) have been funded
by the German Federal Ministry for Economic Aairs and En-
ergy (BMWi, Stealth-Szenarien, Grant No. 1501589A and 1501589C)
within the scope of the German Reactor-Safety-Research-Program.
Parts of the work of Laura Hartmann has been funded by the
European Union from the European Regional Development Fund
(EFRE) and the State of Rhine-land-Palatinate (MWWK), Germany.
Funding content: P1-SZ2-7 F&E: Wissens- und Technologietransfer
(WTT), Application number: 84003751, project MADISA. Her work
has also been funded by Programm zur Förderung des Forschungsper-
sonals, Infrastruktur und forschendem Lernen (ProFIL) of the Uni-
versity of Applied Sciences Worms.
Parts of the work of Luca Caviglione and Wojciech Mazurczyk
have been supported by the SIMARGL Project - Secure Intelligent
Methods for Advanced RecoGnition of malware and stegomalware,
with the support of the European Commission and the Horizon
2020 Program, under Grant Agreement No. 833042.
REFERENCES
[1]
Milad Taleby Ahvanooey,Qianmu Li, Jun Hou, Ahmed Raza Rajput, and Yini Chen.
2019. Modern Text Hiding, Text Steganalysis, and Applications: A Comparative
Analysis. Entropy 21, 4 (2019), 355.
[2]
Walter Bender, Daniel Gruhl, Norishige Morimoto, and Anthony Lu. 1996. Tech-
niques for data hiding. IBM Systems Journal 35 (Nos3&4) (1996), 313—-336.
[3]
Mehdi Bezahaf, David Hutchison, Daniel King, and Nicholas Race. 2020. Internet
Evolution: Critical Issues. IEEE Internet Computing 24, 4 (2020), 5–14.
[4]
Luca Caviglione. 2021. Trends and Challenges in Network Covert Channels
Countermeasures. Applied Sciences 11, 4 (2021), 1641.
[5]
Gabriele Costa, Fabio Pinelli, Simone Soderi, and Gabriele Tolomei. 2021. Covert
Channel Attack to Federated Learning Systems. arXiv:2104.10561 [cs.CR]
[6]
Alex Dyatlov and Simon Castro. 2003. Exploitation of Data Streams Authorized
by a Network Access Control System for Arbitrary Data Transfers: Tunneling
and Covert Channels over the HTTP Protocol. Gray-world.
[7]
Knut Eckstein and Marko Jahnke. 2005. Data hiding in journaling le systems.
In Proceedings of 5th Digital Forensic Research Workshop.
[8]
Sally Fincher. 2004. PLML: Pattern Language Markup Language / Perspectives
on HCI Patterns: Concepts and Tools. CHI 2003 summary document, https:
//www.cs.kent.ac.uk/people/sta/saf/patterns/plml.html.
[9]
Jessica Fridrich. 2009. Steganography in Digital Media: Principles, Algorithms,
and Applications. Cambridge University Press. https://doi.org/10.1017/
CBO978113919290
[10]
Thomas Graf. 2003. Messaging over IPv6 Destination Options. Swiss Unix User
Group.
[11]
S. Gupta and D. Gupta. 2011. Text-Steganography: Review Study & Comparative
Analysis. International Journal of Computer Science and Information Technologies
(IJCSIT) 2, 5 (2011), 2060–2062.
[12]
Theodore G. Handel and Maxwell T. Sandford II. 1996. Hiding data in the OSI
network model. In Proceedings of the 1st International Workshop on Information
Hiding. 23–38.
[13]
Mario Hildebrandt, Robert Altschael, Kevin Lamshöft, Mathias Lange, Martin
Szemkus, Tom Neubert, Claus Vielhauer, Yongdian Ding, and Jana Dittmann.
2020. Threat Analysis of Steganographic and Covert Communication in Nuclear
I&C Systems. In International Conference on Nuclear Security: Sustaining and
Strengthening Eorts.
[14]
Hassan Khan, Mobin Javed, Syed Ali Khayam, and Fauzan Mirza. 2011. De-
signing a cluster-based covert channel to evade disk investigation and forensics.
Computers & Security 30 (1) (2011), 35–49.
[15]
Prashanth Krishnamurthy, Farshad Khorrami, Ramesh Karri, David Paul-Pena,
and Hossein Salehghaari. 2018. Process-Aware Covert Channels Using Physical
Instrumentation in Cyber-Physical Systems. IEEE Transactions on Information
Forensics and Security 13 (11) (2018), 2761–2771.
[16]
Liping Ji, Yu Fan, and Chuan Ma. 2010. Covert channel for local area network. In
2010 IEEE International Conference on Wireless Communications, Networking and
Information Security. 316–319. https://doi.org/10.1109/WCINS.2010.5541791
[17]
Norka B Lucena, Grzegorz Lewandowski, and Steve J Chapin. 2005. Covert
channels in IPv6. In International Workshop on Privacy Enhancing Technologies.
Springer, 147–166.
[18]
Wojciech Mazurczyk and Luca Caviglione. 2014. Steganography in modern smart-
phones and mitigation techniques. IEEE Communications Surveys & Tutorials 17,
1 (2014), 334–357.
[19]
W. Mazurczyk and J. Lubacz. 2010. LACK – a VoIP steganographic method.
Telecommun Syst 45 (2010), 153–163. https://doi.org/10.1007/s11235-009-9245- y
[20]
Wojciech Mazurczyk, Paweł Szaga, and Krzysztof Szczypiorski. 2014. Using
Transcoding for Hidden Communication in IP Telephony. Multimedia Tools Appl.
70, 3 (2014), 2139–2165.
[21]
Wojciech Mazurczyk and Steen Wendzel. 2017. Information Hiding: Challenges
for Forensic Experts. Commun. ACM 61, 1 (Dec. 2017), 86–94. https://doi.org/10.
1145/3158416
[22]
Wojciech Mazurczyk, Steen Wendzel, and Krzysztof Cabaj. 2018. Towards
Deriving Insights into Data Hiding Methods Using Pattern-based Approach. In
Proc. Second International Workshop on Criminal Use of Information Hiding (CUING
2018). ACM, 10:1–10:10.
[23]
Wojciech Mazurczyk, Steen Wendzel, Mehdi Chourib, and Jörg Keller. 2019.
Countering Adaptive Network Covert Communication with Dynamic Wardens.
Future Generation Computer Systems (FGCS) 94 (2019), 712–725.
[24]
Wojciech Mazurczyk, Steen Wendzel, Sebastian Zander, Amir Houmansadr, and
Krzysztof Szczypiorski. 2016. Information Hiding in Communication Networks:
Fundamentals, Mechanisms, and Applications. Wiley.
[25]
Gary McGraw, Richie Bonett, Victor Shepardson, and Harold Figueroa. 2020. The
Top 10 Risks of Machine Learning Security. IEEE Computer 53, 6 (2020), 57–61.
[26]
Aleksandra Mileva, Aleksandar Velinov, Laura Hartmann, Steen Wendzel, and
Wojciech Mazurczyk. 2021. Comprehensive Analysis of MQTT 5.0 Susceptibility
to Network Covert Channels. Computers & Security (COSE) 104, 102207 (2021).
https://doi.org/10.1016/j.cose.2021.102207
[27]
Fabien A. P. Petitcolas, Ross J. Anderson, and Markus G. Kuhn. 1999. Information
hiding-a survey. Proc. IEEE 87, 7 (1999), 1062–1078.
[28]
Birgit Ptzmann. 1996. Information hiding terminology. In Information Hiding,
Ross Anderson (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg, 347–350.
[29]
Tobias Schmidbauer, Steen Wendzel, Aleksandra Mileva, and Wojciech Mazur-
czyk. 2019. Introducing Dead Drops to Network Steganography Using ARP-
Caches and SNMP-Walks. In Proceedings of the 14th International Conference on
Availability, Reliability and Security (Canterbury, CA, United Kingdom) (ARES ’19).
Association for Computing Machinery, New York, NY, USA, Article 64, 10 pages.
https://doi.org/10.1145/3339252.3341488
[30]
Jonathan Spring. 2020. Adversarial ML Threat Matrix: Adversar-
ial Tactics, Techniques, and Common Knowledge of Machine Learn-
ing, Carnegie Mellon University, Software Engineering Institute (SEI).
https://insights.sei.cmu.edu/blog/adversarial-ml- threat-matrix- adversarial-
tactics-techniques- and-common-knowledge- of-machine- learning/
[31]
Martin Steinebach, Andre Ester, and Huajian Liu. 2018. Channel steganalysis. In
Proceedings of the 13th International Conference on Availability, Reliability and
11
ARES 2021, August 17–20, 2021, Vienna, Austria Wendzel, Caviglione, Mazurczyk, Mileva, Dimann, Krätzer, Lamshö, Vielhauer et al.
Security. 1–8.
[32]
Thomas Ulz, Markus Feldbacher, Thomas Pieber, and Christian Steger. 2019.
Sensing danger: exploiting sensors to build covert channels. In Proceedings of the
5th International Conference on Information Systems Security and Privacy (ICISSP
2019), Prague, Czech Republic. 100–113.
[33]
Steen Wendzel and Jörg Keller. 2012. Systematic Engineering of Control Pro-
tocols for Covert Channels. In Communications and Multimedia Security, Bart
De Decker and David W. Chadwick (Eds.). Springer, Berlin, Heidelberg, 131–144.
[34]
Steen Wendzel, Wojciech Mazurczyk, and Georg Haas. 2017. Don’t You Touch
My Nuts: Information Hiding in Cyber Physical Systems. In 2017 IEEE Security
and Privacy Workshops (SPW). IEEE, 29–34. https://doi.org/10.1109/SPW.2017.40
[35]
Steen Wendzel, Wojciech Mazurczyk, and Sebastian Zander. 2016. A Unied De-
scription Method for Network Information Hiding Methods. Journal of Universal
Computer Science (J.UCS) 22, 11 (2016), 1456–1486. https://doi.org/10.3217/jucs-
022-11- 1456 http://dx.doi.org/10.3217/jucs-022- 11- 1456.
[36]
Steen Wendzel and Carolin Palmer. 2015. Creativity in Mind: Evaluating and
Maintaining Advances in Network Steganographic Research. Journal of Universal
Computer Science (J.UCS) 21, 12 (2015), 1684–1705. https://doi.org/10.3217/jucs-
021-12- 1684 https://dx.doi.org/10.3217/jucs-021- 12- 1684.
[37]
Steen Wendzel, Sebastian Zander, Bernhard Fechner, and Christian Herdin.
2015. Pattern-Based Survey and Categorization of Network Covert Channel
Techniques. Computing Surveys (CSUR) 47, 3 (2015).
[38]
Sebastian Zander, Grenville Armitage, and Philip Branch. 2007. A survey of
covert channels and countermeasures in computer network protocols. IEEE
Communications Surveys & Tutorials 9, 3 (2007), 44–57.
[39]
Sebastian Zander, Grenville Armitage, and Philip Branch. 2008. Covert channels
in multiplayer rst person shooter online games. In 2008 33rd IEEE Conference on
Local Computer Networks (LCN). IEEE, 215–222.
12
... In digital contexts, steganography involves embedding information within computer files. This can include encoding hidden data in various formats, such as document files, image files, or network protocols [5,6]. Due to their size, media files are particularly suited for steganographic purposes. ...
... The binary marking is embedded in all the detailed sub-images from the first wavelet resolution level, using relations (1)- (5). The Lena image is marked with different α-mark intensities, between 0.1 and 1.5. ...
... If Q/T > 1, the marking is detected. The binary marking is embedded in all the detailed sub-images from the first wavelet resolution level, using relations (1)- (5). The Lena image is marked with different α-mark intensities, between 0.1 and 1.5. ...
Article
Full-text available
The growing reliance on digital banking and financial transactions has brought significant security challenges, including data breaches and unauthorized access. This paper proposes a robust method for enhancing the security of banking and financial transactions. In this context, steganography—hiding information within digital media—is valuable for improving data protection. This approach combines biometric authentication, using face and voice recognition, with image steganography to secure communication channels. A novel application of Fibonacci sequences is introduced within a direct-sequence spread-spectrum (DSSS) system for encryption, along with a discrete wavelet transform (DWT) for embedding data. The secret message, encrypted through Fibonacci sequences, is concealed within an image and tested for effectiveness using the Mean Square Error (MSE) and Peak Signal-to-Noise Ratio (PSNR). The experimental results demonstrate that the proposed method achieves a high PSNR, particularly for grayscale images, enhancing the robustness of security measures in mobile and online banking environments.
... Steganography comprises the science and art of hiding information transfer and storage [28]. Steganography is not to be confused with cryptography: they both share the ultimate goal of protecting information, but the former attempts to hide it to make it "difficult to notice". ...
... Stefano Bistarelli stefano.bistarelli@unipg.it Andrea Imparato andrea.imparato@studenti.unipg.it 1 Dipartimento di Matematica e Informatica, University of Perugia, Perugia, Italy mation always features the following elements regardless of its specific domain [28]: ...
... Among new domains made available by the digital revolution, Network Steganography stands out for its ability to use regular network traffic as a Covert Object to conceal information transfers [28]. ...
Article
Full-text available
We propose a covert channel and its implementation in Windows OS. This storage channel uses the Initial Sequence Number of TCP to hide four characters of text and the identification field to “sign” the message and thus understand if it has been altered during the transmission. The secret is sent in the first SYN segment to open a connection, and an ACK-RST response acknowledges the receipt. Designed error-correction codes make the protocol more robust and able to handle (IP) packet drops and transmission errors. In this paper, we provide a detailed discussion of the implementation and an evaluation of the stealthiness of the proposed channel: we inspect the generated traffic with two IDSs and RITA, a tool performing statistical analysis to detect malware beaconing.
... Relevant and exemplary attack vectors with steganographic embedding techniques in ICS are presented in [2,11,5,1]. Potential network steganographic embedding patterns and a related terminology is summarized in [13]. A generic taxonomy with the intention of a unified understanding of terms and their applicability for steganographic methods can be found in detail in [12]. ...
Preprint
Full-text available
For the last several years, the embedding of hidden information by steganographic techniques in network communications is increasingly used by attackers in order to obscure data infiltration, exfiltration or command and control in IT (information technology) and OT (operational technology) systems. Especially industrial control systems (ICS) and critical infrastructures have increased protection requirements. Currently, network defense mechanisms are unfortunately quite ineffective against novel attacks based on network steganography. Thus, on the one hand huge amounts of network data with steganographic embedding is required to train, evaluate and improve defense mechanisms. On the other hand, the real-time embedding of hidden information in productive ICS networks is crucial due to safety violations. Additionally it is time consuming because it needs special laboratory setup. To address this challenge, this work introduces an embedding concept to gene ate synthetic steganographic network data to automatically produce significant amounts of data for training and evaluation of defense mechanisms. The concept enables the possibility to manipulate a network packet wherever required and outperforms the state-of-the-art in terms of embedding pace significantly.
... The nature of the detection mechanism to be bypassed has a particularly large influence on the effectiveness of the embedding method. Text steganography is commonly divided into three classes, format-based, linguistic and random/statistical generation [7,31,49,52], whereas steganography used by malware usually belongs to the format-based class. In this paper, we use a different categorization based on the embedding technique [7]. ...
Article
Full-text available
A unified understanding of terms is essential for every scientific discipline: steganography is no exception. Being divided into several domains (e.g., network and text steganography), it is crucial to provide a unified terminology as well as a taxonomy that is not limited to few applications or areas. A prime attempt towards a unified understanding of terms was conducted in 2015 with the introduction of a pattern-based taxonomy for network steganography. In 2021, the first work towards a pattern-based taxonomy for all domains of steganography was proposed. However, this initial attempt still faced several shortcomings, e.g., remaining inconsistencies and a lack of patterns for several steganography domains. As the consortium who published the previous studies on steganography patterns, we present the first comprehensive pattern-based taxonomy tailored to fit all known domains of steganography, including smaller and emerging areas, such as filesystem, IoT/CPS, and AI/ML steganography. To make our contribution more effective and promote the use of the taxonomy to advance research, we also provide a unified description method joint with a thorough tutorial on its utilization.
Article
Full-text available
In today's digital age, ensuring secure communication is essential. This article presents a novel approach for hidden communication by integrating Elliptic Curve Encryption (ECE) with Least Significant Bit (LSB) Steganography. Our proposed fusion offers a robust solution, Stegno Curve for concealing sensitive information within innocuous cover media while encrypting it using elliptic curve cryptography. By leveraging the strengths of both techniques, we achieve enhanced security and confidentiality in data transmission. Through a comprehensive literature review, methodology explanation, security analysis, and implementation details, we demonstrate the feasibility and effectiveness of the Stegno Curve Method. The findings of this study not only contribute to advancing the field of secure communication but also open avenues for practical applications in various domains, such as secure cloud transitions, smart home technologies, and data encryption.
Article
Full-text available
Federated learning (FL) goes beyond traditional, centralized machine learning by distributing model training among a large collection of edge clients. These clients cooperatively train a global, e.g., cloud-hosted, model without disclosing their local, private training data. The global model is then shared among all the participants which use it for local predictions. This paper proves that FL systems can be turned into covert channels to implement a stealth communication infrastructure. The main intuition is that, during federated training, a malicious sender can poison the global model by submitting purposely crafted examples. Although the effect of the model poisoning is negligible to other participants and does not alter the overall model performance, it can be observed by a malicious receiver and used to transmit a sequence of bits. We mounted our attack on an FL system to verify its feasibility. Experimental evidence shows that this covert channel is reliable, efficient, and extremely hard to counter. These results highlight that our new attacker model threatens FL infrastructures.
Article
Full-text available
Network covert channels are increasingly used to endow malware with stealthy behaviors, for instance to exfiltrate data or to orchestrate nodes of a botnet in a cloaked manner. Unfortunately, the detection of such attacks is difficult as network covert channels are often characterized by low data rates and defenders do not know in advance where the secret information has been hidden. Moreover, neutralization or mitigation are hard tasks, as they require to not disrupt legitimate flows or degrade the quality perceived by users. As a consequence, countermeasures are tightly coupled to specific channel architectures, leading to poorly generalizable and often scarcely scalable approaches. In this perspective, this paper investigates trends and challenges in the development of countermeasures against the most popular network covert channels. To this aim, we reviewed the relevant literature by considering approaches that can be effectively deployed to detect general injection mechanisms or threats observed in the wild. Emphasis has been put on enlightening trajectories that should be considered when engineering mitigation techniques or planning the research to face the increasing wave of information-hiding-capable malware. Results indicate that many works are extremely specialized and an effective strategy for taming security risks caused by network covert channels may benefit from high-level and general approaches. Moreover, mechanisms to prevent the exploitation of ambiguities should be already considered in early design phases of both protocols and services.
Article
Full-text available
Modern text hiding is an intelligent programming technique which embeds a secret message/watermark into a cover text message/file in a hidden way to protect confidential information. Recently, text hiding in the form of watermarking and steganography has found broad applications in, for instance, covert communication, copyright protection, content authentication, etc. In contrast to text hiding, text steganalysis is the process and science of identifying whether a given carrier text file/message has hidden information in it, and, if possible, extracting/detecting the embedded hidden information. This paper presents an overview of state of the art of the text hiding area, and provides a comparative analysis of recent techniques, especially those focused on marking structural characteristics of digital text message/file to hide secret bits. Also, we discuss different types of attacks and their effects to highlight the pros and cons of the recently introduced approaches. Finally, we recommend some directions and guidelines for future works.
Article
Message Queuing Telemetry Transport (MQTT) is a publish-subscribe protocol which is currently popular in Internet of Things (IoT) applications. Recently its 5.0 version has been introduced and ensuring that it is capable of providing services in a secure manner is of great importance. It must be noted that holistic security analysis should also evaluate protocol’s susceptibility to network covert channels. That is why in this paper we present a systematic overview of potential data hiding techniques that can be applied to MQTT 5.0. We are especially focusing on network covert channels that, in order to exchange secrets, exploit characteristic features of this MQTT version. Finally, we develop proof-of-concept implementations of the chosen data hiding techniques and conduct their performance evaluation in order to assess their feasibility in practical setups.
Article
The Internet has been gradually evolving since its inception. In this paper, we highlight the important factors that have driven this evolution, and describe how the Internet is still struggling with several critical issues that need to be solved to meet predicted requirements of future applications. We discuss possible approaches and solutions, bearing in mind the considerable inertia of the Internet's key architectural features.
Article
Our recent architectural risk analysis of machine learning systems identified 78 particular risks associated with nine specific components found in most machine learning systems. In this article, we describe and discuss the 10 most important security risks of those 78.
Conference Paper
Network covert channels enable various secret data exchange scenarios among two or more secret parties via a communication network. The diversity of the existing network covert channel techniques has rapidly increased due to research during the last couple of years and most of them share the same characteristics, i.e., they require a direct communication between the participating partners. However, it is sometimes simply not possible or it can raise suspicions to communicate directly. That is why, in this paper we introduce a new concept we call ``dead drop'', i.e., a covert network storage which does not depend on the direct network traffic exchange between covert communication sides. Instead, the covert sender stores secret information in the ARP (Address Resolution Protocol) cache of an unaware host that is not involved in the hidden data exchange. Thus, the ARP cache is used as a covert network storage and the accumulated information can then be extracted by the covert receiver using SNMP (Simple Network Management Protocol).
Article
Network covert channels are hidden communication channels in computer networks. They influence several factors of the cybersecurity economy. For instance, by improving the stealthiness of botnet communications, they aid and preserve the value of darknet botnet sales. Covert channels can also be used to secretly exfiltrate confidential data out of organizations, potentially resulting in loss of market/research advantage. Considering the above, efforts are needed to develop effective countermeasures against such threats. Thus in this paper, based on the introduced novel warden taxonomy, we present and evaluate a new concept of a dynamic warden. Its main novelty lies in the modification of the warden’s behavior over time, making it difficult for the adaptive covert communication parties to infer its strategy and perform a successful hidden data exchange. Obtained experimental results indicate the effectiveness of the proposed approach.