ArticlePDF Available

Hidden in Plain Sight: Developing Use Cases That Nefariously Utilize Twitter’s API For The Purpose of Building Covert Communications

Authors:
Article

Hidden in Plain Sight: Developing Use Cases That Nefariously Utilize Twitter’s API For The Purpose of Building Covert Communications

Abstract and Figures

With over 182 billion Tweets being produced by approximately 330 million accounts on Twitter's social media platform just this year in 2019, each account is crafting approximately 552 Tweets. Due to the large volume of traffic and Tweets on this platform, it is a suitable candidate for creating a covert channel that is hidden in plain sight; thus, allowing for covert communications to exist. The paper defines a covert channel as being any type and all forms of communications that are hidden and communicate surreptitiously between the different endpoints. By exploiting Twitter's APIs, the channel utilizing two use cases: a malware use case and a command and control server design use case. These two use cases have been implemented to send covert messages, execute commands remotely, and exfiltrate data through an account's user profile page being scraped, parsed, and interpreted. Allowing ambiguity to be established in both use cases in a social media environment where communication between the different hosts would eliminate suspicion and mitigate the risk of detection.
Content may be subject to copyright.
Hidden in Plain Sight: Developing Use Cases That Nefariously Utilize
Twitter’s API For The Purpose of Building Covert Communications
Ali Alamri1Mohammed Alshehri1Kyle Petty1and Daryl Johnson2
Abstract With over 182 billion Tweets being produced by
approximately 330 million accounts on Twitter’s social media
platform just this year in 2019, each account is crafting ap-
proximately 552 Tweets. Due to the large volume of traffic and
Tweets on this platform, it is a suitable candidate for creating
a covert channel that is hidden in plain sight; thus, allowing
for covert communications to exist. The paper defines a covert
channel as being any type and all forms of communications
that are hidden and communicate surreptitiously between the
different endpoints. By exploiting Twitter’s APIs, the channel
utilizing two use cases: a malware use case and a command
and control server design use case. These two use cases have
been implemented to send covert messages, execute commands
remotely, and exfiltrate data through an account’s user profile
page being scraped, parsed, and interpreted. Allowing ambi-
guity to be established in both use cases in a social media
environment where communication between the different hosts
would eliminate suspicion and mitigate the risk of detection.
I. INTRODUCTION
With Twitter being one of the most popular sites on the
internet that garners over 330 million users on its’ platform
makes a covert channel hidden in plain sight plausible. Just
in the year 2019, Twitter’s 330 million accounts generated
over 182 billion Tweets, creating a wide range of possibility
to craft and implement a covert communication mechanism
due to the sheer amount of traffic and noise to help hide the
channel among the billions of Tweets.
This paper will explain the different use cases that have
been developed for this covert communication project to
achieve their own respective goals. The use cases that will be
broken down in the paper are: malware and command and
control server. Having two possible methods increases the
robustness of the covert commutation channel and allows us
to be flexible with the type of approach that is taken.
II. DEFI NITIONS
A. Covert Channel
A covert channel incorporates any type and all forms
of communications that are hidden and communicate sur-
reptitiously between the different endpoints. If engineered
correctly, the covert channel will be able to pass the com-
munication in plain sight, undetected. Some characteristics
that are associated with the covert channels and should
be defined before classifying a communication as covert.
These characteristics are type, imperceptibility, robustness,
throughput, detection, and prevention.
1Graduate Computing Security students at Rochester Institute of Tech-
nology
2D. Johnson is faculty with the Department Computing Security,
Rochester Institute of Technology, Rochester, NY 14623, USA
B. Social Media
Social Media is defined by the Merriam-Webster dic-
tionary as “forms of electronic communication (such as
websites for social networking and microblogging) through
which users create online communities to share information,
ideas, personal messages, and other content (such as video)."
[3] Some common social media platforms are Twitter, Face-
book, and even LinkedIn. Twitter’s social media platform
is the current location of our hidden in plain sight covert
channel.
C. Command and Control Server (C2)
To fully understand what a command and control server
(C2) is, the word server must first be defined. A server has
been defined as “a computer in a network that is used to
provide services (such as access to files or shared peripherals
or routing of e-mail) to other computers in the network.” [2]
A command and control server is used to differently and is
defined by Radware as “centralized machines that are able
to send commands ... Anytime attackers who wish to. . . send
special commands to their [C2] servers with instructions to
perform an attack on a particular target, and any infected
machines communicating with the contacted [C2] server will
comply by launching [the payload]. . . ”. [2]
D. Application Program Interface (API)
An application program interface (API) is defined as being
“a set of functions and procedures allowing the creation of
applications that access the features or data of an operating
system, application, or other service.” [7] Twitter’s devel-
opers made an extensive API library that could be used to
connect and build alongside Twitter’s platform.
III. LITERATURE REVIEW
In this section, there will be a literature review to present
pertinent prior work that has been published. It will help
establish prior social media and C2 server’s covert commu-
nication and the approaches taken to develop them.
The BotDefender: A Framework to Detect Bots in Online
Social Media paper that discusses how to prevent and defend
against bots and botnets. Their system’s design manages to
positively identify bots and botnets by taking a staged 3-
module approach; first is the behavior monitoring, second is
the behavior analyzer, and the third is the detection module.
The behavior monitoring consists of keyboard/mouse moni-
toring, monitoring hosting behavior, and monitoring network
behavior. After some extensive monitoring, the BotDefender
will analyzer all this behavior that it has monitored in
the behavior analyzer module. The last module ends up
detecting the bot using the previous two modules information
and determining if they need to report a user and request
suspension of the user to said social media platform for
operating a bot. [6]
This type of framework would not work against the
Developing Use Cases That Nefariously Utilize Twitter’s API
For the Purpose of Building Covert Communication’s design
and potentially creates a weakness going forward that would
need to be addressed. This will be discussed in the “Future
Work” section of the project.
The second relevant publication to be discussed in the
literature review is Secret Message Sharing Using Online
Social Media. This paper discusses the idea of using photo-
sharing social media sites to create the ability to exchange
secrete messages, and in turn creating a covert channel. The
channel uses a couple different steganography tools to craft
the image with the message before sending it out. [1] This
covert channel helped spark the idea of using social media as
the platform of the Developing Use Cases That Nefariously
Utilize Twitter’s API For the Purpose of Building Covert
Communication’s channel.
ADD reference here
IV. TWITTER COVERT CHANN EL
Currently the covert channel deployed in the different use
cases, practices the use of four different patterns to deliver
messages to their respective endpoints. A pattern would be
considered a way or feature used to deliver the message that
both sides can interpret. These four patterns all are using
the “re-Tweet” and “Like” features on their respective user’s
public facing homepage.
A. Hashtag Pattern
The first pattern was used as proof of concept that utilizes
the API’s “Favorite” feature; which essentially means if a
user favorites a Tweet it goes to a special list inside the
user’s profile called "Favorite". This list can be visible to
anyone using Twitter’s platform. The covert communication
part was looking for the first letter of each hashtag it can
find that matches the sender’s message. This pattern scraps
to find and then "Favorites" the Tweet. Followed by that
Tweet being published the user’s profile "Favorites List".
This make it visible to start finding hashtags that correspond
to message being sent. Once that hashtag is found, that Tweet
gets “Liked”, allowing for the hidden in plain sight message
to be posted publicly for everyone to see. This allows the
covert communication to be accessed from anywhere in the
world. An example would be if Alice wants to send the
message "cat" to Bob. Alice would use Twitter’s API to look
for hashtags that start with "c" (i.e. car club, etc..), after that,
it will look for hashtags that start with "a" (i.e. apple , etc..)
and so on. This approach is naive and has limitation as it
cannot process numbers or special characters because the
Twitter hashtags does not start with number or can contain
all the special character.
B. 2-Tweet Pattern
The third pattern uses the "Retweet" feature to Retweet
two Tweets. This pattern starts with a Tweet that specifies
the length of the messages being sent. Then it will start
Retweeting a pair of Tweets; the first Tweet is designed to
represent an ASCII character for crafting the message. It does
this by using the creation time of the Tweet that is specified
at the "created_at" value appearing from Twitter’s API.
The second Tweet is to index the location of the previous
message’s ASCII character that needs to be scrapped for the
purpose of crafting the message. That index is represented in
the creation time section of the Tweet that is specified at the
"created_at" value. For instance, this pattern would take the
sender’s message "cat" and begin converting them to hex, the
pattern would use "63", "61", and "74", respectively. The first
Tweet will represent the length of the coming message which
is the value of "created_at", and example of this value would
be "’Tue Dec 10 09:10:03 +0000 2019"; thus, the receiver
will know that the coming message has a length of "03",
which is the last two digits of "09:10:03". After that, the
sender starts Retweeting a pair of Tweets that describe "c",
"a", and "t". The first Tweet of the first pair would be a Tweet
that has the message "It’s a cruel game.", the second Tweet
of the first pair would be any Tweet from the pool mentioned
before, that has a value of 7 in "created_at" since 7 is the
seventh character in the previous Tweet. The next pair would
be a Tweet with the content of "Now available at Apple",
following that with a Tweet that with any Tweet that has a
value of 4 in "created_at" since 4 is the fourth character in
the previous Tweet. Finally, the last pair would have a Tweet
that has a content of "I like toys r us", then another Tweet
following it that has a value of 7 in the "created_at" value.
C. Epoch Pattern
The second pattern relied on the Epoch timestamp asso-
ciated with each Tweet. It uses the API Retweet feature to
search for a specific Epoch Time associated with the Tweet
to search for the third to fifth index of each timestamp and
make sure it matches the message the sender wishes to send.
The pattern does this by manipulating the hexadecimal Epoch
time feature to represent the ASCII character that is being
sent. An example of this would be a fixed date that would
start with; 1546300800 which would represent the January
1st of 2019. The covert communication’s pattern uses the
third to fifth index of the Epoch timestamp, in this case "630"
to search for its equivalent in ASCII. If, for instance, the
sender want to send "cat" as a message, the channel would
scrape for a Tweet that has a timestamp of "63, 61 and 74" in
the third to fifth index, respectively. The Tweet Epoch time
stamp should look like something similar to 154063XXX,
154061XXX and 154047XXX, the X’s means that it does not
matter which numbers falls in there. This approach ensures
that we can capture every ASCII character we wish to send
and gives more freedom for the message limitation discussed
in the previous pattern and below in the Limitations section.
D. Dates Difference Pattern
The fourth pattern also uses the "Retweet" API mech-
anism. Unlike the "2-Tweet" pattern, this pattern focuses
on the difference in time that the Tweets were published.
But instead of just using the date, this covert channel takes
the difference between a hard-coded standardized date in
the code and computes it against the published Tweet date,
resulting in the number this pattern utilises. By doing this,
the channel is able to compute a number; that number is
then referenced to an ASCII table and changed to an ASCII
character. The hard-coded date must be agreed upon by the
sender and the receiver so they can compute the difference
in days correctly. For example, to send "Dog", first the
sender and receiver need to agree on a date which in the
provided code, for this example the channel will use the
date 2019-01-01. The first Retweeted Tweet would have a
date of "2019-03-09" which if you compute the difference
between that date and 2019-01-01, we will find that it’s
68; thus, the first Retweeted Tweet represents the letter
"D". The next Tweet would have the date "2019-04-22",
which again the computed difference between that date and
2019-01-01 would be "111" which represents the letter "o".
Lastly, a Retweeted Tweet that is created on "2019-04-14",
which again the computed difference has to be "103" which
represents the letter "g".
Fig. 1. Standardized Performance Testing Results.
Fig. 2. Standardized Performance Testing Results.
V. USE CA SES
A. Use Case 1: Malware
The main purpose of this covert channel is to make it as
versatile as possible, and to be used in different scenarios.
One of these scenarios developed is a malware style use
case. Packaged in an executable binary, any attack vector
can be used when sending this payload. The main objective
is compromising the victim’s machine. Once the machine is
compromised the malware can have hard-coded commands
included that would search and collect sensitive files and
system assets. Once collected, the malware would use the
Twitter’s API to individually craft the communications back
to the attacker’s Twitter page via one “Retweet” or “Like”
at a time.
Fig. 3. Malware Use Case Diagram.
B. Use Case 2: Command and Control Server
The main purpose of the Command and Control Bot
use case is to send commands from a C2 server using the
Twitter’s API to manage a network of bots. The bot scrapes
the C2’s servers homepage for the message that has been
crafted and posted; it digests the page Tweet by Tweet in
chronological order. Once the covert communication channel
has been established between the network of bots and the
C2 server through anonymity of the internet; a plethora of
opportunities are at the controller’s hands.
Fig. 4. Command and Control Use Case Diagram.
VI. LIMITATIONS
A. One-Way Communication
Currently, the covert channel only can send one-way
communications. Once the commands are sent, there is no
feedback, besides seeing the covert channel executing the
command that was sent during the one way communication.
B. Rate Limits
The API that is currently being used creates a limitation
for our covert channel in terms of rate limits. Due to the
threat of the GET request being misused by automation
and/or bots, the API’s developers created a rate limit di-
vided into intervals of fifteen minutes. These rate limits
are combined with the fact that all endpoints will require
authentication. Because of the rate limit, the covert channel
can only do fifteen calls every fifteen minutes. The “Like”
feature of the API has a limitation of only allowing each
user one thousand “Likes” per day, where the “Retweet”
feature can be used to “Retweet” three hundred times every
three hours. These different rate limits could potentially
limit the covert channel going forward. Thus, allowing for
improvements to be made to the covert channel and slotted
for the future work.
VII. COVE RT CHAN NEL CHARACTERISTICS
A. Type
The covert channel would be classified as a “subliminal”
style or type due to the idea that the channel is working
below the senses. A subliminal channel focuses on providing
communication over a public insecure channel; with the
caveats being that the communication must look standard
and/or typical on that platform. With the covert channel using
a plethora of different user’s Tweets, the channel is able to
pick a word in that any of those Tweets via the API, to use
for a character or word needed for crafting the message.
B. Imperceptibility
With the channel using the subliminal type of style, the
imperceptibility comes down to ensuring that the channel
is hidden in plain sight and remains a secret. Once the
channel is compromised, the sender can try and delete the
two account’s email addresses that are currently hard-coded
in the sender.py and receiver.py scripts.
C. Robustness
Due to the number of users on Twitter’s platform, it
would be very reasonable to predict that Twitter is not
going anywhere for a long period of time. This specific
covert channel could be active for over ten years if not
compromised. Due to this assumption, we can consider if
the account was compromised externally with unwanted eyes
reading the traffic of the channel and/or an internal situation
where Twitter ends up flagging the account and suspending
it due to some sort of bot detection and/or malicious use of
their terms of service.
D. Throughput
Due to the channel currently using an API that rate limits
the number of Tweets that it will parse and the fifteen
minute rate limit time-frame the rate limit creates, the covert
channel of fifteen calls every fifteen minutes. This impacts
the number of Tweets that can be liked/used to one hundred
Tweets every sixty minutes, the throughput suffers until a
solution can be defined. Which allows the covert channel to
filter one hundred words every sixty minutes; this equates
to 1.66 words being communicated each minute using this
channel. This is one of the limitations of the covert channel
and is considered one of the future work tasks of the project,
to eliminate this rate limit and allow for more words to be
passed through the channel every hour.
E. Detection and Prevention
Even though Twitter has strict rules in terms of bots and
having them connected to Twitter’s API, it is completely
legal and does not breach Twitter’s Terms of Service. [10]
Which is why they are not actively shutting down any
bots and/or botnets that they might detect. Outside of re-
searchers, threat actors, and hobbyist building bot detectors,
this channel should be able to work; hidden in plain sight.
As described in the literature review section, a group of
researchers built a bot detector to help them identify, report,
and shutdown the bot’s account if they deem the material an
issue. During the design phase of the Hidden in Plain Sight
channel, this was an obstacle to overcome. But this follows
the theme of being a subliminal covert channel and trying to
communicate the covert channel under the guises of a social
media user linking content.
Due to there being there being projects involved with
Twitter that include bot detection, the covert channel needs to
consider how to prevent detection. The current bot detection
projects that were discovered and disclosed in the literature
review would not detect the current Hidden in Plain channel,
but it does not mean cannot detect the channel in the future.
Which means the channel needs to be constantly changing
and adapting to prevent detection. This would be tasked as
a future work in the project. But there has been discussion
of potentially putting a delay in the script to interrupt timing
analysis of bot detection. One could also design the script to
move the mouse in a pattern to help avoid mouse detection.
VIII. FUTURE WORK
A. Eliminate the Rate Limit
Due to the current API in use by the cover channel, the
channel is limited 100 Tweets every 60 minutes. The task will
be to either eliminate the current API in use and find and/or
create a custom API that does not rate limit the channel.
The current API that the channel used is open sourced and
could be reversed followed by customized to fit the channel’s
needs. The channel can circumvent the rate limit by using
a special GET request, it just needs to be designed to work
specifically with what the channel is trying to achieve.
B. Eliminate Hard-Coded E-mail Addresses
By somehow automating building a bot or automate with
a script to create an anonymous email account from a
temporary email provider. Create a Twitter account with that
same bot and/or script and link it to the new email. Sending
the details of the email, passwords, and user names to the C2
server or sender depending how the covert channel is being
implemented.
C. Bot Detection Prevention
To ensure that the covert channel remains undetected and
to extend the viability of the channel, there are certain
features that can be added to help prevent bot detection, some
of these features would include process hiding, adding code
to the script that would move the mouse in a fashion that
help deter automation detection and adding a sleep or delay
function to the script to deter timing analysis detection.
IX. CONCLUSION
In conclusion, this paper broke down the two different use
cases that have been developed for the covert communication
project; malware and command and control. This channel has
been shown to meet the requirements of a subliminal covert
channel that could last for years undetected. To achieve this
goal, the covert channel will continue to build and develop
more use cases to expand our project.
REFERENCES
[1] Jianxia Ning, I. S. (2014). Secret Message Sharing Using Online Social
Media. RIverside California University Publication, 1-4.
[2] Merriam-Webster Dictionary. (2019, November 14). Server.
Retrieved from Merriam-Webster: https://www.merriam-
webster.com/dictionary/server
[3] Merriam-Webster Dictionary. (2019, September 22). Social
Media. Retrieved from Merriam-Webster: https://www.merriam-
webster.com/dictionary/social
[4] Millen, J. (2008). 20 Years of Covert Channel Modeling and Analysis.
SRI International Computer Science Laboratory.
[5] Mordechai Guri, M. M. (2016). USBee: Air-Gap Covert-Channel via
Electromagnetic Emission from USB. University of the Negev: Cyber
Security Research Center .
[6] Neharika Singh, M. C. (2017). BotDefender: A Framework to Detect
Bots in Online. Journal of Network Communications and Emerging
Technologies (JNCET).
[7] Oxford Dictionary. (2019, April 21). API. Retrieved from Lexico:
Powered by Oxford: https://www.lexico.com/en/definition/api
[8] Radware Security. (2019, March 11). DDoS Attack
Definitions - DDoSPedia. Retrieved from Security at
Radware: https://security.radware.com/ddos-knowledge-
center/ddospedia/command-and-control-server/
[9] Sebastian Zander, G. A. (2008). Covert Channels in Multiplayer First
PersonShooter Online Games. IEEE, 2fifteen-220. Suite, C. C. (1997).
Craig H. Rowland. First Monday, 1-6.
[10] Twitter. (2018, May 25). Twitter Terms of Service. Retrieved from
Twitter: https://Twitter.com/en/tos
ResearchGate has not been able to resolve any citations for this publication.
Conference Paper
Full-text available
Covert channels aim to hide the existence of communication between two or more parties. Such channels typically utilise pre-existing (overt) data transmissions to carry hidden messages. Internet-based covert channels often encode new information into unused (or loosely specified) IP packet header fields, or the time intervals between IP packet arrivals. We propose a novel covert channel embedded within the traffic of multiplayer, first person shooter online games. We encode covert bits as slight, yet continuous, variations of a playerpsilas characterpsilas movements. Movement information is propagated to all clients attached to a given game server, yet the channel remains covert so long as the variations are visually imperceptible to the human players. A modified version of Quake III Arena is used to demonstrate our concept. We empirically analyse the covert channelpsilas bit rate, and compare the statistical characteristics of unmodified game traffic with those of game traffic carrying covert information.
Conference Paper
Full-text available
Covert channels emerged in mystery and departed in confusion. Covert channels are a means of communication between two processes that are not permitted to communicate, but do so anyway, a few bits at a time, by affecting shared resources. Information hiding is slightly different: the two communicating parties are allowed to talk, but the content is censored and restricted to certain subjects. The trick is to “piggyback” some contraband data invisibly on the legitimate content. The canonical example of this is to use the low-order two bits of each pixel in a picture for your secret message, since no one would notice if they were changed. When a similar idea was applied to smuggle information in network headers, we called it a network covert channel, mostly because the term “information hiding” hadn't been invented yet. The article traces the history of covert channel modeling from 1980 to the present (1999)
Conference Paper
Recently, there have been proposals to evade censors by using steganography to embed secret messages in images shared on public photo-sharing sites. However, establishing a covert channel in this manner is not straightforward. First, photo-sharing sites often process uploaded images, thus destroying any embedded message. Second, prior work assumes the existence of an out-of-band channel, using which the communicating users can exchange metadata or secret keys a priori; establishing such out-of-band channels, not monitored by censors, is difficult. In this paper, we address these issues to facilitate private communications on photo-sharing sites. In doing so, first, we conduct an in-depth measurement study of the feasibility of hiding data on four popular photo-sharing sites. Second, based on the understanding derived, we propose a novel approach for embedding secret messages in uploaded photos while preserving the integrity of such messages. We demonstrate that, despite the processing on photo-sharing sites, our approach ensures reliable covert communication, without increasing the likelihood of being detected via steganalysis. Lastly, we design and implement a scheme for bootstrapping private communications without an out-of-band channel, i.e., by exchanging keys via uploaded images.
  • Merriam-Webster Dictionary
Merriam-Webster Dictionary. (2019, November 14). Server. Retrieved from Merriam-Webster: https://www.merriamwebster.com/dictionary/server
  • Merriam-Webster Dictionary
Merriam-Webster Dictionary. (2019, September 22). Social Media. Retrieved from Merriam-Webster: https://www.merriamwebster.com/dictionary/social
BotDefender: A Framework to Detect Bots in Online
  • Neharika Singh
Neharika Singh, M. C. (2017). BotDefender: A Framework to Detect Bots in Online. Journal of Network Communications and Emerging Technologies (JNCET).
DDoS Attack Definitions -DDoSPedia
  • Radware Security
Radware Security. (2019, March 11). DDoS Attack Definitions -DDoSPedia. Retrieved from Security at Radware: https://security.radware.com/ddos-knowledgecenter/ddospedia/command-and-control-server/
Twitter Terms of Service
  • Twitter
Twitter. (2018, May 25). Twitter Terms of Service. Retrieved from Twitter: https://Twitter.com/en/tos