Content uploaded by Maxim Kolomeets
Author content
All content in this area was uploaded by Maxim Kolomeets on Oct 28, 2024
Content may be subject to copyright.
Content uploaded by Maxim Kolomeets
Author content
All content in this area was uploaded by Maxim Kolomeets on Jan 08, 2024
Content may be subject to copyright.
IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, VOL. TODO, NO. TODO, TODO 2024 1
The Face of Deception: The Impact of
AI-Generated Photos on Malicious Social Bots
Maxim Kolomeets, Han Wu, Lei Shi, and Aad van Moorsel
Abstract—In this research, we investigate the influence of
utilising AI-generated photographs on malicious bots that engage
in disinformation, fraud, reputation manipulation, and other
types of malicious activity on social networks. Our research aims
to compare the performance metrics of social bots that employ
AI photos with those that use other types of photographs. To
accomplish this, we analysed a dataset with 13,748 measurements
of 11,423 bots from the VK social network and identified 73
cases where bots employed GAN-photos and 84 cases where
bots employed Diffusion or Transformers photos. We conducted
a qualitative comparison of these bots using metrics such as
price, survival rate, quality, speed, and human trust. Our study
findings indicate that bots that use AI-photos exhibit less danger
and lower levels of sophistication compared to other types: AI-
enhanced bots are less expensive, less popular on exchange plat-
forms, of inferior quality, less likely to be operated by humans,
and, as a consequence, faster and more susceptible to being
blocked by social networks. We also did not observe any signifi-
cant difference between GAN-based and Diffusion/Transformers
based bots, indicating that Diffusion/Transformers models did
not contribute to increased bot sophistication compared to GAN
models. Our contributions include a proposed methodology for
evaluating the impact of photos on bot sophistication, along
with a publicly available dataset for other researchers to study
and analyse bots. Our research findings suggest a contradiction
to theoretical expectations: in practice, bots using AI-generated
photos pose less danger.
Index Terms—AI-generated photographs, GAN, Diffusion, so-
cial bots, bot evolution, disinformation, social networks.
I. INTRODUCTION
THE rise of sophisticated social engineering attacks on
social networks poses a significant threat to online safety.
Despite advancements in research, social networks remain
vulnerable to malicious actors wielding disinformation, abuse,
fraud, blackmail and more. The escalating threat is fueled by
both attackers’ growing technical expertise and the increasing
accessibility of technology, leading to complex, challenging-
to-detect attacks. One of the critical instruments in an at-
tacker’s arsenal on a social network is bots – fake, hacked,
or paid accounts, masquerading as genuine users and manip-
ulating social interactions to achieve their goals [1].
Maxim Kolomeets is with Durham University, Durham, UK, and
also with Newcastle University, Newcastle upon Tyne, UK. (e-mail:
maksim.kalameyets@newcastle.ac.uk)
Han Wu is with the University of Southampton, Southampton, UK.
(e-mail: h.wu@soton.ac.uk)
Aad van Moorsel is with the University of Birmingham, Birmingham, UK.
(e-mail: a.vanmoorsel@bham.ac.uk)
Lei Shi is with Newcastle University, Newcastle upon Tyne, UK. (e-mail:
lei.shi@newcastle.ac.uk)
This work is supported by UK Research and Innovation, United Kingdom.
Grant: “AGENCY: Assuring Citizen Agency in a World with Complex Online
Harms”, EP/W032481/2.
The integration of Artificial Intelligence (AI) in bots
presents a new frontier in online deception, raising concerns
about their increasing sophistication and potential harm. De-
spite social networks’ efforts to implement countermeasures,
bot operators remain persistent in developing increasingly
complex bots that can circumvent security measures while
fostering a sense of trust among users. This evolutionary
process of bots, termed “bot evolution”, presents a critical area
of study that holds the key to devising effective bot detection
and combat techniques. One worrisome trend in this domain
is the use of AI technologies in bot creation and management,
including automating the generation of bot accounts, imbuing
them with “natural” behaviour, and tailoring them to specific
social groups. While the full impact of AI-powered bots
remains somewhat elusive, their ability to streamline attacks
necessitates further exploration. Thus, this paper breaks new
ground by focusing on the under-examined aspect of AI-
generated photos, offering crucial insights into the evolving
landscape of bot detection and mitigation.
Over the last decade, Generative Adversarial Networks
(GANs) [2], Diffusion and Transformers models (DTM) [3]–
[5] have emerged as a captivating frontier in AI technology,
garnering considerable attention due to their remarkable ability
to generate realistic photos of people [6]. As a result, these
technologies have found application in the development of
bots, leveraging AI-generated profile photos (AI-photos) to
improve their realism and bolster users’ trust in them. By using
the power of AI, these bots can now present themselves with
a degree of truthfulness that was previously unattainable, ele-
vating their potential to seamlessly blend into social contexts
and interact with human users on an unprecedented level.
In this paper, we present a rigorous qualitative comparison
between bots leveraging AI-photos and other bots, thoroughly
evaluating critical bot metrics, including bot price, survival
rate, bot quality, speed, bot-trader type, and human trust.
Our primary objective is to delve into the profound influence
of adopting AI-photos on the behaviour and characteristics
of bots. Our research tackles key challenges in bot combat,
contributing to three critical aspects:
•Disinformation and fraud: We explore challenges posed
by advanced AI technology in generating realistic photos
for malicious purposes, informing strategies to combat
these threats.
•Bot detection: We examine the implications of detect-
ing bots that employ AI-photos, drawing conclusions to
inform our understanding of potential risks and develop
effective detection strategies.
•User protection: We analyse the need for heightened
protection from AI-enhanced bots compared to other bots,
For the purpose of open access, the author has applied a Creative Commons Attribution (CC BY) licence to any Author Accepted Manuscript version arising.
This preprint has not undergone peer review or any post-submission improvements or corrections.
The Version of Record of this article is published in IEEE Transactions on Computational Social Systems, and is available online at
http://doi.org/10.1109/TCSS.2024.3461328
IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, VOL. TODO, NO. TODO, TODO 2024 2
ensuring user safety and security through the development
of robust mitigation strategies.
•Bot evolution: Bots are continually evolving to bypass
detection mechanisms. AI can adapt their behavior, there-
fore understanding bot evolution process is essential to
developing effective combat techniques.
This paper presents a pioneering empirical analysis that
delves into the consequential impact of employing AI-
generated photos on bot sophistication. The crux of this study
lies in its novel methodology, devised to effectively estimate
the influence of photos, which, in turn, can serve as a crucial
tool for analysing and understanding the evolution of future
bot generations. This research goes beyond scientific merit by
offering a publicly available dataset of bots’ photos along with
bots’ metrics, empowering further research and development in
bot detection and mitigation. Moreover, it reveals a concerning
“cyber arms race” effect, highlighting the escalating need
for proactive solutions. This work is a crucial step towards
understanding and ultimately combating the evolving threat
of sophisticated bots.
The paper is structured as follows: Section II provides
essential scientific context through a detailed literature review,
shedding light on the current state of bot combat and the
implications of GAN, Diffusion and Transformers technolo-
gies. Section III introduces our approach to comparing the
key metrics of AI-enhanced bots with those utilising other
types of photos, offering a structured and robust framework
for analysis. Section IV presents the results of our experiments
conducted on malicious bots from the VK social network, un-
veiling crucial insights into bot behaviours and characteristics.
Finally, Section V critically interprets these results within the
broader context of bot sophistication. This section provides
a deeper understanding of the implications of AI-generated
photos on bot evolution and its potential ramifications for
future research and guiding the development of critical security
measures to combat the evolving threat of sophisticated bots.
II. RE LATE D WORK
The combat against social bots and the evaluation of their
harmful effects constitute a multidisciplinary research area
that engages various communities spanning communication
theory, political science, data science, information security, and
more. Given the diverse perspectives from which researchers
approach the study of social bots, as highlighted in [1], a
definitive and universally agreed definition of what exactly
qualifies as a social bot remains elusive.
Given the varied terminology used by researchers to de-
scribe malicious accounts, such as social bots, fakes, sybils,
and trolls, it becomes imperative to establish a clear and
precise definition of bots for the purpose of this paper. For
consistency and clarity, we adopt the definition given in [7]
and define bots as automated or human-operated accounts that
engage in malicious activity on social networks, intending
to distort social processes to achieve the attacker’s goals.
As previous research [8], [9] has shown, bots have become
increasingly sophisticated in their creation, management, and
camouflage techniques, which now encompass the use of AI
and the recruitment of qualified personnel for bot management.
Therefore, in this paper, we extend the definition of malicious
bots beyond automated accounts, encompassing any accounts
that orchestrate “malicious artificial activity” on command,
thereby compromising the confidentiality, integrity, or avail-
ability of users or communities. This definition encompasses
bot types, ranging from those controlled by simple algorithms
to troll factories, where real people operate bots, AI-controlled
bots, bot-exchange platforms (when legitimate users perform
actions for money), and hijacked accounts of genuine users.
The diverse and ever-evolving nature of social bots, with
their continual advancements in creation and management
techniques, challenges the notion of considering them as a
homogeneous category of accounts. Termed as “bot evolu-
tion” [1], [10], this phenomenon entails the development of
increasingly sophisticated methods to outmanoeuvre security
mechanisms of social networks, bolster their survivability, and
enhance their capacity to emulate human behaviour. However,
despite their evolution, the majority of automated bot detection
solutions still rely on supervised machine learning techniques
[11], [12], which have proven to be less effective with each
successive wave of social bots that employ more advanced
technologies [1], [13]. Acknowledging this diversification and
evolution, it becomes apparent that bots cannot be treated as a
monolithic entity [14]. Their ability to adapt and employ so-
phisticated techniques demands novel detection and mitigation
approaches that can keep pace with their cunning strategies.
The incorporation of AI in bot development and manage-
ment is a critical element of bot evolution that blurs the line
between bots and genuine users [1]. With AI at their disposal,
bots can craft more authentic profiles and generate lifelike
photos, produce realistic text, and simulate natural behaviours,
thereby appearing remarkably human-like. In the realm of
AI technologies, Generative Adversarial Networks (GANs)
have garnered significant attention over the past decade. Bots
leverage GANs to fabricate photos of non-existent persons,
enabling the creation of compelling fake profiles that can
deceive unsuspecting users and evade detection efforts. Fol-
lowing GAN technology, photo generation techniques based
on Diffusion and Transformer models (DTMs), such as Stable
Diffusion1and DALL·E2, have emerged, producing images
that are even more realistic to the human eye.
GAN was first introduced in [2] and has since emerged
as one of the most widely used deep learning techniques
for generative modelling. At its core, GAN manifests as a
dualistic neural network entity consisting of a Generator, re-
sponsible for generating synthetic images, and a Discriminator,
tasked with distinguishing between fabricated and authentic
images. As training progresses, the generator’s performance
improves, culminating in the generation of synthetic images
that the discriminator cannot distinguish from authentic ones.
Initially, GAN was limited to generating small images us-
ing fully connected networks. However, the introduction of
Convolutional Neural Network (CNN) architectures, such as
DCGAN [15], facilitated the generation of more complex
1https://stability.ai
2https://openai.com/index/dall-e/
IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, VOL. TODO, NO. TODO, TODO 2024 3
images with stable training. Notably, StyleGAN [16] has
marked impressive strides in generating lifelike faces that are
nearly indistinguishable from real human faces. StyleGAN
also empowers greater control over facial features, enabling
manipulation of attributes such as age, gender, and ethnicity.
Despite the impressive visual quality of GAN-images, they
often exhibit certain hallmarks that can give away their syn-
thetic origins and be exploited for detection purposes [17].
To address this challenge, a novel image detection method
has been introduced by Mandelli et al. [18], which relies on
an ensemble of Convolutional Neural Networks (CNNs) to
detect synthetic images. The key objective of this approach
is to boost the ensemble’s diversity by training each CNN to
provide “orthogonal” results. Through empirical evaluations
conducted on StyleGAN3 [16] images, the proposed GAN
detector ranked first in the competition organized by NVIDIA.
Recently, Diffusion models [3], [4] have emerged as a
powerful successor to GANs for image generation. The main
steps involve a forward diffusion process that adds noise to the
data and a reverse denoising process that removes the noise
step-by-step to generate new, clean data samples. Diffusion
models demonstrated state-of-the-art performance in various
applications, such as image generation and natural language
processing [19]. The variations of diffusion models have fur-
ther enhanced the capability to generate high-quality images.
Notable examples include (i) GLIDE [20] (Guided Language
to Image Diffusion for Generation and Editing), which is
designed for image generation guided by text prompts, and
(ii) Latent Diffusion Models (LDM), which operate in a
compressed latent space to improve efficiency [21].
Transformer-based models are also widely adopted in the
field of generative AI. These models aim at handling sequential
data due to their self-attention mechanisms, which effectively
capture long-range dependencies. Taming Transformers [5]
combine transformers with CNNs for high-quality image syn-
thesis. DALL-E is designed to generate images from textual
descriptions by leveraging the transformer capabilities [22].
Another perspective on this issue is how users assess the
trustworthiness of social network profiles, particularly in the
context of AI-photos. A recent study by Mink et al. [23]
sheds light on this issue through a role-playing experiment,
where participants were asked to review fake profiles created
by the researchers, featuring photos generated using AI. The
alarming results, revealing a connection acceptance rate of
79%–85%, expose a dangerous over-trust in deepfake profiles
among average users. This highlights the critical need for
more sophisticated bot detection strategies and user education
initiatives to combat the growing threat of online deception.
As AI technologies has become widely accessible over
time, it is evident that bot owners have already embraced
its potential. Previously, bot owners had to steal photos from
other users to create their bots. However, they can now use
neural networks to generate fake photos with less effort. In
this paper, we set out to address a set of logical and pertinent
questions: How does AI technology impact social bots and bot
market? Does it make them more challenging to identify, better
at deceiving users compared to other bots, more expensive in
the bot market, and in general, more sophisticated?
III. METHODOLOGY
To empirically estimate the effect of AI-faces on social bots,
we propose comparing the metrics between AI-enhanced bots
and bots using other photo types. This section describes the
key elements of the methodology and the final approach.
A. Formation of sets for comparison
For the purpose of comparison, we divide bots into sets
corresponding to the types of photos they use. To create a
photo for a social bot account, a bot-trader leverages multiple
techniques influenced by factors such as technical complexity,
perceived effectiveness, and implementation cost. This paper
examines photo types commonly employed by bots (Table I).
In our proposed methodology, we assume that the bot-trader
employs a decision tree selection procedure, as depicted in
Figure 1. The leaves of this tree represent types of photos
used. Therefore, as the first step, we identify the photo type
by employing predictions from several neural networks (A-E
in Figure 1):
•YOLO – identify person (A in Figure 1);
•CelebDetector – identify face and celebrity (B and C);
•GAN-image-detection – identify GAN usage (D);
•Diffusion-detection – identify GAN, Diffusion and Trans-
formers usage (E).
The YOLO (“You Only Look Once”) [25] object detection
algorithm facilitates rapid and precise detection of objects,
including individuals, in images and video streams. YOLO has
evolved through several versions, each refining its capabilities.
The algorithm is based on a single neural network that divides
the image into a grid and predicts bounding boxes, confidence
scores, and class probabilities for each grid cell. YOLOv1,
released in 2016, was the first version of the framework and
achieved real-time object detection with acceptable accuracy.
Recently, a YOLOv7 [26] model has been proposed in some
research studies, which reduces about 40% parameters and
50% computation of state-of-the-art real-time object detectors,
resulting in faster inference and higher detection accuracy. In
this paper, we adopt YOLOv7 to detect persons in images.
CelebDetector, a tool developed by [27], [28], can be
used for celebrity detection. The tool employs a pipeline
consisting of the following steps: (1) face identification on a
given photo using Multi-Task Cascaded Convolutional Neural
Networks (MTCNN) [29], (2) vectorization of the extracted
face image using the VGGFACE encoder model [30], [31],
and (3) matching the analysed face with that of a celebrity
using Spotify’s ANNOY (Approximate Nearest Neighbors)
library [32]. Given its combined face extraction and celebrity
detection capabilities, we employed this pipeline for the B and
C elements of the detection conditions outlined in Figure 1.
GAN-image-detection [33], [34], a method designed to
detect real images from synthetic ones, utilizes an ensemble of
Convolutional Neural Networks (CNNs) trained on disparate
training datasets, which are “orthogonal” to each other. “Or-
thogonal” datasets are defined as datasets that include images
with different content that underwent different processing or
were synthesised by different GANs. This approach capitalises
IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, VOL. TODO, NO. TODO, TODO 2024 4
TABLE I
TYP ES OF P HOT OS TH AT BOT-TRA DE RS CA N US E TO CR EATE A B OT ACC OUN T.
photo type description technique possible motivation
without person photo of an animal, abstraction, etc. search in a database,
image search engine(s) imitate a person who wants to remain anonymous
with person
without face
a person is visible
but cannot be recognized
search in a database,
image search engines,
social network(s);
may be processed
imitate a person who wants to remain anonymous,
show the qualities of a person
(age, hobbies, gender, etc.)
with person
& stolen face a photo is a duplicate of an existing person search in social network(s) the bot may look more like a real user
with person
& stock face
a photo of some celebrity
or well-known person
search in a database,
image search engine(s) the bot may look like a fan
with person
& AI-face photo generated by AI own AI neural network,
or service [24]
the bot may look more like a real user,
will be unique
Fig. 1. Formation of sets with bots associated with the type of photo. Detection conditions illustrate how each type can be identified based on the outputs
of photo classifiers.
on the fact that each CNN captures slightly different informa-
tion, and the ensemble of many CNNs trained in this orthog-
onal fashion achieves superior results compared to training a
single CNN over the entire dataset. This GAN detector secured
first place in the NVIDIA competition, exhibiting exceptional
performance that surpassed established multimedia forensics
experts. The authors [34] conducted eight testing scenarios,
achieving a minimum AUC of 0.9919 and an outstanding
global test AUC of 0.9995. These results indicate that the
detector proficiently identifies GAN-photos with an accuracy
that closely approaches the ground truth.
Since Diffusion and Transformers Models have been pro-
posed only recently, their forensic properties have not been
extensively studied. As noted in [35], [36], existing generated
image detectors experience dramatic performance drops when
facing images generated by DTM. We tested the GAN detector
used in this study [33], [34] on a dataset with Diffusion and
Transformer images presented in [36] and found that it could
not effectively detect them (see Figure 2). For this reason, we
use the second AI detector that can detect DTM images – the
Diffusion-detection model presented in [36], [37]. This model
was trained on a dataset of ProGAN images [38]. The authors
reported an average AUC of approximately 90.8% when tested
on different GAN, Diffusion and Transformer images, the
same images we used to test the GAN detector (see Figure
2).
After processing photos through classifiers, we apply spe-
cific conditions (outlined in Figure 1) to the output of neural
Fig. 2. FN produced by GAN-detector [33], [34] indicate that it can’t detect
Diffusion and Transformers images (based on dataset presented in [36]).
networks to determine their photo types. Because the GAN-
image-detection model efficiently detects only GAN-generated
images, while the Diffusion-detection model can detect images
generated by GANs, Diffusion models, and Transformers, we
use the logical conditions depicted in Figure 1 to identify
whether a bot uses a GAN or a DTM-generated photo. We
conclude that an image is GAN-generated if it is detected by
the GAN-detector, and DTM-generated if it is detected by the
Diffusion-detector but not by the GAN-detector. This allows
us to compare these technologies separately.
After that, we conduct a manual analysis of the identified
GAN and DTM photos to remove obvious misclassifications
by the detectors, thereby improving the quality of further
analysis.
This categorisation allows us to effectively divide the dataset
into sets, each corresponding to a bot’s photo type, so we can
compare them.
IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, VOL. TODO, NO. TODO, TODO 2024 5
B. Bot metrics
The second element of our methodology involves the met-
rics used to compare different bot sets. These metrics stem
from observations made during our previous research [7] on
bot collection and bot recognition by a human:
1) Price: This metric quantifies the cost associated with a
specific bot-trader for desirable bot action, e.g., “like” a
photo. It measures the monetary value in currency and
spans a range of [0, Inf].
2) Bot-Trader Type (BTT): This is a binary categorisation
of “bot-factory” used to manage bot activity sales. The
two distinct categories are {SHOP, EXCHANGE}. A
“SHOP” refers to a website owned by a single bot-trader,
where customers can purchase specific bot activity ser-
vices, like obtaining 100 comments from a single bot
owner. In contrast, an “EXCHANGE” is a web-platform,
where customers can create requests for desired bot
activities, and the bot owner can fulfil a single element
of the request. For instance, a customer can request
100 comments, and a single bot owner can provide one
comment. Bots from “EXCHANGE” platforms often
exhibit unique features including diversity and manual
control. They may involve not only fake accounts but
also real users who seek to earn money using their
accounts.
3) Normalised Bot Quality (NBQ): This metric reflects the
quality of bots as identified by experts based on bot
quality descriptions provided by bot-traders. To ensure
consistency across different traders’ descriptions (e.g.,
one trader using terms like “common” to “ultima” while
another using “normal” to “best”), NBQ is categorised
as [LOW, MED, HIGH] based on the interpretation of
a human expert.
4) Speed: This metric estimates a bot’s efficiency in ex-
ecuting actions promptly. Bots operating automatically
are expected to perform actions at a faster pace, while
manually operated bots may be slower. Speed is cate-
gorised as [MIN, HOUR, DAY], indicating the expected
time taken for a bot to perform an action.
5) Survival Rate (SR): This metric represents the probabil-
ity that a bot will be detected and subsequently blocked
by the security mechanism of a social network.
6) Trust: This metric refers to the likelihood that humans
can identify a bot. It indicates how authentic a bot
account appears and reflects the bot’s capacity to deceive
humans. The “Trust” metric reflects how convincing the
bot appears to humans (human defence ability), while
the survival rate measures the platform’s effectiveness in
detecting and blocking bots (platform’s defence ability).
The metrics used in our study were sourced from the
MKMETRIC2022 dataset [39], which contains identifiers (ID)
of 22,325 bot accounts (among which 18,444 are unique)
and their corresponding metrics for the Russian social net-
work VKontakte. The dataset also includes price, BTT, NBQ,
and speed metrics, which were estimated using information
gathered by purchasing bot activity into “honeypots”, which
are traps that mimic the victim’s account. The survival rate,
presented in the dataset, was measured two months after the
purchase. The “Trust” metric was determined by conducting
Turing tests, in which individuals attempted to recognise bots.
It is important to note that “Trust” can be measured in
various ways, considering factors such as blocked accounts.
These different methods typically exhibit a strong correlation,
following the Chaddock scale. In this study, we used a specific
variation of “Trust” as a metric, denoted as T rustIDN
NZ in the
previous research [7].
All metrics in the dataset serve as reflections of the bot’s
“sophistication”. To aid in the understanding of these metrics,
we have summarised them in Table II, indicating the direction
of sophistication based on the values. Cells highlighted in blue
represent cases where higher values indicate greater sophis-
tication (e.g., a higher price denotes greater sophistication).
Conversely, cells highlighted in red indicate situations where
higher values signify lower sophistication (e.g., a higher SR
denotes lower sophistication, as it reflects the probability that
a human can detect the bot). To summarise, sophisticated bots
tend to be more expensive due to their advanced technology
and higher demand. They are commonly obtained from ex-
change platforms that offer a diverse selection of bots and are
regarded as being of superior quality based on rankings given
by bot-traders. However, they may operate at a slower pace,
as they are more likely to be controlled by humans to perform
intricate actions. Sophisticated bots may also exhibit a lower
SR, reducing the likelihood of being detected and blocked by
security mechanisms, and their “Trust” metric is usually lower,
making them less likely to be recognised as bots by humans.
The selection of the MKMETRIC2022 dataset for our study
is primarily due to (a) its detailed level of information and (b)
the data collection technique employed, which distinguishes it
from other existing datasets.
Most bot datasets are gathered using human-annotation
methods where users attempt to identify bots. In one of our
previous studies [40], we demonstrated that this approach is
prone to errors, such as mislabeling genuine users as bots and
vice versa. These errors increase with more sophisticated bots,
which are particularly successful at deceiving users. Conse-
quently, such datasets will likely contain numerous errors and
fewer sophisticated bots. In contrast, the MKMETRIC2022
was gathered using the “honeypots” technique, which involves
purchasing fake activity on the bot market. This method
eliminates false positives (genuine users misidentified as bots)
and minimises false negatives. It is also very diverse, as it is
sourced from 66 bot traders’ offers, ensuring the inclusion of
a wide variety of bots available in the bot market, which helps
us support the generalizability of our findings.
Another compelling reason for choosing the MKMET-
RIC2022 dataset is its inclusion of pre-estimated bot metrics
that focus on detailed market-based characteristics. In contrast,
most datasets are tailored for binary bot detection or account
credibility estimation and often lack comprehensive descrip-
tions of bot characteristics, primarily providing only “impact”
metrics. These impact metrics might track the frequency of a
bot’s posts, comments, or photos [41], analyse differences in
the complexity and centrality of social network graphs [42],
or assess text content similarity [43]. They may also include
IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, VOL. TODO, NO. TODO, TODO 2024 6
text content metrics for sentiment analysis [44] and references
to specific individuals, products, or other entities [45], such
as instances that might discredit a politician. While helpful
in detecting bots and assessing the impacts of their activities,
these metrics fall short in providing detailed characterisations
of the types of bots involved. In contrast, the market charac-
teristics included in MKMETRIC2022 are more suitable for
our analysis as they offer insights into the technologies, efforts,
and financial resources behind the bots being studied. As noted
in [14], bots vary significantly, and categorising them by their
specific characteristics is crucial for a comprehensive analysis.
Thus, MKMETRIC2022 offers a high level of reliability
due to its “honeypots” bot collection technique and its de-
tailed market characteristics of bots, which makes the dataset
particularly valuable for our analysis compared to others.
TABLE II
BOT METRICS SUMMARY.
metric range
categorical & numerical if lower if higher sophistication
direction
price {0, +Inf}cheap bot expensive bot ↑
BTT {SHOP = 0,
EXCAHGE = 1}
one owner,
automated
multiple owners,
manual control ↑
NBQ
{LOW=0,
MED=1,
HIGH=2}
low quality high quality ↑
speed
{MIN=0,
HOUR=1,
DAY=2}
fast slow ↑
SR {0, 1}will
survive
will be
blocked ↓
Trust {0, 1}difficult to
recognise
easy to
recognise ↓
C. Comparison methods
To assess the variation in bot metrics between bots with
AI-photos and others, we propose employing three methods.
Hypothesis testing. Our first proposed technique is a
hypothesis test aimed at evaluating the difference in the
distributions of the metrics between “AI-enhanced bots” and
other types of bots. As we will demonstrate in our experiments
(Section IV), “AI-enhanced bots” are not prevalent, making it
challenging to discern whether observed differences in metrics
are statistically significant or merely due to random variation.
For that reason we use hypothesis testing to ensure that any
detected differences are not the result of the small sample
sizes but reflect genuine discrepancies. For this analysis, we
formulate a null hypothesis (H0) that there is no significant
difference between the two groups. Conversely, the alternative
hypothesis (H1) posits that there is indeed a difference in their
distributions. Formally, we can express this as:
H0:metricGAN |DT M =metricother,
H1:metricGAN |DT M ≶metricother,
p-value <0.05 ⇒reject H0,
(1)
where metric ∈ {price, B TT, N BQ, speed, S R, T rust},
other =AI and corresponds to one of the other types of
photo (no person, person without face, person with face).
For our testing, we use both left-sided and right-sided
Kolmogorov-Smirnov hypothesis tests. These tests enable us
to estimate whether a metric tends to be larger or smaller for
accounts with GAN or DTM-photos.
We also analyse is there a significant difference between
GAN and DTM bots using two-sided Kolmogorov-Smirnov
hypothesis test for each metric (H0:metricGAN =
metricDT M , H1:metricGAN ≶metricDT M ).
Analytical statistics. For more detailed analysis we propose
to calculate statistics for each metric and type of photo and
visually compare them with:
1) Distribution plot – compare probability to observe spe-
cific metric values for different types of bots.
2) Delta mean matrix – compare metrics means differences
for different types of bots in the form of the matrix.
By conducting hypothesis tests and comparing the distribution
of metrics between AI-enhanced bots and other types of
bots, we can quantitatively measure the extent of differences
observed in AI-enhanced bots compared to other bots.
Bot-traders connectivity. If one bot appears in the offers of
multiple traders, we can conclude that these bot-traders have
a common pool of bots. Thus, we can estimate the prevalence
of AI among bot-traders by creating a graph connecting bot-
traders based on the number of common bots:
G= (V, E ), E =∥A∩B∥
∥min(A, B)∥.(2)
The vertices Vin the graph Grepresent the bot-trader’s
offers, while the edges Erepresent the common fraction of
bots between traders Aand B. To determine the common
fraction, we calculate the intersection between sets Aand B
(denoted as cardinality of intersection ∥A∩B∥, the number of
common bots) and divide it by the cardinality of the smaller
set (∥min(A, B)∥). This yields the proportion of bots that both
Aand Bhave in common. This graph effectively illustrates
the connections among bot-traders, showcasing how they share
and interact with common bots. After constructing the graph,
we can highlight vertices as colour with the ratio of AI-
enhanced bots to the size of the entire offer Aas:
ratioA=|AAI |
|A|,(3)
where |AAI|represents the number of AI-bots in bot-trader A’s
offer and |A|is the total number of bots in A’s offer.
We also propose to bar plot the distribution of the ratio to
observe the prevalence of AI technology among bot-traders.
D. Summary of the proposed approach
In summary, the proposed methodology aims to estimate
the effect of AI-faces on social bots by conducting a thorough
comparison of metrics between AI-enhanced bots and other
types of bots. The methodology consists of three elements:
formation of sets, association of sets with bots metrics, and
comparison of bot metrics:
1) To form sets for comparison, we divide bots into sets
corresponding to the photo types that bots use: no
person,person without face,person with stolen face,
person with celebrity face,person with face generated
by GAN, and person with face generated by DTM. We
use predictions of YOLO, CelebDetect, GAN-image-
detection and Diffusion-detection neural networks to
IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, VOL. TODO, NO. TODO, TODO 2024 7
identify these photo types. After processing the photos
through classifiers, we apply logical conditions to deter-
mine the corresponding photo-type of each bot. After,
we manually analyse results for found GAN and DTM
photos and remove obvious misclassifications.
2) The study incorporates several bot metrics, including
price, Bot-Trader Type (BTT), Normalised Bot Quality
(NBQ), speed, survival rate, and the “Trust” metric.
These metrics were sourced from the MKMETRIC2022
dataset, which contains identifiers of VKontakte bot
accounts and their corresponding metrics.
3) To comprehensively assess the variation in bot metrics
between bots with AI-photos and other types of bots,
we use three methods: hypothesis tests to identify the
statistical differences in metrics, analytical statistics to
describe these differences, and an analysis of bot-traders
connectivity to understand the prevalence of AI technol-
ogy among bot-traders.
IV. EXP ER IM EN TS A ND R ES ULTS
For experiments, we downloaded photos using the identifiers
of bots in the MKMETRIC2022 dataset [39]. The dataset
consists of 66 bot-trader offers, encompassing a total of 22,325
bot ID records. Out of these records, 18,444 bots are unique,
while the remaining bots appear in offers from multiple
bot-traders. This suggests that certain bots are under shared
control, with two or more bot-traders managing the same
accounts. However we could successfully download photos
for only 11,423 accounts: some bots were already blocked,
and five damaged photos could not be retrieved. Despite these
limitations, the dataset still provides valuable insights into the
characteristics of a substantial number of bots. We used these
photos as the input for classifiers and obtained the following:
1) Without person: 1,866 bots were classified in this
category, with 1,581 of them being unique.
2) Person without face: 3,370 bots (2,776 unique).
3) Person with stolen or celebrity faces: The celebrity
recognition classifier identified 89 unique bots, but only
3 of them had confidence metric >0.2. As we checked
manually, most predictions below this confidence level
were wrong. Because of such poor results, we decided to
merge classes of celebrity and stolen faces. This merged
class include 8,355 bots (6,952 unique).
4) Bots with GAN-faces: 73 bots (83 before manual
filtering – Figure 7), with 52 of them being unique.
5) Bots with DTM-faces: 84 bots (108 before manual
filtering – Figure 7), with 62 of them being unique.
Overall, 13,748 bots were classified, and among them,
11,423 were unique. Figure 3 depicts dataflow with statistics.
Fig. 3. The result dataflow and statistics for formed sets of bots.
It is essential to acknowledge that the same bot may
appear multiple times in the analysed sets with distinct metric
values due to varying management strategies employed by bot-
traders. Several bot-traders may share one bot account but use
differing management methods. For example, a bot-trader may
rely on automated software for rapid execution, while another
bot-trader opts for manual control, which involves human
intervention, leading to slower actions from the same bot
account. Therefore, situations may arise where a bot exhibits
multiple metric measurements that may deviate. To account for
this, we utilised a non-unique bot identifier as a data example
metric measurement instead of a unique ID. This approach
enables us to conduct our analysis while acknowledging the
potential differences in bot performance and management
strategies across various instances of the same bot.
Results of Hypothesis Testing: The analysis of the hy-
pothesis test indicates significant differences in metric values
between bots utilising AI-photos and their counterparts using
other photo types. The Kolmogorov-Smirnov test (KS) results
are presented in Table III for GAN and in table IV for DTM,
where each row corresponds to a specific metric, and each
column signifies the comparison of AI-enhanced bots with a
particular type of other bots. The table employs colour-coding
for clarity, with red cells featuring a downward arrow denoting
the alternative hypothesis metricGAN|D T M < metricother,
and blue cells with an upward arrow indicating the alterna-
tive hypothesis metricGAN|D T M > metricother . The values
within each cell represent the p-value derived from the test,
which quantifies the significance of the observed differences.
A the same time, we didn’t observe differences in metrics of
GAN and DTM bots, as showed in two-sided KS test presented
in table V.
TABLE III
THE KS TES T RE SULT S WER E ANA LYSE D TO CO MPAR E THE M ET RIC S OF
BOTS UTILISING GAN-PHOT OS WI TH T HOS E TH AT DO NOT. RED CE LLS
IN DIC ATE LOW ER M ETR IC VALU ES F OR TH E GAN-ENHANCED BOTS
WHILE BLUE CELLS INDICATE HIGHER.
GAN
vs...
without
person
with person
without face
with person
stolen/celeb.
price ↓3.1×10−4↓3.7×10−4↓1.0×10−5
BTT ↓5.3×10−4↓3.0×10−6↓5.6×10−5
NBQ ↓4.7×10−4↓8.1×10−4↓1.9×10−5
speed ↓1.4×10−6↓1.5×10−6↓7.7×10−9
SR ↑1.3×10−16 ↑3.4×10−16 ↑7.6×10−19
Trust ↑1.7×10−11 ↑4.9×10−10 ↑1.7×10−10
TABLE IV
KS T EST R ES ULTS F OR DTM PH OTO S.
DTM
vs...
without
person
with person
without face
with person
stolen/celeb.
price ↓1.3×10−7↓1.2×10−7↓6.7×10−10
BTT ↓1.7×10−9↓2.2×10−8↓3.8×10−11
NBQ ↓2.3×10−8↓4.8×10−8↓9.0×10−11
speed ↓1.4×10−6↓1.5×10−6↓7.7×10−9
SR ↑1.2×10−22 ↑3.1×10−22 ↑4.0×10−26
Trust ↑1.1×10−13 ↑1.8×10−10 ↑5.0×10−11
IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, VOL. TODO, NO. TODO, TODO 2024 8
TABLE V
TWO-SIDED K S TE ST TO C OM PARE GAN A ND DTM BOT S.
GAN vs... price BTT NBQ speed SR Trust
DTM 0.806 0.890 0.824 0.933 0.402 0.849
We utilised analytical statistics techniques to visualise the
distributions of metrics for each photo type in Figure 4. The
overlapping distributions accentuate differences between AI-
enhanced bots and other types, as illustrated in the bottom row
of Figure 4. The Y-axis in each plot represents the probability
of observing a specific metric’s value, with the corresponding
metric values depicted on the X-axis. The price metric is
presented in Russian rubles.
To scrutinise the differences between the means of these
distributions, we plotted the means in a matrix depicted in
Figure 5. In this matrix, each cell corresponds to the difference
between the metrics on the X-axis and those on the Y-axis. A
positive sign in a cell indicates that the metric on the X-axis
exhibits a larger mean than the metric on the Y-axis, whereas
a negative sign indicates the opposite.
For an in-depth analysis of the bot-traders connectivity,
we constructed a bot-trader graph, visually presented in Figure
6. In this graph, edges symbolise shared accounts among
different bot-traders, with varying thicknesses indicating the
common fraction of bots between these bot-traders. The vertex
colour represents the ratio of AI-enhanced bots (both GAN
and DTM) to the size of the entire offer, with the grey colour
indicating 0 value. Additionally, the left-hand side of Figure
6 features a barplot that represents the AI ratio for each bot-
trader’s offer.
In the next section, we discuss and interpret these results.
V. DISCUSSION
In our analysis, we identified a modest 73 bots utilising
GAN-photos (comprising 52 unique accounts) and 84 bots
with DTM-photos (62 unique) out of a total of 13,748 bots
measurements. Despite the relatively small sample sizes, it
was sufficient to reveal statistically significant differences in
bot metrics between AI-enhanced bots and those using other
types of photos. As shown in Tables III and IV, both GAN
and DTM-enhanced bots exhibited lower values for the price,
BTT,NBQ, and speed metrics, while showing higher values
for the SR and Trust metrics compared to all other types. When
comparing Tables II and III, we observed that the metrics of
AI-enhanced bots contradicted the prevailing trend of sophis-
tication across all metrics. In essence, bots with AI-photos
demonstrated lower levels of danger and sophistication based
on their metric values, challenging the anticipated escalation in
sophistication associated with the increased complexity of bot
combat. Moreover, as shown in Table V, there is no significant
difference between GAN and DTM bots. This indicates that
novel techniques for photo generation have not made bots
more sophisticated and they remain quite similar to GAN bots.
Based on the analysis of Figures 4 and 5, we can interpret
the impact of differences in each metric as follows:
1) Price: AI-enhanced bots are characterized by lower
costs within the bot market, as shown in Figure 4. The
bot market can be divided into two segments: cheap
bots with prices below ≈0.5rubles and expensive
bots with prices around ≈1ruble per action. Notably,
AI-enhanced bots are, on average, approximately 8%
cheaper, as depicted in Figure 5.
2) BTT: AI-enhanced bots exhibit lower popularity on
exchange platforms, as evidenced in Figure 4. Notably,
the majority of bots are operated by shops rather than
exchange platforms. However, the proportion of AI-
enhanced bots operating on exchange platforms is com-
paratively higher, constituting approximately 20% as
depicted in Figure 5). This finding is not surprising, as
a significant proportion of bots on exchange platforms
are real users who engage in malicious activities for
financial gain. These users are less inclined to leverage
automation or AI technologies in their operations.
3) NBQ: Bot-traders assessed AI-enhanced bots to be of
inferior quality. NBQ serves as an indicator of how bot-
traders rank their bots on a quality scale. It is important
to note that the bot-traders’ assessment may not nec-
essarily align with objective reality but could instead
mirror their expectations or subjective perceptions.
4) Speed: AI-enhanced bots are faster as shown in Figure
4, and most of the AI bots can execute the actions within
a minute. This suggests that AI bots are less likely to
be used in conjunction with human control suggesting a
predominant reliance on automated software for their
operation. Furthermore, the swift execution of com-
mands raises the prospect of heightened detectability,
as rapid actions often serve as a distinctive indicator of
anomalous behaviour for detection mechanisms.
5) SR: AI-enhanced bot exhibit a higher probability of
encountering blocking by social networks, as indicated
by the SR metric. Figure 4 illustrates that, for other
bot accounts, the majority of the bots were observed
at SR ≈0, with a gradual decrease as the likelihood
of being blocked increased. Conversely, AI-enhanced
bots demonstrated a peak at SR ≈0.5. This is
consistent with our previous observation regarding the
speed metric, suggesting that AI-enhanced bots might
be more detectable, thereby contributing to an increased
likelihood of being blocked by social networks.
6) Trust: Users trust less in bots with AI-photos. The trust
metric reflects a user’s capability to discern bots and
can be likened to the true positive rate (TPR) in the
Turing Test. As shown in Figure 4, users seem to face
challenges in accurately recognising bots, with trust
levels peaking around 0.5. This value corresponds to a
scenario akin to random guessing (i.e., flipping a coin
to determine whether an account is a bot). Interestingly,
the trust levels of AI bots exhibit a rightward shift in
the distribution, with a delta mean of approximately 6%
according to Figure 5. This shift is significant, as trust
metric baseline is 50%. Surprisingly, our experiments
unveiled that the use of AI-photos does not improve the
realism of bots, contrary to initial expectations.
IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, VOL. TODO, NO. TODO, TODO 2024 9
Fig. 4. Distributions of metrics values for each type of bots by used photos.
Fig. 5. Differences between the means of metrics for each pair of bot types: a significant difference between metrics of accounts with AI and other types of
photos.
IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, VOL. TODO, NO. TODO, TODO 2024 10
Fig. 6. Bot-traders’ offers connectivity graph with barplot and colour highlight
of AI ratio for each offer: AI-photos were found in half of the offers of not
connected bot-traders, indicating that AI technology is widespread.
In summary, AI-enhanced bots are expected to be less
expensive and faster, less popular on exchange platforms, and
assessed to be of inferior quality by bot-traders. Additionally,
they exhibit a higher likelihood of encountering blocks from
social networks, and contrary to expectations, the use of AI-
photos does not contribute to an improvement in the realism
of bots or an increase in users’ trust levels.
Upon examining the bot-trader connectivity graph and the
AI ratio distribution presented in Figure 6, it becomes evident
that only a small proportion of bots utilize AI technology.
Specifically, our observations reveal that merely 6 bot-traders’
offers feature at least 5% of bots with AI-photos. However,
when evaluating the prevalence of this technology, we found
that approximately half of the offers have at least one bot
adorned with a AI-photo. This suggests that at least half
of bot-traders employ AI technology into their operations.
Furthermore, the graph portrays AI usage among bot-traders
in a manner suggesting independent utilisation, as there is no
discernible connectivity, indicating autonomous use of AI.
A crucial aspect of the discussion is how the GAN and
DTM detectors’ efficiency could influence the findings. Given
that the GAN detector employed in this study demonstrated
remarkable efficiency [34] (with an AUC of 0.9995 in the
global test), we contend that the classification outcomes for
GAN-photos are nearly equivalent to the ground truth, with
only minor errors. Additionally, it is noteworthy that the
research on the detector was conducted in 2022-23, the same
time we gathered the bot sets. Therefore, we do not anticipate
that bot operators have utilised or are currently utilising a more
advanced GAN generator that could potentially introduce bias
into the metrics by generating lower-quality photos that might
be undetectable by the detector employed in this research.
Fig. 7. Filtering of found GAN and Diffusion/Transformers photos.
The efficiency of Diffusion detector is also high, but not near
ground Truth level (average AUC of 90.8). Nevertheless, it is
still possible to detect a significant number of bots using DTM
photos, albeit with minor errors. Due to the low number of
AI-photos identified in our analysis, we used manual filtering
before conducting statistical analysis. The results of this man-
ual analysis are plotted in Figure 7. The remaining photos
are likely to be AI-generated, and upon closer inspection,
some of them exhibit unnatural features such as non-natural
backgrounds and ears (although not all AI-photos exhibit
such noticeable artefacts). Despite the significant imbalance in
the dataset, with only 157 AI-generated photos compared to
13,591 non-AI photos, our AI detection analysis produced only
33 false positives among the 13,591 true negatives. Coupled
with manual filtering of these false positives, we believe
our analysis remains accurate. We removed errors from the
small sample of detected AI photos, while errors in the large
sample of non-AI photos will have a minimal effect. Therefore,
any potential undiscovered errors are unlikely to substantially
impact the statistical analysis.
It is crucial to acknowledge the principle that “correlation
does not imply causation”. The analysis presented here cannot
be used as conclusive evidence to assert that using AI-photos
leads to less sophisticated bots, as many factors can affect bot
sophistication. For example, with regard to the price metric,
AI may be more expensive. Still, bot-traders may choose to
use AI in combination with other cost-effective technologies,
thus reducing the overall price. However, our analysis does
indicate a notable association between AI-photos and simpler
bots, providing not proof but the evidence for such causation.
IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, VOL. TODO, NO. TODO, TODO 2024 11
An alternative explanation for the observations presented
is the “cyber arms race” effect. AI-photos make social bots
more sophisticated and consequently more dangerous, but the
high efficiency of existing AI detectors offsets this advantage.
As a result, bot-traders may exhibit reluctance in employing
AI-generated photos for creating expensive and sophisticated
bots, given their susceptibility to detection. Consequently, bots
with AI-faces are probably more prevalent in the low-quality
and cheap segment of the bot market. Although synthetic
experiments [23] suggest the effectiveness of AI-enhanced
bots in deceiving users, our real-world observations indicate
that AI-enhanced bots are comparatively less successful than
regular bots. This difference in success rates is likely attributed
to bot-traders’ reservations in embracing AI technology.
Another consideration is that the experiments were con-
ducted using the Russian social network VK, and the gen-
eralizability of our results depends on the adoption of bot
technologies by bot-traders operating on other platforms and
in other countries. As noted by [8], Russian bot technologies
dominate the bot market. Therefore, we assume that bots
on VK, and in the Russian-speaking internet, may be more
sophisticated compared to those on platforms like Facebook
or Twitter. Another aspect is the interaction design and socio-
cultural differences between platforms as shown in Table VI,
as they may impact how much people interact with photos
[46] and therefore, can determine social bot metrics and
users’ trust levels. For that reason, we believe that results
obtained with VK will be relevant for platforms with similar
design and demographic profiles, such as the European part of
Facebook and LinkedIn. However, such conclusions can only
be confirmed through separate empirical analyses of the bot
markets and AI usage on each platform.
TABLE VI
COMPARISON OF SOCIAL NETWO RK PLAT FOR MS
Platforms
VK, Facebook,
Renren, Mixi,
LinkedIn, Instagram
X, TikTok WeChat, LINE,
KakaoTalk, MXit
Interaction
design
Detailed Profile Info
(High photo engagement)
Limited Profile Info
(Medium photo engagement)
Limited Profile Info
(Minimal photo engagement)
Demography
Russian/Europe (VK),
Global (Facebook, LinkedIn, Instagram),
Chinese (Renren),
Japanese (Mixi)
Global (X, TikTok)
Chinese (WeChat),
Japanese (LINE),
South Korean (KakaoTalk),
African (MXit)
We also need to consider the bot evolution phenomena and
the development of new generative AI technologies. In this
study, we did not find any significant differences between bots
using older AI-generation methods (GAN) and those using
newer (Diffusion/Transformers). However, this may change
rapidly as AI technologies evolve quickly. Therefore, we em-
phasize that presented results are relevant to the years 2022/23
when the dataset was collected. The approaches presented here
can also be used in future research to analyze AI technology,
providing a basis for comparison with current findings.
A valuable direction for future research would be a cross-
platform analysis of the impact of AI technology. Additionally,
re-examining metrics for AI bots and comparing them to
current findings will help track bot evolution over time.
In summary, the current findings suggest that the use of AI
for creating malicious accounts may not always lead to more
dangerous social bots or malicious activity. Despite the initial
anticipation of AI technology being a valuable tool for bot
creators, it appears predominantly confined to the low-quality
and cheap segment within the bot market. This might stem, at
least in part, from the effective development of AI detectors,
creating a sense of caution among bot owners about using AI-
photos for creating more expensive bots that require significant
effort. It is important to note that the study identified the
AI detector as an effective bot detection solution, despite its
ability to detect only a small portion of all bots and the lower
quality of the detected bots. Nevertheless, this existing AI
detection solution might be adequate to dissuade bot owners
from extensively adopting AI, at least until a more advanced
AI generator emerges in the ongoing arms race [47] between
producing and combating disinformation.
VI. CONCLUSIONS
This research examines the impact of Generative Adversar-
ial Networks (GANs), Diffusion and Transformers models for
image generation on malicious bots and the bot market. A
qualitative analysis was conducted, comparing bots using AI-
photos with other bots within the Russian VK social network,
based on various metrics, including bot price, survival rate,
bot quality, speed, bot-trader type, and human trust.
The main finding of this study is that AI-based bots tend to
be less expensive and faster, but they are less popular on ex-
change platforms and generally exhibit lower quality, accord-
ing to bot-traders. Additionally, AI-enhanced bots are more
susceptible to being blocked by social networks and do not
contribute to enhancing the realism of bots or increasing users’
trust levels. The analysis further reveals that approximately
half of the offers from the analysed bot-traders contain at least
one bot with a AI-photo, with a peak occurrence of 10%. This
indicates that at least half of the bot-traders in the Russian
bot market are employing AI technology. Interestingly, the
utilisation of AI technology among bot-traders appears to be
dispersed independently rather than interconnected, suggesting
that bots with AI-photos, while not necessarily popular, are
widely prevalent in the market. It’s also worth noting that
emerging AI technologies such as Diffusion and Transformers
have not significantly contributed to the sophistication of bots
compared to GAN, as we didn’t find any significant differences
between these two groups of bots.
Therefore, the inference can be drawn that AI-photos are
more commonly associated with simple bots in the lower and
less expensive segment of the malicious bot market. This
observation aligns with the notion of a “cyber arms race”
effect, where bot-traders exhibit hesitancy in employing AI-
generated photos for creating sophisticated bots, acknowledg-
ing the heightened risk of detection.
DATA AVAI LA BI LI TY
We have opened access to collected bot photos, bot IDs, and
results of their classification via Kaggle (MKPHOTO2023,
https://www.kaggle.com/datasets/guardeec/mkphoto2023).
These photos can also be associated with the relevant bot
metrics through the bot IDs provided in the MKMETRIC2022
dataset (https://github.com/guardeec/datasets#mkmetric2022).
IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, VOL. TODO, NO. TODO, TODO 2024 12
REFERENCES
[1] S. Cresci, “A decade of social bot detection,” Communications of the
ACM, vol. 63, no. 10, pp. 72–83, 2020.
[2] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley,
S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,”
Advances in neural information processing systems, vol. 27, 2014.
[3] J. Ho, A. Jain, and P. Abbeel, “Denoising diffusion probabilistic models,”
Advances in neural information processing systems, vol. 33, pp. 6840–
6851, 2020.
[4] J. Sohl-Dickstein, E. Weiss, N. Maheswaranathan, and S. Ganguli,
“Deep unsupervised learning using nonequilibrium thermodynamics,”
in International conference on machine learning. PMLR, 2015, pp.
2256–2265.
[5] P. Esser, R. Rombach, and B. Ommer, “Taming transformers for high-
resolution image synthesis,” in Proceedings of the IEEE/CVF conference
on computer vision and pattern recognition, 2021, pp. 12 873–12 883.
[6] T. Park, M.-Y. Liu, T.-C. Wang, and J.-Y. Zhu, “Semantic image
synthesis with spatially-adaptive normalization,” in Proceedings of the
IEEE/CVF conference on computer vision and pattern recognition, 2019,
pp. 2337–2346.
[7] M. Kolomeets and A. Chechulin, “Social bot metrics,” Social Network
Analysis and Mining, vol. 13, no. 1, p. 36, 2023.
[8] S. Bay and A. Reynolds, “The black market for social media manipu-
lation,” NATO StratCom COE, 2018.
[9] M. Kolomeets and A. Chechulin, “Analysis of the malicious bots mar-
ket,” in 2021 29th conference of open innovations association (FRUCT).
IEEE, 2021, pp. 199–205.
[10] E. Ferrara, O. Varol, C. Davis, F. Menczer, and A. Flammini, “The rise
of social bots,” Communications of the ACM, vol. 59, no. 7, pp. 96–104,
2016.
[11] M. Orabi, D. Mouheb, Z. Al Aghbari, and I. Kamel, “Detection of
bots in social media: a systematic review,” Information Processing &
Management, vol. 57, no. 4, p. 102250, 2020.
[12] K. Hayawi, S. Saha, M. M. Masud, S. S. Mathew, and M. Kaosar, “Social
media bot detection with deep learning methods: a systematic review,”
Neural Computing and Applications, pp. 1–16, 2023.
[13] I. Goodfellow, P. McDaniel, and N. Papernot, “Making machine learning
robust against adversarial inputs,” Communications of the ACM, vol. 61,
no. 7, pp. 56–66, 2018.
[14] S. Cresci, R. Di Pietro, A. Spognardi, M. Tesconi, and M. Petrocchi,
“Demystifying misconceptions in social bots research,” arXiv preprint
arXiv:2303.17251, 2023.
[15] A. Radford, L. Metz, and S. Chintala, “Unsupervised representation
learning with deep convolutional generative adversarial networks,” arXiv
preprint arXiv:1511.06434, 2015.
[16] T. Karras, S. Laine, and T. Aila, “A style-based generator architecture
for generative adversarial networks,” in Proceedings of the IEEE/CVF
conference on computer vision and pattern recognition, 2019, pp. 4401–
4410.
[17] D. Gragnaniello, D. Cozzolino, F. Marra, G. Poggi, and L. Verdoliva,
“Are gan generated images easy to detect? a critical analysis of the state-
of-the-art,” in 2021 IEEE international conference on multimedia and
expo (ICME). IEEE, 2021, pp. 1–6.
[18] S. Mandelli, N. Bonettini, P. Bestagini, and S. Tubaro, “Detecting gan-
generated images by orthogonal training of multiple cnns,” in 2022 IEEE
International Conference on Image Processing (ICIP). IEEE, 2022, pp.
3091–3095.
[19] L. Yang, Z. Zhang, Y. Song, S. Hong, R. Xu, Y. Zhao, W. Zhang,
B. Cui, and M.-H. Yang, “Diffusion models: A comprehensive survey
of methods and applications,” ACM Computing Surveys, vol. 56, no. 4,
pp. 1–39, 2023.
[20] A. Nichol, P. Dhariwal, A. Ramesh, P. Shyam, P. Mishkin, B. McGrew,
I. Sutskever, and M. Chen, “Glide: Towards photorealistic image gen-
eration and editing with text-guided diffusion models,” arXiv preprint
arXiv:2112.10741, 2021.
[21] R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer, “High-
resolution image synthesis with latent diffusion models,” in Proceedings
of the IEEE/CVF conference on computer vision and pattern recognition,
2022, pp. 10 684–10 695.
[22] A. Ramesh, M. Pavlov, G. Goh, S. Gray, C. Voss, A. Radford, M. Chen,
and I. Sutskever, “Zero-shot text-to-image generation,” in International
conference on machine learning. Pmlr, 2021, pp. 8821–8831.
[23] J. Mink, L. Luo, N. M. Barbosa, O. Figueira, Y. Wang, and G. Wang,
“{DeepPhish}: Understanding user trust towards artificially generated
profiles in online social networks,” in 31st USENIX Security Symposium
(USENIX Security 22), 2022, pp. 1669–1686.
[24] A. R. (2023) Github: Random face generator. [Online]. Available:
https://adityar224.github.io/Random-Face-Generator/\#/
[25] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look
once: Unified, real-time object detection,” in Proceedings of the IEEE
conference on computer vision and pattern recognition, 2016, pp. 779–
788.
[26] C.-Y. Wang, A. Bochkovskiy, and H.-Y. M. Liao, “Yolov7: Trainable
bag-of-freebies sets new state-of-the-art for real-time object detectors,”
arXiv preprint arXiv:2207.02696, 2022.
[27] S. Gupta. (2020) Celebrity recognition using vggface and
annoy. [Online]. Available: https://medium.com/analytics-vidhya/
celebrity-recognition- using-vggface- and-annoy-363c5df31f1e
[28] ——. (2021) Github: Celebrity recognition. [Online]. Available:
https://github.com/shobhit9618/celeb recognition
[29] B. Jiang, Q. Ren, F. Dai, J. Xiong, J. Yang, and G. Gui, “Multi-
task cascaded convolutional neural networks for real-time dynamic
face recognition method,” in Communications, Signal Processing, and
Systems: Proceedings of the 2018 CSPS Volume III: Systems 7th.
Springer, 2020, pp. 59–66.
[30] O. Parkhi, A. Vedaldi, and A. Zisserman, “Deep face recognition,” in
BMVC15 - Proceedings of the British Machine Vision Conference, 2015.
[31] e. a. Refik Can Malli. (2019) Github: keras-vggface. [Online]. Available:
https://github.com/rcmalli/keras-vggface
[32] B. Erik. (2013) Github: Annoy (approximate nearest neighbors) library.
[Online]. Available: https://github.com/spotify/annoy
[33] S. Mandelli, N. Bonettini, P. Bestagini, and S. Tubaro. (2022)
Github: Gan-image-detection. [Online]. Available: https://github.com/
polimi-ispl/GAN- image-detection
[34] ——, “Detecting gan-generated images by orthogonal training of multi-
ple cnns,” in 2022 IEEE International Conference on Image Processing
(ICIP). IEEE, 2022, pp. 3091–3095.
[35] Z. Wang, J. Bao, W. Zhou, W. Wang, H. Hu, H. Chen, and H. Li,
“Dire for diffusion-generated image detection,” in Proceedings of the
IEEE/CVF International Conference on Computer Vision, 2023, pp.
22 445–22 455.
[36] R. Corvi, D. Cozzolino, G. Zingarini, G. Poggi, K. Nagano, and
L. Verdoliva, “On the detection of synthetic images generated by
diffusion models,” in ICASSP 2023-2023 IEEE International Conference
on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2023,
pp. 1–5.
[37] ——. (2023) Github: Dmimagedetection. [Online]. Available: https:
//github.com/grip-unina/DMimageDetection
[38] S.-Y. Wang, O. Wang, R. Zhang, A. Owens, and A. A. Efros, “Cnn-
generated images are surprisingly easy to spot...for now,” in CVPR, 2020.
[39] M. Kolomeets. (2022) Github: Mkmetric2022 dataset. [Online].
Available: https://github.com/guardeec/datasets\#mkmetric2022
[40] M. Kolomeets, O. Tushkanova, V. Desnitsky, L. Vitkova, and
A. Chechulin, “Experimental evaluation: Can humans recognise social
media bots?” Big Data and Cognitive Computing, vol. 8, no. 3, p. 24,
2024.
[41] S. Stieglitz, F. Brachten, D. Berthel´
e, M. Schlaus, C. Venetopoulou, and
D. Veutgen, “Do social bots (still) act different to humans?–comparing
metrics of social bots with those of humans,” in Social Computing and
Social Media. Human Behavior: 9th International Conference, SCSM
2017, Held as Part of HCI International 2017, Vancouver, BC, Canada,
July 9-14, 2017, Proceedings, Part I 9. Springer, 2017, pp. 379–395.
[42] C. Yang, R. Harkreader, and G. Gu, “Empirical evaluation and new
design for fighting evolving twitter spammers,” IEEE Transactions on
Information Forensics and Security, vol. 8, no. 8, pp. 1280–1293, 2013.
[43] R. S. Perdana, T. H. Muliawati, and R. Alexandro, “Bot spammer
detection in twitter using tweet similarity and time interval entropy,”
Jurnal Ilmu Komputer dan Informasi, vol. 8, no. 1, pp. 19–25, 2015.
[44] A. Foysal, S. Islam, and T. Rahaman, “Classification of ai powered social
bots on twitter by sentiment analysis and data mining through svm,”
International Journal of Computer Applications, vol. 117, pp. 13–19,
2019.
[45] K.-C. Yang, P.-M. Hui, and F. Menczer, “Bot electioneering volume:
Visualizing social bot activity during elections,” in Companion Proceed-
ings of The 2019 World Wide Web Conference, 2019, pp. 214–217.
[46] S. Bakhshi, D. A. Shamma, and E. Gilbert, “Faces engage us: Photos
with faces attract more likes and comments on instagram,” in Proceed-
ings of the SIGCHI conference on human factors in computing systems,
2014, pp. 965–974.
[47] S. Cresci, R. Di Pietro, M. Petrocchi, A. Spognardi, and M. Tesconi,
“The paradigm-shift of social spambots: Evidence, theories, and tools
for the arms race,” in Proceedings of the 26th international conference
on world wide web companion, 2017, pp. 963–972.