Conference PaperPDF Available

The Socialbot Network: When bots socialize for fame and money

Authors:

Abstract and Figures

Online Social Networks (OSNs) have become an integral part of today's Web. Politicians, celebrities, revolutionists, and others use OSNs as a podium to deliver their message to millions of active web users. Unfortunately, in the wrong hands, OSNs can be used to run astroturf campaigns to spread misinformation and propaganda. Such campaigns usually start off by infiltrating a targeted OSN on a large scale. In this paper, we evaluate how vulnerable OSNs are to a large-scale infiltration by socialbots: computer programs that control OSN accounts and mimic real users. We adopt a traditional web-based botnet design and built a Socialbot Network (SbN): a group of adaptive socialbots that are orchestrated in a command-and-control fashion. We operated such an SbN on Facebook---a 750 million user OSN---for about 8 weeks. We collected data related to users' behavior in response to a large-scale infiltration where socialbots were used to connect to a large number of Facebook users. Our results show that (1) OSNs, such as Facebook, can be infiltrated with a success rate of up to 80%, (2) depending on users' privacy settings, a successful infiltration can result in privacy breaches where even more users' data are exposed when compared to a purely public access, and (3) in practice, OSN security defenses, such as the Facebook Immune System, are not effective enough in detecting or stopping a large-scale infiltration as it occurs.
Content may be subject to copyright.
A preview of the PDF is not available
... In the highly competitive realm of online engagement, the metrification of social media, alongside concerns about data capture and amplification, has produced the need to move beyond a priori dichotomies such as 'real' and 'fake' (Burton and Chun 2023;Lindquist and Weltevrede 2024). On Instagram, comments on posts, likes in stories, tags in reels, direct messages, follower-following relations, and links in profile bios, among other pre-structured possibilities to interact, provide paths for software-supported activities programmed to mimic 'authentic' engagement (Boshmaf et al. 2011;Guilbeault 2016). As we demonstrate through associated metadata, the application of 'Instagrammatics' (Highfield and Leaver 2016;Rogers 2021) as a method to study platform cultures of use highlights bot operators' techniques of working with and around platform affordances and constraints (Bucher and Helmond 2018). ...
... Like all social bots, porn bots take advantage of software for coordinating their actions and require a face (Boshmaf et al. 2011) -a persona designed to appeal to a collective formation of norms, desires, and imaginaries. Persona, in terms of its capacity to fabricate a role, implies performance and masquerade but also rule-bound repetition (Marshall, Moore, and Barbour 2019). ...
Article
Full-text available
This article presents a conceptual and methodological account of a small porn bot network, focusing on its embeddedness within Instagram use. The analysis explores the gendered design of bots as platform-native personas, particularly their capacity to perform within the confines of Instagram's increasingly strict sexual content controls. We address three performative trajectories in the bot-exploited 'Instagrammatics' of identity play, social influence, and attention capture. We argue that a bot programmed to operate with sexual content to generate attention relies on the paradoxical blend of pornographic 'imagination' and social media 'authenticity'. For our analysis, we manually identified 30 porn bot accounts spamming in the comment sections of highly visible Instagram posts (those published by @justinbieber). We then collected associated metadata-bot profile names and images, comments and comment likers, followers and followings, bot content, and links in the bot profile bios. By variously situating and combining these data, we discuss how networked automation taps into the sexualized social scripts imitated by 'artificial' and 'authentic' users alike. Our findings point to how porn bots re-enact gender as a programmed set of instructions, adapting to Instagram's vision of acceptable sexuality and revealing its normative order.
... However, a few research teams have utilized fake accounts themselves as the mechanism used to gather data and study online interactions [52,53]. In 2011, research was conducted using both passive and active fake accounts or "socialbots" orchestrated together to demonstrate the vulnerability of PI [54]; methods to identify fake accounts have improved significantly since then, using techniques like correlating activity across IPs and geographic locations, increasing the threshold for constructing a good fake ID. There are a variety of open-source platforms that provide automated mass account generation functionalities for different websites and social media platforms [55]. ...
Article
Full-text available
When personal information is shared across the Internet, we have limited confidence that the designated second party will safeguard it as we would prefer. Privacy policies offer insight into the best practices and intent of the organization, yet most are written so loosely that sharing with undefined third parties is to be anticipated. Tracking these sharing behaviors and identifying the source of unwanted content is exceedingly difficult when personal information is shared with multiple such second parties. This paper formulates a model for realistic fake identities, constructs a robust fake identity generator, and outlines management methods targeted towards online transactions (email, phone, text) that pass both cursory machine and human examination for use in personal privacy experimentation. This fake ID generator, combined with a custom account signup engine, are the core front-end components of our larger Use and Abuse of Personal Information system that performs one-time transactions that, similar to a cryptographic one-time pad, ensure that we can attribute the sharing back to the single one-time transaction and/or specific second party. The flexibility and richness of the fake IDs also serve as a foundational set of control variables for a wide range of social science research questions revolving around personal information. Collectively, these fake identity models address multiple inter-disciplinary areas of common interest and serve as a foundation for eliciting and quantifying personal information-sharing behaviors.
... Although there is ample evidence regarding bot activity on Twitter, debates about the impact of these malicious activities are yet to be settled (Duan et al., 2022;González-Bailón et al., 2021;Keijzer et al., 2021). While some researchers have been warning about their increasing sophistication (Boshmaf et al., 2011), and recent evidence suggests that there is the potential for political bots to get more advanced (Cresci, 2020;, others have found that the majority of the currently available commercial services and tools only provide rather simplistic and repetitive automation (Assenmacher et al., 2020). Most of the Twitter bots in a recent study were found to be "spammers", with no advanced capabilities and limited intelligence (Assenmacher et al., 2020). ...
Preprint
Full-text available
Bots have become increasingly prevalent in the digital sphere and have taken up a proactive role in shaping democratic processes. While previous studies have focused on their influence at the individual level, their potential macro-level impact on communication dynamics is still little understood. This study adopts an information theoretic approach from dynamical systems theory to examine the role of political bots shaping the dynamics of an online political discussion on Twitter. We quantify the components of this dynamic process in terms of its complexity, predictability, and the remaining uncertainty. Our findings suggest that bot activity is associated with increased complexity and uncertainty in the structural dynamics of online political communication. This work serves as a showcase for the use of information-theoretic measures from dynamical systems theory in modeling human-bot dynamics as a computational process that unfolds over time.
... Unlike most other studies focused on identifying patterns of Twitter bots but also the strategies based on which they were programmed (Boshmaf et al., 2011;Haustein et al., 2016;Neff and Nagy, 2016) this study follows the main discourses and users' folk theories when they encounter Russian bots on Twitter. Such an analysis is not limited to the mere identification of such bots but is rather about the perceived implications that these bots have on the daily political and social life of the users involved. ...
Article
Full-text available
The bots' activity is already frequently documented in the literature, and the war between Russia and Ukraine accentuated this scholarly interest for users' sensemaking. Applying folk theories framework on 56 semi-structured interviews with users who tweet about "Russian bots," I examine how bots might be understood as structural-computational entities, with complex roles in shaping digitally mediated realities. Findings reveal several theories associated with Russian bots. First, participants believe that these bots actively endorse users' political enemies, which are mainly politicians from the participants' countries. Second, such bots are considered to increase animosities between users, as participants actively unfollow their peers on Twitter and unfriend them in real life, based on their opinions regarding the war in Ukraine. Third, bots boost users' digital activity, given the fact that participants consider them responsible for artificially increasing the popularity of certain accounts or, on the contrary, for systematic and aggressive attacks against others.
Preprint
Full-text available
This study investigates whether personality traits can predict and impact susceptibility to persuasion in potential social engineering scenarios. It also explores cultural differences in such susceptibility. Data was collected through an online survey with 651 participants (329 from the Arab Gulf countries and 322 from the United Kingdom). Personality traits were measured using a validated 10-item scale based on the Big-5 model. Cialdini’s six persuasion principles were employed as a conceptual framework. Participants were presented with 12 scenarios: six featuring the principles and six where the principles were neutralized. They were asked questions about their level of trust in the potential social engineer and their willingness to take risks. We analysed the data to identify differences in susceptibility between the two groups, and regression analyses evaluated the impact of personality traits on susceptibility. The findings reveal no significant difference in susceptibility to persuasion tactics between Arab and UK participants. Additionally, personality traits are weak predictors of susceptibility to persuasion in social engineering scenarios in both samples. Unlike existing studies, our method isolated personality traits and did not mix them with other predictors like age, gender, or competency. This approach allowed us to scrutinize their pure impact. Scenarios were carefully designed, and face validated to be around the same situation but present each principle alone while neutralizing other variables. Previous literature used heterogeneous scenarios, making it hard to pinpoint specific causes. Additionally, this study includes a cross-cultural component with participants from the Arab Gulf countries, a segment often neglected in research.
Article
With the rise and prevalence of social bots, their negative impacts on society are gradually recognized, prompting research attention to effective detection and countermeasures. Recently, graph neural networks (GNNs) have flourished and have been applied to social bot detection research, improving the performance of detection methods effectively. However, existing GNN-based social bot detection methods often fail to account for the heterogeneous associations among users within social media contexts, especially the heterogeneous integration of social bots into human communities within the network. To address this challenge, we propose a heterogeneous compatibility perspective for social bot detection, in which we preserve more detailed information about the varying associations between neighbors in social media contexts. Subsequently, we develop a compatibility-aware graph neural network (CGNN) for social bot detection. CGNN consists of an efficient feature processing module, and a lightweight compatibility-aware GNN encoder, which enhances the model’s capacity to depict heterogeneous neighbor relations by emulating the heterogeneous compatibility function. Through extensive experiments, we showed that our CGNN outperforms the existing state-of-the-art (SOTA) method on three commonly used social bot detection benchmarks while utilizing only about 2% of the parameter size and 10% of the training time compared with the SOTA method. Finally, further experimental analysis indicates that CGNN can identify different edge categories to a significant extent. These findings, along with the ablation study, provide strong evidence supporting the enhancement of GNN’s capacity to depict heterogeneous neighbor associations on social media bot detection tasks.
Article
The detection of fake profiles on social networking platforms is a pressing concern due to the proliferation of fraudulent accounts that undermine user trust and platform integrity. This paper proposes a novel framework for the automatic detection of fake profiles, leveraging the private information available within social networking platforms while respecting user privacy. The proposed scheme utilizes advanced algorithms and machine learning models to analyze various parameters, including user activity patterns, account creation details, and communication behavior, to identify potentially fraudulent accounts. Importantly, this approach ensures the preservation of user privacy by conducting analysis solely within the platform's closed environment without compromising sensitive personal information. Furthermore, the framework incorporates an alert system to notify platform administrators and users of suspicious activity indicative of fake identity creation, enabling proactive measures to prevent the spread of fake profiles and mitigate potential risks. Through the implementation of this framework, social networking companies can effectively combat the proliferation of fake profiles while upholding user privacy and fostering a safer and more trustworthy online environment for all users
Chapter
Full-text available
The ability to tell humans and computers apart is imperative to protect many services from misuse and abuse. For this purpose, tests called CAPTCHAs or HIPs have been designed and put into production. Recent history shows that most (if not all) can be broken given enough time and commercial interest: CAPTCHA design seems to be a much more difficult problem than previously thought. The assumption that difficult-AI problems can be easily converted into valid CAPTCHAs is misleading. There are also some extrinsic problems that do not help, especially the big number of in-house designs that are put into production without any prior public critique. In this paper we present a state-of-the-art survey of current HIPs, including proposals that are now into production. We classify them regarding their basic design ideas. We discuss current attacks as well as future attack paths, and we also present common errors in design, and how many implementation flaws can transform a not necessarily bad idea into a weak CAPTCHA. We present examples of these flaws, using specific well-known CAPTCHAs. In a more theoretical way, we discuss the threat model: confronted risks and countermeasures. Finally, we introduce and discuss some desirable properties that new HIPs should have, concluding with some proposals for future work, including methodologies for design, implementation and security assessment.
Conference Paper
Full-text available
Within this paper we present our novel friend injection attack which exploits the fact that the great majority of social networking sites fail to protect the communication between its users and their services. In a practical evaluation, on the basis of public wireless access points, we furthermore demonstrate the feasibility of our attack. The friend injection attack enables a stealth infiltration of social networks and thus outlines the devastating consequences of active eavesdropping attacks against social networking sites. Keywordssocial networks-privacy-infiltration
Conference Paper
Full-text available
We propose Stegobot, a new generation botnet that communicates over probabilistically unobservable communication channels. It is designed to spread via social malware attacks and steal information from its victims. Unlike conventional botnets, Stegobot traffic does not introduce new communication endpoints between bots. Instead, it is based on a model of covert communication over a social-network overlay – bot to botmaster communication takes place along the edges of a social network. Further, bots use image steganography to hide the presence of communication within image sharing behavior of user interaction. We show that it is possible to design such a botnet even with a less than optimal routing mechanism such as restricted flooding. We analyzed a real-world dataset of image sharing between members of an online social network. Analysis of Stegobot’s network throughput indicates that stealthy as it is, it is also functionally powerful – capable of channeling fair quantities of sensitive data from its victims to the botmaster at tens of megabytes every month.
Conference Paper
Full-text available
In this paper, we present a case study describing the privacy and trust that exist within a small population of online social network users. We begin by formally characterizing different graphs in social network sites like Facebook. We then determine how often people are willing to divulge personal details to an unknown online user, an adversary. While most users in our sample did not share sensitive information when asked by an adversary, we found that more users were willing to divulge personal details to an adversary if there is a mutual friend connected to the adversary and the user. We then summarize the results and observations associated with this Facebook case study.
Article
Despite neglecting even basic security measures, close to two billion people use the Internet, and only a small fraction appear to be victimized each year. This paper suggests that an explanation lies in the economics of at-tacks. We distinguish between scalable attacks, where costs are almost independent of the number of users at-tacked, and non-scalable (or targeted) attacks, which involve per-user effort. Scalable attacks reach orders of magnitude more users. To compensate for her disad-vantage in terms of reach the targeted attacker must target users with higher than average value. To accomplish this she needs that value be both vis-ible and very concentrated, with few users having very high value while most have little. In this she is for-tunate: power-law longtail distributions that describe the distributions of wealth, fame and other phenomena are extremely concentrated. However, in these distribu-tions only a tiny fraction of the population have above average value. For example, fewer than 2% of people have above average wealth in the US. Thus, when at-tacking assets where value is concentrated, the targeted attacker ignores the vast majority of users, since at-tacking them hurts rather than helps her requirement to extract greater than average value. This helps explain why many users escape harm, even when they neglect security precautions: most users never experience most attacks. Attacks that involve per-user effort will be seen by only a tiny fraction of users. No matter how clever the exploit, unless the expected value is high, there is little place for per-user effort in this world of mass-produced attacks.
Article
Popular Internet sites are under attack all the time from phishers, fraudsters, and spammers. They aim to steal user information and expose users to unwanted spam. The attackers have vast resources at their disposal. They are well-funded, with full-time skilled labor, control over compromised and infected accounts, and access to global botnets. Protecting our users is a challenging adversarial learning problem with extreme scale and load requirements. Over the past several years we have built and deployed a coherent, scalable, and extensible realtime system to protect our users and the social graph. This Immune System performs realtime checks and classifications on every read and write action. As of March 2011, this is 25B checks per day, reaching 650K per second at peak. The system also generates signals for use as feedback in classifiers and other components. We believe this system has contributed to making Facebook the safest place on the Internet for people and their information. This paper outlines the design of the Facebook Immune System, the challenges we have faced and overcome, and the challenges we continue to face.
Article
A previously derived iteration formula for a random net was applied to some data on the spread of information through a population. It was found that if the axon density (the only free parameter in the formula) is determined by the first pair of experimental values, the predicted spread is much more rapid than the observed one. If the successive values of the “apparent axon density” are calculated from the successive experimental values, it is noticed that this quantity at first suffers a sharp drop from an initial high value to its lowest value and then gradually “recovers”. An attempt is made to account for this behavior of the apparent axon density in terms of the “assumption of transitivity”, based on a certain socio-structural bias, namely, that the likely contacts of two individuals who themselves have been in contact are expected to be strongly overlapping. The assumption of transitivity leads to a drop in the apparent axon density from an arbitrary initial value to the vicinity of unity (if the actual axon density is not too small). However, the “recovery” is not accounted for, and thus the predicted spread turns out to beslower than the observed.