Figure - uploaded by Raad Bin Tareaf
Content may be subject to copyright.
Source publication
Spam Bots have become a threat to online social networks with their malicious behavior, posting misinformation messages and influencing online platforms to fulfill their motives. As spam bots have become more advanced over time, creating algorithms to identify bots remains an open challenge. Learning low-dimensional embeddings for nodes in graph st...
Context in source publication
Similar publications
Entity alignment refers to discovering two entities in different knowledge bases that represent the same thing in reality. Existing methods generally only adopt TransE or TransE-like knowledge graph representation learning models, which usually assume that there are enough training triples for each entity, and entities appearing in few triples are...
Various non-trivial spaces are becoming popular for embedding structured data such as graphs, texts, or images. Following spherical and hyperbolic spaces, more general product spaces have been proposed. However, searching for the best configuration of product space is a resource-intensive procedure, which reduces the practical applicability of the...
The attributes of the knowledge nodes in the interactive educational knowledge graph need to cater to students’ online learning preferences, so understanding the composition and learning preferences of students in the online learning process is helpful to the development of more targeted learning paths. Currently, there are few existing research re...
Knowledge graph embedding (KGE) models have become popular means for making discoveries in knowledge graphs (e.g., RDF graphs) in an efficient and scalable manner. The key to success of these models is their ability to learn low-rank vector representations for knowledge graph entities and relations. Despite the rapid development of KGE models, stat...
The submitted document is the extended abstract of a paper which will be presented in July if the extended abstract is accepted by the conference reviewers.
A shorter abstract (of the paper to be written) is as follows:
In this paper we develop bond graph descriptions for ideal mechanical constraints, both embedded and adjoined, by stating and ap...
Citations
... Alhosseini et al. [46] introduced the use of graph convolutional neural networks (GCNN) in bot identification. They noted that besides the users' features, the construction of a social network would enhance a model's ability to distinguish the bots from the genuine users. ...
... • Botometer [23] is a web-based program that leverages more than 1,000 user features. • Alhosseini et al. [46] introduced graph convolutional neural networks in bot detection. • SATAR [27] leverages the user's semantics, property, and neighborhood information • BotRGCN et al. [12] used the user's description, tweets, numerical and categorical properties, and neighborhood information. ...
... We see that our model benefits from the search for the fittest architecture that we performed beforehand, as it achieves a higher accuracy, F1-score, and MCC than other state-of-theart methods. Model Accuracy F1-score MCC [9] 0.7456 0.7823 0.4879 [37] 0.8191 0.8546 0.6643 [20] 0.8174 0.7517 0.6710 [21] 0.7126 0.7533 0.4193 [39] 0.4801 0.6266 -0.1372 [10] 0.4793 0.1072 0.0839 [23] 0.5584 0.4892 0.1558 [46] 0.6813 0.7318 0.3543 [27] 0.8412 0.8642 0.6863 [12] 0.8462 0.8707 0.7021 [29] 0.7466 0.7630 ours 0.8568 ± 0.004 0.8712 ± 0.003 0.7116 ± 0.007 ...
Social media platforms, including X, Facebook, and Instagram, host millions of daily users, giving rise to bots-automated programs disseminating misinformation and ideologies with tangible real-world consequences. While bot detection in platform X has been the area of many deep learning models with adequate results, most approaches neglect the graph structure of social media relationships and often rely on hand-engineered architectures. Our work introduces the implementation of a Neural Architecture Search (NAS) technique, namely Deep and Flexible Graph Neural Architecture Search (DFG-NAS), tailored to Relational Graph Convolutional Neural Networks (RGCNs) in the task of bot detection in platform X. Our model constructs a graph that incorporates both the user relationships and their metadata. Then, DFG-NAS is adapted to automatically search for the optimal configuration of Propagation and Transformation functions in the RGCNs. Our experiments are conducted on the TwiBot-20 dataset, constructing a graph with 229,580 nodes and 227,979 edges. We study the five architectures with the highest performance during the search and achieve an accuracy of 85.7%, surpassing state-of-the-art models. Our approach not only addresses the bot detection challenge but also advocates for the broader implementation of NAS models in neural network design automation.
... This shift toward deep learning enables the analysis of unstructured information, including the network structures connected users create. Models utilizing graph convolutional networks (GCNs) have been introduced to exploit these user relationships [31]. ...
Organized misinformation campaigns on Twitter continue to proliferate, even as the platform acknowledges such activities through its transparency center. These deceptive initiatives significantly impact vital societal issues, including climate change, thus spurring research aimed at pinpointing and intercepting these malicious actors. Present-day algorithms for detecting bots harness an array of data drawn from user profiles, tweets, and network configurations, delivering commendable outcomes. Yet, these strategies mainly concentrate on postincident identification of malevolent users, hinging on static training datasets that categorize individuals based on historical activities. Diverging from this approach, we advocate for a forward-thinking methodology, which utilizes user data to foresee and mitigate potential threats before their realization, thereby cultivating more secure, equitable, and unbiased online communities. To this end, our proposed technique forecasts malevolent activities by tracing the projected trajectories of user embeddings before any malevolent action materializes. For validation, we employed a dynamic directed multigraph paradigm to chronicle the evolving engagements between Twitter users. When juxtaposed against the identical dataset, our technique eclipses contemporary methodologies by an impressive 40.66% in F score (F1 score) in the anticipatory identification of harmful users. Furthermore, we undertook a model evaluation exercise to gauge the efficiency of distinct system elements.
... We find that many works collect user profile data from various online social networks for this analysis, like Twitter [212][213][214][215][216][217][218], Instagram [219][220][221], Facebook [222][223][224], YouTube [225], and Sina Weibo [226]. A different approach was used by a study [227] that collected real names from various webpages, schools, and other sources to automatically detect fake names online. ...
Fraud is a prevalent offence that extends beyond financial loss, causing psychological and physical harm to victims. The advancements in online communication technologies alowed for online fraud to thrive in this vast network, with fraudsters increasingly using these channels for deception. With the progression of technologies like AI, there is a growing concern that fraud will scale up, using sophisticated methods, like deep-fakes in phishing campaigns, all generated by language generation models like ChatGPT. However, the application of AI in detecting and analyzing online fraud remains understudied. We conduct a Systematic Literature Review on AI and NLP techniques for online fraud detection. The review adhered the PRISMA-ScR protocol, with eligibility criteria including relevance to online fraud, use of text data, and AI methodologies. We screened 2,457 academic records, 350 met our eligibility criteria, and included 223. We report the state-of-the-art NLP techniques for analysing various online fraud categories; the training data sources; the NLP algorithms and models built; and the performance metrics employed for model evaluation. We find that current research on online fraud is divided into various scam activitiesand identify 16 different frauds that researchers focus on. This SLR enhances the academic understanding of AI-based detection methods for online fraud and offers insights for policymakers, law enforcement, and businesses on safeguarding against such activities. We conclude that focusing on specific scams lacks generalization, as multiple models are required for different fraud types. The evolving nature of scams limits the effectiveness of models trained on outdated data. We also identify issues in data limitations, training bias reporting, and selective presentation of metrics in model performance reporting, which can lead to potential biases in model evaluation.
... This procedure baits spambots into attacking a specific system aimed at studying their behaviors and profiles [4,22]. Furthermore, some recent methods have been developed by Ali Alhosseini et al. [3] to detect traditional spambots via models based on graph convolutional neural networks. ...
Emerging technologies, particularly artificial intelligence (AI), and more specifically Large Language Models (LLMs) have provided malicious actors with powerful tools for manipulating digital discourse. LLMs have the potential to affect traditional forms of democratic engagements, such as voter choice, government surveys, or even online communication with regulators; since bots are capable of producing large quantities of credible text. To investigate the human perception of LLM-generated content, we recruited over 1,000 participants who then tried to differentiate bot from human posts in social media discussion threads. We found that humans perform poorly at identifying the true nature of user posts on social media. We also found patterns in how humans identify LLM-generated text content in social media discourse. Finally, we observed the Uncanny Valley effect in text dialogue in both user perception and identification. This indicates that despite humans being poor at the identification process, they can still sense discomfort when reading LLM-generated content.
... These methods leverage user relationships to improve accuracy and robustness and demonstrate the efficacy of capturing and utilizing the structural information. They construct various types of graphs, including isomorphic graphs [17,18], heterogeneous graphs [19,20], and multirelational graphs [21,22] based on user relationships, and employ GNNs to obtain user representations for effective bot detection. ...
... The MIU phenomenon in the user representation feature space reveals a notable overlap between features of personalized genuine accounts and bot accounts, whereas the majority of human accounts are readily distinguishable. Therefore, we propose the HR-MRG that expands [17,20,21], and multiple types of user relationships have different impacts on social bot detection, we achieve the representation models R b and R r based on multi-relational graphs and realize classifiers F b and F r with fully connected layers. Specifically, given the dataset D = {V, X, A} , we first generate the adjacency matrix A r for each relation r from the global adjacency matrix A, where r ∈ {1, 2, ..., R} indicates any interaction between users. ...
... In detail, we first select subsets of unlabeled samples for LP c and LP f and obtain feature representations using frozen representation models R b and R r to construct adjacency matrices (lines [12][13][14]. Subsequently, we perform coarse and fine label propagation separately and congregate all pseudo labels (lines [15][16][17]. Finally, we fine-tune the models for a few epochs on the expanded dataset, consisting of labeled and unlabeled samples (lines [18][19][20][21][22][23][24]. ...
Social bot detection is crucial for ensuring the active participation of digital twins and edge intelligence in future social media platforms. Nevertheless, the performance of existing detection methods is impeded by the limited availability of labeled accounts. Despite the notable progress made in some fields by deep semi-supervised learning with label propagation, which utilizes unlabeled data to enhance method performance, its effectiveness is significantly hindered in social bot detection due to the misdistribution of individuation users (MIU). To address these challenges, we propose a novel deep semi-supervised bot detection method, which adopts a coarse-to-fine label propagation (LP-CF) with the hybridized representation models over multi-relational graphs (HR-MRG) to enhance the accuracy of label propagation, thereby improving the effectiveness of unlabeled data in supporting the detection task. Specifically, considering the potential confusion among accounts in the MIU phenomenon, we utilize HR-MRG to obtain high-quality user representations. Subsequently, we introduce a sample selection strategy to partition unlabeled samples into two subsets and apply LP-CF to generate pseudo labels for each subset. Finally, the predicted pseudo labels of unlabeled samples, combined with labeled samples, are used to fine-tune the detection models. Comprehensive experiments on two widely used real datasets demonstrate that our method outperforms other semi-supervised approaches and achieves comparable performance to the fully supervised social bot detection method.
... Researchers tackled this by leveraging the graphical structure of the Twittersphere, which is composed of social relationships among Twitter users. They used Graph Neural Networks (GNNs) like Graph Convolutional Networks (GCNs) [11], Relational Graph Convolutional Networks (RGCNs) [12], and Relational Graph Transformers (RGTs) [13] for graph node classification to detect bots. Graph-based methods outperform text-based methods in detection performance and exhibit better generalization capabilities [14]. ...
Detecting Twitter Bots is crucial for maintaining the integrity of online discourse, safeguarding democratic processes, and preventing the spread of malicious propaganda. However, advanced Twitter Bots today often employ sophisticated feature manipulation and account farming techniques to blend seamlessly with genuine user interactions, posing significant challenges to existing detection models. In response to these challenges, this paper proposes a novel Twitter Bot Detection framework called BotSAI. This framework enhances the consistency of multimodal user features, accurately characterizing various modalities to distinguish between real users and bots. Specifically, the architecture integrates information from users, textual content, and heterogeneous network topologies, leveraging customized encoders to obtain comprehensive user feature representations. The heterogeneous network encoder efficiently aggregates information from neighboring nodes through oversampling techniques and local relationship transformers. Subsequently, a multi-channel representation mechanism maps user representations into invariant and specific subspaces, enhancing the feature vectors. Finally, a self-attention mechanism is introduced to integrate and refine the enhanced user representations, enabling efficient information interaction. Extensive experiments demonstrate that BotSAI outperforms existing state-of-the-art methods on two major Twitter Bot Detection benchmarks, exhibiting superior performance. Additionally, systematic experiments reveal the impact of different social relationships on detection accuracy, providing novel insights for the identification of social bots.
... Early works on social bot detection rely on feature engineering [7]. With the advent of graph representation learning, social bot detection with graph neural networks (GNNs) has gained popularity. Seyed et al. [8] applied GNNs in social bot detection, using a graph convolutional neural network (GCN) to learn the features of users and their neighbors. Yang et al. [9] adopted reinforcement learning and self-supervised methods to search for optimal GNN architectures, aiming to learn the embedding of user subgraphs for social bot detection. ...
... We compare with the social user representations proposed by Alhosseini et al. [8] and Yang et al. [7]. As shown in Table 2, our method achieves the optimal value of silhouette score and DBI. ...
As online social networks grow rapidly, the emergence of a large num-ber of virtual accounts, named social bots, poses great challenges to social secu-rity. In response to the bot invasion, the bot detection method has attracted con-siderable attention. Especially in recent years, with the widespread application of graphs, graph representation learning is widely applied in social bot detection. However, existing detection methods fall short in simultaneously representing with diverse network structures. In this work, we propose a graph-based and structure-aware framework to alleviate this problem. Specifically, we jointly en-code user semantics, attributes and neighborhood information. Moreover, we em-ploy a refined graph attention network model for parallel computation on large-scale graphs via subgraph sampling. In particular, we construct local and remote feature extractors, which can achieve multiple network feature extraction. Fi-nally, we adopt a multitask learning approach to construct auxiliary tasks for self-supervised training and conduct bot detection. Extensive experiments show that our model outperforms state-of-the-art methods. Further exploration also demon-strates that our model has a strong generalization ability.
... Для виявлення програмних ботів структурні графи аналізуються методами: показників центральності [8], навчання за поданими вузлами (node representation learning) [3], графових нейронних мереж (Graph Neural Network), або GNN [7]. Комбінування різних методів аналізу графів і текстів [32], а також створення покращених архітектур GNN для аналізу неоднорідних мереж [21], мають значні перспективи для виявлення програмних ботів. ...
Мета даної роботи полягає у детальному дослідженні ефективності використання великих мовних моделей (Large Language Models), або LLM, для виявлення програмних ботів в соціальних мережах. Робота зосереджується на аналізуванні ефективності різних методів виявлення та визначення потенціалу LLM як засобу для підвищення точності та ефективності процесу ідентифікації ботів. Дослідження охоплює аналіз трьох основних підходів до виявлення програмних ботів: аналіз метаданих, текстовий аналіз та аналіз графів. Аналізуються як традиційні методи машинного навчання, так і новітні LLM, які використовуються для аналізу великих даних з соціальних мереж. Основною методикою є порівняльний аналіз, який включає використання розширених наборів даних, таких як TwiBot20 і TwiBot-22, для оцінки продуктивності кожного методу з використанням метрик, таких як точність та F1-міра, що дозволяє отримати об'єктивне уявлення про ефективність різних підходів до виявлення ботів. Наукова новизна даної роботи полягає у використанні LLM для аналізу різноманітних видів даних з соціальних мереж для виявлення програмних ботів. Автори розглядають інтеграцію LLM у традиційні методи виявлення, що дозволяє адаптувати процеси виявлення до складної поведінки програмних ботів, забезпечуючи високу точність і ефективність. Висновки. LLM демонструють високу ефективність у виявленні програмних ботів, проте мають високу обчислювальну вимогливість. Тому актуальним є застосування гібридних підходів, які поєднують LLM з традиційними методами. Така гібридизація дозволить зменшити використання ресурсів і забезпечити більш стійку та адаптовану систему виявлення ботів. Такий підхід може сприяти поліпшенню загальної продуктивності систем виявлення ботів, зменшенню витрат на обчислювальні ресурси та забезпеченню більш точного і ефективного виявлення шкідливих програм у соціальних мережах. Рекомендується подальше дослідження для вдосконалення інтеграції LLM у системи виявлення ботів, особливо в контексті динамічної поведінки соціальних мереж та еволюції програмних ботів.
... This framework treats network elements as multi-attribute graphs and uses them for semi-supervised learning to classify nodes. Ali Alhosseini and his team [19] create a model using Graph Convolutional Neural Networks that can effectively spot social bots by looking at the characteristics of a node and those around it. Additionally, Thomas Kipf and his colleagues [20] suggest a scalable approach for learning with limited supervision on graphs. ...
... Previous research has found that there is no significant difference in the feature of username length between social bot users and human users [19]. Therefore, evaluating and improving robustness based on the username length attribute may not be significant. ...
Online social networks are easily exploited by social bots. Although the current models for detecting social bots show promising results, they mainly rely on Graph Neural Networks (GNNs), which have been proven to have vulnerabilities in robustness and these detection models likely have similar robustness vulnerabilities. Therefore, it is crucial to evaluate and improve their robustness. This paper proposes a robustness evaluation method: Attribute Random Iteration-Fast Gradient Sign Method (ARI-FGSM) and uses a simplified adversarial training to improve the robustness of social bot detection. Specifically, this study performs robustness evaluations of five bot detection models on two datasets under both black-box and white-box scenarios. The white-box experiments achieve a minimum attack success rate of 86.23%, while the black-box experiments achieve a minimum attack success rate of 45.86%. This shows that the social bot detection model is vulnerable to adversarial attacks. Moreover, after executing our robustness improvement method, the robustness of the detection model increased by up to 86.98%.
... With the application of graph neural networks for node representation and node classification tasks, many works have emerged that utilize graph models for social bot detection. Alhosseini et al. [22] firstly implemented social bot detection based on graph models by applying the GCN method to obtain the user representation and achieve the classification of the social accounts. BotRGCN [23] applied the relational graph convolutional neural network model to achieve social bot classification, obtaining better detection results. ...
Malicious social bots pose a serious threat to social network security by spreading false information and guiding bad opinions in social networks. The singularity and scarcity of single organization data and the high cost of labeling social bots have given rise to the construction of federated models that combine federated learning with social bot detection. In this paper, we first combine the federated learning framework with the Relational Graph Convolutional Neural Network (RGCN) model to achieve federated social bot detection. A class-level cross entropy loss function is applied in the local model training to mitigate the effects of the class imbalance problem in local data. To address the data heterogeneity issue from multiple participants, we optimize the classical federated learning algorithm by applying knowledge distillation methods. Specifically, we adjust the client-side and server-side models separately: training a global generator to generate pseudo-samples based on the local data distribution knowledge to correct the optimization direction of client-side classification models, and integrating client-side classification models’ knowledge on the server side to guide the training of the global classification model. We conduct extensive experiments on widely used datasets, and the results demonstrate the effectiveness of our approach in social bot detection in heterogeneous data scenarios. Compared to baseline methods, our approach achieves a nearly 3–10% improvement in detection accuracy when the data heterogeneity is larger. Additionally, our method achieves the specified accuracy with minimal communication rounds.