There have been many efforts to detect rumors using various machine learning (ML) models, but there is still a lack of understanding of their performance against different rumor topics and available features, resulting in a significant performance degrade against completely new and unseen (unknown) rumors. To address this issue, we investigate the...
Context in source publication
In the two recent decades various security authorities around the world acknowledged the importance of exploiting the ever-growing amount of information published on the web on various types of events for early detection of certain threats, situation monitoring and risk analysis. Since the information related to a particular real-world event might...
With the increased popularity of social media platforms, people are increasingly depending on them for news and updates. Even official media channels post news on social media platforms such as Twitter and Facebook. However, with the vast amount of user-generated content, the credibility of shared information must be verified, and this process should be performed automatically and efficiently to accommodate the huge rate of generated posts. Current technology provides powerful methods and tools to solve the issue of rumor spreading on social networks. In this study, the aim is to investigate the use of state-of-the-art machine learning and deep learning models to detect rumors in a collection of Arabic tweets using the ArCOV19-Rumors dataset. A comprehensive comparison of the performance of the models was conducted. In deep learning experiments, the performances of seven optimizers were compared. The results demonstrated that using over-sampled data did not enhance classical and deep learning models. By contrast, using stacking classifiers increased the predictive model’s performance. As a result, the model became more logical and realistic in predicting rumors, non-rumors, and other classes than using classical machine learning without the stacking technique. Additionally, both long short-term memory (LSTM) and bidirectional-LSTM (Bi-LSTM) with the Root mean square propagation (RMSprop) optimizer obtained the best results. Finally, the results were analyzed to explain and interpret the low performance.
Rumor detection is a recent and quite active topic of multidisciplinary research due to its evident impact on society, which can even result in physical harm to people. Despite its broad definition and scope, most existing research focuses on Twitter, as it makes the problem a bit more tractable from the point of view of information gathering, processing, and further retrieval of social network features. This paper presents an empirical study of novel machine learning ensembles for the rumor classification task on Twitter. As it has been observed that certain neural models perform better for specific veracity labels, we present a study on how the combination of such classifiers in different kinds of ensemble results in a new classifier that has better performance. Using benchmark data, we evaluate three groups of models (two groups of deep neural networks and a control group of classical machine learning methods). In addition, we study the performance of three ensemble strategies: Bagging, Stacking, and Simple Soft Voting. After varying several parameters of the models, such as the number of hidden units and dropout, among others experimental factors, our study shows that the LSTM, Stacked LSTM (S-LSTM), Recurrent Convolutional Neural Networks (RCNN), and Bidirectional Gated Recurrent Unit (Bi-GRU) yields the best results reaching an accuracy of 0.93 and an average precision of 0.97.KeywordsRumor detectionFake newsEnsemblesMachine learningDeep learning
News on social media can significantly influence users, manipulating them for political or economic reasons. Adversarial manipulations in the text have proven to create vulnerabilities in classifiers, and the current research is towards finding classifier models that are not susceptible to such manipulations. In this paper, we present a novel technique called ConTheModel, which slightly modifies social media news to confuse machine learning (ML)-based classifiers under the black-box setting. ConTheModel replaces a word in the original tweet with its synonym or antonym to generate tweets that confuse classifiers. We evaluate our technique on three different scenarios of the dataset and perform a comparison between five well-known machine learning algorithms, which includes Support Vector Machine (SVM), Naive Bayes (NB), Random Forest (RF), eXtreme Gradient Boosting (XGBoost), and Multilayer Perceptron (MLP) to demonstrate the performance of classifiers on the modifications done by ConTheModel. Our results show that the classifiers are confused after modification with the utmost drop of 16.36%. We additionally conducted a human study with 25 participants to validate the effectiveness of ConTheModel and found that the majority of participants (65%) found it challenging to classify the tweets correctly. We hope our work will help in finding robust ML models against adversarial examples.