Conference PaperPDF Available

Fake News Identification on Twitter with Hybrid CNN and RNN Models

Authors:

Abstract

The problem associated with the propagation of fake news continues to grow at an alarming scale. This trend has generated much interest from politics to academia and industry alike. We propose a framework and method that detects and classifies fake news messages from Twitter posts using a hybrid of convolutional neural networks and long-short term recurrent neural network models. We find in our work that using this deep learning approach achieves an 82% accuracy. Intuitively identifying relevant features associated with fake news stories without previous knowledge of topic domain.
Fake News Identification on Twitter with
Hybrid CNN and RNN Models
Oluwaseun Ajao
C3Ri Research Institute
Sheffield Hallam University
United Kingdom
oajao@acm.org
Deepayan Bhowmik
C3Ri Research Institute
Sheffield Hallam University
United Kingdom
d.bhowmik@ieee.org
Shahrzad Zargari
C3Ri Research Institute
Sheffield Hallam University
United Kingdom
s.zargari@shu.ac.uk
ABSTRACT
The problem associated with the propagation of fake news
continues to grow at an alarming scale. This trend has
generated much interest from politics to academia and
industry alike. We propose a framework that detects and
classifies fake news messages from Twitter posts using
hybrid of convolutional neural networks and long-short
term recurrent neural network models. The proposed work
using this deep learning approach achieves 82% accuracy.
Our approach intuitively identifies relevant features
associated with fake news stories without previous
knowledge of the domain.
CCS CONCEPTS
Information systems Social networking sites
KEYWORDS
Fake News, Twitter, Social Media
ACM Reference format:
Oluwaseun Ajao, Deepayan Bhowmik, and Shahrzad
Zargari. 2018. Fake News Identification on Twitter with
Hybrid CNN and RNN Models. In Proceedings of the
International Conference on Social Media & Society,
Copenhagen, Denmark (SMSociety).
1
DOI:
https://doi.org/10.1145/3217804.3217917
1 INTRODUCTION
The growing influence experienced by the propaganda
of fake news is now cause for concern for all walks of life.
Election results are argued on some occasions to have been
manipulated through the circulation of unfounded and
1
Permission to make digital or hard copies of part or all of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or c ommercial advantage and that copies
bear this notice and the full citation on the first page. Copyrights for third-
party components of this work must be honored. For all other uses, contact
the owner/author(s).
SMSociety, July 2018, Copenhagen, Denmark
© 2018 Copyright held by the owner/author(s). $ 15.0 0.
DOI: https://doi.org/10.1145/3217804.3217917
doctored stories on social media including microblogs such
as Twitter. All over the world, the growing influence of
fake news is felt on daily basis from politics to education
and financial markets. This has continually become a cause
of concern for politicians and citizens alike. For example,
on April 23rd 2013, the Twitter account of the news
agency, Associated Press, which had almost 2 million
followers at the time was hacked.
Figure 1: Tweet allegedly sent by the Syrian
Electronic Army from hacked Twitter account of
Associated Press
The impact may also be severe. The following message
was sent, "Breaking: Two Explosions in the White House
and Barack Obama is injured" (shown in Figure 1). This
message led to a flash crash on the New York Stock
Exchange, where investors lost 136 billion dollars on the
Standard & Poors Index in two minutes [1]. It would be
interesting and indeed beneficial if the origin of messages
could be verified and filtered where the fake messages were
separated from authentic ones. The information that people
listen to and share in social media is largely influenced by
the social circles and relationships they form online [2].
The feat of accurately tracking the spread of fake messages
and especially news content would be of interest to
researchers, politicians, citizens as well as individuals all
around the world. This can be achieved by using effective
and relevant “social sensor tools” [3]. This need is more
important in countries that have trusted and embraced
technology as part of their electoral process and thus
adopted e-voting. [4] For example, in France and Italy,
SMSociety, July 2018, Copenhagen, Denmark
Oluwaseun Ajao, Deepayan Bhowmik and Shahrzad Zargari
even though internet users may not accurately represent the
demographics of the entire population, opinions on social
media and mass surveys of citizens are correlated. They are
both found to be largely influenced by external factors such
as news stories from newspapers, TV and ultimately on
social media.
In addition, there is a growing and alarming use of social
media for anti-social behaviors such as cyberbullying,
propaganda, hate crimes, and for radicalization and
recruitment of individuals into terrorism organizations such
as ISIS [5]. A study by Burgess et al. into the top 50 most
retweeted stories with pictures of the Hurricane Sandy
disaster found that less than 25% were real while the rest
were either fake or from unsubstantiated sources [6].
Facebook announced the use of “filters for removing
hoaxes and fake news from the news feed on the world's
largest social media platform [7]. The development
followed concerns that the spread of fake news on the
platform might have helped Donald Trump win the US
presidential elections held in 2016 [8]. According to the
social media site, 46% of the top fake news stories
circulated on Facebook were about US politics and election
[9]. Table 1 gives detail of the top ranking news stories that
were circulated on Facebook in year 2016.
Tambuscio et al proposed a model for tracking the
spread of hoaxes using four parameters: spreading rate,
gullibility, probability to verify a hoax, and forgetting one's
current belief[10]. Many organizations now employ social
media accounts on Twitter, Facebook and Instagram for the
purposes of announcing corporate information such as
earnings reports and new product releases. Consumers,
investors, and other stake holders take these news messages
as seriously as they would for any other mass media [11].
Other reasons that fake news have been widely proliferated
include for humor, and as ploys to get readers to click on
sponsored content (also referred to as “clickbait”.
In this work, we aim to answer the following research
questions:
Given tweets about a news item or story, can we
determine their truth or authenticity based on the
content of the messages?
Can semantic features or linguistic characteristics
associated with a fake news story on Twitter be
automatically identified without prior knowledge
of the domain or news topic?
The remainder of the paper is structured as follows: in
Section 2, we discuss related works in rumor and fake news
detection, Section 3 presents the methodology that we
propose and utilize in our task of fake news detection,
while the results of our findings are discussed and
evaluated in Section 4. Finally, conclusions and future
work are presented in Section 5.
2 RELATED WORKS
The work on fake news detection has been initially
reviewed by several authors, who in the past referred to it
only “rumors”, until 2016 and the US presidential election.
During the election, the phrase “fake news” became
popular with then-candidate, Donald Trump. Until recently,
Twitter allowed their users to communicate with 140
characters on its platform, limiting users ability to
communicate with others. Therefore, users who propagate
fake news, rumors and questionable posts have been found
to incorporate other mediums to make their messages go
viral. A good example happened in the aftermath of
Hurricane Sandy, where enormous amounts of fake and
altered images were circulating on the internet. Gupta et al
used a Decision Tree classifier to distinguish between fake
and real images posted about the event[12].
Neural networks are a form of machine learning method
that have been found to exhibit high accuracy and precision
in clustering and classification of text [13]. They also prove
effective in the prompt detection of spatio-temporal trends
in content propagation on social media. In this approach,
we combine this with the efficiency of recurrent neural
networks (RNN) in the detection and semantic
interpretation of images. Although this hybrid approach in
semantic interpretation of text and images is not new
[14][15], to the best of our knowledge, this is the first
attempt involving the use of a hybrid approach in the
detection of the origin and propagation of fake news posts.
Kwon et al identified and utilized three hand-crafted feature
Table 1: Most Circulated and Engaging Fake News Stories on Facebook in 2016
S/N
Category
1
Politics
2
Crime
3
Politics
4
Politics
5
Crime
6
Crime
7
Politics
8
Politics
9
Crime
10
Politics
Fake News Identification on Twitter SMSociety, July 2018, Copenhagen, Denmark
types associated with rumor classification including: (1)
Temporal features - how a tweet propagates from one time
window to another; (2) Structural Features - how the
influence or followership of posters affect other posts; and
(3) Linguistic Features - sentiment categories of words[16].
Previous work done by Ferrara achieved 97% accuracy
in detecting fake images from tweets posted during the
Hurricane Sandy disaster in the United States They
performed a characterization analysis, to understand the
temporal, social reputation and influence patterns for the
spread of fake images by examining more than 10,000
images posted on Twitter during the incident[12]. They
used two broad types of features in the detection of fake
images posted during the event. These include seven user-
based features such as age of the user account, followers'
size and the follower-followee ratio. They also deduced
eighteen tweet-based features such as tweet length, retweet
count, presence of emoticons and exclamation marks.
Aggarwal et al had identified four certain features
based on URLs, protocols to query databases content and
followers networks of tweets associated with phishing,
which present a similar problem to fake and non-credible
tweets but in their case also have the potential to cause
significant financial harm to someone clicking on the links
associated with these “phishing” messages[17].
Yardi et al developed three feature types for spam
detection on Twitter; which includes searches for URLs,
matching username patterns, and detection of keywords
from supposedly spam messages[18]. O'Donovan et al
identified the most useful indicators of credible and non-
credible tweets as URLs, mentions, retweets, and tweet
lengths[19].
Other works on the credibility and veracity identification
on Twitter include Gupta et al that developed a framework
and real-time assessment system for validating authors
content on Twitter as they are being posted[20]. Their
approach assigns a graduated credibility score or rank to
each tweet as they are posted live on the social network
platform.
3 METHODOLOGY
The approach of our work is twofold. First is the
automatic identification of features within Twitter post
without prior knowledge of the subject domain or topic of
discussion using the application of a hybrid deep learning
model of LSTM and CNN models. Second is the
determination and classification of fake news posts on
Twitter using both text and images.
We posit that since the use of deep learning models
enables automatic feature extraction, the dependencies
amongst the words in fake messages can be identified
automatically [13] without expressly defining them in the
network. The knowledge of the news topic or domain being
discussed would not be necessary to achieve the feat of
fake news detection.
3.1 The Deep learning Architectures
We implemented three deep neural network variants.
The models applied to train the datasets include:
3.1.1 Long-Short Term Memory (LSTM)
LSTM recurrent neural network (RNN) was adopted for
the sequence classification of the data. The LSTM [7]
remains a popular method for the deep learning
classification involving text since when they first appeared
20 years ago [22]
3.1.2 LSTM with dropout regularization
LSTM with dropout regularization [23] layers between
the word embedding layer and the LSTM layer was
adopted to avoid over-fitting to the training dataset.
Following this approach, we randomly selected and
dropped weights amounting to 20% of neurons in the
LSTM layer.
3.1.3 LSTM with convolutional neural networks (CNN)
We included a 1D CNN [14] immediately after the word
embedding layer of the LSTM model. We further added a
max pooling layer to reduce dimensionality of the input
layer while preserving the depth and avoid over-fitting of
the training data. This also helps in reducing computational
time and resources in the training of the model. The overall
aim is to ultimately improve model prediction accuracy.
3.2 About the Dataset
The dataset consisted of approximately 5,800 tweets
centered on five rumor stories. The tweets were collected
and used in the works by Zubiaga et al[24. The dataset
consisted of original tweets and they were labeled as rumor
and non-rumors. The events were widely reported in online,
print and conventional electronic media such radio and
Television at the time of occurrence:
CharlieHebdo
SydneySiege
Ottawa Shooting
Germanwings-Crash
Ferguson Shooting
We applied ten-fold cross validation on the entire dataset
of 5,800 tweets and performed padding of the tweets (i.e.,
adding zeros to the tweets for uniform inclusion in the
feature vector for analysis and processing).
3.3 Recurrent Neural Network (RNNs)
This type of neural network has been shown to be effective
in time and sequence based predictions [25]. Twitter posts
can be likened to events that occur in time [16] where the
intervals between the retweet of one user to another is
contained within a time window and treated in sequential
modes. Rumors have been examined in the context of
SMSociety, July 2018, Copenhagen, Denmark
Oluwaseun Ajao, Deepayan Bhowmik and Shahrzad Zargari
varying time windows [26]. Recurrent Neural Networks
were initially limited by the problem associated with the
adjustment of weights over time. Several methods have
been adopted in solving the vanishing gradient problem but
can largely be categorized into two types, namely, the
exploding gradient and the vanishing gradient. Solutions
adopted for the former include truncated back propagation,
penalties and gradient clipping (these resolve the exploding
gradient problem), while the vanishing gradient problem
has been resolved using dynamic weight initializations, the
echo state networks (ESN) and Long-Short Term Memory
(LSTMs). LSTMs will be the main focus of this work as
they preserve the memory from the last phase and
incorporate this in the prediction task of the neural network
model. Weights are the long term memories of the neural
network.
3.4 Incorporating Convolutional Neural Network
Another popular model is the convolutional neural network
(CNN) which has been well known for its application in
image processing as well as use in text mining [27]. We
posit that addition of the hybrid method would improve
performance of the model and give much better results for
the content based fake news detection. However, the hybrid
implementation for this work so far involves a text-only
approach.
3.5 Selection of Training Parameters
The following hyper-parameters were optimized using a
grid search approach and optimal values derived for the
following batch size, epochs, learning rates, activation
function and dropout regularization rate which is set at
20%. This hybrid method of detecting fake news using this
approach complements the natural language semantic
processing of text by ensuring that images allow for the
disambiguation of posts and making them more enriched in
the identification repository. We posit that it would also be
interesting to look at users’ impact in the propagation of
these messages and fake news content on Twitter.
The aim of the study is to detect the veracity of posts on
Twitter. Possible applications include assisting law
enforcement agencies in curtailing the spread and
propagation of such messages, especially when they have
negative implications and consequences for readers who
believe them.
4 EVALUATION, RESULTS AND DISCUSSION
Our deep learning model intuitively achieves 82% accuracy
on the classification task in detecting fake news posts
without prior domain knowledge of the topics being
discussed. So far in the experiments completed, it is
revealed that the plain vanilla LSTM model achieved the
best performance in terms of precision, recall and
FMeasure and an accuracy of 82% as shown in Table 2. On
the other hand, the LSTM method with dropout
regularization performed the least in terms of the metrics
adopted. This is likely a result of under fitting the model, in
addition to the lack of sufficient training data and examples
within the network. Another reason for the low
performance of the dropout regularization may be the depth
of the network, as it is relatively shallow, resulting in the
drop-out layer being quite close to the input and output
layers of the model. This could severely degrade the
performance of the method. An alternative to improve
model performance could be through Batch Normalization
[28], where the input values in the layers have a mean
activation of zero and standard deviation of one as in a
standard normal distribution; this is beyond the scope of
this current work.
The LSTM-CNN hybrid model performed better than
the dropout regularization model, with a 74% accuracy and
an FMeasure of 39.7%. However insufficient training
examples for the neural network model led to negative
appreciation against the plain-vanilla LSTM model.
The precision of 68% achieved by the state of the art
PHEME dataset by [24] was still higher than the results
obtained so far. However we believe that with the inclusion
of more training data from the reactions to the original
twitter posts, there will be more significant improvement in
model performance.
5 CONCLUSION AND FUTURE WORK
We have presented a novel approach for the detection of
fake news spread on Twitter from message text and images.
We achieved an 82% accuracy performance beating the
state of the art on the PHEME Dataset. It is expected that
our approach will achieve much better results following
incorporation of fake image disambiguation (found in these
tweets aimed at making the posts go viral).
We leverage on a hybrid implementation of two deep
learning models to find that we do not require a very large
number of tweets about an event to determine the veracity
or credibility of the messages. Our approach gives a boost
in the achievement of a higher performance while not
requiring a large amount of training data typically
associated with deep learning models. We are progressively
examining the inference of the tweet geo-locations and
origin of these fake news items and the authors that
propagate them. It would be interesting if also the training
data required in this task was relatively smaller such that
Table 2: Proposed Deep Learning Methods in
Fake News Detection
Technique
ACC
PRE
REC
F-M
LSTM
82.29
44.35
40.55
40.59
LSTMDrop
73.78
39.67
29.71
30.93
LSTM-CNN
80.38
43.94
39.53
39.70
Fake News Identification on Twitter SMSociety, July 2018, Copenhagen, Denmark
fake news items can be quickly identified. It would be
further beneficial if the origin and location [29] of fake
posts were tracked with minimal computational resources.
Deep learning models such as CNN and RNN often
require much larger datasets, and in some cases multiple
layers of neural networks for the effective training of their
models. In our case, we have a small dataset of 5,800
tweets. In our ongoing and future work we have collected
the reactions of other users to these messages via the
Twitter API in the magnitude of hundreds of thousands,
with the aim of enriching the size of the training dataset
and thus improving the robustness of the model
performance. We expect that this will also help to draw
more actionable insights for the propagation of these
messages from one user to another and how they react—
specifically if they embraced or refrained from becoming
evangelists and promoters of these messages to other users
on the platform.
REFERENCES
[1]
J Keller. 2013. A fake AP tweet sinks the DOWfor an instant.
Bloomberg Businessweek (2013).
[2]
Jure Leskovec and Julian J Mcauley. 2012. Learning to discover
social circles in ego networks. In Advances in neural information
processing systems. 539547.
[3]
Steve Schifferes, Nic Newman, Neil Thurman, David Corney, Ayse
Göker, and Carlos Martin. 2014. Identifying and verifying news
through social media: Developing a user-centred tool for professional
journalists. Digital Journalism 2, 3 (2014), 406418.
[4]
Andrea Ceron, Luigi Curini, Stefano M Iacus, and Giuseppe Porro.
2014. Every tweet counts? How sentiment analysis of social media
can improve our knowledge of citizens’ political preferences with an
application to Italy and France. New Media & Society 16, 2 (2014),
340358.
[5]
Emilio Ferrara. 2015. Manipulation and abuse on social media by
emilio ferrara with ching-man au yeung as coordinator. ACM
SIGWEB Newsletter Spring (2015), 4 .
[6]
Jean Burgess, Farida Vis, and Axel Bruns. 2012. Hurricane Sandy:
The Most Tweeted Pictures. The Guardian Data Blog, November 6
(2012).
[7]
BBC. 2017. Facebook to tackle fake news in Germany 2017. (2017).
[8]
Olivia Solon. 2016. Facebook’s failure: Did fake news and polarized
politics get Trump elected. The Guardian 10 (2016).
[9]
Craig Silverman. 2016. Here are 50 of the biggest fake news hits on
Facebook from 2016. BuzzFeed, https://www.
buzzfeed.com/craigsilverman/top-fake-news-of-2016 (2016).
[10]
Marcella Ta mbuscio, Giancarlo Ruffo, Alessandro Flammini, and
Filippo Menczer. 2015. Fact-checking effect on viral hoaxes: A
model of misinformation spread in social networks. In Proceedings of
the 24th International Conference on World Wide Web. ACM, 977
982.
[11]
Andreas M Kaplan and Michael Haenlein. 2010. Users of the world,
unite! The challenges and opportunities of Social Media. Business
horizons 53, 1 (2010), 5968.
[12]
Aditi Gupta, Hemank Lamba, Ponnurangam Kumaraguru, and
Anupam Joshi. 2013. Faking sandy: characterizing and identifying
fake images on twitter during hurricane sandy. In Proceedings of the
22nd international conference on World Wide Web. ACM, 729736.
[13]
Jing Ma, Wei Ga o, Prasenjit Mitra, Sejeong Kwon, Bernard J Jansen,
Kam-Fai Wong, and Meeyoung Cha. 2016. Detecting Rumors from
Microblogs with Recurrent Neural Networks.. In IJCAI. 38183824.
[14]
Andrej Karpathy and Li Fei-Fei. 2015. Deep visual-semantic
alignments for generating image descriptions. In Proceedings of the
IEEE Conference on Computer Vision and Pattern Recognition.
31283137.
[15]
JiangWang, Yi Yang, Junhua Mao, Zhiheng Huang, Chang Huang,
and Wei Xu. 2016. Cnn-rnn: A unified framework for multi-label
image classification. In Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition. 22852294.
[16]
Sejeong Kwon, Meeyoung Cha, Kyomin Jung, Wei Chen, and Yajun
Wang. 2013. Prominent features of ru mor propagation in online social
media. In Data Mining (ICDM), 2013 IEEE 13th International
Conference on. IEEE, 11031108.
[17]
Anupama Aggarwal, Ashwin Rajadesingan, and Ponnurangam
Kumaraguru.2012. Phishari: automatic realtime phishing detection on
twitter. In eCrime Researchers Summit (eCrime), 2012. IEEE, 112.'
[18]
Sarita Yardi, Daniel Romer o, Grant Schoenebeck, et al. 2009.
Detecting spam in a twitter network. First Monday 15, 1 (2009).
[19]
John O’Donovan, Byungkyu Kang, Greg Meyer, Tobias Hollerer, and
Sibel Adalii. 2012. Credibility in context: An analysis of feature
distributions in twitter. In Privacy, Security, Risk and Trust
(PASSAT), 2012 international conference on and 2012 international
conference on social computing (SocialCom). IEEE, 293301.
[20]
Aditi Gupta, Ponnurangam Kumaraguru, Carlos Castillo, and Patrick
Meier. 2014. Tweetcred: Real-time credibility assessment of content
on twitter. In International Conferen ce on Social Informatics.
Springer, 228243.
[21]
Klaus Greff, Rupesh K Srivastava, Jan Koutník, Bas R Steunebrink,
and Jürgen Schmidhuber. 2017. LSTM: A search space odyssey.
IEEE transactions on neural networks and learning systems 28, 10
(2017), 22222232.
[22]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term
memor y. Neural co mputation 9, 8 (199 7), 17351780.
[23]
Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever,
and Ruslan Salakhutdinov. 2014. Dropout: A simple way to prevent
neural networks from overfitting. The Journal of Machine Learning
Research 15, 1 (2014), 19291958.
[24]
Arkaitz Zubiaga, Maria Liakata, and Rob Procter. 2016. Learning
Reporting Dynamics during Breaking News for Rumour Detection in
Social Media. arXiv preprint arXiv:1610.07363 (2016).
[25]
Jing Ma, Wei Gao, Zhongyu Wei, Yueming Lu, and Kam-Fai Wong.
2015. Detect rumors using time series of social context information
on microbl oggin g websites. In Proceedings of the 24th ACM
International Conference on Information and Knowledge
Management. ACM, 17511754
[26]
Sejeong Kwon, Meeyoung Cha, and Kyomin Jung. 2017. Rumor
detection over varying time windows. PloS one 12, 1 (2017),
e0168344.
[27]
Shiou Tian Hsu, Changsung Moon, Paul Jones, and Nagiza Samatova.
2017. A Hybrid CNN-RNN Alignment Model for Phrase-Aware
Sentence Classification. In Proceedings of the 15th Conference of the
European Chapter of the Association for Computational Linguistics:
Volume 2, Short Papers, Vol. 2. 443449.
[28]
Sergey I offe and Christian Szegedy. 2015. Batch normalization:
Accelerating deep network training by reducing internal covariate
shift. In International conference on machine learning. 448456.
[29]
Oluwaseun Ajao, Jun Hong, and Weiru Liu. 2015. A survey of
location inference techniques on Twitter. Journal of Information
Science 41, 6(2015), 855864.
... Numerous studies in the field of rumor detection employ various machine learning (ML) techniques, including support vector machine (SVM) [23][24][25], the susceptible-exposed-infected-skeptic (SEIZ) model [26], decision tree (DT), random forest (RF), k-nearest neighbor (KNN) [24], and the naive Bayes (NB) classifier [27,28]. Additionally, deep learning (DL) methods, specifically convolutional neural network (CNN) [29][30][31][32] and recurrent neural network (RNN) [33][34][35] approaches, have also been explored. ...
... Asghar et al. [31] proposed a novel bidirectional CNN that classifies tweets either as a rumor or credible with 86.12% accuracy. Similarly, Ajao et al. [29] proposed a hybrid LSTM-CNN for identifying rumors on X with 82% accuracy. ...
... A summary of key studies that served as the basis for incorporating ML approaches into our proposed HEA model is provided in Table 1. [24] DT/SVM/RF X Yang et al. [25] SVM Weibo Jin et al. [26] SEIZ X Dayani et al. [27] KNN+Bayes X Ajao et al. [29] LSTM+CNN X Alsaeedi et al. [30] CNN PHEME Asghar et al. [31] CNN PHEME Alkhodair et al. [33] Word2Vec+LSTM PHEME Ma et al. [35] RNN X Proposed method HEA model PHEME ...
Article
The paper describes a novel a hybrid ensemble algorithm (HEA) that combines ensemble learning, class imbalance handling, and feature extraction. To address class imbalance in the dataset, the suggested approach integrates SMOTE oversampling and random under sampling (RU) feature extraction. To begin, Pearson correlation analysis is used to detect highly associated features in a dataset. This analysis aids in the selection of the most relevant features, which are either substantially related to the target variable or have a strong association with other features. The method seeks to improve classification performance by focusing on these correlated features. Following that, the SMOTE oversampling and RU algorithms are used to balance the majority and minority categorization characteristics. The SMOTE (synthetic minority oversampling technique) develops synthetic cases for the minority class by interpolating between existing instances, enhancing minority class representation. RU, on the other hand, removes instances from the majority class at random to obtain a balanced distribution. Furthermore, the random forest classifier (RFC) model’s key features are input into an ensemble of decision tree (DT), k-nearest neighbor (KNN), adaptive boosting (AdaBoost), and convolutional neural network (CNN) approaches. This ensemble approach combines multiple models’ predictions, exploiting their particular strengths and catching varied patterns in the data. Popular machine learning algorithms include DT, KNN, AdaBoost, and CNN, which are notable for their capacity to handle many types of data and capture complicated relationships. The evaluation findings show that the suggested HEA approach is effective, with a maximum precision, recall, F-score, and accuracy of 90%. The proposed methodology produces encouraging results, proving its applicability to a variety of categorization problems.
... Understanding the motive or emotion behind the spread lies can also help classifying an information into real or fake. Detecting and classifying information of this complexity necessitates the utilization of state of art techniques and methods like Deep Learning (DL) & Natural Language Processing (NLP) [15][16][17][18]. ...
... The authors tested this hybrid model with chi-Square and PCA which resulted in an accuracy of 95.2% and 97.8% respectively [4]. Ajao et al. also used CNN-LSTM to achieve an accuracy of 80.83% [16]. ...
Article
Spread of Fake news is a major challenge in this digital age where information is shared through various means like internet articles or social media. Since the proliferation of fake news is on the rise, the spread of misinformation is a major concern. Spreading of fake news have wide repercussions as it has a wider reach due to straightforward access and rapid dispersion. This paper proposes a Deep Learning approach along with Natural Language Processing Techniques as the solution for fake news detection. We used ISOT dataset to train and test our model with GloVe model for word embeddings and bidirectional Long Short-Term Memory (LSTM) as the neural network classifier to achieve an accuracy of 99.35%. Keywords – Deep Learning, Fake News Detection, Long Short-Term Memory (LSTM), Natural Language Processing.
... We used an SVM with a Gaussian kernel (RBF) function. • Back propagation neural network (BPNN) [65] achieves efcient mapping from input space to output space by building a series of hierarchical structures to extract and learn key features in the data layer by layer. Te structure of the BPNN model in this experiment included an input layer, an output layer, and a hidden layer. ...
Article
Full-text available
User and artificial intelligence generated contents, coupled with the multimodal nature of information, have made the identification of false news an arduous task. While models can assist users in improving their cognitive abilities, commonly used black-box models lack transparency, posing a significant challenge for interpretability. This study proposes a novel credibility assessment method of social media content, leveraging multimodal features by optimizing the hierarchical belief rule-based (HBRB) inference method. Compared to other popular feature engineering and deep learning models, our method integrates, analyses, and filters relevant features, improving the HBRB structure to make the model layered, independent, and interconnected, enhancing interpretability and controllability, thereby addressing the rule combination explosion problem. The results highlight the potential of our method to improve the integrity of the online information ecosystem, offering a promising solution for more transparent and reliable credibility assessment in social media.
... when dense units are 100 with window size 5. Further a dropout method of regularisation is used which lessens the problem of overfitting and is mostly used in the task of fake news detection (Ajao et al. 2018). ...
Article
Full-text available
The expeditious propagation of fake news through online social media platforms has cropped up as a captious challenge, undermining the credibility of information sources and affecting public trust. Accurate detection of fake news is imperative to maintain the integrity of online content but is constrained by availability of data. This research aims to detect fake news from online articles by proposing a novel deep learning ensemble network capable of effectively discerning between genuine and fabricated news articles using limited data. We introduce ScrutNet, which leverages the synergistic capabilities of a bidirectional long short-term memory network and a convolutional neural network, which have been meticulously designed and fine-tuned for the task by us. This comprehensive ensemble classifier captures both sequential dependencies and local patterns within the textual data without requiring very large datasets like transformer based models. Through rigorous experimentation, we optimise the individual model parameters and ensemble strategy. The experimental results showcase the remarkable efficacy of ScrutNet in the detection of fake news, with an outstanding precision of 99.56%, 99.43% specificity, and an F1 score of 99.49% achieved on the partition test of the data set. Comparative analysis against state-of-the-art baselines demonstrates the superior performance of ScrutNet, establishing its prominence as a generalised and dependable fake news detection mechanism.
... Deep learning has also demonstrated promising outcomes in various text classification tasks. In the context of false information detection, the widely implemented neural network framework approaches on detecting rumour are recurrent neural network (RNN) [6][7][8], long short-term memory (LSTM) [14][15][16], and convolutional neural network (CNN) [9][10][11][12][13]. ...
Article
Full-text available
Online misinformation poses a significant challenge due to its rapid spread and limited supervision. To address this issue, automated rumour detection techniques are essential for countering the negative impact of false information. Previous research primarily focussed on extracting text features, which proved time-consuming and less effective. In this study, we contribute substantially to two domains: rumour detection on Twitter and the evaluation of text embeddings. We thoroughly analyse rumour detection models and compare the quality of text embeddings generated by various fine-tuned BERT-based models. Our findings indicate that our proposed models outperform existing techniques. Notably, when we test these models on combined datasets, we observe significant performance improvements with larger training and testing data sizes. We conclude that carefully considering the dataset, data splitting, and classification techniques is crucial for evaluating solution performance. Additionally, we find that differences in the quality of text embeddings between RoBERTa, BERT, and DistilBERT are insignificant. This challenges existing assumptions and highlights the need for future research to explore these nuances further.
... The integration of multimodal data and hybrid neural architectures has significantly advanced the field since 2018. For example, Ajao et al. (2018) developed a hybrid convolutional neural network (CNN) and RNN model for the identification of fake news on Twitter (X), demonstrating how combining CNNs and RNNs can effectively analyze text and sequential patterns in social media posts. Their work exemplifies the growing importance of leveraging multiple neural architectures to improve detection accuracy and scalability. ...
Article
Full-text available
Social media, particularly microblogging platforms, are essential for rapid information sharing and public discussion but often allow rumors, that is, unverified information, to spread rapidly during events or persist over time. These platforms also offer opportunities to study the dynamics of rumors and develop computational methods to assess their veracity. In this paper, we provide a comprehensive review of existing theoretical foundations, interdisciplinary challenges, and emerging advancements in rumor detection research, with a focus on integrating theoretical and computational approaches. Drawing on insights from computer science, cognitive psychology, and sociology, we explore methodologies, such as multimodal fusion, graph‐based models, and attention mechanisms, while highlighting gaps in real‐world scalability, ethical transparency, and cross‐platform adaptability. Using a systematic literature review and bibliometric analysis, we identify trends, methods, and gaps in current research. Our findings emphasize interdisciplinary collaboration to develop adaptable, efficient, and ethical rumor detection strategies. We also highlight the critical role of combining socio‐psychological insights with advanced computational techniques to address the human factors in rumor spread. Furthermore, we emphasize the importance of designing systems that remain effective across diverse cultural and linguistic contexts, enhancing their global applicability. We propose a conceptual framework integrating diverse theories and computational techniques, offering a roadmap for improving detection systems and addressing misinformation challenges on microblogging platforms.
... used to detect a rumour. These features can be the source, headline, text of post or image, etc... Several researchers have tried to identify the features that can be used for rumour detection. For example content based text and media (images, gifs, etc) are considered the features that can be more helpful in detecting rumours by some researchers (Ajao et. al., 2018) whereas some have identified that tracking the activity of a user or account will be more beneficial for the same (Ferrara et. al., 2016). Similarly some have concluded that network features can also be of great help i.e. by identifying the followers and friends of a user (Tacchini et. al., 2017). ...
Article
In modern life, social media is playing a vital role; the cutting edge internet based life has attracted a huge following. As connectivity and accessibility of networks has amplified it has subsequently increased the use of social media. With the increase in social media accessibility, the speed of spreading rumours i.e. the unverified information has also increased. For detection and interception of rumours over social media a lot of research has been done. The paper, along with highlighting the prominent research done in rumour detection; focuses on the process of rumour generation and detection both. The purpose of this paper is to highlight the phase of feature extraction in the rumor detection process i.e. how and why it is significant. The paper shows the importance of feature identification by forming categorizations and finding their impact in the rumour detection process.
... Deep getting to know is a subset of the device getting to know that can analyze unsupervised, unstructured or unlabeled data. On the other hand, deep learning can learn optimal features from available data without human intervention [25]. ...
Article
Full-text available
Article History Keywords Deep learning Feature extraction Formatting Machine learning Natural language processing NLP applications. Word embedding. The use of deep learning techniques in natural language processing (NLP) is examined thoroughly in this study with particular attention to tasks where deep learning has been shown to perform very effectively. The primary strategies explored are phrase embedding, function extraction, and textual content cleaning. These are all essential for sorting textual content statistics and files. It appears in important gear like software programs, hardware and extensively used libraries and cutting-edge programs for deep learning in NLP. In NLP, deep learning is turning into a chief fashion, changing many areas and making large modifications in many fields. This paper stresses how deep analyzing techniques have a good-sized-ranging effect and how vital they may be for shifting the world in advance. This paper also discusses how deep learning may assist in solving modern issues and handling challenging and stressful situations in NLP research. Since those methods are getting more popular, it indicates that they're top at handling many NLP responsibilities. The final part of the evaluation talks about the most current makes use of, developing traits and long-term troubles in NLP. It helps practitioners and lecturers determine and use the capabilities of deep learning in the dynamic field of natural language processing with its applicable facts and examples. Contribution/Originality: This study differs in that it provides a comprehensive analysis of recent deep learning methods in NLP by combining an investigation of theoretical foundations with practical implementations. Unlike earlier research, it focuses on generative models, unsupervised and reinforcement learning approaches, and emerging trends, providing a comprehensive view of NLP's evolving landscape.
Preprint
Cyber information influence, or disinformation in general terms, is widely regarded as one of the biggest threats to social progress and government stability. From US presidential elections to European Union referendums and down to regional news reporting of wildfires, lies and post-truths have normalized radical decision-making. Accordingly, there has been an explosion in research seeking to detect disinformation in online media. The frontier of disinformation detection research is leveraging a variety of ML techniques such as traditional ML algorithms like Support Vector Machines, Random Forest, and Na\"ive Bayes. Other research has applied deep learning models including Convolutional Neural Networks, Long Short-Term Memory networks, and transformer-based architectures. Despite the overall success of such techniques, the literature demonstrates inconsistencies when viewed holistically which limits our understanding of the true effectiveness. Accordingly, this work employed a two-stage meta-analysis to (a) demonstrate an overall meta statistic for ML model effectiveness in detecting disinformation and (b) investigate the same by subgroups of ML model types. The study found the majority of the 81 ML detection techniques sampled have greater than an 80\% accuracy with a Mean sample effectiveness of 79.18\% accuracy. Meanwhile, subgroups demonstrated no statistically significant difference between-approaches but revealed high within-group variance. Based on the results, this work recommends future work in replication and development of detection methods operating at the ML model level.
Article
Full-text available
This study determines the major difference between rumors and non-rumors and explores rumor classification performance levels over varying time windows—from the first three days to nearly two months. A comprehensive set of user, structural, linguistic, and temporal features was examined and their relative strength was compared from near-complete date of Twitter. Our contribution is at providing deep insight into the cumulative spreading patterns of rumors over time as well as at tracking the precise changes in predictive powers across rumor features. Statistical analysis finds that structural and temporal features distinguish rumors from non-rumors over a long-term window, yet they are not available during the initial propagation phase. In contrast, user and linguistic features are readily available and act as a good indicator during the initial propagation phase. Based on these findings, we suggest a new rumor classification algorithm that achieves competitive accuracy over both short and long time windows. These findings provide new insights for explaining rumor mechanism theories and for identifying features of early rumor detection.
Article
Full-text available
Breaking news leads to situations of fast-paced reporting in social media, producing all kinds of updates related to news stories, albeit with the caveat that some of those early updates tend to be rumours, i.e., information with an unverified status at the time of posting. Flagging information that is unverified can be helpful to avoid the spread of information that may turn out to be false. Detection of rumours can also feed a rumour tracking system that ultimately determines their veracity. In this paper we introduce a novel approach to rumour detection that learns from the sequential dynamics of reporting during breaking news in social media to detect rumours in new stories. Using Twitter datasets collected during five breaking news stories, we experiment with Conditional Random Fields as a sequential classifier that leverages context learnt during an event for rumour detection, which we compare with the state-of-the-art rumour detection system as well as other baselines. In contrast to existing work, our classifier does not need to observe tweets querying a piece of information to deem it a rumour, but instead we detect rumours from the tweet alone by exploiting context learnt during the event. Our classifier achieves competitive performance, beating the state-of-the-art classifier that relies on querying tweets with improved precision and recall, as well as outperforming our best baseline with nearly 40% improvement in terms of F1 score. The scale and diversity of our experiments reinforces the generalisability of our classifier.
Conference Paper
Full-text available
Microblogging platforms are an ideal place for spreading rumors and automatically debunking rumors is a crucial problem. To detect rumors, existing approaches have relied on hand-crafted features for employing machine learning algorithms that require daunting manual effort. Upon facing a dubious claim, people dispute its truthfulness by posting various cues over time, which generates long-distance dependencies of evidence. This paper presents a novel method that learns continuous representations of microblog events for identifying rumors. The proposed model is based on recurrent neural networks (RNN) for learning the hidden representations that capture the variation of contextual information of relevant posts over time. Experimental results on datasets from two real-world microblog platforms demonstrate that (1) the RNN method outperforms state-of-the-art rumor detection models that use hand-crafted features; (2) performance of the RNN-based algorithm is further improved via sophisticated recurrent units and extra hidden layers; (3) RNN-based method detects rumors more quickly and accurately than existing techniques, including the leading online rumor debunking services.
Conference Paper
Full-text available
The goal of this work is to introduce a simple modeling framework to study the diffusion of hoaxes and in particular how the availability of debunking information may contain their diffusion. As traditionally done in the mathematical modeling of information diffusion processes, we regard hoaxes as viruses: users can become infected if they are exposed to them, and turn into spreaders as a consequence. Upon verification, users can also turn into non-believers and spread the same attitude with a mechanism analogous to that of the hoax-spreaders. Both believers and non-believers, as time passes, can return to a susceptible state. Our model is characterized by four parameters: spreading rate, gullibility, probability to verify a hoax, and that to forget one's current belief. Simulations on homogeneous, heterogeneous, and real networks for a wide range of parameters values reveal a threshold for the fact-checking probability that guarantees the complete removal of the hoax from the network. Via a mean field approximation, we establish that the threshold value does not depend on the spreading rate but only on the gullibility and forgetting probability. Our approach allows to quantitatively gauge the minimal reaction necessary to eradicate a hoax.
Conference Paper
Full-text available
Automatically identifying rumors from online social media especially microblogging websites is an important research issue. Most of existing work for rumor detection focuses on modeling features related to microblog contents, users and propagation patterns, but ignore the importance of the variation of these social context features during the message propagation over time. In this study, we propose a novel approach to capture the temporal characteristics of these features based on the time series of rumor’s lifecycle, for which time series modeling technique is applied to incorporate various social context information. Our experiments using the events in two microblog datasets confirm that the method outperforms state-of-the-art rumor detection approaches by large margins. Moreover, our model demonstrates strong performance on detecting rumors at early stage after their initial broadcast.
Article
Full-text available
The increasing popularity of the social networking service, Twitter, has made it more involved in day-to-day communications, strengthening social relationships and information dissemination. Conversations on Twitter are now being explored as indicators within early warning systems to alert of imminent natural disasters such as earthquakes and aid prompt emergency responses to crime. Producers are privileged to have limitless access to market perception from consumer comments on social media and microblogs. Targeted advertising can be made more effective based on user profile information such as demography, interests and location. While these applications have proven beneficial, the ability to effectively infer the location of Twitter users has even more immense value. However, accurately identifying where a message originated from or author’s location remains a challenge thus essentially driving research in that regard. In this paper, we survey a range of techniques applied to infer the location of Twitter users from inception to state-of-the-art. We find significant improvements over time in the granularity levels and better accuracy with results driven by refinements to algorithms and inclusion of more spatial features.
Article
Deep neural nets with a large number of parameters are very powerful machine learning systems. However, overfitting is a serious problem in such networks. Large networks are also slow to use, making it difficult to deal with overfitting by combining the predictions of many different large neural nets at test time. Dropout is a technique for addressing this problem. The key idea is to randomly drop units (along with their connections) from the neural network during training. This prevents units from co-adapting too much. During training, dropout samples from an exponential number of different "thinned" networks. At test time, it is easy to approximate the effect of averaging the predictions of all these thinned networks by simply using a single unthinned network that has smaller weights. This significantly reduces overfitting and gives major improvements over other regularization methods. We show that dropout improves the performance of neural networks on supervised learning tasks in vision, speech recognition, document classification and computational biology, obtaining state-of-the-art results on many benchmark data sets. © 2014 Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever and Ruslan Salakhutdinov.
Article
Our personal social networks are big and cluttered, and currently there is no good way to organize them. Social networking sites allow users to manually categorize their friends into social circles (e.g. 'circles' on Google+, and 'lists' on Facebook and Twitter), however they are laborious to construct and must be updated whenever a user's network grows. We define a novel machine learning task of identifying users' social circles. We pose the problem as a node clustering problem on a user's ego-network, a network of connections between her friends. We develop a model for detecting circles that combines network structure as well as user profile information. For each circle we learn its members and the circle-specific user profile similarity metric. Modeling node membership to multiple circles allows us to detect overlapping as well as hierarchically nested circles. Experiments show that our model accurately identifies circles on a diverse set of data from Facebook, Google+, and Twitter for all of which we obtain hand-labeled ground-truth.