ArticlePDF Available

A Novel Stacking Approach for Accurate Detection of Fake News

Authors:

Abstract and Figures

With the increasing popularity of social media, people has changed the way they access news. News online has become the major source of information for people. However, much information appearing on the Internet is dubious and even intended to mislead. Some fake news are so similar to the real ones that it is difficult for human to identify them. Therefore, automated fake news detection tools like machine learning and deep learning models have become an essential requirement. In this paper, we evaluated the performance of five machine learning models and three deep learning models on two fake and real news datasets of different size with hold out cross validation. We also used term frequency, term frequency-inverse document frequency and embedding techniques to obtain text representation for machine learning and deep learning models respectively. To evaluate models’ performance, we used accuracy, precision, recall and F1-score as the evaluation metrics and a corrected version of McNemar’s test to determine if models’ performance is significantly different. Then, we proposed our novel stacking model which achieved testing accuracy of 99.94% and 96.05 % respectively on the ISOT dataset and KDnugget dataset. Furthermore, the performance of our proposed method is high as compared to baseline methods. Thus, we highly recommend it for fake news detection.
Content may be subject to copyright.
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3056079, IEEE Access
Date of publication xxxx 00, 0000, date of current version November 8, 2020.
Digital Object Identifier 10.1109/ACCESS.2017.DOI
A Novel Stacking Approach for Accurate
Detection of Fake News
TAO JIANG1, JIAN PING LI1, AMIN UL HAQ1, ABDUS SABOOR 1, And AMJAD ALI 2
1School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China.
2Department of Computer Science and Software Technology, University of Swat,19200, KPK, Pakistan.
Corresponding author: Tao Jiang, Jian ping li and Amin Ul Haq (e-mail: tao1024@yahoo.com, jpli2222@uestc.edu.cn,
khan.amin50@yahoo.com).
This work was supported in part by the National Natural Science Foundation of China under Grant 61370073, in part by the National High
Technology Research and Development Program of China under Grant 2007AA01Z423, and in part by the project of the Science and
Technology Department of Sichuan Province.
ABSTRACT With the increasing popularity of social media, people has changed the way they access news.
News online has become the major source of information for people. However, much information appearing
on the Internet is dubious and even intended to mislead. Some fake news are so similar to the real ones
that it is difficult for human to identify them. Therefore, automated fake news detection tools like machine
learning and deep learning models have become an essential requirement. In this paper, we evaluated the
performance of five machine learning models and three deep learning models on two fake and real news
datasets of different size with hold out cross validation. We also used term frequency, term frequency-
inverse document frequency and embedding techniques to obtain text representation for machine learning
and deep learning models respectively. To evaluate models’ performance, we used accuracy, precision, recall
and F1-score as the evaluation metrics and a corrected version of McNemar’s test to determine if models’
performance is significantly different. Then, we proposed our novel stacking model which achieved testing
accuracy of 99.94% and 96.05 % respectively on the ISOT dataset and KDnugget dataset. Furthermore, the
performance of our proposed method is high as compared to baseline methods. Thus, we highly recommend
it for fake news detection.
INDEX TERMS
Deception detection, deep learning, fake news, machine learning, McNemar’s test, performance evaluation,
stacking.
I. INTRODUCTION
With the rapid development of the Internet, social media has
become an perfect hotbed for spreading fake news, distorted
information, fake reviews, rumors, satires. Many people think
the 2016 U.S. presidential election campaign has been influ-
enced by fake news. Subsequent to this election, the term has
entered the mainstream vernacular [46].
Nowadays fake news has become a major concern for both
industry and academia, one of the solutions for this problem
is human fact-checking. However, the real-time nature of fake
news on social media makes identify online fake news even
more difficult [46]. The expert fact-checking may have very
limited help because of its low efficiency. In addition, fact-
checking by human is very laborious and expensive. Thus,
we need to use Machine Learning (ML) and Deep Learning
Models (DL) to automate this process. Various hierarchical
classification methods such as [30] and [1] can be used for
fake news detection.
In recent years, to identify fake news from real news, a
lot researchers have been working on establishing effective
and automatic frameworks for online fake news detection. A
lot of researchers proposed their models based on machine
learning and deep learning techniques. However, these pro-
posed methods have some limitations in terms of accuracy.
To tackle these issues and effectively detect fake news, a new
method is necessary.
In this paper, we evaluated different classification algo-
rithms such as logistic regression (LR) [45], supports vector
machine (SVM) [11], k-nearest neighbor (k-NN) [45], deci-
sion tree (DT) [42], random forest (RF), covolutional neural
network (CNN), gated recurrent network (GRU)[10], long
shor-term memory (LSTM)[20] for the detection of Fake
VOLUME 4, 2016 1
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3056079, IEEE Access
Author et al.: Preparation of Papers for IEEE TRANSACTIONS and JOURNALS
news. Then we used stacking method to improve the indi-
vidual model performance. Our paper involved two datasets:
ISOT dataset1. We used techniques like term frequency (TF),
term frequency-inverse document frequency (TF-IDF) and
embedding to tokenize the title and text feature of these two
datasets. Grid Search technique has been used for tuning the
hyperparamters and model selection. Various performance
evaluation metrics have been used such as accuracy, recall,
f1-score, precision and training time. The experimental re-
sults of the proposed stacking method have been compared
with the state of the art results in the published literature.
Furthermore, all experimental results have been tabulated in
various tables and graphically shown in various figures for
better understanding.
The paper have the following contributions:
Firstly, five machine learning models and three deep
learning models have been trained to compare the per-
formance difference between individual models.
Secondly, we used two datasets of different size to test
models’ robustness on datasets of different size.
Thirdly we employed a corrected version of McNemar’s
statistical test to decided if there really are signifi-
cantly differences between two model’s performance
and choose the best individual model for fake news
detection.
Lastly, our proposed stacking model outperformed the
state of the art methods.
The rest of this paper is organized as follows. In section
2 literature review have been presented. In section 3, we
have discussed the details of data sets and classification
models used in this paper. The experimental results have
been presented in section 4. The conclusion and future work
direction has been discussed in last section 5.
II. LITERATURE REVIEW
To detect the fake news numerous machine learning and deep
Learning techniques have been recommended by various
scholars. In this research study, we have presented some
of the baselines fake news detection techniques. The major
objectives of the literature review is to identify the problems
in the baseline methods and provide a reliable solution.
Some researchers evaluated a lot machine learning models
on different datasets to choose the best individual model.
Ozbay et al. [32] implemented twenty-three supervised ar-
tificial intelligence algorithms in three datasets including
BayesNet, JRip, OneR, Decision Stump, ZeroR, Stochastic
Gradient Descent (SGD), Logistic Model Tree (LMT), etc.
According to their experimental results, the decision tree
algorithm outperformed all other intelligent classification
algorithms in all evaluation metrics except recall. Kaliyar et
al. [22] used Random Forest, Multinomial Naïve Bayes, Gra-
dient Boosting, Decision Tree, Logistic Regression, Linear-
SVM for fake news detection. They found that gradient
1https://www.uvic.ca/engineering/ece/isot/datasets/fake-
news/index.php and KDnugget dataset2
boosting provides state-of-the-art results and achieved an
accuracy of 86% on Fake News Challenge dataset. Gilda et
al. [15] used TF-IDF of bi-grams and probabilistic context
free grammar (PCFG) features to classify news from reliable
sources (labeled as 0) and unreliable sources (labeled as
1). Then, they evaluated SVM, SGD, Gradient Boosting,
Decision Trees and Random Forests trained on TF-IDF only
features, PCFG only features and TF-IDF and PCFG combin-
ing features. Finally, they concluded that SGD model trained
on TF-IDF feature set only presented the best performance in
ROC AUC meassure.
There are some researchers designing extraordinary neural
networks for fake news detection. Umer et al. [41] proposed
a model that combines neural network architecture including
CNN and LSTM with dimensionality reduction methods,
PCA and Chi-Square, to determine if news articles’ headline
agree with text body. Then they observed that the proposed
model resulted in the highest accuracy, 97.8%, with much
shorter time. Kaliyar et al. [23] created a CNN-based deep
neural network called FNDNet and achieved state-of-the-art
results with an accuracy of 98.36%on Kaggle fake news
dataset. Kumar et al. [25] performed a CNN + BiLSTM
ensembled model with attention mechanism on their own
datasets and FakeNewsNet dataset and achieved the highest
accuracy of 88.78%. Ajao et al. [4] used a hybrid of CNN
and RNN to classify fake news messages from Twitter posts.
They compared the performance of plain LSTM model,
LSTM with dropout regularization and LSTM-CNN hybrid
model on a dataset containing approximately 5,800 tweets
centered on five rumor stories. Then they concluded that the
plain LSTM model has the best performance while LSTM
method with dropout regularization suffers from underfitting
and LSTM-CNN hybrid model suffers from limited data.
Roy et al. [37] utilized CNN to identify hidden features
and RNN to capture temporal sequence and fed obtained
representation into MLP for classification. They used pre-
trained 300-dimensional Google News Vectors to get feature
embeddings and fed them into separate convolutional layers
and separate Bi-LSTM layers. Their models were tested on
Liar Dataset with an accuracy of 44.87%which outperforms
the state-of-the-art model by 3%.
Some researchers try to solve the spreading of fake news
on social network platform. Ma et al. [27] used a deep
learning model to detect rumors on twitter and weibo. Their
RNN-based model allow for early detection and achieves sig-
nificant improvement over state-of-the-art algorithms. Monti
et al. [29] presented a propagation-based approach for fake
news detection on Twitter social network. They used a four-
layer Graph CNN with two convolutional layers and two
dense layers to predict and achieved great performance with
92.7%ROC AUC. Ruchansky et al. [38] builded a CSI model
that utilized the text, the user response and source character-
istics at once for fake news detection. The Capture module
in CSI was constructed with LSTM to exploit the temporal
pattern of user response and text. The Score module of CSI
used a neural network and user graph to assign a score to
2VOLUME 4, 2016
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3056079, IEEE Access
Author et al.: Preparation of Papers for IEEE TRANSACTIONS and JOURNALS
each user. The score, then, can be used to identify suspicious
users. The Integrate module combined the information from
first two module and classify each article as fake or not fake.
To improve the results, some researchers also used other
different news features. Reis et al. [36] evaluated the discrim-
inative features extracted from news content, news source,
environment using several classifiers such as k-NN, NB, RF,
SVM with kernel RBF, and XGBoost w.r.t. the area under
the ROC curve and the Macro F1 score measurements. Della
Vedova et al. [12] proposed a novel machine learning fake
news detection method which, by combining news content
and social context features increases accuracy by up to 4.8%.
Then they validated it with a real-world dataset, obtaining
a fake news detection accuracy of 81.7%. Shabani et al.
[39] selected 5 different machine learning models: Logistic
Regression, SVM, Random Forest, Neural Networks, and
Gradient Boosting Classifier for Fake news and Satire clas-
sification. Making use of TF-IDF + paralinguistic features
+ sentiment related features and text similarity features
extracted by querying Google, the Neural Network model
achieved the highest accuracy of 81.64% which was better
than the baseline results by 2.54%.
Some researchers evaluated how different feature extrac-
tion methods affect the results. Ahmed et al. [3] compared
2 different features extraction techniques namely, term fre-
quency (TF) and term frequency-inverted document fre-
quency (TF-IDF) and 6 n-gram machine learning classifica-
tion models including SGD, SVM, LSVM, LR, KNN, and
DT on two datasets. They saw that an increase in the n-gram
size would cause a decrease in the accuracy. Agudelo et al.
[2] used Naive Bayes Model for the identification of false
news in public data sets. Their results showed that it was
more effective to use CountVectorizer than TfidfVectorizer to
preprocess the data, since CountVectorizer method correctly
classified 89.3% of the news.
Some researchers proposed their own deep learning frame-
work and achieved great accuracy on their datasets. Zhang
et al. [46] presented a detailed comparison of thirteen ex-
isting fact-checking resources and seven public datasets.
Then, they illustrated the overall categorizations of the cur-
rent researches on online fake news detection. Finally, they
proposed a three layers ecosystem including alert layer,
detection layer(fact-checking, fake news detection), and in-
tervention layer. Singhania et al. [40] constructed a three
level hierarchical attention network(3HAN) based on a pro-
posed HAN. 3HAN has three levels, one each for words,
sentences, and the headline and provided an understandable
output to enable further manual fact checking. They also used
headlines to perform a supervised pre-training of the initial
layers of 3HAN. Long et al. [26] performed a attention-based
LSTM model on LIAR dataset which incorporates speaker
profiles such as speaker name, title, party affiliation, current
job, location of speech and credit history. Their experimental
results show that speaker profile information can improve
CNN and LSTM models significantly.
To detect fake news, researchers did a lot novel work.
Rasool et al. [34] proposed a novel method of multilevel
multiclass fake news detection based on relabeling of the
dataset and learning iteratively. The method is tested using
different supervised machine learning algorithms like SVM
and decision tree with hold-out, test dataset and cross valida-
tion approaches. Their method outperformed the benchmark
with an accuracy of 39.5% on LIAR dataset. Oshikawa et
al. [31] compared and discussed nine benchmark datasets
and experimental results of different methods. Then, they
suggested that meta-data and additional information can be
utilized to improve the robustness and performance. Jain et
al. [21] proposed a mix of Naïve Bayes classifier, SVM,
and natural language processing techniques on a fake news
dataset and their model accuracy is up to 93.6% which was
better than the baseline results by 6.85%. Reis et al. [35]
provided a great understanding of how features are used in
the decisions taken by models. They performed an unbiased
search for XGB models. Each of them was composed of a
set of randomly chosen features. Then they used SHAP to
explain why news are classified as fake or real by representa-
tive models of each model cluster.
Fake news detection is a global problem, different fake
news in different countries is written in different languages.
A lot researchers try to find a solution for multiple language
fake news detection by constructing a new dataset or training
models on different language datasets. Faustini et al. [14]
trained Naïve Bayes, K-Nearest Neighbors, SVM and Ran-
dom Forest from five datasets in three languages. They com-
pared the results obtained through a custom set of features,
Document-class Distance, bag-of-words and Word2Vec in
accuracy and F1-Score measures. Eventually, they concluded
that SVM and RandomForest outperformed other algorithms
and bag-of-words achieved the best results in general. Wang
et al. [43] presented a English fake news dataset. They also
designed a hybrid CNN model to integrate metadata with text
and proved that this hybrid approach can improve a text-only
model.
In [33], the researchers compiled a new Spanish language
corpus of news from January to July of 2018. The corpus is
annotated with two labels (real and fake) and true news and
fake news are pairs of events. Statistics of the corpus like
vocabulary overlap of the different news topics and labels
are also mentioned in their paper. To detect fake news, they
performed SVM, LR, RF, and boosting on bag of words
(BOW), POS tags, and n-grams features sets of their datasets
and find character 4-grams without removing the stop words
with the Boosting algorithm has the best performance in
accuracy. Amjad et al. [5] proposed a new Urdu language
corpus: “Bend The Truth” for fake news detection which
contains 900 news articles, 500 annotated as real and 400
labeled as fake. Their text representation feature sets in-
clude the combination of word n-grams, character n-grams,
functional word n-grams (n ranging from 1 to 6) with a
variety of feature weighting schemes including binary values,
normalized frequency, log-entropy weighting, raw frequency,
relative frequency, and TF-IDF. Then they evaluated classi-
VOLUME 4, 2016 3
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3056079, IEEE Access
Author et al.: Preparation of Papers for IEEE TRANSACTIONS and JOURNALS
fiers like Multinomial Naive Bayes (MNB), Bernoulli Naive
Bayes (BNB), SVM, LR, RF, DT, and AdaBoost in balanced
accuracy, F1Real score, F1F ake score, and ROC-AUC with
10-fold cross-validation. Finally they found AdaBoost with
the combination of character 2-grams and word 1-grams has
the best performance in F1F ake score.
The proposed methods in literature have been summarized
in Table 1. In Table 1, we presented proposed models, contri-
butions, data sets of the related work for better understanding
about fake news detection.
III. MATERIALS AND METHODS
The fundamental concepts of proposed models have been
discussed in below sections.
A. DATA-SET
Our paper involved two datasets, ISOT fake news dataset
and KDnugget dataset. ISOT dataset was entirely collected
from real-world sources [32]. The real news was collected
by crawling articles from Reuters.com and fake news were
collected from unreliable websites that were flagged by
Politifact and Wikipedia. The articles were mostly released
from 2016 to 2017. This dataset includes 44898 data in total,
21417 are real news (labeled as 1) and 23481 are fake news
(labeled as 0). The features include title, text (news body),
subject, date and label. The news subjects has different cat-
egories like ’politicsNews’, ’worldnews’, ’News’, ’politics’,
’Government News’, ’left-news’, ’US_News’, ’Middle-east’.
We selected features like title and text in ISOT dataset to train
our models.
KDnugget dataset was made public by KDnuggets (a
data website) [14]. There are 6335 fake and real news in
KDnugget dataset including 3171 real ones (labeled as 1) and
3164 fake news (labeled as 0). It is publicly available. The
real news came from media organizations such as the New
York Times, WSJ, Bloomberg, NPR, and the Guardian and
were published in 2015 or 2016. The fake news is randomly
selected from the kaggle fake news dataset. It has three useful
features including title, text (news body) and label. To obtain
more information, we selected features like title and text in
KDnugget dataset to train our models.
Obviously, these two datasets have different size. The
KDnugget data set is only one-seventh of ISOT data set. A
little section of ISOT dataset and KDnugget dataset had been
described in Table 2 and Table 3 respectively.
B. PRE-PROCESSING
Before the data were fed into machine learning and deep
learning models, the text data need to be preprocessed us-
ing methods like stop word removal, tokenization, sentence
segmentation, and punctuation removal. These operations
can significantly help us select the most relevant terms and
increase model performance.
Both our datasets come from real word news articles, so
there are a lot meaningless urls which carry none informa-
tion. So we cleaned our data first by removing these urls.
Stop word removal is our next preprocessing step. Stop words
frequently used in English sentences to complete the sentence
structure but they are insignificant in expressing individual’s
thoughts. So in all our experiments, we removed them in case
they crate too much noise. After the text data was cleaned, we
tokenized them by using TF-IDF and embedding techniques.
Term frequency (TF)
TF is a common tokenization technique that calculate
the similarity between documents by using the counts of
words in the documents. By utilizing TF technique, each
document will be represented by a vector that contains
the word counts. Then each vector will be normalized
and the sum of its elements will be one which makes the
word counts convert into probabilities.
Let Ddenote a corpus and let ddenote a document.
Suppose wis the word in dand nw(d)is number of
times the word wappears. Thus the size of d can be
represented as |d|=Pwdnw(d). The normalized TF
for word w in document d can be defined as follows:
T F (w)d=nw(d)
|d|(1)
Term frequency-inverted document frequency (TF-IDF)
In our machine learning experiments, we also used TF-
IDF to transform the data into vectors. TF-IDF is a
weighting metric commonly used in text classification
problem. It is used to assign a score which shows the
importance of the term to every term in the document. In
this method, a term’s significance increases with the fre-
quency of the term in the dataset. Let D denote a corpus,
namely, set of news articles. Let d denote a article which
consists of a set of words w. The inverse document
frequency (IDF) can be computed methematically using
following equation.
I DF (w)D=1 + log( |D|
{|d:D|wd|})(2)
TF-IDF (term frequency-inverted document frequency)
for the word w with respect to document d and corpus
D is calculated as follows:
T F I DF (w)d,D =T F (w)d×I DF (w)D(3)
Embedding
In our deep learning experiments, we used pretrained
word embedding technique to obtain text representa-
tion. In word embedding space, the geometrical dis-
tance between word vectors represents the semantic
relationship. Word embedding can project the real hu-
man language grammar into a vector space. In an
ideal embedding space, words sharing the same se-
mantic meaning will be embedded into a similar vec-
tors. The geometrical distance between two word vec-
tors is highly associated with their linguistic meaning.
4VOLUME 4, 2016
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3056079, IEEE Access
Author et al.: Preparation of Papers for IEEE TRANSACTIONS and JOURNALS
TABLE 1. Summary of the published fake news papers
Ref Year Contributions Data set Models
[14] 2020 Trained four models from five datasets in
three languages
Btvlifestyle, FakeOrRealNews,
FakeNewsData1, FakeBrCorpus,
TwitterBR
Naïve bayes (NB), KNN, SVM, RF
[32] 2020 Implemented twenty-three supervised artifi-
cial intelligence algorithms in three datasets
ISOT, BuzzFeed Political News
Data set, Random Political News
Data set
ZeroR, SGD, CV Parameter Selection,
Randomizable Filtered Classifier, Logistic
Model Tree, Locally Weighted Learning,
Classification via Clustering, Weighted In-
stances Handler Wrapper, Ridor, MLP, Ordi-
nal Learning Model, Simple Cart, Attribute
Selected Classifier, J48, Sequential Minimal
Optimization (SMO), Bagging, DT, Kernel-
Logistic Regression, IBk
[41] 2020 Proposed a novel model that combines CNN
and LSTM with PCA and Chi-Square
- CNN+LSTM with PCA
[23] 2020 Achieved state-of-the-art results with an ac-
curacy of 98.36%
Kaggel fake news dataset DT, RF, CNN, LSTM, KNN, Multinomial
Naïve Bayes
[25] 2020 Proposed CNN + BiLSTM ensembled model
with attention mechanism
1356 news instances on Twitter and
other media sources
CNN+BiLSTM
[5] 2020 900 news articles, 500 annotated as real and
400, as fake
Multinomial Naive Bayes (MNB),
Bernoulli Naive Bayes (BNB),
SVM, LR, RF DT, and AdaBoost
Compiled a new Urdu language corpus
[36] 2019 Evaluated features extracted from news con-
tent, news source, environment
2282 BuzzFeed news articles re-
lated to the 2016 U.S. election
KNN,NB,RF,SVM,XGBoost
[34] 2019 Proposed a novelty binary classification
method for multilabel fake news dataset
Liar dataset SVM
[22] 2019 Evaluated different machine learning models
for fake news detection
Kaggle fake news dataset RF, Multinomial Naive Bayes, GB, DT, LR,
Linear-SVM
[21] 2019 Include machine learning and deep learning
models in one system
- NB,SVM
[35] 2019 Explained how features are used in the deci-
sions taken by models
- XGBoost
[29] 2019 Presented a propagation-based approach - Graph CNN
[33] 2019 491 ture, 480 false news SVM, LR, RF and Boosting Compiled a new Spanish language corpus
[3] 2018 Compared different feature extraction tech-
niques on different machine learning models
12 600 fake news articles from kag-
gle and 12 600 truthful political ar-
ticles
LR, SGD, DT, KNN, LSVM, SVM
[12] 2018 Combined news content and social context
features for fake news detection
- LR
[2] 2018 Compared two different feature extraction
techniques: TF and TF-IDF
- SVM
[39] 2018 Used TF-IDF features, paralinguistic fea-
tures, sentiment related features and text
similarity features
- LR, SVM, RF, GB, Neural Networks
[4] 2018 Used a hybrid of CNN and RNN 5,800 tweets centered on five rumor
stories
LSTM+CNN
[37] 2018 Utilized CNN and RNN to obtain text rep-
resentation and fed obtained representation
into MLP for classification
Liar dataset CNN+Bilstm
[15] 2017 Evaluated different models on TF-IDF fea-
tures and probabilistic context free grammar
(PCFG) features
- DT, Gradient Boosting, RF, SGD, SVM
[40] 2017 Constructed a three level hierarchical atten-
tion network(3HAN)
- 3HAN
[43] 2017 Presented a new public benchmark dataset
(LIAR)
Liar dataset SVM, LR, BiLSTM, CNN
[26] 2017 Compared models’ performance with and
without speaker profile information
Liar dataset Attention based LSTM model
[38] 2017 utilized features like the text, the user re-
sponse and source characteristics
Twitter and Weibo CSI
[27] 2016 Proposed a RNN based model for early de-
tection
Twitter and Weibo microblog
datasets
SVM-TS, SVM-RBF,DTC, RF, tanh-RNN,
LSTM, GRU
VOLUME 4, 2016 5
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3056079, IEEE Access
Author et al.: Preparation of Papers for IEEE TRANSACTIONS and JOURNALS
TABLE 2. Description of ISOT dataset
Title Text Subject Date Label
UK transport police leading in... LONDON (Reuters) - British counter-terrorism. worldnews September 15, 2017 REAL
Pacific nations crack down on... WELLINGTON (Reuters) - South Pacific island nation. worldnews September 15, 2017 REAL
Three suspected al Qaeda... ADEN, Yemen (Reuters) - Three suspected al Qaeda... worldnews September 15, 2017 REAL
Chinese academics prod Beijing... BEIJING (Reuters) - Chinese academics are publicly. worldnews September 15, 2017 REAL
Classic! Kid Rock Hits Back At... Not much to say after this classic response from. politics Sep 2, 2017 FAKE
’My Pillow’ CEO Mike Lindell... Who hasn t seen his commercials over and over and... politics Sep 1, 2017 FAKE
Bitter John McCain Calls Trump... What the heck! Senator John McCain just admitted.. politics Sep 1, 2017 FAKE
Muslim Activist Caught Sending... This woman has no shame1 Muslim activist Linda... politics Sep 1, 2017 FAKE
TABLE 3. Description of KDnugget dataset
Id Title Text Label
7614 Globalization Expressway to... If humans were largely moral and ethical beings... FAKE
10294 Watch The Exact Moment Paul... Google Pinterest Digg Linkedin Reddit... FAKE
7060 Now Malaysia Dumps US for... Now Malaysia Dumps US for Chinese Naval Vessels... REAL
10142 Bernie supporters on Twitter... Kaydee King (@KaydeeKing) November 9, 2016 The... FAKE
875 The Battle of New York: Why... It’s primary day in New York and front-runners... REAL
6903 Tehran, USA I’m not an immigrant, but my grandparents are... FAKE
7341 Girl Horrified At What She... Share This Baylee Luciani (left), Screenshot of... FAKE
95 ‘Britain’s Schindler’ Dies at... A Czech stockbroker who saved more than 650 Jewish... REAL
To reduce trainable parameters and increase time effi-
ciency, we used a file of GloVe word embeddings called
’glove.twitter.27B.100d.txt’ and loaded it in our model.
C. MACHINE LEARNING CLASSIFICATION MODELS
In this study, for classifying fake news and real news, five
machine learning models have been used. These models have
been discussed in detail.
1) Logistic Regression (LR)
Logistic regression is a common machine learning classifica-
tion algorithm. In a binary classification problem, to predict
the values of predictive variable y, where y[0, 1]. The
negative class is denoted by 0 while positive class is by 1.
In order to classify two classes 0 and 1, a hypothesis
h(θ) = θTXwill be designed and the threshold of classifier’s
output is when (x)= 0.5. If the value of hypothesis
(x)0.5, it will predict y= 1 which means that this
news is real and if the value of (x)<0.5, then it predicts
y= 0 which shows that this news is fake.
Hence, the prediction of logistic regression under the con-
dition 0(x)1is done. Logistic regression sigmoid
function can be written in equation 4 as follows:
(x) = g(θTX)(4)
where g(z)=1/(1 + xz)and (x)=1/(1 + xθTX)
Similarly, the logistic regression cost function can be writ-
ten in equation 5 as follows:
J(θ) = 1
m
m
X
i=1
cost((x(i)), y(i))(5)
2) Decision Tree (DT)
DT is an important supervised learning algorithm. Re-
searchers tend to use tree-based ensemble models like Ran-
dom Forest or Gradient Boosting on all kinds of tasks. The
basic idea of DT is that it develops a model to predict the
value of a dependent factor by learning various decision rules
inferred from the whole data. Decision Tree has a top-down
structure and shapes like a tree in which a node can only be
a leaf node which is binding with a label class or a decision
node which are responsible for making decisions. Decision
Tree is easily understandable about the process of making
the decisions and predictions. However, it is a weak learner
which means it may have bad performance on small datasets.
The key learning process in DT is to select the best
attribute. To solve this problem, various trees have different
metrics such as information gain used in ID3 algorithm,
gain_ratio used in C4.5 algorithm. Suppose discrete attribute
Ahas n different values and Diis the set which contains all
samples that has a value of i in training dataset D. The gain
ratio and information gain for attribute Acan be calculated
6VOLUME 4, 2016
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3056079, IEEE Access
Author et al.: Preparation of Papers for IEEE TRANSACTIONS and JOURNALS
as follows:
Gain(A, D) = Entropy(D)
n
X
i=1
|Di|
|D|Entropy(Di)(6)
GainRatio(A, D) = Gain(D, A)
IV (A)(7)
Where intrinsic value of attrivute A can be calculated as:
IV (A) =
n
X
i=1
|Di|
|D|log2|Di|
|D|(8)
3) K-Nearest Neighbor (KNN)
K-NN is a well known algorithm in machine learning. The
K-NN procedures are very simple. Given a test sample, it
first finds out k nearest neighbors to this sample based on
a distance measure. Then it predicts class label of the test
instance with major vote strategy. Sometimes classification
performance of K-NN is not high mostly because of curse of
dimensionality. K-NN also is a lazy learning algorithm and it
can spend a lot time on classification. The main procedures
of K-NN algorithm are given in algorithm 1.
Algorithm 1 KNN algorithm
1: for all unlabeled data u do
2: for all labeled data v do
3: compute the distance between u and v
4: find k smallest distances and locate the corre-
sponding labeled instances v1,...vk
5: assign unlabeled data u to the label appearing
most frequently in the located labeled instances
6: end for
7: end for
8: End
4) Random Forest (RF)
Random Forest is a ensemble consisting of a bagging of un-
pruned decision trees with a randomized selection of features
at each split. Each individual tree in the random forest pro-
duces a prediction and the prediction with the most votes are
the final prediction. According to No Free Lunch theorem:
There is no algorithm that is always the most accurate, thus
RF is more accurate and robust than the individual classifiers.
The random forest algorithm can be expressed as
F(x) = arg max
l(z
X
i=1
T(A(B, θk)))(9)
Where F(x) is the random forest model, j is the target cate-
gory variable and F is the characteristic function. To ensure
the diversity of the decision tree, the sample selection of
random forest and the candidate attributes of node splitting
is randomness. Pseudocode of the random forest algorithm is
described in algorithm 2.
Algorithm 2 Random Forest algorithm
Require: Training set (mis the number of training set, fis
the feature set)
Ensure: Random forest with msub CART trees
1: Draw Bootstrap sample sets msub with replacement
2: Choose a sample set as the root node and train in a
completely split way
3: Select fsub randomly from fand choose the best feature
to split the node by using minimum principle of Gini
impurities
4: Let the nodes grow to the maximum extent. Label the
nodes with a minimum impurity as leaf node
5: Repeat steps 2-4 until all nodes have been trained or
labeled as leaf nodes.
6: Repeat steps 2-5 until all CART has been trained
7: Output the random forest with msub CART trees
5) Support Vector Machine (SVM)
For binary and multi-classification related problems, SVM
is one of the most popular models [17] [11], [7], [9], and
[18]. It is a supervised machine learning classifier and many
researchers adopted it for binary and mutli-classification
related problems [8]. The instances are separated with a
hyper plane in binary classification problem in such a way
wTx+b= 0, where wis a dimensional coefficient weight
vector which is normal to the hyper-plane. The bias term b,
which is the offset values from the origin, and data points
are represented by x. Determining the values of wand b
is the main task in SVM. In linear case, wcan be solved
using Lagrangian function. On the maximum border, the data
points are called support vectors. As an outcome, the solution
of wcan be expressed mathematically as in equation 6.
w=
n
X
i=1
αiYiXi(10)
In equation 6, ndenotes support vectors, and target class
label is Yiwhich is corresponding to samples x. The term bias
bcan be computed by yiwTxi+b1=0. In nonlinear
case, kernel trick and decision function of n wand bare
expressed as in equation 7 as follows:
f(x) = sgn n
X
i=1
αiyiK(xi, x) + b!(11)
The positive semi definite functions, which follows the Mer-
cer ’s condition like kernel functions [45]: the polynomial
kernel can be written in equation 8 as:
K(x, xi) = ((xTxi) + 1)d(12)
The Gaussian kernel is expressed in equation 10 as:
K(x, xi) = exp(γ||xxi||2)(13)
Here,Cand γare two parameters required to be defined by
SVM.
VOLUME 4, 2016 7
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3056079, IEEE Access
Author et al.: Preparation of Papers for IEEE TRANSACTIONS and JOURNALS
D. DEEP LEARNING MODELS DESCRIPTION
In this study, for classifying fake news and real news, three
deep learning models have been used. These models have
been discussed in detail.
1) Convolutional Neural Networks (CNNs)
The CNN model structure has been shown in Table 4.
TABLE 4. CNN, LSTM and GRU Models’ Structures
Model name Layer(type) Output shape Param#
embedding_3(Embedding) (None,300,100) 1000000
dropout_1(Dropout) (None,300,100) 0
conv1d_1(Conv1D) (None,297,128) 51328
CNN global_max_pooling1d_1 (None,128) 0
dropout_2(Dropout) (None,128) 0
dense_5(Dense) (None,128) 16512
dense_6(Dense) (None,1) 129
embedding_1(Embedding) (None,300,100) 1000000
lstm_1(LSTM) (None,300,64) 42240
lstm_2(LSTM) (None,128) 98816
LSTM dense_1(Dense) (None,32) 4128
dense_2(Dense) (None,1) 33
embedding_5(Embedding) (None,300,100) 1000000
gru_1(GRU) (None,128) 87936
GRU dropout_5(Dropout) (None,128) 0
dense_9(Dense) (None,1) 129
2) Long Short-Term Memory (LSTM)
To deal with vanishing gradient problem which means when
layers increase the neural network will become untrainable,
Hochreiter et al. [20] developed LSTM algorithm. In prac-
tice, long short-term memory (LSTM) has become one of
the common recurrent layers used to train time series and
sequence data. The LSTM cell structure and LSTM model
structure have been shown in Figure 1 and Table 4.
FIGURE 1. LSTM cells
As we can see from Figure 1, long-term state c(t1) first
processed by forget gate,dropping some memories, and plus
some new memories selected by the input gate. Then c(t1)
become the result c(t). Besides that, c(t1) is copied and
passed through the tanh function and filtered by the output
gate to compute the short-term state h(t)which is also the
cell’s output at t time step, yt. Normally in a basic RNN
cell, there is only one fully connected layer that outputs gt.
However in LSTM cell, there are three more gate controllers
layers. Due to logistic activation function, their computation
results range form 0 to 1. By using element-wise multi-
plication, if they output zeros, the according gate will be
closed and if they output ones the gate will be open. The
forget gate determine what to delete in long-term state. The
input gate determine what should be added to the long-term
state. The output gate determine what parts of the long-term
state should be read and output at current time step. Let x
be the input sequence vector representation and W be the
weights associated with each matrix element.The involved
computation in LSTM cell can be summarized as follow:
i(t)=σ(WT
xix(t)+WT
hih(t1) +bi)(14)
f(t)=σ(WT
xf x(t)+WT
hf h(t1) +bf)(15)
o(t)=σ(WT
xox(t)+WT
hoh(t1) +bo)(16)
g(t)=tanh(WT
xgx(t)+WT
hgh(t1) +bg)(17)
c(t)=f(t)c(t1) +i(t)g(t)(18)
y(t)=ht=o(t)tanh(c(t))(19)
3) Gated Recurrent Unit (GRU)
GRU cell [10] is a simple version of LSTM cell, but some-
times its performance is even better than LSTM. This conclu-
sion can also be proven in our experiments which shows that
their performance is almost the same but the time GRU neural
network spent on training is much shorter. In GRU cell, the
long short term state are merged into a single vector h(t). And
there are only two gate controller z(t)and r(t). When z(t)
outputs 1, the forget gate is open and the input gate is closed.
when z(t)outputs 0, it will close the forget gate and open
input gate. At every time step, there always will be a output
unlike in LSTM cell where exists an output gate. And r(t)
controls what content in the previous state should be sent to
the main layer g(t). The GRU cell and model structure have
been shown in Figure 2 and Table 4. The all computations
involved in GRU cell can be summarized as follows:
z(t)=σ(WT
xzx(t)+WT
hzh(t1) +bz)(20)
r(t)=σ(WT
xrx(t)+WT
hrh(t1) +br)(21)
g(t)=tanh(WT
xgx(t)+WT
hg(r(t)h(t1) ) + bg)(22)
h(t)=z(t)h(t1) + (1 z(t))g(t)(23)
E. HOLD OUT CROSS VALIDATION METHOD
For best model selection, we have adopted hold out cross-
validation method. In hold out method, the data set is divided
into two parts for training and testing. In our experiments,
80%of the instances are used to train the classifiers and
the remaining 20%of the datasets are used for testing the
classifiers.
8VOLUME 4, 2016
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3056079, IEEE Access
Author et al.: Preparation of Papers for IEEE TRANSACTIONS and JOURNALS
FIGURE 2. GRU cells
F. MODEL EVALUATION CRITERIA:
We employed accuracy, precision, recall, and f1-score [19] ,
for model evaluation and in equation 24, 25, 26, and 27, we
expressed these evaluation metrics as follows:
Accuracy (Acc) = T P +T N
T P +T N +F P +F N 100% (24)
Recall (Re) = T P
T P +F N 100% (25)
P recision (P re) = T N
T N +F P 100% (26)
F1score = 2 (precision)(recall)
precision +recall (27)
G. MCNEMARS STATISTICAL TEST
McNemar’s test [28] is a nonparametric statistical test for
paired nominal data. We can use McNemar’s test to compare
the predictive accuracy of two machine learning and deep
learning models’ performance. McNemar’s test is based on
a 2 times 2 contingency table of the two model’s predictions
on the test dataset. In the Table 5 the contingency table has
been given. The total number of samples in the test set are n
and n = n00 + n01 + n10 + n11. The McNemars’ test statistics
("chi-squared") can be computed in equation 28 as follows:
χ2=(n01 n10)2
n01 +n10
(28)
TABLE 5. Contingency Table
Model2 correct Model2 wrong
Model1 correct n11 n10
Model1 wrong n01 n00
Approximately 1 year after McNemar[28] published the
McNemar Test, Edwards [13] proposed a different continuity
corrected version, which is the more commonly used variant
today. Then McNemar’s formula, corrected for continuity,
may be written in equation 29 as follows:
χ2=(|n01 n10| − 1)2
n01 +n10
(29)
We will use the continuity corrected version in our paper.
In McNemar’s test, if the sum of cell n10 and n01 is
sufficiently large, the χ2value follows a chi-squared distribu-
tion with one degree of freedom. After setting a significance
threshold α, in our case α=0.5, we will compute the p-value,
the p-value is the probability of observing this empirical (or
a larger) chi-squared value. If the p-value is lower than our
chosen significance level, we can reject the null hypothesis
because the two model’s performances are different. If the p-
value is greater than our chosen significance level, we will
accept the null hypothesis that the models’ performances are
equal. Mathematically we can summarize it as: If p < α: then
hypothesis H0are rejected, the model performances are not
equal, If p > α: then H0is accepted and the models have
the same performance when trained on the specific training
dataset.
H. PROPOSED STACKING MECHANISM
Stacking is one of the ensemble methods that connects mul-
tiple models of different types through a meta classifier to
achieve better results. It can be seen as a more sophisticated
version of cross-validation [44]. When we utilize stacking
mechanism, we should ensure that each base learners must
perform better than random guess and these base learners
must be diverse. Otherwise, the stacking method may not be
working.
In our work, we used the complete training set to train
the eight base learners. To increase the diversity, we used
machine learning models like SVM, LR, DT,KNN and RF
and deep learning models like CNN, LSTM and GRU. We
also used three different tokenization methods such as em-
bedding, TF-IDF and TF. Then, the meta classifier, RF is
fitted by using the prediction of each individual base models.
Our proposed staking method is shown in 3.
IV. EXPERIMENTAL RESULTS AND DISCUSSION
In this research work, we first removed the stopwords and
urls from our dataset. Then, we used tokenization methods
like TF, TF-IDF and embedding to obtain the text repre-
sentation. After that, we trained individual models including
five machine learning models such as LR, DT, KNN, RF
and SVM and three deep learning models like LSTM ,GRU,
CNN on these text representation features. To choose the best
individual model, we used a corrected version of McNemar’
test to determine if the model with the highest accuracy has
a significant difference with other models on both dataset.
Finally, to improve the individual model performance, we
proposed our stacking method of training another RF model
based on the prediction results of all individual models.
The experimentation on both ISOT and KDnugget datasets
was performed by using Google Colab, a free cloud service
VOLUME 4, 2016 9
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3056079, IEEE Access
Author et al.: Preparation of Papers for IEEE TRANSACTIONS and JOURNALS
FIGURE 3. Proposed stacking method for fake news detection
supported by Google. The programming language is Python
3.7 and all experiments were performed by using different
python libraries like tensorflow (for deep learning experi-
ments) and scikit-learn (for machine learning experiments).
In this work, for both ISOT and KDnugget datasets, 80%of
the instances are used to train the classifiers and the remain-
ing 20%of the datasets are used for testing the classifiers.
A. MODEL CLASSIFICATION PERFORMANCE ON ISOT
DATASET
The performance of machine learning and deep learning
models have been checked on the bigger ISOT dataset in
order to check if one model’s performance is higher as
compared to other models. All models had 100 percent per-
formance in precison, recall, F1-score on this dataset, except
K-NN.
According to Table 7, the classification performance of RF
with TF-IDF (RF1) is highest among all machine learning
and deep learning models with an accuracy of 99.87%.
Three deep learning models have only slightly lower ac-
curacy than RF. Among these three deep learning models,
LSTM has the highest accuracy. However, the low computa-
tion time is also a important criteria for best model selection.
Considering that, the running time of GRU model, however,
is only one hundredth of LSTM. So maybe GRU is a better
option than LSTM.
When it comes to machine learning models, to find the
best parameters, we used Grid Search method to train these
models with different hyperparameters. According to Table
6, among six machine learning experiments, SVM with hper-
parameters C= 1 and LR with essential hyperparameters
C= 1 had the same high accuracy in all machine learning
models. Their accuracy is only slightly lower than RF with
TF-IDF. The KNN model with k= 9 has the highest
performance as compared to other kvalues. Due to this
reason, we only reported the KNN performance when k= 9.
However the KNN model’s performance when k= 9 are
still very low in all metrics as compared to other models
due to curse of dimensionality and the running time is also
much bigger. So on large dataset, we may should avoid using
it. The classification performance of all models have been
graphically reported in Figure 4 for better understanding.
B. MODEL CLASSIFICATION PERFORMANCE ON
KDNUGGET DATASET
Both machine learning and deep learning models had also
been checked on KDnugget dataset to choose the best model
for fake news detection. As we mentioned before, KDnugget
dataset has much less instances than ISOT dataset. On this
small dataset, all machine learning and deep learning models
had worse performance in all metrics as compared to their
almost perfect performance on the big ISOT dataset. Accord-
ing to Table 8, LR with essential hyperparameters C=1 has
the highest performance in all metrics among all machine
learning and deep learning models.
Deep learning models usually need to train a lot of param-
eters and need more data which explains their bad perfor-
mance on this small dataset. According to Table 9, LSTM
model has only seventh best accuracy now. GRU model
has the best performance in all metrics among three deep
learning models but still was no better than both random
forest models.
Machine learning models also had performed very dif-
ferently on this small dataset. Support Vector Machine’s
accuracy is almost as high as Logistic Regression. Besides
that, the performance of KNN is higher than Decision Tree on
KDnugget dataset in all metrics now. We all know that K-NN
algorithm has limitations that it can suffer from curse of di-
mensionality. As we expected, KNN model has much higher
performance on small KDnugget dataset than on big ISOT
dataset. On the other hand, the performance of Decision Tree
decrease significantly on this small dataset. This is because
Decision Tree is a weak leaner and it needs big data to make
decisions. Therefore, on small datasets, we may should avoid
using Decision Tree. The classification performance of all
models have been graphically reported in Figure 5 for better
10 VOLUME 4, 2016
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3056079, IEEE Access
Author et al.: Preparation of Papers for IEEE TRANSACTIONS and JOURNALS
TABLE 6. Classifier performance on ISOT dataset. * = p-value 0.05, NS =NotS ignif icant
Classifiers Model Performance evaluation metrics
Parameters tokenization methods Acc (%) Pre (%) Recall (%) F1-score(%) Training time(s) χ2p-value
SVM C=1 TF-IDF 99.63 100 100 100 20 16.0 *
LR C=1 TF-IDF 99.63 100 99 100 2 16.0 *
DT max_depth=5 TF-IDF 99.60 100 100 100 86 13.22 *
K-NN K=9 TF-IDF 68.65 94 37 53 337 2789.06 *
RF1 n_estimators=400,max_depth=40 TF-IDF 99.87 100 100 100 225 - -
RF2 n_estimators=300,max_depth=40 TF 99.84 100 100 100 160 0.57 NS
LSTM - embedding 99.74 100 100 100 1500 3.22 NS
CNN - embedding 99.52 100 100 100 45 20 *
GRU - embedding 99.69 100 100 100 14 6.61 *
FIGURE 4. Performance of models on ISOT dataset
TABLE 7. Performance Comparison on ISOT dataset
Metrics Performance comparison
Accuracy RF1>RF2>LSTM>GRU>SVM=LR>CNN>DT>KNN
Precision RF1=RF2=LSTM=GRU=SVM=LR=CNN=DT=100>KNN
Recall RF1=RF2=LSTM=GRU=SVM=CNN=DT=100>LR>KNN
F1-score RF1=RF2=LSTM=GRU=SVM=LR=CNN=DT=100>KNN
understanding.
C. MCNEMARS STATISTICAL TEST FOR MODELS
PERFORMANCE COMPARISON
Model’s accuracy is a common metrics that researchers used
to select the models. As we can see from Figure 6, there
are a lot models that had very similar accuracy. Thus, we
employed a corrected version of McNemar’s test to determine
if models’ performance has significant difference in accuracy.
As we can see from Table 6 and Table 8, RF1 has the highest
accuracy in ISOT dataset and LR has the highest accuracy in
KDnugget dataset. Therefore, we computed χ2and p-value
between RF1 and other models and put the results into Table
6. Since LR has the highest accuracy on KDnugget dataset,
we computed χ2and p-value between LR and other models
and put the results into Table 8.
In our experiments, the value of significance level alpha is
0.05, and the confidence level is 0.95. Based on p-value and
alpha, we accept or reject the null hypothesis. If p-value < α
: then H0is rejected, the models has significantly different
performance. If p-value >= α: then H0is accepted and the
models have the same performance.
In Table 6, SVM, LR, DT, KNN, CNN and GRU have a
p-value less than 0.5, so their performance is significantly
VOLUME 4, 2016 11
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3056079, IEEE Access
Author et al.: Preparation of Papers for IEEE TRANSACTIONS and JOURNALS
TABLE 8. Classifier performance on KDnugget Dataset. * = p-value 0.05, NS =N otSignif icant
Classifiers Model performance evaluation metrics
parameters tokenization methods Acc (%) Pre (%) Recall (%) F1-score(%) Trainiing time(s) χ2p-value
SVM C=1 TF-IDF 92.42 93 93 93 11 0.94 NS
LR C=10 TF-IDF 92.82 93 93 93 1 - -
DT max_depth=5 TF-IDF 79.87 83 76 80 25 108.89 *
K-NN K=9 TF-IDF 82.56 77 94 85 12 80.78 *
RF1 n_estimators=150 TF-IDF 91.63 91 93 92 12 0.70 NS
RF2 n_estimators=200 TF 91.48 91 93 92 16 13.24 *
LSTM - embedding 88.95 89 89 89 450 17.32 *
CNN - embedding 89.50 90 89 89 0.0033 11.67 *
GRU - embedding 91.32 91 91 91 2 2.59 NS
FIGURE 5. Performance of models on KDnugget dataset
TABLE 9. Performance Comparison on KDnugget dataset
Metrics Performance comparison
Accuracy LR>SVM>RF1>RF2>GRU>CNN>LSTM>KNN>DT
Precision LR=SVM>RF1=RF2=GRU>CNN>LSTM>DT>KNN
Recall KNN>LR=SVM=RF1=RF2>GRU>CNN=LSTM>DT
F1-score LR=SVM>RF1=RF2>GRU>LSTM=CNN>KNN>DT
different from LSTM model at the 0.05 significance level
which means their performances are worse. On the other
hand, RF2 and LSTM have a p-value that are greater than
0.05. So their performance is as good as RF1’s. According
to the p-value from Table 8, only SVM, RF1 and GRU have
a p-value that are less than 0.05. Therefore we can conclude
that their performance is just as good as LR’s performance
and other models have worse performance on this dataset.
In a nutshell, RF1, RF2 and LSTM have the best perfor-
mance on the bigger dataset ISOT when the significance level
is 0.05. LR, SVM, RF1 and GRU have the best performance
on the smaller KDnugget dataset. Thus, the best individual
model on these two datasets is Random Forest with TF-IDF
tokenization method.
D. RESULTS OF THE PROPOSED STACKING
MECHANISM
To further improve the results, we used the predictions of
all individual models as the training data. Since RF is the
best individual model among all machine learning and deep
learning models, we used it to train the prediction data.
According to Table 10, our proposed model has much better
performance than all individual models on both datasets in all
12 VOLUME 4, 2016
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3056079, IEEE Access
Author et al.: Preparation of Papers for IEEE TRANSACTIONS and JOURNALS
FIGURE 6. Individual model performance comparison on both datasets
evaluation metrics. So we highly recommend it for fake news
detection.
TABLE 10. The performance of our proposed stacking model
Datasets Acc (%) Pre (%) Recall (%) F1-score (%)
IOST 99.94 100 100 100
KDnugget 96.05 97 96 96
E. PERFORMANCE COMPARISON WITH BASELINE
METHODS
The proposed method for fake news detection have been
compared with state of the art methods in Table 11. Accord-
ing to Table 11, in terms of accuracy our method achieved
99.94% accuracy on ISOT dataset and 96.05% on KDnugget
dataset which is so much better than existing methods.
Therefore, the proposed method is highly recommended for
detection of fake news and it could be easily employed in real
environment.
TABLE 11. Performance comparison of the proposed method with the
baselines methods
Reference dataset Accuracy (%)
[3] ISOT 92
[32] ISOT 96.8
[16] ISOT 99.8
[24] ISOT 99.86
proposed model ISOT 99.94
[6] KDnugget 92.7
[14] KDnugget 94
proposed model KDnugget 96.05
V. CONCLUSION AND FUTURE WORK
In this paper, we evaluated five machine learning models and
three deep learning models on two fake news datasets of
different size in terms of accuracy, precision, recall, F1-score.
According to our experiments, some models like K-Nearest
Neighbors had better performance on small dataset and
other models like Decision Tree, Support Vector Machine,
Logistic Regression, CNN, GRU, LSTM had a lot worse
performance on small datasets. To select the best model, we
used a corrected version of McNemar’s test to determine if
models’ performance is significantly different. According to
our final experiments, among all individual models, Random
Forest with TF-IDF has the highest accuracy on the ISOT
dataset and Logistic Regression with TF-IDF has the highest
accuracy on the KDnugget dataset.
The experimental results of these two best models demon-
strated that our proposed stacking method achieved 99.94%
accuracy on the ISOT dataset and 96.05% accuracy on the
KDnugget dataset was very high as compared to individual
models. We also compared our results to other existing work
and concluded that our stacking model is much better. Due to
the high performance of our proposed stacking methods, we
recommend it for the detection of fake news.
The major innovation of this research work as follows:
Firstly, five machine learning and three deep learning models
have been trained in order to compare the performance dif-
ference between machine learning and deep learning models.
Secondly, we used two datasets of different size for evalu-
ation of the proposed method to test models’ robustness on
datasets of different size. Thirdly we employed a corrected
version of McNemar’s statistical test to decide if there really
are significant differences between two model’s performance
and determine the best individual model for fake news de-
tection. Lastly, we proposed a stacking model to improve the
individual model performance.
In future, we will perform more experiments on other data
sets in different languages. We will also try to use more
different machine learning and deep learning models for fake
news detection. We will also collect more fake and real news
data in different language to detect fake news in different
countries.
CONFLICTS OF INTEREST
The authors declare that they have no conflicts of interest.
REFERENCES
[1] Giuseppe Aceto et al. “MIMETIC: Mobile encrypted
traffic classification using multimodal deep learning”.
In: Computer Networks 165 (2019), p. 106944.
[2] Gerardo Ernesto Rolong Agudelo, Octavio José Sal-
cedo Parra, and Julio Barón Velandia. “Raising a
model for fake news detection using machine learning
in Python”. In: Conference on e-Business, e-Services
and e-Society. Springer. 2018, pp. 596–604.
[3] Hadeer Ahmed, Issa Traore, and Sherif Saad. “Detect-
ing opinion spams and fake news using text classifica-
tion”. In: Security and Privacy 1.1 (2018), e9.
[4] Oluwaseun Ajao, Deepayan Bhowmik, and Shahrzad
Zargari. “Fake news identification on twitter with hy-
brid cnn and rnn models”. In: Proceedings of the 9th
international conference on social media and society.
2018, pp. 226–230.
VOLUME 4, 2016 13
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3056079, IEEE Access
Author et al.: Preparation of Papers for IEEE TRANSACTIONS and JOURNALS
[5] Maaz Amjad et al. ““Bend the truth”: Benchmark
dataset for fake news detection in urdu language and
its evaluation”. In: Journal of Intelligent & Fuzzy
Systems Preprint (2020), pp. 1–13.
[6] Sreyasee Das Bhattacharjee, Ashit Talukder, and Bala
Venkatram Balantrapu. “Active learning based news
veracity detection with feature weighting and deep-
shallow fusion”. In: 2017 IEEE International Confer-
ence on Big Data (Big Data). IEEE. 2017, pp. 556–
565.
[7] Chih-Chung Chang and Chih-Jen Lin. “LIBSVM: A
library for support vector machines”. In: ACM trans-
actions on intelligent systems and technology (TIST)
2.3 (2011), p. 27.
[8] Chih-Chung Chang and Chih-Jen Lin. “LIBSVM: A
library for support vector machines”. In: ACM trans-
actions on intelligent systems and technology (TIST)
2.3 (2011), p. 27.
[9] Hui-Ling Chen et al. “A support vector machine clas-
sifier with rough set-based feature selection for breast
cancer diagnosis”. In: Expert Systems with Applica-
tions 38.7 (2011), pp. 9014–9022.
[10] Kyunghyun Cho et al. “Learning phrase representa-
tions using RNN encoder-decoder for statistical ma-
chine translation”. In: arXiv preprint arXiv:1406.1078
(2014).
[11] Nello Cristianini, John Shawe-Taylor, et al. An intro-
duction to support vector machines and other kernel-
based learning methods. Cambridge university press,
2000.
[12] Marco L Della Vedova et al. “Automatic online fake
news detection combining content and social signals”.
In: 2018 22nd Conference of Open Innovations Asso-
ciation (FRUCT). IEEE. 2018, pp. 272–279.
[13] Allen L Edwards. “Note on the “correction for con-
tinuity” in testing the significance of the difference
between correlated proportions”. In: Psychometrika
13.3 (1948), pp. 185–187.
[14] Pedro Henrique Arruda Faustini and Thiago Ferreira
Covões. “Fake news detection in multiple platforms
and languages”. In: Expert Systems with Applications
(2020), p. 113503.
[15] Shlok Gilda. “Evaluating machine learning algorithms
for fake news detection”. In: 2017 IEEE 15th Student
Conference on Research and Development (SCOReD).
IEEE. 2017, pp. 110–115.
[16] Mohammad Hadi Goldani, Saeedeh Momtazi, and
Reza Safabakhsh. “Detecting Fake News with Capsule
Neural Networks”. In: arXiv (2020), arXiv–2002.
[17] Amin Ul Haq et al. “A novel integrated diagnosis
method for breast cancer detection”. In: Journal of
Intelligent Fuzzy Systems 2019 (2019).
[18] Amin Ul Haq et al. “Comparative analysis of the clas-
sification performance of machine learning classifiers
and deep neural network classifier for prediction of
Parkinson disease”. In: 2018 15th International Com-
puter Conference on Wavelet Active Media Technology
and Information Processing (ICCWAMTIP). IEEE.
2018, pp. 101–106.
[19] Amin Ul Haq et al. “Feature Selection Based on L1-
Norm Support Vector Machine and Effective Recog-
nition System for Parkinson’s Disease Using Voice
Recordings”. In: IEEE Access 7 (2019), pp. 37718–
37734.
[20] Sepp Hochreiter and Jürgen Schmidhuber. “Long
short-term memory”. In: Neural computation 9.8
(1997), pp. 1735–1780.
[21] Anjali Jain et al. “A smart System for Fake News
Detection Using Machine Learning”. In: 2019 Inter-
national Conference on Issues and Challenges in In-
telligent Computing Techniques (ICICT). Vol. 1. IEEE.
2019, pp. 1–4.
[22] Rohit Kumar Kaliyar, Anurag Goswami, and Pratik
Narang. “Multiclass Fake News Detection using En-
semble Machine Learning”. In: 2019 IEEE 9th Inter-
national Conference on Advanced Computing (IACC).
IEEE. 2019, pp. 103–107.
[23] Rohit Kumar Kaliyar et al. “FNDNet–A deep convo-
lutional neural network for fake news detection”. In:
Cognitive Systems Research 61 (2020), pp. 32–44.
[24] Sebastian Kula et al. “Sentiment analysis for fake
news detection by means of neural networks”. In:
International Conference on Computational Science.
Springer. 2020, pp. 653–666.
[25] Sachin Kumar et al. “Fake news detection using deep
learning models: A novel approach”. In: Transactions
on Emerging Telecommunications Technologies 31.2
(2020), e3767.
[26] Yunfei Long. “Fake news detection through multi-
perspective speaker profiles”. In: Association for
Computational Linguistics. 2017.
[27] Jing Ma et al. “Detecting rumors from microblogs with
recurrent neural networks”. In: (2016).
[28] Quinn McNemar. “Note on the sampling error of the
difference between correlated proportions or percent-
ages”. In: Psychometrika 12.2 (1947), pp. 153–157.
[29] Federico Monti et al. “Fake news detection on so-
cial media using geometric deep learning”. In: arXiv
preprint arXiv:1902.06673 (2019).
[30] Antonio Montieri et al. “A dive into the dark web:
Hierarchical traffic classification of anonymity tools”.
In: IEEE Transactions on Network Science and Engi-
neering (2019).
[31] Ray Oshikawa, Jing Qian, and William Yang Wang.
“A survey on natural language processing for fake
news detection”. In: arXiv preprint arXiv:1811.00770
(2018).
[32] Feyza Altunbey Ozbay and Bilal Alatas. “Fake news
detection within online social media using supervised
artificial intelligence algorithms”. In: Physica A: Sta-
tistical Mechanics and its Applications 540 (2020),
p. 123174.
14 VOLUME 4, 2016
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3056079, IEEE Access
Author et al.: Preparation of Papers for IEEE TRANSACTIONS and JOURNALS
[33] Juan-Pablo Posadas-Durán et al. “Detection of fake
news in a new corpus for the Spanish language”. In:
Journal of Intelligent & Fuzzy Systems 36.5 (2019),
pp. 4869–4876.
[34] Tayyaba Rasool et al. “Multi-label fake news detection
using multi-layered supervised learning”. In: Proceed-
ings of the 2019 11th International Conference on
Computer and Automation Engineering. 2019, pp. 73–
77.
[35] Julio CS Reis et al. “Explainable machine learning for
fake news detection”. In: Proceedings of the 10th ACM
Conference on Web Science. 2019, pp. 17–26.
[36] Julio CS Reis et al. “Supervised learning for fake news
detection”. In: IEEE Intelligent Systems 34.2 (2019),
pp. 76–81.
[37] Arjun Roy et al. “A deep ensemble framework for fake
news detection and classification”. In: arXiv preprint
arXiv:1811.04670 (2018).
[38] Natali Ruchansky, Sungyong Seo, and Yan Liu. “Csi:
A hybrid deep model for fake news detection”. In: Pro-
ceedings of the 2017 ACM on Conference on Informa-
tion and Knowledge Management. 2017, pp. 797–806.
[39] Shaban Shabani and Maria Sokhn. “Hybrid machine-
crowd approach for fake news detection”. In: 2018
IEEE 4th International Conference on Collaboration
and Internet Computing (CIC). IEEE. 2018, pp. 299–
306.
[40] Sneha Singhania, Nigel Fernandez, and Shrisha Rao.
“3han: A deep neural network for fake news detec-
tion”. In: International Conference on Neural Infor-
mation Processing. Springer. 2017, pp. 572–581.
[41] Muhammad Umer et al. “Fake News Stance Detection
Using Deep Learning Architecture (CNN-LSTM)”.
In: IEEE Access (2020).
[42] Wagacha. “Induction of Decision Trees”. In: Founda-
tions of Learning and Adaptive Systems 12 (2003).
[43] William Yang Wang. “" liar, liar pants on fire": A new
benchmark dataset for fake news detection”. In: arXiv
preprint arXiv:1705.00648 (2017).
[44] David H Wolpert. “Stacked generalization”. In: Neural
networks 5.2 (1992), pp. 241–259.
[45] Xindong Wu et al. “Top 10 algorithms in data mining”.
In: Knowledge and information systems 14.1 (2008),
pp. 1–37.
[46] Xichen Zhang and Ali A Ghorbani. “An overview of
online fake news: Characterization, detection, and dis-
cussion”. In: Information Processing & Management
57.2 (2020), p. 102025.
TAO JIANG is currently pursuing his MS from
the School of Computer Science and Engineering
UESTC China. His research area includes ma-
chine learning, deep learning, big data analysis,
IoT, E-Health and concerned Technologies and
Algorithms.
JIAN PING LI is Chairman of Computer Sci-
ence and Engineering College and Model Soft-
ware College, University of Electronic Science
and Technology of China. He is the Director of
International Centre for Wavelet Analysis and Its
Applications. He is the Chief Editor of Interna-
tional Progress on Wavelet Active Media Tech-
nology and Information Processing. He is also
the Associate editor of International Journal of
Wavelet Multimedia and Information Processing.
National Science and Technology Award Evaluation Committee; National
Natural Science Foundation Committee of China; The ministry of public
security of the People’s Republic of China such as technical adviser and a
dozen academic and social positions.
AMIN UL HAQ is working as a postdoctoral
scientific research fellow in the University of
Electronic Science and Technology of China
(UESTC), China. He has a vast academic, techni-
cal and professional experience in Pakistan. He is
a lecturer in agricultural university Peshawar Pak-
istan. His research area includes machine learning,
deep learning, Medical Big Data, IoT, E-Health
and Telemedicine, concerned Technologies and
Algorithms. He is associated with Wavelets Active
Media Technology and Big Data Laboratory as a postdoctoral scientific
research fellow . He has been published high level research papers in good
journals. He is an invited reviewer for numerous world-leading high-impact
journals (reviewed 40+ journal papers to date).
ABDUS SABOOR is currently pursuing his MS
from the School of Computer Science and Engi-
neering UESTC China. He is a lecturer in govern-
ment university Peshawar Pakistan. His research
area includes machine learning, Medical Big Data,
IoT, E-Health and Telemedicine, concerned Tech-
nologies and Algorithms.
VOLUME 4, 2016 15
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2021.3056079, IEEE Access
Author et al.: Preparation of Papers for IEEE TRANSACTIONS and JOURNALS
AMJAD ALI received the Ph.D. degree in real
time systems from Gyeongsang National Univer-
sity,South Korea. He is currently an Assistant
Professor and the Chairman of the Department
of Computer Science and Software Technology,
University of Swat. He published several research
papers in international journals and conferences.
VI. APPENDIX
The mathematical notations used in paper are given in Table
12.
TABLE 12. Mathematical symbols and notations used in this paper
Symbol Description
c(t)long term state at t time step
h(t)short term state at t time step
y(t)output at t time step
i(t)input gate controller
f(t)forget gate controller
o(t)output gate controller
ntotal number of instances in dataset, support vector
xInput
yclasse label
bbais, offset value from the origin
wd-dimensional coefficient vector
xiith instance of dataset sample X
yitarget labels to xi
p-value probability
αsignificance level
Kkernel
Iimage
Mm,n matrix of m rows and n columns
16 VOLUME 4, 2016
... For the task of fake news detection, a feature set can never be considered complete and sound. Jiang et al. [15] evaluated the performance of five machine learning models and three deep learning models on two fake and real news datasets of different sizes withholding out cross-validation. Moreover, the detection of fake news with sentiment analysis is required for different machine learning and deep learning models. ...
... Further, the paper, proposed a new multiple imputation strategy for handling multivariate missing variables in the ISOT of social media data. The researchers have used this imputation approach to bring in Table 1 Comparison table for the state of art techniques References Objective Pros Cons Sahoo et al [15] Automatic fake news detection approach in chrome environment using machine learning and deep learning classifiers on which it can detect fake news on Facebook. ...
Article
Full-text available
Fake news on social media, has spread for personal or societal gain. Detecting fake news is a multi-step procedure that entails analysing the content of the news to assess its trustworthiness. The article has proposed a new solution for fake news detection which incorporates sentiment as an important feature to improve the accuracy with two different data sets of ISOT and LIAR. The key feature words with content’s propensity scores of the opinions are developed based on sentiment analysis using a lexicon-based scoring algorithm. Further, the study proposed a multiple imputation strategy which integrated Multiple Imputation Chain Equation (MICE) to handle multivariate missing variables in social media or news data from the collected dataset. Consequently, to extract the effective features from the text, Term Frequency and Inverse Document Frequency (TF-IDF) are introduced to determine the long-term features with the weighted matrix. The correlation of missing data variables and useful data features are classified based on Naïve Bayes, passive-aggressive and Deep Neural Network (DNN) classifiers. The findings of this research described that the overall calculation of the proposed method was obtained with an accuracy of 99.8% for the detection of fake news with the evaluation of various statements such as barely true, half true, true, mostly true and false from the dataset. Finally, the performance of the proposed method is compared with the existing methods in which the proposed method results in better efficiency.
... To improve this performance, the work by Han et al. (2021) proposes the use of a twostream network for fake news detection. Similarly the work by Li et al. (2021) uses unsupervised fake news detection method based on auto encoder, and the work by Jiang et al. (2021) proposes an ensemble method of stacking of logistic regression, decision tree, k-nearest neighbor, random forest, and support vector machine (SVM). All of these approaches achieved an accuracy of over 85%, for verification of real-time generated news. ...
Article
Full-text available
Fake news, which considers and modifies facts for virality objectives, causes a lot of havoc on social media. It spreads faster than real news and produces a slew of issues, including disinformation, misunderstanding, and misdirection in the minds of readers. To combat the spread of fake news, detection algorithms are used, which examine news articles through temporal language processing. The lack of human engagement during fake news detection is the main problem with these systems. To address this problem, this paper presents a cooperative deep learning-based fake news detection model.The suggested technique uses user feedbacks to estimate news trust levels, and news ranking is determined based on these values. Lower-ranked news is preserved for language processing to ensure its validity, while higher-ranked content is recognized as genuine news. A convolutional neural network (CNN) is utilized to turn user feedback into rankings in the deep learning layer. Negatively rated news is sent back into the system to train the CNN model. The suggested model is found to have a 98% accuracy rate for detecting fake news, which is greater than most existing language processing based models.The suggested deep learning cooperative model is also compared to state-of-the-art methods in terms of precision, recall, F-measure, and area under the curve (AUC). Based on this analysis, the suggested model is found to be highly efficient.
... Their technique controls text input from bilingual encoder modifiers. Jiang et al. [37] used ML classifiers, such as LR, SVM, k-NN, DT, and RF, and also used Deep Networks, i.e., CNN, GRU, 6 CMC, 2023 and LSTM, for misinformation detection. Shu et al. [38] propose a verdict comment co-consideration sub-network to achieve user comments and new article content. ...
... It is an ensemble AI calculation. It utilizes a meta-learning calculation to figure out how to best join the expectations from at least two base AI calculations [52]. The advantage of stacking is that it can bridle the capacities of a scope of well-performing models on an order or relapse undertaking and cause forecasts that to have preferable execution over any single model in the ensemble. ...
Article
The exact forecast of heart disease is necessary to proficiently treating cardiovascular patients before a heart failure happens. Assuming we talk about artificial intelligence (AI) techniques, can be accomplished utilizing an ideal AI model with rich medical services information on heart diseases. To begin with, the feature extraction technique, gradient boosting-based sequential feature selection (GBSFS) is applied to separate the significant number of features from coronary illness dataset to create important medical services information. Using machine learning algorithms like Decision tree (DT), Random forest (RF), Multilayer perceptron (MLP), Support vector machine (SVM), Extra tree (ET), Gradient boosting (GBC), Linear regression (LR), K-nearest neighbor (KNN), and stacking, a comparison model is created between dataset that include both all features and a significant number of features. With stacking, the proposed framework achieves test accuracy of 98.78 percent, which is higher than the existing frameworks and most notable in the marking model with 11 features. This outcome shows that our framework is more powerful for the expectation of coronary illness, in contrast with other cutting edge strategies.
... To approve the obtained results, we used multiple performance measures, such as accuracy, precision, recall and f1-score [24]. In reality, it was hard to annotate 28 k tweets, Sustainability 2023, 15, 420 9 of 13 so we resorted to statistical sampling theory. ...
Article
Full-text available
This paper proposes an artificial intelligence model to manage risks in healthcare institutions. This model uses a trendy data source, social media, and employs users’ interactions to identify and assess potential risks. It employs natural language processing techniques to analyze the tweets of users and produce vivid insights into the types of risk and their magnitude. In addition, some big data analysis techniques, such as classification, are utilized to reduce the dimensionality of the data and manage the data effectively. The produced insights will help healthcare managers to make the best decisions for their institutions and patients, which can lead to a more sustainable environment. In addition, we build a mathematical model for the proposed model, and some closed-form relations for risk analysis, identification and assessment are derived. Moreover, a case study on the CVS institute of healthcare in the USA, and our subsequent findings, indicate that a quartile of patients’ tweets refer to risks in CVS services, such as operational, financial and technological risks, and the magnitude of these risks vary between high risk (19%), medium risk (80.4%) and low risk (0.6%). Further, several performance measures and a complexity analysis are given to show the validity of the proposed model.
... This dataset contains a total of 44898 data points, 21417 of which are true news and 23481 of which are false news. Upon this ISOT and KDnugget datasets, we provide a novel stacking technique with testing precision of 99.94% and 96.05%, respectively [9]. ...
Article
Numerous enhancements have been made to the mobile internet, which leads to an increase the people’s attention to posting more multi-modal posts among the social media platforms. Hence, this paper aims to design a multimodal fake news detection model with enhanced deep learning architecture. Initially, the multi-modal information including images and text are gathered from real-time social media. Then, pre-processing of both images and text is carried out. Further, the text features are extracted using Word2vector and glove embedding techniques. Then, the optimal features from both image features and text features are attained using the Adaptive Water Strider Algorithm (A-WSA). The achieved optimal features are forwarded to the new feature fusion concept based on the weight factor that is optimized using the same A-WSA. Finally, the fused features are forwarded to the fake news classification stage with the help of “Optimized-Bidirectional Long Short-Term Memory (O-BiLSTM),” where the hyperparameter of BiLSTM is optimized through similar A-WSA. Throughout the result analysis, the accuracy rate of the designed A-WSA-BiLSTM method is attained at 96.51%. The experimental result demonstrates that the proposed model is effective in using multi-modal data in automated fake news classification.
Chapter
In today's world, when the internet is pervasive, everyone gets news from a variety of online sources. As the use of social media platforms has grown, news has travelled quickly among thousands of people in a very less duration. The propagation has been far reaching for the fake news generation in repercussions, from altering election outcomes in support of specific politicians, creating prejudiced viewpoints. Furthermore, spammers use appealing news headlines to make cash through click-bait adverts. In today’s world knowingly or unknowingly fake news spreads around the world through internet. This has a great impact on the people who blindly believe whatever the internet provides. Hence, fake news identification has become a new study subject that is attracting a lot of attention. However, due to a lack of resources, such as datasets and processing and analysis procedures, it encounters several difficulties. This research uses a non-probabilistic machine learning models of computational prototypes to address this problem. Furthermore, the comparison of Term Frequency-Inverse Document Frequency (TF-IDF) is done, for the purpose of determining the best vectorizer used for detecting fake news. In order to raise the accuracy, stop words of English are used. To predict bogus news, a Support Vector Machine (SVM) classifier is deployed. According to the simulation data, the SVM and the TF-IDF produce results with high accuracy.
Article
Literature reports several infectious diseases news validation approaches, but none is economically effective for collecting and classifying information on different infectious diseases. This work presents a hybrid machine‐learning model that could predict the validity of the infectious disease's news spread on the media. The proposed hybrid machine learning (ML) model uses the Dynamic Classifier Selection (DCS) process to validate news. Several machine learning models, such as K‐Neighbors‐Neighbor (KNN), AdaBoost (AB), Decision Tree (DT), Random Forest (RF), SVC, Gaussian Naïve Base (GNB), and Logistic Regression (LR) are tested in the simulation process on benchmark dataset. The simulation employs three DCS process methods: overall Local Accuracy (OLA), Meta Dynamic ensemble selection (META‐DES), and Bagging. From seven ML classifiers, the AdaBoost with Bagging DCS method got a 97.45% high accuracy rate for training samples and a 97.56% high accuracy rate for testing samples. The second high accuracy was obtained at 96.12% for training and 96.45% for testing samples from AdaBoost with the Meta‐DES method. Overall, the AdaBoost with Bagging model obtained higher accuracy, AUC, sensitivity, and specificity rate with minimum FPR and FNR for validation. An explorative study of machine learning approaches.
Article
Full-text available
Society and individuals are negatively influenced both politically and socially by the widespread increase of fake news either way generated by humans or machines. In the era of social networks, the quick rotation of news makes it challenging to evaluate its reliability promptly. Therefore, automated fake news detection tools have become a crucial requirement. To address the aforementioned issue, a hybrid Neural Network architecture, that combines the capabilities of CNN and LSTM, is used with two different dimensionality reduction approaches, Principle Component Analysis (PCA) and Chi-Square. This work proposed to employ the dimensionality reduction techniques to reduce the dimensionality of the feature vectors before passing them to the classifier. To develop the reasoning, this work acquired a dataset from the Fake News Challenges (FNC) website which has four types of stances: agree, disagree, discuss, and unrelated. The nonlinear features are fed to PCA and chi-square which provides more contextual features for fake news detection. The motivation of this research is to determine the relative stance of a news article towards its headline. The proposed model improves results by ~4% and ~20% in terms of Accuracy and F1-score. The experimental results show that PCA outperforms than Chi-square and state-of-the-art methods with 97.8% accuracy.
Article
Full-text available
The paper presents a new corpus for fake news detection in the Urdu language along with the baseline classification and its evaluation. With the escalating use of the Internet worldwide and substantially increasing impact produced by the availability of ambiguous information, the challenge to quickly identify fake news in digital media in various languages becomes more acute. We provide a manually assembled and verified dataset containing 900 news articles, 500 annotated as real and 400, as fake, allowing the investigation of automated fake news detection approaches in Urdu. The news articles in the truthful subset come from legitimate news sources, and their validity has been manually verified. In the fake subset, the known difficulty of finding fake news was solved by hiring professional journalists native in Urdu who were instructed to intentionally write deceptive news articles. The dataset contains 5 different topics: (i) Business, (ii) Health, (iii) Showbiz, (iv) Sports, and (v) Technology. To establish our Urdu dataset as a benchmark, we performed baseline classification. We crafted a variety of text representation feature sets including word n-grams, character n-grams, functional word n-grams, and their combinations. After applying a variety of feature weighting schemes, we ran a series of classifiers on the train-test split. The results show sizable performance gains by AdaBoost classifier with 0.87 F1Fake and 0.90 F1Real. We provide the results evaluated against different metrics for a convenient comparison of future research. The dataset is publicly available for research purposes.
Article
Full-text available
With the ever increase in social media usage, it has become necessary to combat the spread of false information and decrease the reliance of information retrieval from such sources. Social platforms are under constant pressure to come up with efficient methods to solve this problem because users' interaction with fake and unreliable news leads to its spread at an individual level. This spreading of misinformation adversely affects the perception about an important activity, and as such, it needs to be dealt with using a modern approach. In this paper, we collect 1356 news instances from various users via Twitter and media sources such as PolitiFact and create several datasets for the real and the fake news stories. Our study compares multiple state‐of‐the‐art approaches such as convolutional neural networks (CNNs), long short‐term memories (LSTMs), ensemble methods, and attention mechanisms. We conclude that CNN + bidirectional LSTM ensembled network with attention mechanism achieved the highest accuracy of 88.78%, whereas Ko et al tackled the fake news identification problem and achieved a detection rate of 85%. With the ever increase in social media usage, it has become necessary to combat the spread of false information and decrease the reliance of information retrieval from such sources. Social platforms are under constant pressure to come up with efficient methods to solve this problem. We tackled the fake news identification problem and achieved a detection rate of 88.78%.
Article
Fake news has increased dramatically in social media in recent years. This has prompted the need for effective fake news detection algorithms. Capsule neural networks have been successful in computer vision and are receiving attention for use in Natural Language Processing (NLP). This paper aims to use capsule neural networks in the fake news detection task. We use different embedding models for news items of different lengths. Static word embedding is used for short news items, whereas non-static word embeddings that allow incremental uptraining and updating in the training phase are used for medium length or long news statements. Moreover, we apply different levels of n-grams for feature extraction. Our proposed models are evaluated on two recent well-known datasets in the field, namely ISOT and LIAR. The results show encouraging performance, outperforming the state-of-the-art methods by 7.8% on ISOT and 3.1% on the validation set, and 1% on the test set of the LIAR dataset.
Chapter
The problem of fake news has become one of the most challenging issues having an impact on societies. Nowadays, false information may spread quickly through social media. In that regard, fake news needs to be detected as fast as possible to avoid negative influence on people who may rely on such information while making important decisions (e.g., presidential elections). In this paper, we present an innovative solution for fake news detection that utilizes deep learning methods. Our experiments prove that the proposed approach allows us to achieve promising results.
Article
The debate around fake news has grown recently because of the potential harm they can have on different fields, being politics one of the most affected. Due to the amount of news being published every day, several studies in computer science have proposed models using machine learning to detect fake news. However, most of these studies focus on news from one language (mostly English) or rely on characteristics of social media-specific platforms (like Twitter or Sina Weibo). Our work proposes to detect fake news using only text features that can be generated regardless of the source platform and are the most independent of the language as possible. We carried out experiments from five datasets, comprising both texts and social media posts, in three language groups: Germanic, Latin, and Slavic, and got competitive results when compared to benchmarks. We compared the results obtained through a custom set of features and with other popular techniques when dealing with natural language processing, such as bag-of-words and Word2Vec.
Article
With the increasing popularity of social media and web-based forums, the distribution of fake news has become a major threat to various sectors and agencies. This has abated trust in the media, leaving readers in a state of perplexity. There exists an enormous assemblage of research on the theme of Artificial Intelligence (AI) strategies for fake news detection. In the past, much of the focus has been given on classifying online reviews and freely accessible online social networking-based posts. In this work, we propose a deep convolutional neural network (FNDNet) for fake news detection. Instead of relying on hand-crafted features, our model (FNDNet) is designed to automatically learn the discriminatory features for fake news classification through multiple hidden layers built in the deep neural network. We create a deep Convolutional Neural Network (CNN) to extract several features at each layer. We compare the performance of the proposed approach with several baseline models. Benchmarked datasets were used to train and test the model, and the proposed model achieved state-of-the-art results with an accuracy of 98.36% on the test data. Various performance evaluation parameters such as Wilcoxon, false positive, true negative, precision, recall, F1, and accuracy, etc. were used to validate the results. These results demonstrate significant improvements in the area of fake news detection as compared to existing state-of-the-art results and affirm the potential of our approach for classifying fake news on social media. This research will assist researchers in broadening the understanding of the applicability of CNN-based deep models for fake news detection.