Figure 1 - available via license: Creative Commons Attribution 4.0 International
Content may be subject to copyright.
Source publication
In recent years, the consumption of social media content to keep up with global news and to verify its authenticity has become a considerable challenge. Social media enables us to easily access news anywhere, anytime, but it also gives rise to the spread of fake news, thereby delivering false information. This also has a negative impact on society....
Contexts in source publication
Context 1
... in the past few years, social media channels, such as Facebook, Twitter, and Instagram, have emerged as platforms for quick dissemination and retrieval of information. Figure 1 shows a snapshot of some fake news in recent years. According to various studies [2], almost 50% of the population of developed nations depend on social media for news. ...
Context 2
... an SVM classifier was applied. The confusion matrix and the accuracy of this classifier are shown in Figure 10, and it can be observed that the accuracy of the classifier is 93.15%, the precision value is 92.65%, the recall value is 95.71%, and the F1-score is 94.15%. ...
Context 3
... implementing all of the classifiers, their results were compared, and it was observed that all of the experiments conducted using the support vector machine provide the best accuracy for the proposed fake news detector and perform better than the other classifiers with an accuracy of 93.15%, precision of 92.65%, recall of 95.71%, and F1-score of 94.15%. Table 2 and Figure 11 provide a comparison of various aspects of the classifier. Comparing the SVM with logistic regression, which was the second best classifier, it can be observed that SVM is better than logistic regression in terms of accuracy as follows: Improvement in accuracy = 93.15 ...
Similar publications
Malaria is a pressing medical issue in tropical and subtropical regions. Currently, the manual microscopic examination remains the gold standard malaria diagnosis method. Nevertheless, this procedure required highly skilled lab technicians to prepare and examine the slides. Therefore, a framework encompassing image processing and machine learning i...
Cancer can be considered as one of the leading causes of death widely. One of the most effective tools to be able to handle cancer diagnosis, prognosis, and treatment is by using expression profiling technique which is based on microarray gene. For each data point (sample), gene data expression usually receives tens of thousands of genes. As a resu...
The key-blocks are the main reason accounting for structural failure in discontinuous rock slopes, and automated identification of these block types is critical for evaluating the stability conditions. This paper presents a classification framework to categorize rock blocks based on the principles of block theory. The deep convolutional neural netw...
Companies in the same supply chain influence each other, so sharing information enables more efficient supply chain management. An efficient supply chain must have a symmetry of information between participating entities, but in reality, the information is asymmetric, causing problems. The sustainability of the supply chain continues to be threaten...
Ensemble learning method exhibits a high level of performance which is very much essential and useful in various domains. Random Forest (RF) is an ensemble learning technique which creates many trees on the subset of the data and combines the output of all the trees thereby it reducing overfitting problem in decision trees and also the variance the...
Citations
... The study used data from the ISOT fake news dataset and compared results with other state-of-the-art machine learning techniques such as gradient boosting machines (GBM), extreme gradient boosting machines (Boost), and the adaptive boost regression model. Support vector machine (SVM) models have also shown promising results, with an accuracy of 93.15% being achieved when applying the data from the fake news dataset extracted from Kaggle, outperforming the LR approach applied to the same data by 6.82% [39]. ...
The ubiquitous access and exponential growth of information available on social media networks have facilitated the spread of fake news, complicating the task of distinguishing between this and real news. Fake news is a significant social barrier that has a profoundly negative impact on society. Despite the large number of studies on fake news detection, they have not yet been combined to offer coherent insight on trends and advancements in this domain. Hence, the primary objective of this study was to fill this knowledge gap. The method for selecting the pertinent articles for extraction was created using the preferred reporting items for systematic reviews and meta-analyses (PRISMA). This study reviewed deep learning, machine learning, and ensemble-based fake news detection methods by a meta-analysis of 125 studies to aggregate their results quantitatively. The meta-analysis primarily focused on statistics and the quantitative analysis of data from numerous separate primary investigations to identify overall trends. The results of the meta-analysis were reported by the spatial distribution, the approaches adopted, the sample size, and the performance of methods in terms of accuracy. According to the statistics of between-study variance high heterogeneity was found with τ2 = 3.441; the ratio of true heterogeneity to total observed variation was I2 = 75.27% with the heterogeneity chi-square (Q) = 501.34, the degree of freedom = 124, and p ≤ 0.001. A p-value of 0.912 from the Egger statistical test confirmed the absence of a publication bias. The findings of the meta-analysis demonstrated satisfaction with the effectiveness of the recommended approaches from the primary studies on fake news detection that were included. Furthermore, the findings can inform researchers about various approaches they can use to detect online fake news.
... Numerous approaches for automatically detecting the authenticity of news have been developed. Initially, Natural Language Processing (NLP) issues were handled using traditional Machine Learning (ML) methods such as Logistic Regression and Support Vector Machine (SVM) with hand-crafted features [10,11]. These approaches inevitably produced high-dimensional interpretations of language processing, giving rise to the curse of dimensionality. ...
Fake news detection techniques are a topic of interest due to the vast abundance of fake news data accessible via social media. The present fake news detection system performs satisfactorily on well-balanced data. However, when the dataset is biased, these models perform poorly. Additionally, manual labeling of fake news data is time-consuming, though we have enough fake news traversing the internet. Thus, we introduce a text augmentation technique with a Bidirectional Encoder Representation of Transformers (BERT) language model to generate an augmented dataset composed of synthetic fake data. The proposed approach overcomes the issue of minority class and performs the classification with the AugFake-BERT model (trained with an augmented dataset). The proposed strategy is evaluated with twelve different state-of-the-art models. The proposed model outperforms the existing models with an accuracy of 92.45%. Moreover, accuracy, precision, recall, and f1-score performance metrics are utilized to evaluate the proposed strategy and demonstrate that a balanced dataset significantly affects classification performance.
... In terms of news content, researchers created linguistic and visual methodologies for extracting information from text data. However, although linguistic-based qualities have been extensively explored to aid with general NLP tasks such as text classification and grouping, the fundamental characteristics of false news remain unknown [40]. Additionally, "embedding methods such as word embedding and deep neural networks are gaining popularity for textual feature extraction due to their ability to generate better representations and improve feature extraction accuracy" [17,23,41]. ...
Social media platforms like Twitter have become common tools for disseminating and consuming news because of the ease with which users can get access to and consume it. This paper focuses on the identification of false news and the use of cutting-edge detection methods in the context of news, user, and social levels. Fake news detection taxonomy was proposed in this research. This study examines a variety of cutting-edge methods for spotting false news and discusses their drawbacks. It also explored how to detect and recognize false news, such as credibility-based, time-based, social context-based, and the substance of the news itself. Lastly, the paper examines various datasets used for detecting fake news and proposed an algorithm.
... In addition, in order to achieve superior accuracy results from FND, the optimizing weight of ensemble learning methods was defined utilizing Self-Adaptive Harmony Search (SAHS) technique. Islam et al. [18] presented a new solution in which the authenticity of news is ensured using NLP approaches. In detail, this work presented a new method with three stages such as stance recognition, author credibility confirmation, and ML-based classifier to verify the authenticity of the news. ...
The recent advancements made in World Wide Web and social networking have eased the spread of fake news among people at a faster rate. At most of the times, the intention of fake news is to misinform the people and make manipulated societal insights. The spread of low-quality news in social networking sites has a negative influence upon people as well as the society. In order to overcome the ever-increasing dissemination of fake news, automated detection models are developed using Artificial Intelligence (AI) and Machine Learning (ML) methods. The latest advancements in Deep Learning (DL) models and complex Natural Language Processing (NLP) tasks make the former, a significant solution to achieve Fake News Detection (FND). In this background, the current study focuses on design and development of Natural Language Processing with Sea Turtle Foraging Optimization-based Deep Learning Technique for Fake News Detection and Classification (STODL-FNDC) model. The aim of the proposed STODL-FNDC model is to discriminate fake news from legitimate news in an effectual manner. In the proposed STODL-FNDC model, the input data primarily undergoes pre-processing and Glove-based word embedding. Besides, STODL-FNDC model employs Deep Belief Network (DBN) approach for detection as well as classification of fake news. Finally, STO algorithm is utilized after adjusting the hyperparameters involved in DBN model, in an optimal manner. The novelty of the study lies in the design of STO algorithm with DBN model for FND. In order to improve the detection performance of STODL-FNDC technique, a series of simulations was carried out on benchmark datasets. The experimental outcomes established the better performance of STODL-FNDC approach over other methods with a maximum accuracy of 95.50%.
... A natural language unlike a formal language is imprecise and highly context dependent. However, the emergence of models such as TF-IDF approaches, support vector machines and long short term memory (LSTM) have enabled the analysis and comprehension of text [4]. Sentiment analysis also called opinion mining or text mining, is a way to find out public opinion and their reaction towards a particular entity. ...
This paper performs the sentiment analysis on tweets and social media posts of general people of Pakistan about one of Pakistan's political party i.e. Pakistan Democratic Movement (PDM). PDM is a political movement comprised of 11 political parties of Pakistan founded against the current government of Pakistan. This paper focus on analyzing the sentiments of common Pakistanis towards PDM. Sentiment analysis is also called opinion mining or text mining. It is a way to find out public opinion and their reaction towards a particular entity or a topic. In the proposed system, data is extracted from Facebook using instant data scraper, and also tweets from twitter were extracted using twitter API. The data was extracted based on the query: current situations in Pakistan i.e. Pakistan Democratic Movement. This paper focuses on mining social media comments written in different languages and mostly in English. After pre-processing, data is labelled manually using 5 emotions which are agree, disagree, neutral, sarcastic and angry. After labeling the data several algorithms are used like support vector machines, Long Short Term Memory (LSTM) and Convolutional Neural Network (CNN) to classify the tweets/ posts.
... To addresses the detection of fake news, the authors of [5] present a solution based on three steps: stance detection, author credibility verification, and machine learning classification. Stance detection verifies the relevance between the title and paragraphs of a news item; if there is a match, the next module checks whether the author is authentic to determine whether the news item should be believed or not. ...
Artificial Intelligence has gained a lot of popularity in recent years thanks to the advent of, mainly, Deep Learning techniques [...]
... Given a set of training examples, the objective of the hyperplane is to separate the set in a way that every instance with the same labels stays on the same side. SVM has shown to be more accurate in some tasks than decision trees or neural network based approaches [29]. ...
Social media is a great source of data for analyses, since they provide ways for people to share emotions, feelings, ideas, and even symptoms of diseases. By the end of 2019, a global pandemic alert was raised, relative to a virus that had a high contamination rate and could cause respiratory complications. To help identify those who may have the symptoms of this disease or to detect who is already infected, this paper analyzed the performance of eight machine learning algorithms (KNN, Naive Bayes, Decision Tree, Random Forest, SVM, simple Multilayer Perceptron, Convolutional Neural Networks and BERT) in the search and classification of tweets that mention self-report of COVID-19 symptoms. The dataset was labeled using a set of disease symptom keywords provided by the World Health Organization. The tests showed that Random Forest algorithm had the best results, closely followed by BERT and Convolution Neural Network, although traditional machine learning algorithms also have can also provide good results. This work could also aid in the selection of algorithms in the identification of diseases symptoms in social media content.
... Fake news is defined as the news that is deliberately fabricated and is verifiable false [10,11]. Existing work on fake news detection can be divided into two categories: unimodal and multimodal. ...
As one of the most popular social media platforms, microblogs are ideal places for news propagation. In microblogs, tweets with both text and images are more likely to attract attention than text-only tweets. This advantage is exploited by fake news producers to publish fake news, which has a devasting impact on individuals and society. Thus, multimodal fake news detection has attracted the attention of many researchers. For news with text and image, multimodal fake news detection utilizes both text and image information to determine the authenticity of news. Most of the existing methods for multimodal fake news detection obtain a joint representation by simply concatenating a vector representation of the text and a visual representation of the image, which ignores the dependencies between them. Although there are a small number of approaches that use the attention mechanism to fuse them, they are not fine-grained enough in feature fusion. The reason is that, for a given image, there are multiple visual features and certain correlations between these features. They do not use multiple feature vectors representing different visual features to fuse with textual features, and ignore the correlations, resulting in inadequate fusion of textual features and visual features. In this paper, we propose a novel fine-grained multimodal fusion network (FMFN) to fully fuse textual features and visual features for fake news detection. Scaled dot-product attention is utilized to fuse word embeddings of words in the text and multiple feature vectors representing different features of the image, which not only considers the correlations between different visual features but also better captures the dependencies between textual features and visual features. We conduct extensive experiments on a public Weibo dataset. Our approach achieves competitive results compared with other methods for fusing visual representation and text representation, which demonstrates that the joint representation learned by the FMFN (which fuses multiple visual features and multiple textual features) is better than the joint representation obtained by fusing a visual representation and a text representation in determining fake news.
... It is advisable to scale the number of input variables to reduce the cost of modeling calculations and, in some cases, to increase the model's effectiveness. The statistics-based feature selection method uses statistics for each input variable and target variable and selects the input variables with a substantial correlation to the target variable [44,45]. Various feature choice techniques are available, such as Univariate Selection, Feature Significance, and Correlation Matrix with Heatmap. ...
Diabetes Mellitus is one of the most severe diseases, and many studies have been conducted to anticipate diabetes. This research aimed to develop an intelligent mobile application based on machine learning to determine the diabetic, pre-diabetic, or non-diabetic without the assistance of any physician or medical tests. This study's methodology was classified into two the Diabetes Prediction Approach and the Proposed System Architecture Design. The Diabetes Prediction Approach uses a novel approach, Light Gradient Boosting Machine (LightGBM), to ensure a faster diagnosis. The Proposed System Architecture Design has been combined into seven modules; the Answering Question Module is a natural language processing Chabot that can answer all kinds of questions related to diabetes. The Doctor Consultation Module ensures free treatment related to diabetes. In this research, 90% accuracy was obtained by performing K-fold cross-validation on top of the K nearest neighbor's algorithm (KNN) & LightGBM. To evaluate the model's performance, Receiver Operating Characteristics (ROC) Curve and Area under the ROC Curve (AUC) were applied with a value of 0.948 and 0.936, respectively. This manuscript presents some exploratory data analysis, including a correlation matrix and survey report. Moreover, the proposed solution can be adjustable in the daily activities of a diabetic patient.