To read the full-text of this research, you can request a copy directly from the authors.
... The presentation of news has changed considerably over the last decades in the light of technological advances. The new electronic environment determined the emergence of a new electronic form (apart from written and oral forms) of communication, where the information is conveyed across time and borders (Bhatia et al., 2022;Blake, 2019, Yates & Orlikowski, 1992 and is manifested in the combination of interaction and communication. ...
The advancement of modern technologies has influenced the way news is presented and consumed, particularly online. Weather is an important topic for the public as it relates to the human experience and addresses current societal issues. In this paper, we introduce a systematic approach to conduct sentiment analysis of weather news stories, to specify the emotional tone and examine the role of subjectivity in online news reporting. This research falls within the scope of a lexicon-based (unsupervised) approach to sentiment analysis, which involves finding the sentiment polarity of words. The analysis is predominantly based on sentence-level sentiment analysis. Two popular online web services, MonkeyLearn and SentiStrength, were applied to automatically detect human emotions. We compared the efficiency of each tool and found that MonkeyLearn provided better final results in comparison to SentiStrength, which tended to misclassify negative sentiments into neutral ones. The final results of frequency calculation showed the dominance of weather news stories with negative sentiment polarity over positive and neutral ones, with neutral sentiments being in the minority. Based on the empirical findings, we observed an objectivity-to-subjectivity shift in online news reporting.
Fake news, or fabric which appeared to be untrue with point of deceiving the open, has developed in ubiquity in current a long time. Spreading this kind of data undermines societal cohesiveness and well by cultivating political division and doubt in government. Since of the sheer volume of news being disseminated through social media, human confirmation has ended up incomprehensible, driving to the improvement and arrangement of robotized strategies for the recognizable proof of wrong news. Fake news publishers use a variety of stylistic techniques to boost the popularity of their works, one of which is to arouse the readers’ emotions. Due to this, text analytics’ sentiment analysis, which determines the polarity and intensity of feelings conveyed in a text, is now being utilized in false news detection methods, as either the system’s foundation or as a supplementary component. This assessment analyzes the full explanation of false news identification. The study also emphasizes characteristics, features, taxonomy, different sorts of data in the news, categories of false news, and detection approaches for spotting fake news. This research recognized fake news using the probabilistic latent semantic analysis approach. In particular, the research describes the fundamental theory of the related work to provide a deep comparative analysis of various literature works that has contributed to this topic. Besides this, a comparison of different machine learning and deep learning techniques is done to assess the performance for fake news detection. For this purpose, three datasets have been used.
In the age of the digital revolution and the widespread usage of social networks, the modalities of information consumption and production were disrupted by the shift to instantaneous transmission. Sometimes the scoop and exclusivity are just for a few minutes. Information spreads like wildfire throughout the world, with little regard for context or critical thought, resulting in the proliferation of fake news. As a result, it is preferable to have a system that allows consumers to obtain balanced news information. Some researchers attempted to detect false and authentic news using tagged data and had some success. Online social groups propagate digital false news or fake news material in the form of shares, reshares, and repostings. This work aims to detect fake news forms dispatched on social networks to enhance the quality of trust and transparency in the social network recommendation system. It provides an overview of traditional techniques used to detect fake news and modern approaches used for multiclassification using unlabeled data. Many researchers are focusing on detecting fake news, but fewer works highlight this detection’s role in improving the quality of trust in social network recommendation systems. In this research paper, we take an improved approach to assisting users in deciding which information to read by alerting them about the degree of inaccuracy of the news items they are seeing and recommending the many types of fake news that the material represents.
In recent years, the consumption of social media content to keep up with global news and to verify its authenticity has become a considerable challenge. Social media enables us to easily access news anywhere, anytime, but it also gives rise to the spread of fake news, thereby delivering false information. This also has a negative impact on society. Therefore, it is necessary to determine whether or not news spreading over social media is real. This will allow for confusion among social media users to be avoided, and it is important in ensuring positive social development. This paper proposes a novel solution by detecting the authenticity of news through natural language processing techniques. Specifically, this paper proposes a novel scheme comprising three steps, namely, stance detection, author credibility verification, and machine learning-based classification, to verify the authenticity of news. In the last stage of the proposed pipeline, several machine learning techniques are applied, such as decision trees, random forest, logistic regression, and support vector machine (SVM) algorithms. For this study, the fake news dataset was taken from Kaggle. The experimental results show an accuracy of 93.15%, precision of 92.65%, recall of 95.71%, and F1-score of 94.15% for the support vector machine algorithm. The SVM is better than the second best classifier, i.e., logistic regression, by 6.82%.
People got to know about the world from newspapers to today’s digital media.From 1605 to 2021 the topography of news has evolved at an immense. People forgotten about newspapers and habituated to digital devices so that they can view it at anytime and anywhere soon it became a crucial asset for people. From the past few years fake news also evolved and people always being believed by the available fake news who are being shared by fake profiles in digital media. Currently numerous approaches for detecting fake news by neural networks in one-directional model. We proposed BERT- Bidirectional Encoder Representations from Transformers is the bidirectional model where it uses left and right content in each word so that it is used for pre-train the words into two-way representations from unlabeled words it shown an excellent result when dealt with fake news it attained 99% of accuracy and outperform logistic regression and K-Nearest Neighbors. This method became a crucial in dealing with fake news so that it improves categorization easily and reduces computation time. Through this proposal, we are aiming to build a model to spot fake news present across various sites. The motivation behind this work to help people improve the consumption of legitimate news while discarding misleading information relationship in social media. Classification accuracy of fake news may be improved from the utilization of machine learning ensemble methods.
Before the internet, people acquired their news from the radio, television, and newspapers. With the internet, the news moved online, and suddenly, anyone could post information on websites such as Facebook and Twitter. The spread of fake news has also increased with social media. It has become one of the most significant issues of this century. People use the method of fake news to pollute the reputation of a well-reputed organization for their benefit. The most important reason for such a project is to frame a device to examine the language designs that describe fake and right news through machine learning. This paper proposes models of machine learning that can successfully detect fake news. These models identify which news is real or fake and specify the accuracy of said news, even in a complex environment. After data-preprocessing and exploration, we applied three machine learning models; random forest classifier, logistic regression, and term frequency-inverse document frequency (TF-IDF) vectorizer. The accuracy of the TFIDF vectorizer, logistic regression, random forest classifier, and decision tree classifier models was approximately 99.52%, 98.63%, 99.63%, and 99.68%, respectively. Machine learning models can be considered a great choice to find reality-based results and applied to other unstructured data for various sentiment analysis applications.
Social media has become a popular means for people to consume and share news. However, it also enables the extensive spread of fake news, that is, news that deliberately provides false information, which has a significant negative impact on society. Especially recently, the false information about the new coronavirus disease 2019 (COVID-19) has spread like a virus around the world. The state of the Internet is forcing the world’s tech giants to take unprecedented action to protect the “information health” of the public. Despite many existing fake news datasets, comprehensive and effective algorithms for detecting fake news have become one of the major obstacles. In order to address this issue, we designed a self-learning semi-supervised deep learning network by adding a confidence network layer, which made it possible to automatically return and add correct results to help the neural network to accumulate positive sample cases, thus improving the accuracy of the neural network. Experimental results indicate that our network is more accurate than the existing mainstream machine learning methods and deep learning methods.
Classification of fake news on social media has gained a lot of attention in the last decade due to the ease of adding fake content through social media sites. In addition, people prefer to get news on social media instead of on traditional televisions. These trends have led to an increased interest in fake news and its identification by researchers. This study focused on classifying fake news on social media with textual content (text classification). In this classification, four traditional methods were applied to extract features from texts (term frequency-inverse document frequency, count vector, character level vector, and N-Gram level vector), employing 10 different machine learning and deep learning classifiers to categorize the fake news dataset. The results obtained showed that fake news with textual content can indeed be classified, especially using a convolutional neural network. This study obtained an accuracy range of 81 to 100% using different classifiers.
In this world of modern technologies and media, online news publications and portals are increasing at a high speed. That is why, nowadays, it has become almost impossible to check out the traditional fact of news headlines and examine them due to the increase in the number of content writers, online media portals, and news portals. Mostly, fake headlines are filled with bogus or misleading content. They attract the commoners by putting phony words or misleading fraudulent content in the headlines to increase their views and share. But, these fake and misleading headlines create havoc in the commoner’s life and misguide them in many ways. That is why we took a step so that the commoners can differentiate between fake and real news. We proposed a model that can successfully detect whether the story is fake or accurate based on the news headlines. We created a novel data set of Bengali language and achieved our aim and reached the target using the Gaussian Naive Bayes algorithm. We have used other algorithms, but the Gaussian Naive Algorithm has performed well in our model. This algorithm used a text feature dependent on TF-IDF and an Extra Tree Classifier to choose the attribute. In our model, using Gaussian Naive Bayes we got 87% accuracy which is comparatively best than any other algorithm we used in this model.
Social media is a popular medium for the dissemination of real-time news all over the world. Easy and quick information proliferation is one of the reasons for its popularity. An extensive number of users with different age groups, gender, and societal beliefs are engaged in social media websites. Despite these favorable aspects, a significant disadvantage comes in the form of fake news, as people usually read and share information without caring about its genuineness. Therefore, it is imperative to research methods for the authentication of news. To address this issue, this article proposes a two-phase benchmark model named WELFake based on word embedding (WE) over linguistic features for fake news detection using machine learning classification. The first phase preprocesses the data set and validates the veracity of news content by using linguistic features. The second phase merges the linguistic feature sets with WE and applies voting classification. To validate its approach, this article also carefully designs a novel WELFake data set with approximately 72,000 articles, which incorporates different data sets to generate an unbiased classification output. Experimental results show that the WELFake model categorizes the news in real and fake with a 96.73% which improves the overall accuracy by 1.31% compared to bidirectional encoder representations from transformer (BERT) and 4.25% compared to convolutional neural network (CNN) models. Our frequency-based and focused analyzing writing patterns model outperforms predictive-based related works implemented using the Word2vec WE method by up to 1.73%.
The explosion of social media allowed individuals to spread information without cost, with little investigation and fewer filters than before. This amplified the old problem of fake news, which became a major concern nowadays due to the negative impact it brings to the communities. In order to tackle the rise and spreading of fake news, automatic detection techniques have been researched building on artificial intelligence and machine learning. The recent achievements of deep learning techniques in complex natural language processing tasks, make them a promising solution for fake news detection too. This work proposes a novel hybrid deep learning model that combines convolutional and recurrent neural networks for fake news classification. The model was successfully validated on two fake news datasets (ISO and FA-KES), achieving detection results that are significantly better than other non-hybrid baseline methods. Further experiments on the generalization of the proposed model across different datasets, had promising results.
In our modern era where the internet is ubiquitous, everyone relies on various online resources for news. Along with the increase in the use of social media platforms like Facebook, Twitter, etc. news spread rapidly among millions of users within a very short span of time. The spread of fake news has far-reaching consequences like the creation of biased opinions to swaying election outcomes for the benefit of certain candidates. Moreover, spammers use appealing news headlines to generate revenue using advertisements via click-baits. In this paper, we aim to perform binary classification of various news articles available online with the help of concepts pertaining to Artificial Intelligence, Natural Language Processing and Machine Learning. We aim to provide the user with the ability to classify the news as fake or real and also check the authenticity of the website publishing the news.
The explosive growth in fake news and its erosion to democracy, justice, and public trust has increased the demand for fake news detection and intervention. This survey reviews and evaluates methods that can detect fake news from four perspectives: (1) the false knowledge it carries, (2) its writing style, (3) its propagation patterns, and (4) the credibility of its source. The survey also highlights some potential research tasks based on the review. In particular, we identify and detail related fundamental theories across various disciplines to encourage interdisciplinary research on fake news. We hope this survey can facilitate collaborative efforts among experts in computer and information sciences, social sciences, political science, and journalism to research fake news, where such efforts can lead to fake news detection that is not only efficient but more importantly, explainable.
With the ever increase in social media usage, it has become necessary to combat the spread of false information and decrease the reliance of information retrieval from such sources. Social platforms are under constant pressure to come up with efficient methods to solve this problem because users' interaction with fake and unreliable news leads to its spread at an individual level. This spreading of misinformation adversely affects the perception about an important activity, and as such, it needs to be dealt with using a modern approach. In this paper, we collect 1356 news instances from various users via Twitter and media sources such as PolitiFact and create several datasets for the real and the fake news stories. Our study compares multiple state‐of‐the‐art approaches such as convolutional neural networks (CNNs), long short‐term memories (LSTMs), ensemble methods, and attention mechanisms. We conclude that CNN + bidirectional LSTM ensembled network with attention mechanism achieved the highest accuracy of 88.78%, whereas Ko et al tackled the fake news identification problem and achieved a detection rate of 85%. With the ever increase in social media usage, it has become necessary to combat the spread of false information and decrease the reliance of information retrieval from such sources. Social platforms are under constant pressure to come up with efficient methods to solve this problem. We tackled the fake news identification problem and achieved a detection rate of 88.78%.
The issues of online fake news have attained an increasing eminence in the diffusion of shaping news stories online. Misleading or unreliable information in the form of videos, posts, articles, URLs is extensively disseminated through popular social media platforms such as Facebook and Twitter. As a result, editors and journalists are in need of new tools that can help them to pace up the verification process for the content that has been originated from social media. Motivated by the need for automated detection of fake news, the goal is to find out which classification model identifies phony features accurately using three feature extraction techniques, Term Frequency–Inverse Document Frequency (TF–IDF), Count-Vectorizer (CV) and Hashing-Vectorizer (HV). Also, in this paper, a novel multi-level voting ensemble model is proposed. The proposed system has been tested on three datasets using twelve classifiers. These ML classifiers are combined based on their false prediction ratio. It has been observed that the Passive Aggressive, Logistic Regression and Linear Support Vector Classifier (LinearSVC) individually perform best using TF-IDF, CV and HV feature extraction approaches, respectively, based on their performance metrics, whereas the proposed model outperforms the Passive Aggressive model by 0.8%, Logistic Regression model by 1.3%, LinearSVC model by 0.4% using TF-IDF, CV and HV, respectively. The proposed system can also be used to predict the fake content (textual form) from online social media websites.
Fake news is a phenomenon which is having a significant impact on our social life, in particular in the political world. Fake news detection is an emerging research area which is gaining interest but involved some challenges due to the limited amount of resources (i.e., datasets, published literature) available. We propose in this paper, a fake news detection model that use n-gram analysis and machine learning techniques. We investigate and compare two different features extraction techniques and six different machine classification techniques. Experimental evaluation yields the best performance using Term Frequency-Inverted Document Frequency (TF-IDF) as feature extraction technique, and Linear Support Vector Machine (LSVM) as a classifier, with an accuracy of 92%.
This paper is based on a review of how previous studies have defined and operationalized the term “fake news.” An examination of 34 academic articles that used the term “fake news” between 2003 and 2017 resulted in a typology of types of fake news: news satire, news parody, fabrication, manipulation, advertising, and propaganda. These definitions are based on two dimensions: levels of facticity and deception. Such a typology is offered to clarify what we mean by fake news and to guide future studies.
Social media for news consumption is a double-edged sword. On the one hand, its low cost, easy access, and rapid dissemination of information lead people to seek out and consume news from social media. On the other hand, it enables the wide spread of "fake news", i.e., low quality news with intentionally false information. The extensive spread of fake news has the potential for extremely negative impacts on individuals and society. Therefore, fake news detection on social media has recently become an emerging research that is attracting tremendous attention. Fake news detection on social media presents unique characteristics and challenges that make existing detection algorithms from traditional news media ineffective or not applicable. First, fake news is intentionally written to mislead readers to believe false information, which makes it difficult and nontrivial to detect based on news content; therefore, we need to include auxiliary information, such as user social engagements on social media, to help make a determination. Second, exploiting this auxiliary information is challenging in and of itself as users' social engagements with fake news produce data that is big, incomplete, unstructured, and noisy. Because the issue of fake news detection on social media is both challenging and relevant, we conducted this survey to further facilitate research on the problem. In this survey, we present a comprehensive review of detecting fake news on social media, including fake news characterizations on psychology and social theories, existing algorithms from a data mining perspective, evaluation metrics and representative datasets. We also discuss related research areas, open problems, and future research directions for fake news detection on social media.
In analysing the news media's role in serving the functions associated with democratic citizenship, the number, diversity and range of news sources are central. Research conducted on sources has overwhelmingly focused on individual national systems. However, studying variations in news source patterns across national environments enhances understanding of the media's role. This article is based on a larger project, “Media System, Political Context and Informed Citizenship: A Comparative Study”, involving 11 countries. It seeks, first, to identify differences between countries in the sources quoted in the news; second, to establish whether there are consistent differences across countries between types of media in their sourcing patterns; and, third, to trace any emergent consistent patterns of variation between different types of organization across different countries. A range of findings related to news media source practices is discussed that highlights variations and patterns across different media and countries, thereby questioning common generalizations about the use of sources by newspapers and public service broadcasters. Finally, a case is made for comparative media research that helps enhance the news media's key role as a social institution dedicated to informed citizenship.
n analysing the news media's role in serving the functions associated with democratic citizenship, the number, diversity and range of news sources are central. Research conducted on sources has overwhelmingly focused on individual national systems. However, studying variations in news source patterns across national environments enhances understanding of the media's role. This article is based on a larger project, “Media System, Political Context and Informed Citizenship: A Comparative Study”, involving 11 countries. It seeks, first, to identify differences between countries in the sources quoted in the news; second, to establish whether there are consistent differences across countries between types of media in their sourcing patterns; and, third, to trace any emergent consistent patterns of variation between different types of organization across different countries. A range of findings related to news media source practices is discussed that highlights variations and patterns across different media and countries, thereby questioning common generalizations about the use of sources by newspapers and public service broadcasters. Finally, a case is made for comparative media research that helps enhance the news media's key role as a social institution dedicated to informed citizenship
The significance of social media has increased manifold in the past few decades as it helps people from even the most remote corners of the world to stay connected. With the advent of technology, digital media has become more relevant and widely used than ever before and along with this, there has been a resurgence in the circulation of fake news and tweets that demand immediate attention. In this paper, we describe a novel Fake News Detection system that automatically identifies whether a news item is “real” or “fake”, as an extension of our work in the CONSTRAINT COVID-19 Fake News Detection in English challenge. We have used an ensemble model consisting of pre-trained models followed by a statistical feature fusion network, along with a novel heuristic algorithm by incorporating various attributes present in news items or tweets like source, username handles, URL domains and authors as statistical feature. Our proposed framework have also quantified reliable predictive uncertainty along with proper class output confidence level for the classification task. We have evaluated our results on the COVID-19 Fake News dataset and FakeNewsNet dataset to show the effectiveness of the proposed algorithm on detecting fake news in short news content as well as in news articles. We obtained a best F1-score of 0.9892 on the COVID-19 dataset, and an F1-score of 0.9156 on the FakeNewsNet dataset.
In recent years, the rise of Online Social Networks has led to proliferation of social news such as product advertisement, political news, celebrity’s information, etc. Some of the social networks such as Facebook, Instagram and Twitter affected by their user through fake news. Unfortunately, some users use unethical means to grow their links and reputation by spreading fake news in the form of texts, images, and videos. However, the recent information appearing on an online social network is doubtful, and in many cases, it misleads other users in the network. Fake news is spread intentionally to mislead readers to believe false news, which makes it difficult for detection mechanism to detect fake news on the basis of shared content. Therefore, we need to add some new information related to user’s profile, such as user’s involvement with others for finding a particular decision. The disseminated information and their diffusion process create a big problem for detecting these contents promptly and thus highlighting the need for automatic fake news detection. In this paper, we are going to introduce automatic fake news detection approach in chrome environment on which it can detect fake news on Facebook. Specifically, we use multiple features associated with Facebook account with some news content features to analyze the behavior of the account through deep learning. The experimental analysis of real-world information demonstrates that our intended fake news detection approach has achieved higher accuracy than the existing state of art techniques.
Massive dissemination of fake news and its potential to erode democracy has increased the demand for accurate fake news detection. Recent advancements in this area have proposed novel techniques that aim to detect fake news by exploring how it propagates on social networks. Nevertheless, to detect fake news at an early stage, i.e., when it is published on a news outlet but not yet spread on social media, one cannot rely on news propagation information as it does not exist. Hence, there is a strong need to develop approaches that can detect fake news by focusing on news content. In this article, a theory-driven model is proposed for fake news detection. The method investigates news content at various levels: lexicon-level, syntax-level, semantic-level, and discourse-level. We represent news at each level, relying on well-established theories in social and forensic psychology. Fake news detection is then conducted within a supervised machine learning framework. As an interdisciplinary research, our work explores potential fake news patterns, enhances the interpretability in fake news feature engineering, and studies the relationships among fake news, deception/disinformation, and clickbaits. Experiments conducted on two real-world datasets indicate the proposed method can outperform the state-of-the-art and enable fake news early detection when there is limited content information.
The debate around fake news has grown recently because of the potential harm they can have on different fields, being politics one of the most affected. Due to the amount of news being published every day, several studies in computer science have proposed models using machine learning to detect fake news. However, most of these studies focus on news from one language (mostly English) or rely on characteristics of social media-specific platforms (like Twitter or Sina Weibo). Our work proposes to detect fake news using only text features that can be generated regardless of the source platform and are the most independent of the language as possible. We carried out experiments from five datasets, comprising both texts and social media posts, in three language groups: Germanic, Latin, and Slavic, and got competitive results when compared to benchmarks. We compared the results obtained through a custom set of features and with other popular techniques when dealing with natural language processing, such as bag-of-words and Word2Vec.
Automatic detection of fake news, which could negatively affect individuals and the society, is an emerging research area attracting global attention. The problem has been approached in this paper from Natural Language Processing and Machine Learning perspectives. The evaluation is carried out for three standard datasets with a novel set of features extracted from the headlines and the contents. Performances of seven machine learning algorithms in terms of accuracies and F1 scores are compared. Gradient Boosting outperformed other classifiers with mean accuracy of 88% and F1-Score of 0.91.
False or unverified information spreads just like accurate information on the web, thus possibly going viral and influencing the public opinion and its decisions. Fake news and rumours represent the most popular forms of false and unverified information, respectively, and should be detected as soon as possible for avoiding their dramatic effects. The interest in effective detection techniques has been therefore growing very fast in the last years. In this paper we survey the different approaches to automatic detection of fake news and rumours proposed in the recent literature. In particular, we focus on five main aspects. First, we report and discuss the various definitions of fake news and rumours that have been considered in the literature. Second, we highlight how the collection of relevant data for performing fake news and rumours detection is problematic and we present the various approaches, which have been adopted to gather these data, as well as the publicly available datasets. Third, we describe the features that have been considered in fake news and rumour detection approaches. Fourth, we provide a comprehensive analysis on the various techniques used to perform rumour and fake news detection. Finally, we identify and discuss future directions.
A large body of recent works has focused on understanding and detecting fake news stories that are disseminated on social media. To accomplish this goal, these works explore several types of features extracted from news stories, including source and posts from social media. In addition to exploring the main features proposed in the literature for fake news detection, we present a new set of features and measure the prediction performance of current approaches and features for automatic detection of fake news. Our results reveal interesting findings on the usefulness and importance of features for detecting false news. Finally, we discuss how fake news detection approaches can be used in the practice, highlighting challenges and opportunities.
Classification of time series has been attracting great interest over the past decade. While dozens of techniques have been introduced, recent empirical evidence has strongly suggested that the simple nearest neighbor algorithm is very difficult to beat for most time series problems, especially for large-scale datasets. While this may be considered good news, given the simplicity of implementing the nearest neighbor algorithm, there are some negative consequences of this. First, the nearest neighbor algorithm requires storing and searching the entire dataset, resulting in a high time and space complexity that limits its applicability, especially on resource-limited sensors. Second, beyond mere classification accuracy, we often wish to gain some insight into the data and to make the classification result more explainable, which global characteristics of the nearest neighbor cannot provide. In this work we introduce a new time series primitive, time series shapelets, which addresses these limitations. Informally, shapelets are time series subsequences which are in some sense maximally representative of a class. We can use the distance to the shapelet, rather than the distance to the nearest neighbor to classify objects. As we shall show with extensive empirical evaluations in diverse domains, classification algorithms based on the time series shapelet primitives can be interpretable, more accurate, and significantly faster than state-of-the-art classifiers.